How to get document_topics distribution of all of the document in gensim LDA?












0















I'm new to python and I need to construct a LDA project. After doing some preprocessing step, here is my code:



dictionary = Dictionary(docs)
corpus = [dictionary.doc2bow(doc) for doc in docs]

from gensim.models import LdaModel
num_topics = 10
chunksize = 2000
passes = 20
iterations = 400
eval_every = None
temp = dictionary[0]
id2word = dictionary.id2token
model = LdaModel(corpus=corpus, id2word=id2word, chunksize=chunksize,
alpha='auto', eta='auto',
random_state=42,
iterations=iterations, num_topics=num_topics,
passes=passes, eval_every=eval_every)


I want to get a topic distribution of docs, all of the document and get 10 probability of topic distribution, but when I use:



get_document_topics = model.get_document_topics(corpus)
print(get_document_topics)


The output only appear



<gensim.interfaces.TransformedCorpus object at 0x000001DF28708E10>


How do I get a topic distribution of docs?










share|improve this question





























    0















    I'm new to python and I need to construct a LDA project. After doing some preprocessing step, here is my code:



    dictionary = Dictionary(docs)
    corpus = [dictionary.doc2bow(doc) for doc in docs]

    from gensim.models import LdaModel
    num_topics = 10
    chunksize = 2000
    passes = 20
    iterations = 400
    eval_every = None
    temp = dictionary[0]
    id2word = dictionary.id2token
    model = LdaModel(corpus=corpus, id2word=id2word, chunksize=chunksize,
    alpha='auto', eta='auto',
    random_state=42,
    iterations=iterations, num_topics=num_topics,
    passes=passes, eval_every=eval_every)


    I want to get a topic distribution of docs, all of the document and get 10 probability of topic distribution, but when I use:



    get_document_topics = model.get_document_topics(corpus)
    print(get_document_topics)


    The output only appear



    <gensim.interfaces.TransformedCorpus object at 0x000001DF28708E10>


    How do I get a topic distribution of docs?










    share|improve this question



























      0












      0








      0








      I'm new to python and I need to construct a LDA project. After doing some preprocessing step, here is my code:



      dictionary = Dictionary(docs)
      corpus = [dictionary.doc2bow(doc) for doc in docs]

      from gensim.models import LdaModel
      num_topics = 10
      chunksize = 2000
      passes = 20
      iterations = 400
      eval_every = None
      temp = dictionary[0]
      id2word = dictionary.id2token
      model = LdaModel(corpus=corpus, id2word=id2word, chunksize=chunksize,
      alpha='auto', eta='auto',
      random_state=42,
      iterations=iterations, num_topics=num_topics,
      passes=passes, eval_every=eval_every)


      I want to get a topic distribution of docs, all of the document and get 10 probability of topic distribution, but when I use:



      get_document_topics = model.get_document_topics(corpus)
      print(get_document_topics)


      The output only appear



      <gensim.interfaces.TransformedCorpus object at 0x000001DF28708E10>


      How do I get a topic distribution of docs?










      share|improve this question
















      I'm new to python and I need to construct a LDA project. After doing some preprocessing step, here is my code:



      dictionary = Dictionary(docs)
      corpus = [dictionary.doc2bow(doc) for doc in docs]

      from gensim.models import LdaModel
      num_topics = 10
      chunksize = 2000
      passes = 20
      iterations = 400
      eval_every = None
      temp = dictionary[0]
      id2word = dictionary.id2token
      model = LdaModel(corpus=corpus, id2word=id2word, chunksize=chunksize,
      alpha='auto', eta='auto',
      random_state=42,
      iterations=iterations, num_topics=num_topics,
      passes=passes, eval_every=eval_every)


      I want to get a topic distribution of docs, all of the document and get 10 probability of topic distribution, but when I use:



      get_document_topics = model.get_document_topics(corpus)
      print(get_document_topics)


      The output only appear



      <gensim.interfaces.TransformedCorpus object at 0x000001DF28708E10>


      How do I get a topic distribution of docs?







      python-3.x gensim lda topic-modeling probability-distribution






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 15 '18 at 6:45







      wayne64001

















      asked Nov 15 '18 at 6:23









      wayne64001wayne64001

      475




      475
























          1 Answer
          1






          active

          oldest

          votes


















          0














          The function get_document_topics takes an input of a single document in BOW format. You're calling it on the full corpus (an array of documents) so it returns an iterable object with the scores for each document.



          You have a few options. If you just want one document, run it on the document you want the values for:



          get_document_topics = model.get_document_topics(corpus[0])


          or do the following to get an array of scores for all the documents:



          get_document_topics = [model.get_document_topics(item) for item in corpus]


          Or directly access each object from your original code:



          get_document_topics = model.get_document_topics(corpus)
          print(get_document_topics[0])





          share|improve this answer
























          • Thanks! Is it posible to get a topic distribution about a docs not a single document? I want to check out the importance of the 10 topics in the corpus.

            – wayne64001
            Nov 15 '18 at 9:29











          • I'm not sure exactly what you're looking for. LDA works by figuring out how important a topic is for a document, relative to the whole corpus. If you want to see what it thinks of as a topic, use model.show_topics(). According to the gensim documentation at radimrehurek.com/gensim/models/… : "Unlike LSA, there is no natural ordering between the topics in LDA. The returned topics subset of all topics is therefore arbitrary and may change between two LDA training runs."

            – Andrew McDowell
            Nov 15 '18 at 10:28













          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313575%2fhow-to-get-document-topics-distribution-of-all-of-the-document-in-gensim-lda%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          The function get_document_topics takes an input of a single document in BOW format. You're calling it on the full corpus (an array of documents) so it returns an iterable object with the scores for each document.



          You have a few options. If you just want one document, run it on the document you want the values for:



          get_document_topics = model.get_document_topics(corpus[0])


          or do the following to get an array of scores for all the documents:



          get_document_topics = [model.get_document_topics(item) for item in corpus]


          Or directly access each object from your original code:



          get_document_topics = model.get_document_topics(corpus)
          print(get_document_topics[0])





          share|improve this answer
























          • Thanks! Is it posible to get a topic distribution about a docs not a single document? I want to check out the importance of the 10 topics in the corpus.

            – wayne64001
            Nov 15 '18 at 9:29











          • I'm not sure exactly what you're looking for. LDA works by figuring out how important a topic is for a document, relative to the whole corpus. If you want to see what it thinks of as a topic, use model.show_topics(). According to the gensim documentation at radimrehurek.com/gensim/models/… : "Unlike LSA, there is no natural ordering between the topics in LDA. The returned topics subset of all topics is therefore arbitrary and may change between two LDA training runs."

            – Andrew McDowell
            Nov 15 '18 at 10:28


















          0














          The function get_document_topics takes an input of a single document in BOW format. You're calling it on the full corpus (an array of documents) so it returns an iterable object with the scores for each document.



          You have a few options. If you just want one document, run it on the document you want the values for:



          get_document_topics = model.get_document_topics(corpus[0])


          or do the following to get an array of scores for all the documents:



          get_document_topics = [model.get_document_topics(item) for item in corpus]


          Or directly access each object from your original code:



          get_document_topics = model.get_document_topics(corpus)
          print(get_document_topics[0])





          share|improve this answer
























          • Thanks! Is it posible to get a topic distribution about a docs not a single document? I want to check out the importance of the 10 topics in the corpus.

            – wayne64001
            Nov 15 '18 at 9:29











          • I'm not sure exactly what you're looking for. LDA works by figuring out how important a topic is for a document, relative to the whole corpus. If you want to see what it thinks of as a topic, use model.show_topics(). According to the gensim documentation at radimrehurek.com/gensim/models/… : "Unlike LSA, there is no natural ordering between the topics in LDA. The returned topics subset of all topics is therefore arbitrary and may change between two LDA training runs."

            – Andrew McDowell
            Nov 15 '18 at 10:28
















          0












          0








          0







          The function get_document_topics takes an input of a single document in BOW format. You're calling it on the full corpus (an array of documents) so it returns an iterable object with the scores for each document.



          You have a few options. If you just want one document, run it on the document you want the values for:



          get_document_topics = model.get_document_topics(corpus[0])


          or do the following to get an array of scores for all the documents:



          get_document_topics = [model.get_document_topics(item) for item in corpus]


          Or directly access each object from your original code:



          get_document_topics = model.get_document_topics(corpus)
          print(get_document_topics[0])





          share|improve this answer













          The function get_document_topics takes an input of a single document in BOW format. You're calling it on the full corpus (an array of documents) so it returns an iterable object with the scores for each document.



          You have a few options. If you just want one document, run it on the document you want the values for:



          get_document_topics = model.get_document_topics(corpus[0])


          or do the following to get an array of scores for all the documents:



          get_document_topics = [model.get_document_topics(item) for item in corpus]


          Or directly access each object from your original code:



          get_document_topics = model.get_document_topics(corpus)
          print(get_document_topics[0])






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 15 '18 at 8:41









          Andrew McDowellAndrew McDowell

          1,9161416




          1,9161416













          • Thanks! Is it posible to get a topic distribution about a docs not a single document? I want to check out the importance of the 10 topics in the corpus.

            – wayne64001
            Nov 15 '18 at 9:29











          • I'm not sure exactly what you're looking for. LDA works by figuring out how important a topic is for a document, relative to the whole corpus. If you want to see what it thinks of as a topic, use model.show_topics(). According to the gensim documentation at radimrehurek.com/gensim/models/… : "Unlike LSA, there is no natural ordering between the topics in LDA. The returned topics subset of all topics is therefore arbitrary and may change between two LDA training runs."

            – Andrew McDowell
            Nov 15 '18 at 10:28





















          • Thanks! Is it posible to get a topic distribution about a docs not a single document? I want to check out the importance of the 10 topics in the corpus.

            – wayne64001
            Nov 15 '18 at 9:29











          • I'm not sure exactly what you're looking for. LDA works by figuring out how important a topic is for a document, relative to the whole corpus. If you want to see what it thinks of as a topic, use model.show_topics(). According to the gensim documentation at radimrehurek.com/gensim/models/… : "Unlike LSA, there is no natural ordering between the topics in LDA. The returned topics subset of all topics is therefore arbitrary and may change between two LDA training runs."

            – Andrew McDowell
            Nov 15 '18 at 10:28



















          Thanks! Is it posible to get a topic distribution about a docs not a single document? I want to check out the importance of the 10 topics in the corpus.

          – wayne64001
          Nov 15 '18 at 9:29





          Thanks! Is it posible to get a topic distribution about a docs not a single document? I want to check out the importance of the 10 topics in the corpus.

          – wayne64001
          Nov 15 '18 at 9:29













          I'm not sure exactly what you're looking for. LDA works by figuring out how important a topic is for a document, relative to the whole corpus. If you want to see what it thinks of as a topic, use model.show_topics(). According to the gensim documentation at radimrehurek.com/gensim/models/… : "Unlike LSA, there is no natural ordering between the topics in LDA. The returned topics subset of all topics is therefore arbitrary and may change between two LDA training runs."

          – Andrew McDowell
          Nov 15 '18 at 10:28







          I'm not sure exactly what you're looking for. LDA works by figuring out how important a topic is for a document, relative to the whole corpus. If you want to see what it thinks of as a topic, use model.show_topics(). According to the gensim documentation at radimrehurek.com/gensim/models/… : "Unlike LSA, there is no natural ordering between the topics in LDA. The returned topics subset of all topics is therefore arbitrary and may change between two LDA training runs."

          – Andrew McDowell
          Nov 15 '18 at 10:28






















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313575%2fhow-to-get-document-topics-distribution-of-all-of-the-document-in-gensim-lda%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Xamarin.iOS Cant Deploy on Iphone

          Glorious Revolution

          Dulmage-Mendelsohn matrix decomposition in Python