Skip to content
This repository has been archived by the owner on Jan 18, 2024. It is now read-only.

Identify Survey topic #89

Open
6 tasks
TiffanyAndrews opened this issue Oct 22, 2019 · 0 comments
Open
6 tasks

Identify Survey topic #89

TiffanyAndrews opened this issue Oct 22, 2019 · 0 comments

Comments

@TiffanyAndrews
Copy link
Collaborator

TiffanyAndrews commented Oct 22, 2019

10x Qualitative Data User story

As a data scientist, I want to the number of topics so that the LDA algorithm can match them with keywords in the text data I provide.

Acceptance criteria

  • [ ]identify total number of topics (K) in the training data
  • all topics are related
  • each word is assigned to a topic
  • probability across all K topics add to 1

To do:

  • choose the total number of topics(K)
  • go through each document and randomly assign each word in the comment to one the K topics
  • For each comment c
    • For each word w in c
      A. For each topic t compute two things:
      1) p(topic t | comment c)
      2) p(word w | topic t)
      B. Reassign w a new topic, choosing t with probability p(topic t | comment c) * p(word w | topic t) {(Probability that topic t generated word w)}
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant