Skip to content

Latest commit

 

History

History
28 lines (22 loc) · 1.63 KB

README.md

File metadata and controls

28 lines (22 loc) · 1.63 KB

naturalLanguageProcessing

Journey to learning natural language processing.

Abstract

In times of political turmoil, often the news we see from all sources is not 100% accurate. With different biases and parties releasing their own version of news, or with tabloid news outlets like Buzzfeed, Facebook, etc, we are trying to predict the accuracy of news based on text. This is a process called natural language processing, a machine learning method that essentially teaches the computer to understand words.

Goal

Learn natural language processing. Predict the accuracy of news based on keywords/tags from the article title. One method is to differentiate between object/verb in a sentence in the title of the article and a summary of the article.

Method

Fact check the accuracy of news based on keywords/phrases. The third column includes the statements - predict how many could be "fact-checked." Try to break the statement into Subject-Verb-Object tuples and check against the data.

Utilizing NLTK, and Scikit-learn.

PIVOTING!!!!!!!!!!

To Do 5/31/2017

  • Get familiar with nlp using resources below > feel free to add your own!
  • Write to a new csv with columns: word, part of speech, original phrase, accuracy of original phrase.
  • Algorithms to use: Naive Bayes classifier, SVM

Resources

https://www.dataquest.io/blog/natural-language-processing-with-python/ https://pythonprogramming.net/naive-bayes-classifier-nltk-tutorial/?completed=/words-as-features-nltk-tutorial/ http://textminingonline.com/dive-into-nltk-part-ii-sentence-tokenize-and-word-tokenize http://victoria.lviv.ua/html/fl5/NaturalLanguageProcessingWithPython.pdf