Repository housing a chrome extension demo developed at HackZurich2017. The extension wraps a python script that can be used to analyze text from a news article to determine its fringiness (measure of non-meanstreamness). This is done by comparing the entites extracted from the article by PermID against entities in real-time stream of articles fetched via Thomson Reuters ® API.
text = <few-paragraphs-of-news-text>
res = fastrun(text)
x, y, f = fringiness(res_to_matrix(res_times)[0])
plot = embedding_plot_bokeh(x, y, f, res)
from bokeh.resources import CDN
html = file_html(plot, CDN, title = "my plot")
with open("file.html", "w") as file:
file.write(html)
See also the Jupyter Notebook
- Nikola Nikolov
- Daniel Keller
- Stan Kerstjens
- Martin Holub
Get the word vectors here https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit and change the path in document_similarity.py
You need the gensim and NLTK libraries: pip install gensim nltk