nsfads

Introduction

This repository contains R code to munge, analyze, and publish (as a Shiny web app) summaries of word counts in National Science Foundation (NSF) grant abstracts from the Division of Mathematical Sciences (DMS). See mathtrends.ssk.im for an example.

The relevant files are:

xml_munge.R
gen_tdm.R
tdm_dms.R
ui.R
server.R

For any questions, bug reports, etc., contact Steven S. Kim via e-mail at [email protected].

Requirements

Required R packages include:

XML
plyr
data.table
tm
RWeka
ggplot2
stringr

The XML files containing abstract data were downloaded from the NSF website.

Project Notes

This project was heavily influenced by the Google Ngram viewer.
Default constants look through years 1990 -- 2015, but this was an arbitrary choice, and easily changed by updating the YEARS constant in the code. However, many XML files from earlier years do not contain abstract data.
Key functionality is provided by the tm text-mining package in R.
The file tdm_dms.R sparsifies the TermDocumentMatrix to only include terms which occur in at least 20% of the years analyzed.
A few example terms with interesting trends:
- machine learning vs. data vs. statistics + statistical
- biology + biological
- underrepresented, minority + minorities
- outreach
- young researchers and undergraduate, graduate
- develop, advance + advances
- the project will
- network + networks
- control, partial differential
Some eventual TODOs:
- smoothing the time series
- a "shuffle" option incorporating list of sample queries
- make the plot interactive with tool-tips on hover
- look at all divisions and make comparisons across NSF
- compare to NIH/DOD/NSERC funding priorities
- use a Markov model to generate a "sample" abstract
- map textual differences across corpora
- a "dollar-weighted" count (weighting gram proportion in a given grant by dollars in grant)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
dms_tdm.R		dms_tdm.R
gen_tdm.R		gen_tdm.R
grant_counts.rds		grant_counts.rds
server.R		server.R
tdm_sparse.rds		tdm_sparse.rds
ui.R		ui.R
xml_munge.R		xml_munge.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nsfads

Introduction

Requirements

Project Notes

About

Releases

Packages

Languages

License

mikss/nsfads

Folders and files

Latest commit

History

Repository files navigation

nsfads

Introduction

Requirements

Project Notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages