-
-
Notifications
You must be signed in to change notification settings - Fork 93
Data Repositories
Already quite a few pages exist on the web where data for different ML experiments is hosted. This page lists them. @ALL: Please extend this page when you know of other repos!
-
UCI
-
MLData
-
Benchmark Data Sets for Highly Imbalanced Binary Classification
http://www.cs.gsu.edu/~zding/research/imbalance-data/x19data.txt
-
UCR
-
KEEL
-
LIBSVM
-
NIPS Feature Selection Challenge Data
-
AutoWeka Data
-
More Feature Selection
-
BigML's list of 1000+ data sources
http://blog.bigml.com/2013/02/28/data-data-data-thousands-of-public-data-sources/
-
Massive list from Data Science Central.
http://www.datasciencecentral.com/profiles/blogs/data-sources-for-cool-data-science-projects
-
R packages:
-
UTwente Activity recognition datasets:
-
Quandl
-
KDNuggets list of data sets:
http://www.kdnuggets.com/2015/04/awesome-public-datasets-github.html
-
Microarray data:
http://genomics-pubs.princeton.edu/oncology/ http://svitsrv25.epfl.ch/R-doc/library/multtest/html/golub.html
Drafts:
Proposals:
Other: