Skip to content

Releases: capitalone/DataProfiler

v0.3.5

16 Mar 21:06
f63cad6
Compare
Choose a tag to compare
  • Enhancement: 50-90% reduced profiling time
    • Improved methods for unique row and null-in-row prediction(s)
  • Enhancement: Users can now select header row for delimited files
  • Bug Fix: Added header detection on delimited files with only strings

v0.3.4

12 Mar 19:28
5e5f64e
Compare
Choose a tag to compare
  • Significantly improved header detection on structured datasets
  • Updated model
    • New entities: DATE, TIME, US_STATE, DRIVERS_LICENSE
    • Removed entities: INTEGER_BIG
  • New [easier] way to extend labels to the model
  • ML requirements installed separately via pip install dataprofiler[ml] - required for labeler
  • Profiler & Labeler only load TensorFlow when necessary
  • Minor bug fixes & improved testing

v0.3.2

04 Mar 05:09
7c05449
Compare
Choose a tag to compare
  • TensorFlow only runs when a labeler executes
  • Improved CSV detection
  • 2-8x memory reduction in profiling
  • Various bug fixes

v0.3.1

23 Feb 20:49
93a9b6e
Compare
Choose a tag to compare
  • Dramatically reduced memory requirements for the data labeler
  • Renamed the module: data_profiler -> dataprofiler
  • Improved delimiter (CSV) file detection

v0.3.0

11 Feb 20:01
07e8b3b
Compare
Choose a tag to compare

Initial Data Profiler release.
Load a file. Extract profile. Save output.
See README.md for full information regarding release.