Skip to content

v0.1.0

Compare
Choose a tag to compare
@changhiskhan changhiskhan released this 27 Sep 23:07
· 2048 commits to main since this release

Highlights

  1. Documentation is now live and a Quickstart Notebook is available
  2. Lance is now integrated with pytorch and supports multiple workers.
  3. Vision-specific extension types like Box2d provides vectorized iou and Image types that make it easy to perform IO and go between bytes, PIL, numpy, and tensors.

What's Changed

  • Setting BatchSize via ScanBuilder by @eddyxu in #135
  • Move Expression based schema project to Schema class by @eddyxu in #137
  • Refactor I/O exec nodes by @eddyxu in #136
  • Simplify RecordBatchReader to use Project.next() by @eddyxu in #139
  • Convert bdd100k dataset in python benchmarks by @eddyxu in #131
  • Fix the condition of Scan advancing batch id by @eddyxu in #143
  • Initial PyTorch Dataset support by @eddyxu in #134
  • Example training code over oxford pet dataset by @eddyxu in #144
  • Test writing fixed size list and fixed size binary via WriteTable by @eddyxu in #151
  • Fix fixed size length calculation by @eddyxu in #152
  • Provide binary to profiling scans by @eddyxu in #149
  • Multi-worker support in Pytorch Dataset by @eddyxu in #147
  • Vision specific extension types by @changhiskhan in #146
  • lance dataset that overrides Dataset.scanner and Dataset.head by @changhiskhan in #158
  • Pickle Image by @changhiskhan in #160
  • Only load manifest once within the dataset and share Manifest amount the readers by @eddyxu in #155
  • Improve ergonomic of the Pytorch dataset and Generate embeddings for oxford pet by @eddyxu in #157
  • Fix PlainEncoder to read empty page by @eddyxu in #164
  • Convert coco annotations from the list of structs to struct of lists by @eddyxu in #166
  • Convert coco bounding box format to [x0,y0,x1,y1] format. by @eddyxu in #169
  • Image Array by @changhiskhan in #168
  • Fix writing and reading extension type by @eddyxu in #172
  • Coco improvements by @changhiskhan in #174
  • Support partitioning and group size control in coco dataset generation. by @eddyxu in #175
  • Extension type improvements to support 3d types by @changhiskhan in #173
  • Support converting PIL from Image in pytorch Dataset by @eddyxu in #176
  • Minor fix for 3d extension types by @changhiskhan in #177
  • MS coco dataset training by @eddyxu in #163
  • Change version import to relative import by @eddyxu in #181
  • [python] Mix of minor improvements by @changhiskhan in #182
  • Automatically build document and publish to Github Pages by @eddyxu in #180
  • [benchmarks] simplify the datagen code and remove partitioning for now by @changhiskhan in #183
  • Fix PlainDecoder handle empty filtered array by @eddyxu in #187
  • [python] minor improvements by @changhiskhan in #190
  • Fix bug that attempt to partitioned columns which does not exist in the file. by @eddyxu in #189
  • Pass filter indices via Limit and Return empty array in GetListArray by @eddyxu in #191
  • Exclude filter columns from projection by @eddyxu in #194
  • action to bump version for new release by @changhiskhan in #199
  • [C++] [BUG] Adjust offset when the batch size is set for reading by @eddyxu in #201
  • GH action to upload wheels and also make reusable yml by @changhiskhan in #200
  • [Python] Test projection in Python Torch Dataset by @eddyxu in #202
  • Fix typo of calculating offsets for slicing index by @eddyxu in #206
  • Changhiskhan/tutorial by @changhiskhan in #167

Full Changelog: v0.0.5...v0.1.0