v0.1.0
Highlights
- Documentation is now live and a Quickstart Notebook is available
- Lance is now integrated with pytorch and supports multiple workers.
- Vision-specific extension types like Box2d provides vectorized iou and Image types that make it easy to perform IO and go between bytes, PIL, numpy, and tensors.
What's Changed
- Setting BatchSize via ScanBuilder by @eddyxu in #135
- Move Expression based schema project to Schema class by @eddyxu in #137
- Refactor I/O exec nodes by @eddyxu in #136
- Simplify RecordBatchReader to use Project.next() by @eddyxu in #139
- Convert bdd100k dataset in python benchmarks by @eddyxu in #131
- Fix the condition of Scan advancing batch id by @eddyxu in #143
- Initial PyTorch Dataset support by @eddyxu in #134
- Example training code over oxford pet dataset by @eddyxu in #144
- Test writing fixed size list and fixed size binary via WriteTable by @eddyxu in #151
- Fix fixed size length calculation by @eddyxu in #152
- Provide binary to profiling scans by @eddyxu in #149
- Multi-worker support in Pytorch Dataset by @eddyxu in #147
- Vision specific extension types by @changhiskhan in #146
- lance dataset that overrides Dataset.scanner and Dataset.head by @changhiskhan in #158
- Pickle Image by @changhiskhan in #160
- Only load manifest once within the dataset and share Manifest amount the readers by @eddyxu in #155
- Improve ergonomic of the Pytorch dataset and Generate embeddings for oxford pet by @eddyxu in #157
- Fix PlainEncoder to read empty page by @eddyxu in #164
- Convert coco annotations from the list of structs to struct of lists by @eddyxu in #166
- Convert coco bounding box format to [x0,y0,x1,y1] format. by @eddyxu in #169
- Image Array by @changhiskhan in #168
- Fix writing and reading extension type by @eddyxu in #172
- Coco improvements by @changhiskhan in #174
- Support partitioning and group size control in coco dataset generation. by @eddyxu in #175
- Extension type improvements to support 3d types by @changhiskhan in #173
- Support converting PIL from Image in pytorch Dataset by @eddyxu in #176
- Minor fix for 3d extension types by @changhiskhan in #177
- MS coco dataset training by @eddyxu in #163
- Change version import to relative import by @eddyxu in #181
- [python] Mix of minor improvements by @changhiskhan in #182
- Automatically build document and publish to Github Pages by @eddyxu in #180
- [benchmarks] simplify the datagen code and remove partitioning for now by @changhiskhan in #183
- Fix PlainDecoder handle empty filtered array by @eddyxu in #187
- [python] minor improvements by @changhiskhan in #190
- Fix bug that attempt to partitioned columns which does not exist in the file. by @eddyxu in #189
- Pass filter indices via Limit and Return empty array in GetListArray by @eddyxu in #191
- Exclude filter columns from projection by @eddyxu in #194
- action to bump version for new release by @changhiskhan in #199
- [C++] [BUG] Adjust offset when the batch size is set for reading by @eddyxu in #201
- GH action to upload wheels and also make reusable yml by @changhiskhan in #200
- [Python] Test projection in Python Torch Dataset by @eddyxu in #202
- Fix typo of calculating offsets for slicing index by @eddyxu in #206
- Changhiskhan/tutorial by @changhiskhan in #167
Full Changelog: v0.0.5...v0.1.0