16 Jan 00:24

434ad65

v0.2.9 pandas extension type for inline images

And also, we've started to implement Lance is Rust. A new kickass vector indexing feature will be coming soon once we do some more cleanup and hook the Rust module back into python.

What's Changed

[DuckDB] Add macro to check window size by @eddyxu in #395
[pandas] Add pandas extension type for ImageBinary by @changhiskhan in #398
python 3.11 is updating and causing error by @changhiskhan in #397
[RUST] Initialize read support in Rust. by @eddyxu in #401
Add missing logical type conversions by @eddyxu in #404
[RUST] Schema projection by @eddyxu in #403
[RUST] Data file reader by @eddyxu in #402
[Rust] Decoder for dictionary encoding by @eddyxu in #406
[Rust] Support full scan for BooleanArray by @changhiskhan in #407
[Rust] Basic reading support for nested fields. by @eddyxu in #408
Add unit tests for all supported primitive types by @changhiskhan in #409
[RUST] Binary encoder and null support. by @eddyxu in #411
[Rust] Fix Cargo publish by @eddyxu in #410
[RUST] Large binary support by @eddyxu in #412
Add support for fixed size list by @changhiskhan in #413
Jaichopra/nuscenes converter by @jaichopra in #364
Add Support for Fixed Size Binary Full scan by @changhiskhan in #414
Bare minimal scanner in Rust by @eddyxu in #415
Set field IDs. by @eddyxu in #417
[Rust] Read/Write Protobuf-backed struct directly from file or buffers. by @eddyxu in #418
[Rust] Lance File Writer by @eddyxu in #419
[Rust] Write dictionary data by @eddyxu in #420
[RUST] Write List/LargeList/FixedSizeList/FixedSizeBinary by @eddyxu in #421
fix byte range and iterator bug by @changhiskhan in #422
Fix dict order in logical type to be consistent with C++ by @eddyxu in #425
Limits notebook GHA to only run when C++ / Python changes. by @eddyxu in #427
Implement futures::Stream for Scanner by @eddyxu in #426
Append column to RecordBatch by @eddyxu in #429
[Rust] Read batch with rowid as a meta column. by @eddyxu in #430
[RUST] argmin and argmax kernel for numeric array by @eddyxu in #432

Full Changelog: v0.2.8...v0.2.9

Contributors

eddyxu, changhiskhan, and jaichopra

Assets 2

24 Dec 03:37

changhiskhan

v0.2.8

b8a949b

v0.2.8 Happy Holidays!

This release contains the following:

A full-fledged ML data quality improvement workflow using Lance showing model performance insights, detecting mislabels, and doing active learning. An experimental integration with Label Studio is demonstrated as well.
Critical bug fix affected read/write of dictionary columns
Imagenet dataset converter

What's Changed

[BUG] Fix reading version aux data reading and writing by @eddyxu in #384
[Benchmark] upload scripts for coco / imagenet benchmark dataset by @eddyxu in #385
Closes #387 by @changhiskhan in #388
Data quality notebook and associated code by @changhiskhan in #389
[DUCKDB] Do not build PyTorch by default by @eddyxu in #392
brew pin python by @changhiskhan in #391
fix off by one error using negative indices for diff'ing by @changhiskhan in #383
Fix GHA for duckdb extension by @changhiskhan in #394
[DUCKDB] Add a Derivative macro by @eddyxu in #393
[Benchmark] Create imagenet from raw dataset by @eddyxu in #386
Various fixes for imagenet and fmt changes by @changhiskhan in #396

Full Changelog: v0.2.7...v0.2.8

Contributors

eddyxu and changhiskhan

Assets 2

19 Dec 00:48

eddyxu

v0.2.7

173ac9d

v0.2.7 Dataset Diff and Metrics computation, and Dataset Version Metadata

What's Changed

create and update tarball for pets by @changhiskhan in #372
[C++] Sanity check to verify column does not overlap when merging a new table by @eddyxu in #375
update notebooks so s3 credentials are not required by @changhiskhan in #376
Add function to get version as of a certain date. Also formatting by @changhiskhan in #378
convenience for comparing metrics across versions by @changhiskhan in #379
Changhiskhan/datadiff by @changhiskhan in #380
Refactor dataset diff and compute metric by @changhiskhan in #381
[C++] Attach new schema update when update dataset by @eddyxu in #374

Full Changelog: v0.2.6...v0.2.7

Contributors

eddyxu and changhiskhan

Assets 2

1 Join discussion

13 Dec 02:41

eddyxu

v0.2.6

00102dc

v0.2.6 Schema evolution bug fixes, Google Colab support, and more datasets

What's Changed

[C++] Remove unused Reader APIs by @eddyxu in #344
[Python] fix timezone issue with version timestamp by @changhiskhan in #345
[C++] add Dataset::Make(string) API by @eddyxu in #346
[DUCKDB] Native duckdb lance reader by @eddyxu in #347
[DUCKDB] Read a special version of dataset by @eddyxu in #350
[DUCKDB] Fix duckdb manylinux build by @eddyxu in #351
[Python] Add colab badge to notebooks by @eddyxu in #354
[Notebook] ML dev cycle for DINO by @eddyxu in #355
[DUCKDB] fix type mapping for other int types by @changhiskhan in #359
[Python] Fix lance.dataset open local related path by @eddyxu in #365
[C++] Store relative path for data files by @eddyxu in #368
[C++] Add RAII util (defer) to auto cleanup / close resources after exiting the scope by @eddyxu in #369
[Python] Convert of ImageNet 1K into Lance dataset by @eddyxu in #366
[Python] Imagenet data quality analytics notebook by @eddyxu in #370

Full Changelog: v0.2.5...v0.2.6

Contributors

eddyxu and changhiskhan

Assets 2

02 Dec 06:15

eddyxu

v0.2.5

ceb65ae

v0.2.5 Schema evolution, support merging with arrow Table

What's Changed

[DOC] Fix notebook build by @eddyxu in #339
[Python] lance.write_dataset takes pandas DataFrame by @eddyxu in #342
[DOC] update readme docs to cater for import pathways from df/parquet by @jaichopra in #340
[Python] Improve PyTorch dataset ergonomic by @eddyxu in #336
[C++] Add columns from in-memory table by @eddyxu in #337
[Python] append column with a in-memory Pyarrow Table by @eddyxu in #338
[C++][Python] Add timestamp to each manifest version. by @eddyxu in #343

Full Changelog: v0.2.4...v0.2.5

Contributors

eddyxu and jaichopra

Assets 2

28 Nov 21:25

eddyxu

v0.2.4

b6ba75f

v0.2.4: Schema Evolution and Append Column

Support Schema Evolution via Append Column.

What's Changed

[Notebook] fixes for notebook backing the blog post by @changhiskhan in #316
[C++] Append column by @eddyxu in #299
[Python] Append columns by @eddyxu in #318
[Use column projection during update by @eddyxu in https://github.com//pull/322
update to duckdb 0.6 by @changhiskhan in #312
[Python] Support add column via Expression. by @eddyxu in #324
[Python] Expose projection for append column by @eddyxu in #325
[C++] Support column projection during add_columns via expression by @eddyxu in #326
[Python] Pytorch Dataset uses Fragment instead of files and support versions by @eddyxu in #327
[C++] Move writer API a private API by @eddyxu in #329
[C++] Refectory Metadata class to eliminate protobuf reference. by @eddyxu in #328
[C++] Performance profiling and improvement by @eddyxu in #333
[C++] Upgrade lq cmd tool to be able to inspect new versioned format by @eddyxu in #334

Full Changelog: v0.2.3...v0.2.4

Contributors

eddyxu and changhiskhan

Assets 2

1 Join discussion

16 Nov 04:23

changhiskhan

v0.2.3

a55f929

v0.2.3 Bugfix release; breaks dataset proto schema

What's Changed

[C++] Project schema via field Ids and Schema intersection by @eddyxu in #305
when writing in batches, handle all na arrays properly by @changhiskhan in #306
[C++] Use LanceFragment to build I/O exec plan by @eddyxu in #307
[CI] Fix Github Action warning to upgrade nodejs 12 based actions by @eddyxu in #309
Update README.md by @changhiskhan in #310
Temporarily pin duckdb to 0.5.1 by @changhiskhan in #313
Notebook for new blog post on versioning by @changhiskhan in #311
[C++] Fix reading dictionary values from manifest files by @eddyxu in #314

Full Changelog: v0.2.2...v0.2.3

Contributors

eddyxu and changhiskhan

Assets 2

09 Nov 17:25

eddyxu

v0.2.2

8a9d736

v0.2.2 Python notebooks and CV dataset conversion.

What's Changed

[DOC] Update README.md by @jaichopra in #294
[DUCKDB] Script to upload lance extension zip by @changhiskhan in #295
[C++] Scan Node reads multiple files by @eddyxu in #300
[Python] Add lance.util.duckdb to help install the extension transparently by @changhiskhan in #301
[Python] Notebook fixes by @changhiskhan in #303
[Python] Make dataset conversion a feature by @changhiskhan in #304

Full Changelog: v0.2.1...v0.2.2

Contributors

eddyxu, changhiskhan, and jaichopra

Assets 2

04 Nov 22:41

changhiskhan

v0.2.1

70d72fb

v0.2.1 Bug fix release

Fixed bug affecting writes of fixed size list arrays as well as datagen code for Coco.
Updated to Arrow 10.0 newly released.

What's Changed

remove duplicate test_mac.sh by @changhiskhan in #284
Fix build on intel mac by @eddyxu in #286
[C++] Fix write fixed list array bug by @eddyxu in #288
Upgrade Apache Arrow to 10.0 by @eddyxu in #266
temporary hack to fix pytorch loader until it can handle a versioned … by @changhiskhan in #293
fix image_id alignment in coco datagen by @changhiskhan in #289

Full Changelog: v0.2.0...v0.2.1

Contributors

eddyxu and changhiskhan

Assets 2

02 Nov 19:38

eddyxu

v0.2.0

f4eaa21

v0.2.0 Dataset Versioning, DuckDB extension built with CUDA

Highlights

Lance Dataset versioning support
Duckdb Extension supports building against PyTorch with Cuda
Revamp README and documentation.

What's Changed

Fetch Dataset Versions by @eddyxu in #272
Readability improvement for metadata class by @Renkai in #275
[DuckDB] Enable DuckDB extension to build/run on CUDA-enabled PyTorch by @changhiskhan in #273
[Python] Support multi-versioned dataset by @eddyxu in #278
[Document] Add logo/README refresh by @jaichopra in #279
[Python] Fetch dataset versions. by @eddyxu in #280
[Python] Support group size and rows_per_file customization via write_dataset API by @eddyxu in #281
[Python] use new write API in python benchmark by @eddyxu in #282

Full Changelog: v0.1.5...v0.2.0

Contributors

eddyxu, changhiskhan, and 2 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Highlights

What's Changed

Contributors

Releases: lancedb/lance

v0.2.9 pandas extension type for inline images

What's Changed

Contributors

v0.2.8 Happy Holidays!

What's Changed

Contributors

v0.2.7 Dataset Diff and Metrics computation, and Dataset Version Metadata

What's Changed

Contributors

v0.2.6 Schema evolution bug fixes, Google Colab support, and more datasets

What's Changed

Contributors

v0.2.5 Schema evolution, support merging with arrow Table

What's Changed

Contributors

v0.2.4: Schema Evolution and Append Column

What's Changed

Contributors

v0.2.3 Bugfix release; breaks dataset proto schema

What's Changed

Contributors

v0.2.2 Python notebooks and CV dataset conversion.

What's Changed

Contributors

v0.2.1 Bug fix release

What's Changed

Contributors

v0.2.0 Dataset Versioning, DuckDB extension built with CUDA

Highlights

What's Changed

Contributors