Add fittable #140

stephantul · 2024-12-23T13:31:59Z

No description provided.

codecov · 2024-12-23T13:33:46Z

Codecov Report

Attention: Patch coverage is 97.79736% with 10 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
model2vec/inference/model.py	90.47%	6 Missing ⚠️
model2vec/train/classifier.py	97.38%	4 Missing ⚠️

Files with missing lines	Coverage Δ
model2vec/inference/__init__.py	`100.00% <100.00%> (ø)`
model2vec/model.py	`94.55% <100.00%> (ø)`
model2vec/train/__init__.py	`100.00% <100.00%> (ø)`
model2vec/train/base.py	`100.00% <100.00%> (ø)`
model2vec/utils.py	`92.53% <ø> (ø)`
tests/conftest.py	`100.00% <100.00%> (ø)`
tests/test_inference.py	`100.00% <100.00%> (ø)`
tests/test_trainable.py	`100.00% <100.00%> (ø)`
model2vec/train/classifier.py	`97.38% <97.38%> (ø)`
model2vec/inference/model.py	`90.47% <90.47%> (ø)`

... and 1 file with indirect coverage changes

Pringled

Looks good! Some minor comments and suggestions.

model2vec/train/classifier.py

model2vec/train/base.py

pyproject.toml

model2vec/train/classifier.py

davidberenstein1957

Looks super useful. left some comments. Also, perhaps we can add some reference to multi-label usage somewhere?

model2vec/model.py

davidberenstein1957 · 2025-01-26T09:00:15Z

model2vec/inference/model.py

+        """Save the model to a folder."""
+        save_pipeline(self, path)
+
+    def push_to_hub(self, repo_id: str, token: str | None = None, private: bool = False) -> None:


I would add a modelcard and perhaps tags or a library reference, this helps a lot with visibility, usability and findability.

https://huggingface.co/docs/hub/model-cards#specifying-a-library

This actually already happens because we push the underlying static model to the hub, which has a model card. This model card template is specified in the root of the code.

davidberenstein1957 · 2025-01-26T09:01:25Z

model2vec/inference/model.py

+        self.head = head
+
+    @classmethod
+    def from_pretrained(


can't we load it from the Hub? perhaps we should align the arguments a bit with the transformers naming given you've also adopted from_pretrained?

For example using pretrained_model_name_or_path. https://huggingface.co/docs/transformers/v4.48.0/en/model_doc/auto#transformers.AutoTokenizer.from_pretrained

from_pretrained loads from the hub. The arguments mimic the ones from StaticModel and, although they don't match transformers exactly, we're wary of introducing breaking changes.

model2vec/inference/model.py

model2vec/train/README.md

model2vec/train/classifier.py

stephantul added 11 commits December 22, 2024 12:38

Fix tokenizer issue

4078a3b

fix issue with warning

09f888d

regenerate lock file

2167a4e

fix lock file

c95dca5

Try to not select 2.5.1

b5d8bb7

fix: issue with dividers in utils

3e68669

Try to not select 2.5.0

1ae4d61

fix: do not up version

1349b0c

Attempt special fix

4b83d59

merge

9515b83

feat: add training

dfd865b

stephantul added 8 commits December 23, 2024 14:38

merge with old

c4ba272

fix: no grad

4713bfa

use numpy

e8058bb

Add train_test_split

a59127e

fix: issue with fit not resetting

310fbb5

feat: add lightning

b1899d1

merge

e27f9dc

Fix bugs

8df3aaf

stephantul marked this pull request as ready for review January 3, 2025 20:22

stephantul requested a review from Pringled January 3, 2025 20:22

Pringled requested changes Jan 4, 2025

View reviewed changes

stephantul added 2 commits January 5, 2025 16:07

fix: reviewer comments

839d88a

fix train issue

8457357

Pringled approved these changes Jan 5, 2025

View reviewed changes

stephantul added 4 commits January 7, 2025 17:21

fix issue with trainer

a750709

fix: truncate during training

e83c54e

feat: tokenize maximum length truncation

803565d

fixes

9052806

stephantul added 23 commits January 8, 2025 10:03

typo

2f9fbf4

Add progressbar

f1e08c3

small code changes, add docs

bb54a76

fix training comments

69ee4ee

Merge branch 'main' into add-fittable

9962be7

Add pipeline saving

ffec235

fix bug

0af84fc

fix issue with normalize test

c829745

change default batch size

9ce65a1

feat: add sklearn skops pipeline

e1169fb

Device handling and automatic batch size

f096824

Add docstrings, defaults

ff3ebdf

docs

b4e966a

fix: rename

8f65bfd

fix: rename

8cdb668

fix installation

e96a72a

rename

3e76083

Add training tutorial

9f1cb5a

Add tutorial link

e2d92b9

Merge branch 'main' into add-fittable

657cef0

test: add tests

773009f

fix tests

7015341

tests: fix tests

8ab8456

stephantul requested a review from Pringled January 24, 2025 18:49

davidberenstein1957 reviewed Jan 26, 2025

View reviewed changes

stephantul added 5 commits January 26, 2025 13:26

Address comments

e21e61f

Add inference reqs to train reqs

ff75af9

fix normalize

87de7c4

update lock file

1fb33f1

Merge branch 'main' into add-fittable

59f0076

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fittable #140

Add fittable #140

stephantul commented Dec 23, 2024

codecov bot commented Dec 23, 2024 •

edited

Loading

Pringled left a comment

davidberenstein1957 left a comment

davidberenstein1957 Jan 26, 2025

stephantul Jan 26, 2025

davidberenstein1957 Jan 26, 2025

stephantul Jan 26, 2025

Add fittable #140

Are you sure you want to change the base?

Add fittable #140

Conversation

stephantul commented Dec 23, 2024

codecov bot commented Dec 23, 2024 • edited Loading

Codecov Report

Pringled left a comment

Choose a reason for hiding this comment

davidberenstein1957 left a comment

Choose a reason for hiding this comment

davidberenstein1957 Jan 26, 2025

Choose a reason for hiding this comment

stephantul Jan 26, 2025

Choose a reason for hiding this comment

davidberenstein1957 Jan 26, 2025

Choose a reason for hiding this comment

stephantul Jan 26, 2025

Choose a reason for hiding this comment

codecov bot commented Dec 23, 2024 •

edited

Loading