Add value error if both d sae and expansion factor set #301

anthonyduong9 · 2024-09-24T20:32:50Z

Description

Changes LanguageModelSAERunnerConfig so if d_sae and expansion_factor are not None, __post_init__() raises a ValueError and if d_sae and expansion_factor are None, __post_init__() sets expansion_factor to 4. d_sae and expansion_factor are both attributes to set the size of an SAE, and expansion_factor has a default value of 4, for backward compatibility.

Fixes #47

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

You have tested formatting, typing and unit tests (acceptance tests not currently in use)

I have run make check-ci to check format and linting. (you can run make format to format code if needed.)

Performance Check.

If you have implemented a training change, please indicate precisely how performance changes with respect to the following metrics:

L0
CE Loss
MSE Loss
Feature Dashboard Interpretability

Please links to wandb dashboards with a control and test group.

chanind

The core code changes look good, but why were unrelated test values changed?

chanind · 2024-09-25T21:04:17Z

tests/unit/helpers.py

-    # useful for checking the correctness of the configuration
-    # in the tests.
-    mock_config.__post_init__()
+    mock_config = LanguageModelSAERunnerConfig(**mock_config_dict)


fair enough! This feels less hacky than doing __post_init__() like before 👍

Haha, yeah, we shouldn't effectively call __post_init__() twice.

chanind · 2024-09-25T21:04:50Z

tests/unit/training/test_activations_store.py

@@ -28,7 +28,6 @@ def tokenize_with_bos(model: HookedTransformer, text: str) -> list[int]:
        {
            "model_name": "tiny-stories-1M",
            "dataset_path": "roneneldan/TinyStories",
-            "tokenized": False,


Why were these values changed?

Because build_sae_cfg() now takes a dict, updates another dict, and then passes it to a dataclass, whereas before it instantiated a dataclass, then looped over the dict and called setattr() to set the dataclass attributes. So now, since "tokenized" doesn't match any attributes of LanguageModelSAERunnerConfig, its __init__() method would return an error/tests would fail.

I checked throughout the codebase that there are no references to a field tokenized for any instance of LanguageModelSAERunnerConfig.

oooh nice, so if we mistype attributes in tests now we'll get an error? 🥇

chanind · 2024-09-25T21:06:40Z

tests/unit/training/test_sae_basic.py

@@ -202,7 +198,8 @@ def test_sae_save_and_load_from_pretrained_gated(tmp_path: Path) -> None:

 def test_sae_save_and_load_from_pretrained_topk(tmp_path: Path) -> None:
    cfg = build_sae_cfg(
-        activation_fn_str="topk", activation_fn_kwargs={"k": 30}, device="cpu"


why was this changed?

For similar reasons as in this comment. activation_fn_str isn't an attribute of LanguageModelSAERunnerConfig.

chanind · 2024-09-25T21:08:54Z

tests/unit/training/test_sae_trainer.py

@@ -204,7 +204,6 @@ def test_train_sae_group_on_language_model__runs(
    checkpoint_dir = tmp_path / "checkpoint"
    cfg = build_sae_cfg(
        checkpoint_path=str(checkpoint_dir),
-        train_batch_size=32,


Why was this changed?

For similar reasons as in this comment. train_batch_size isn't an attribute of LanguageModelSAERunnerConfig.

chanind · 2024-09-25T22:54:31Z

Awesome work with this, and thanks for fixing the issues with tests have invalid config fields set!

…us#301) * adds ValueError if both d_sae and expansion_factor set * renames class * removes commented out line

anthonyduong9 added 3 commits September 25, 2024 10:06

adds ValueError if both d_sae and expansion_factor set

f1ee575

renames class

d77c0ca

removes commented out line

3d1b263

anthonyduong9 force-pushed the add-ValueError-if-both-d_sae-and-expansion_factor-set branch from adfe458 to 3d1b263 Compare September 25, 2024 17:08

anthonyduong9 marked this pull request as ready for review September 25, 2024 17:14

chanind reviewed Sep 25, 2024

View reviewed changes

anthonyduong9 requested a review from chanind September 25, 2024 22:41

chanind merged commit 999ffe8 into jbloomAus:main Sep 25, 2024
5 checks passed

zhenningdavidliu pushed a commit to decandido/SAELens that referenced this pull request Oct 4, 2024

feat: Add value error if both d sae and expansion factor set (jbloomA…

5d8379c

…us#301) * adds ValueError if both d_sae and expansion_factor set * renames class * removes commented out line

tom-pollak pushed a commit to tom-pollak/SAELens that referenced this pull request Oct 22, 2024

feat: Add value error if both d sae and expansion factor set (jbloomA…

e5c1e7a

…us#301) * adds ValueError if both d_sae and expansion_factor set * renames class * removes commented out line

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add value error if both d sae and expansion factor set #301

Add value error if both d sae and expansion factor set #301

anthonyduong9 commented Sep 24, 2024 •

edited

Loading

chanind left a comment

chanind Sep 25, 2024

anthonyduong9 Sep 25, 2024

chanind Sep 25, 2024

anthonyduong9 Sep 25, 2024

chanind Sep 25, 2024

chanind Sep 25, 2024

anthonyduong9 Sep 25, 2024 •

edited

Loading

chanind Sep 25, 2024

anthonyduong9 Sep 25, 2024 •

edited

Loading

chanind commented Sep 25, 2024

Add value error if both d sae and expansion factor set #301

Add value error if both d sae and expansion factor set #301

Conversation

anthonyduong9 commented Sep 24, 2024 • edited Loading

Description

Type of change

Checklist:

You have tested formatting, typing and unit tests (acceptance tests not currently in use)

Performance Check.

chanind left a comment

Choose a reason for hiding this comment

chanind Sep 25, 2024

Choose a reason for hiding this comment

anthonyduong9 Sep 25, 2024

Choose a reason for hiding this comment

chanind Sep 25, 2024

Choose a reason for hiding this comment

anthonyduong9 Sep 25, 2024

Choose a reason for hiding this comment

chanind Sep 25, 2024

Choose a reason for hiding this comment

chanind Sep 25, 2024

Choose a reason for hiding this comment

anthonyduong9 Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

chanind Sep 25, 2024

Choose a reason for hiding this comment

anthonyduong9 Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

chanind commented Sep 25, 2024

anthonyduong9 commented Sep 24, 2024 •

edited

Loading

anthonyduong9 Sep 25, 2024 •

edited

Loading

anthonyduong9 Sep 25, 2024 •

edited

Loading