Forbidden Relations of Disabled Hyperparameters #397

bbudescu · 2024-11-19T17:53:41Z

Avoid evaluating forbidden relations on HPs that were disabled/not sampled because of other conditions that were not met, since retrieving int values (for category indices) from nans fails.

Fixes #396 (also see automl/SMAC3#1161).

…ampled because of conditions that were not met

eddiebergman

Hi, thanks a lot for the PR for the issue you raised. I'm sorry I did not get back to you but I appreciate the initiative to go and fix it!

Happy to merge this, but would you mind adding a single test-case to make sure this regression doesn't occur again?

…forbidden relations on conditioned parameters - regression test for crash

- finalize regression test

bbudescu · 2024-11-20T07:24:37Z

Hi! Thanks for replying. I added a unit test for the fixes I wrote. I hope they make sense, I'm not intimately familiar with the architecture of the project, but this is where I thought it made most sense to add these nan checks, at a first sight. I wasn't sure whether this was supposed to be addressed, e.g., at a lower level, by replacing nans with some invalid value for array indices, like a negative value or np.iinfo.max, or some other custom placeholder.

bbudescu · 2024-11-20T07:41:02Z

Also, since I caught your attention here :), I was hoping that perhaps you're familiar with another question I have about SMAC3 (I asked it on that project's issue board here automl/SMAC3#1163 but nobody answered for a couple of days, and I couldn't find any other place to ask, like a forum or chat).

Do you know whether one could run SMAC3 without anaconda? I mean, in the docs, it's listed as the preferred installation environment, but still, is there a chance I could avoid this dependency? Does it depend on any custom package build specific for conda or something? Does it crash or is its performance affected in any significant way?

I mean, I tried running the tests both with and without conda, and the same tests (2 or 3 of them) crash in both environments.

Thanks, and sorry for asking this here.

eddiebergman · 2024-11-20T17:10:40Z

Hiyo, for now let's just consider where you implemented it as being correct :)

There's probably a better place to put it but I unfortunatly don't have the bandwidth to look into it and benchmark it at the moment.

As for your SMAC question, I will answer there!

eddiebergman · 2024-11-20T17:31:11Z

I will just run these tests and then merge it in and make a release :) Might not check back on it tonight but will do tomorrow at latest!

eddiebergman · 2024-11-20T17:33:45Z

Sorry last thing that seems to need to be done. Can you run
pip install ruff in your environment and then run ruff --fix src. Should hopefully automatically fix the 4 things in picked up in pre-commit

bbudescu · 2024-11-21T06:41:51Z

Sure. I thought that the ruff formatter suggestion wasn't really readable, but I came up with something that made it pass.

Look, there's one question I have about my fix. Is it possible that the nans encountered actually should be passed through the HP's transformer and cast to get a usable value? I mean, for an OrdinalHyperparameter where the index in the user-provided vector is sampled, nan is definitely not convertible to a valid index, but perhaps converters for other kinds of HPs don't mind receiving nan.

eddiebergman · 2024-11-21T08:55:11Z

Good question, perhaps there's a world where that could be done meaningfully, but in practice, ConfigSpace has a very concrete assumption and it is that np.nan stands for a value which is not active.

Becuase there is conditionality, we need to somehow represent non-active inside a numpy float vector. Really your only options are to use some sentinal value, of which only 3 make sense np.inf, -np.inf or np.nan. Out of these 3, np.nan is the one that was went with.

bbudescu · 2024-11-21T09:09:44Z

Good question, perhaps there's a world where that could be done meaningfully, but in practice, ConfigSpace has a very concrete assumption and it is that np.nan stands for a value which is not active.

Becuase there is conditionality, we need to somehow represent non-active inside a numpy float vector. Really your only options are to use some sentinal value, of which only 3 make sense np.inf, -np.inf or np.nan. Out of these 3, np.nan is the one that was went with.

Ok, well, if that convention is clear, than that's what I was concerned about. I didn't know that it has been convened to do so, although I did expect it, as it has the solid arguments that you present. I just wanted to make sure this is the case, as I'm not intimately familiar with the implementation, and I could imagine that, e.g., nan can also be used to denote something else (besides being passed by the user as the value for a Constant or as one of the options for a Categorical). I guess after making some assumptions in the past that seemed obvious, and turned out to not hold later on, I prefer asking explicitly :).

bbudescu · 2024-11-21T09:24:40Z

Sorry last thing that seems to need to be done. Can you run pip install ruff in your environment and then run ruff --fix src. Should hopefully automatically fix the 4 things in picked up in pre-commit

As stated in a previous comment above, I did run ruff on the code. Now, the stage in which the pre-commit fails is ruff-format (which doesn't take a --fix; on ruff version 0.7.4, which I'm using, this is an option only for the ruff check command).

Anyway, I used ruff format src, got its suggestions, modified the code manually until ruff was happy with it and didn't modify code any more in a way in which it becomes less readable (in my opinion), but the CI pipeline still failed in the same ruff-format stage.

I now ran ruff format on the entire project, and it made some welcome modifications to the test code I had added, but it also touched on other files outside the scope of the bug I was addressing. I guess I was supposed to run ruff format test. The formatting suggestions on files in the scripts directory look safe and good to me, though, so perhaps you'd want to keep those, as well. If not, let me know, so I can undo them, and only touch the forbidden code and test files.

eddiebergman · 2024-11-21T10:04:58Z

All good, worst case I merge and just make another small patch to fix it to get this in. Don't want to hold you up :)

eddiebergman · 2024-11-21T10:13:56Z

Merged and fixed on main :)

eddiebergman · 2024-11-21T10:21:03Z

Released in v1.2.1 on PyPI :)

bbudescu · 2024-11-21T11:05:07Z

Released in v1.2.1 on PyPI :)

Thanks a huge lot!

P.S.: btw, I can see in the pre-commit pipeline now fails in the ruff stage. I did run ruff check src and ruff check tst locally, and it came up with a bunch of modifications (including one in the test I wrote, where I think it wants to import stuff in alphabetical order). Should I just push them here?

eddiebergman · 2024-11-21T11:56:27Z

It's alright, I already took care of it, the main branch is passing with it :) Thank you though!

avoid evaluating forbidden relations on HPOs that were disabled/not s…

84899b7

…ampled because of conditions that were not met

eddiebergman reviewed Nov 19, 2024

View reviewed changes

Bogdan Budescu added 3 commits November 20, 2024 08:49

- add nan checks for single vals, not just arrays, before evaluating …

166e8aa

…forbidden relations on conditioned parameters - regression test for crash

- aesthetic refactor: rename for improved legibility

e5c525c

- finalize regression test

test all forbidden relations, not just equals

1673f3a

rearrange some code to pass ruff formatter check

5ebe8e4

run ruff format on the entire project, not just on src subdir

1c207f2

eddiebergman merged commit d944e16 into automl:main Nov 21, 2024
16 of 17 checks passed

bbudescu mentioned this pull request Nov 21, 2024

Forbidden relations of conditioned hyperparameters automl/SMAC3#1161

Closed

bbudescu deleted the fix-forbiddens-depending-on-conditionals branch November 21, 2024 12:01

bbudescu mentioned this pull request Dec 17, 2024

[Feature Request] Objectives RoI automl/SMAC3#1172

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forbidden Relations of Disabled Hyperparameters #397

Forbidden Relations of Disabled Hyperparameters #397

bbudescu commented Nov 19, 2024

eddiebergman left a comment

bbudescu commented Nov 20, 2024

bbudescu commented Nov 20, 2024

eddiebergman commented Nov 20, 2024

eddiebergman commented Nov 20, 2024

eddiebergman commented Nov 20, 2024

bbudescu commented Nov 21, 2024

eddiebergman commented Nov 21, 2024

bbudescu commented Nov 21, 2024

bbudescu commented Nov 21, 2024 •

edited

Loading

eddiebergman commented Nov 21, 2024

eddiebergman commented Nov 21, 2024

eddiebergman commented Nov 21, 2024 •

edited

Loading

bbudescu commented Nov 21, 2024

eddiebergman commented Nov 21, 2024

Forbidden Relations of Disabled Hyperparameters #397

Forbidden Relations of Disabled Hyperparameters #397

Conversation

bbudescu commented Nov 19, 2024

eddiebergman left a comment

Choose a reason for hiding this comment

bbudescu commented Nov 20, 2024

bbudescu commented Nov 20, 2024

eddiebergman commented Nov 20, 2024

eddiebergman commented Nov 20, 2024

eddiebergman commented Nov 20, 2024

bbudescu commented Nov 21, 2024

eddiebergman commented Nov 21, 2024

bbudescu commented Nov 21, 2024

bbudescu commented Nov 21, 2024 • edited Loading

eddiebergman commented Nov 21, 2024

eddiebergman commented Nov 21, 2024

eddiebergman commented Nov 21, 2024 • edited Loading

bbudescu commented Nov 21, 2024

eddiebergman commented Nov 21, 2024

bbudescu commented Nov 21, 2024 •

edited

Loading

eddiebergman commented Nov 21, 2024 •

edited

Loading