-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Loading status checks…
Refactor metasyn and make CLI more feature complete (#227)
A big refactor with some breaking changes: - In `MetaFrame.from_dataframe()`, `spec` becomes `var_specs` and changes from a dictionary to a list. - `MetaFrame`: A new `from_config` classmethod to create a datafree MetaFrame. - CLI interface is now feature complete using .toml files as input for the configuration. - CLI: For the `create_meta` method, the input and output parameters are now optional. - `UniformDistribution` has its parameters changed to `low` and `high` to conform with the discrete version. - `MetaVar`: The `detect` method is gone and you should now either use the `fit` method or the `__init__` itself. - `MetaVar`: - `Privacy`: Privacy classes can and should now be dynamically loaded using entry points. - `MetaConfig`: A new configuration object that is used to parse and check toml/dictionaries. - `VarConfig`: A new variable configuration object. - `VarConfigAccess`: An accessor for the configuration object that takes into account defaults from the meta_config. - `DistributionSpec`: A new distribution configuration class that can parse and check configurations. - `DistributionProviderList`: A new `create` method to allow for datafree creation of distributions. - `fit_kwargs` is now in the `DistributionSpec` - `unique` is now in the `DistributionSpec`
- Loading branch information
Showing
33 changed files
with
969 additions
and
409 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
import polars as pl | ||
|
||
from metasyn import MetaFrame | ||
|
||
# example dataframe from polars website | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Example toml file as input for metasyn | ||
|
||
[general] | ||
dist_providers = ["builtin", "metasyn-disclosure"] | ||
|
||
[general.privacy] | ||
name = "disclosure" | ||
parameters = {n_avg = 11} | ||
|
||
|
||
[[var]] | ||
name = "PassengerId" | ||
distribution = {unique = true} # Notice booleans are lower case in .toml files. | ||
|
||
[[var]] | ||
name = "Name" | ||
prop_missing = 0.1 | ||
description = "Name of the unfortunate passenger of the titanic." | ||
distribution = {implements = "core.faker", parameters = {faker_type = "name", locale = "en_US"}} | ||
|
||
[[var]] | ||
name = "Fare" | ||
distribution = {implements = "core.exponential"} | ||
|
||
[[var]] | ||
name = "Age" | ||
distribution = {implements = "core.uniform", parameters = {low = 20, high = 40}} | ||
|
||
[[var]] | ||
name = "Cabin" | ||
distribution = {implements = "core.regex", parameters = {regex_data = "[A-F][0-9]{2,3}"}} | ||
privacy = {name = "disclosure", parameters = {n_avg = 21}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.