-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add text size to unstructured profiler #340
Conversation
AnhTruong
commented
Jul 16, 2021
- add capacity to global stats
- add tests
dataprofiler/profilers/utils.py
Outdated
:type data: Union[list, numpy.array, pandas.DataFrame] | ||
:param unit: capacity unit (B, K, M, or G) | ||
:type unit: string | ||
:return: capacity of the input data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update docstring to get rid of capacity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
} | ||
|
||
# ensure all data are of type str | ||
data = data.apply(str) | ||
|
||
# get capacity | ||
base_stats = {"memory_size": utils.get_capacity(data, unit='M')} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still called capacity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
dataprofiler/profilers/utils.py
Outdated
@@ -504,8 +504,39 @@ def get_memory_size(data, unit='M'): | |||
if unit not in unit_map: | |||
raise ValueError('Currently only supports the ' | |||
'memory size unit in {}'.format(list(unit_map.keys()))) | |||
capacity = 0 | |||
memory_size = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this tabbed over?
* add text size * add error raise for unit * clean code * clean test * fix test * clean test * clean test * clean test * clean test