-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EntitySet.add_relationship should error if the child variable is also the index variable of the child entity #1009
Comments
Hi @MaratSaidov , thanks for the bug report. Happy to help you with these issues. Could you share the code you used to create the entityset, and attach the structure image to your comment directly? |
Hi @rwedge , thank you for your help. Here is the structure of an Here is the code which creates this
Then I call a depth wise synthesis function:
Here The described dataframes ( |
Hi @MaratSaidov , I made a mock dataset and was able to replicate the warnings you were getting, we can investigate those. I didn't encounter the ValueError you reported. Are you able to share the data / code you used to create the EntitySet. A reproducible example would help a lot with diagnosing this. Other information that would help:
|
Hi @rwedge, I allocated a problem a bit. Consider two structures: This input is suitable to a bunch of DFS runs. However, if I try to add an edgeless table to this structure: In this case I get the described above ValueError. Here you could find a bug example notebook: Bug Example Here the data tables are placed: data The original dataset you could find here: dataset Notes about other information:
Thank you! |
Hi @MaratSaidov, Thanks for the example notebook! I was able to reproduce the error and figure out what was causing it. In both of the graphs you included in your last post, the index column for the However, we can fix this relationship problem. Instead of using We can remove one preprocessing step: # choose only unique indices
print("olist_order_items.shape:", olist_order_items.shape)
olist_order_items = olist_order_items.iloc[olist_order_items.drop_duplicates(['order_id']).index, :]
print("olist_order_items.shape:", olist_order_items.shape) I removed the above code dropping rows since with a different index than Then we need to add a new index. es = es.entity_from_dataframe(entity_id='order_items', dataframe=olist_order_items, index='order_item_unique_id', make_index=True) After making those changes I was able to run I'm changing the name of the issue to reflect what we should do to fix it in the future, which is to prevent adding a relationship where the child variable is also the index of the child entity. |
Depth Feature Synthesis Error
I am trying to apply DFS to my
EntitySet
. It is simple and has the following structure:Image of Structure
However,
ft.dfs
is not appliable nor totarget_entity='orders'
, neither totarget_entity='customers'
. I found an open issue in your repository: issueBut this is still open and I can't figure out how to fix the errors which appear with both of target entities.
Bug/Feature Request Description
There are warnings of the following kind:
WARNING Attempting to add feature <Feature: customer_zip_code_prefix / 1> which is already present. This is likely a bug.
Error's message:
ValueError: 'order_id' is both an index level and a column label, which is ambiguous.
Expectations
How to get here a
feature_matrix_spec
without any errors and warnings?Thanks for your time!
Output of
featuretools.show_info()
[paste the output of
featuretools.show_info()
here below this line]Featuretools version: 0.15.0
Featuretools installation directory: /usr/local/lib/python3.6/dist-packages/featuretools
SYSTEM INFO
python: 3.6.9.final.0
python-bits: 64
OS: Linux
OS-release: 4.19.104+
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
INSTALLED VERSIONS
numpy: 1.18.4
pandas: 1.0.4
tqdm: 4.41.1
PyYAML: 3.13
cloudpickle: 1.3.0
dask: 2.12.0
distributed: 1.25.3
psutil: 5.4.8
pip: 19.3.1
setuptools: 47.1.1
The text was updated successfully, but these errors were encountered: