Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed example: scripts/sample_data_loader.py #311

Closed
eleijonmarck opened this issue Mar 1, 2020 · 11 comments
Closed

Failed example: scripts/sample_data_loader.py #311

eleijonmarck opened this issue Mar 1, 2020 · 11 comments

Comments

@eleijonmarck
Copy link

python 3.6.6
ubuntu 19.10

venv - amundsendatabuilder

when I run through the example - https://github.com/lyft/amundsen/blob/master/docs/installation.md#bootstrap-a-default-version-of-amundsen-using-docker

I get the following error:

~/dev/amundsen/amundsendatabuilder @dac8110d
amundsendatabuilder $ python example/scripts/sample_data_loader.py 

Traceback (most recent call last):
  File "example/scripts/sample_data_loader.py", line 642, in <module>
    load_table_data_from_csv('sample_table_programmatic_source.csv', 'programmatic')
  File "example/scripts/sample_data_loader.py", line 88, in load_table_data_from_csv
    i['description_source']) for i in dr]
  File "example/scripts/sample_data_loader.py", line 88, in <listcomp>
    i['description_source']) for i in dr]
KeyError: 'schema'
@feng-tao
Copy link
Member

feng-tao commented Mar 2, 2020

@eleijonmarck it should be fixed by amundsen-io/amundsendatabuilder#199 . The git submodules in amundsen repo will get auto updated tomorrow.

@feng-tao
Copy link
Member

feng-tao commented Mar 2, 2020

if you want to try it now, you could also clone the latest lyft/amundsendatabuilder repo.

@feng-tao
Copy link
Member

feng-tao commented Mar 2, 2020

@eleijonmarck all the git submodules have been updated to latest, could you retry and fetch the latest master and see if it fixes your issues? thanks.

@eleijonmarck
Copy link
Author

@feng-tao will do!

@eleijonmarck
Copy link
Author

@feng-tao currently I am getting:

Traceback (most recent call last):
  File "example/scripts/sample_data_loader.py", line 723, in <module>
    job_es_table.launch()
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.0.2-py3.6.egg/databuilder/job/job.py", line 78, in launch
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.0.2-py3.6.egg/databuilder/job/job.py", line 74, in launch
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.0.2-py3.6.egg/databuilder/publisher/base_publisher.py", line 37, in publish
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.0.2-py3.6.egg/databuilder/publisher/base_publisher.py", line 34, in publish
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.0.2-py3.6.egg/databuilder/publisher/elasticsearch_publisher.py", line 178, in publish_impl
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/client/indices.py", line 107, in create
    "PUT", _make_path(index), params=params, body=body
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/transport.py", line 353, in perform_request
    timeout=timeout,
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 229, in perform_request
    raise ConnectionError("N/A", str(e), e)
elasticsearch.exceptions.ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7fbb06dacda0>: Failed to establish a new connection: [Errno 111] Connection refused) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7fbb06dacda0>: Failed to establish a new connection: [Errno 111] Connection refused)

But it might just be my docker having some weird network issues. But I have removed all containers, networks and volumes however before running

@jornh
Copy link
Contributor

jornh commented Mar 11, 2020

@eleijonmarck any updates on this?

As it stands it looks like the the original issue you reported was resolved by git submodule update mentioned by Tao.

Your second issue (if still relevant) certainly indicates that you had a container networking issue. And not directly related to the issue title.

You should be able to connect to the elasticsearch container from your host on port 9200 otherwise try on a different host. Also make sure you ES container is running. Please provide logs from it if you want others to help troubleshooting.

@eleijonmarck
Copy link
Author

@jornh if I update to latest submodules I am able to see the examples in the interface populated and I am able to look into the test table and the column names and their definitions.

However I am still getting errors for connection refused

Traceback (most recent call last):
  File "example/scripts/sample_data_loader.py", line 274, in <module>
    job_es_table.launch()
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.3.2-py3.6.egg/databuilder/job/job.py", line 78, in launch
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.3.2-py3.6.egg/databuilder/job/job.py", line 74, in launch
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.3.2-py3.6.egg/databuilder/publisher/base_publisher.py", line 37, in 
publish
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.3.2-py3.6.egg/databuilder/publisher/base_publisher.py", line 34, in 
publish
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/amundsen_databuilder-2.3.2-py3.6.egg/databuilder/publisher/elasticsearch_publisher.py", lin
e 182, in publish_impl
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/client/indices.py", line 107, in create
    "PUT", _make_path(index), params=params, body=body
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/transport.py", line 353, in perform_request
    timeout=timeout,
  File "/home/eleijonmarck/dev/amundsen/amundsendatabuilder/venv/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 229, in perform_request
    raise ConnectionError("N/A", str(e), e)
elasticsearch.exceptions.ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7fc7c10ffcc0>: Failed to establish a new connection: [Errno 111] Connection ref
used) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7fc7c10ffcc0>: Failed to establish a new connection: [Errno 111] Connection refused)

I could close the issue since I am able to now see the interface populated but the connection refused still remains.

@jornh
Copy link
Contributor

jornh commented Mar 12, 2020

Ok got it. So you’re still experiencing the inability of Databuilder to connect to ElasticSearch.

What’s the docker logs output for ElasticSearch? It should reveal what’s up with the container. Can you connect to it’s port 9200 from curl or a browser?

@rockie-yang
Copy link

I met same issue when using default docker configuration on Mac which only has 2G memory allocated. fixed by

  • change the memory size for docker to 8G
  • start with docker compose again

@feng-tao
Copy link
Member

thanks @rockie-yang , @eleijonmarck please try and see if it fixes your issues.

jornh added a commit to jornh/amundsen that referenced this issue Mar 20, 2020
For context see amundsen-io#311 (comment) and various Slack threads.
@jornh
Copy link
Contributor

jornh commented Mar 20, 2020

I added some troubleshooting hints in #337

Since we didn't hear back I think this issue is resolved, right? @eleijonmarck @feng-tao

feng-tao pushed a commit that referenced this issue Mar 20, 2020
For context see #311 (comment) and various Slack threads.
gjxdxh pushed a commit to gjxdxh/amundsen that referenced this issue Dec 8, 2020
dorianj pushed a commit to dorianj/amundsen that referenced this issue Apr 25, 2021
dorianj pushed a commit to dorianj/amundsen that referenced this issue Apr 25, 2021
* Redesigned the `table_detail` page with a new layout.
* Added new components for FrequentUsers, Lineage, Writer, Source, ExploreButton, etc
* Added EditableSection to replace EntityCard and EntityCardSection with an updated interaction and design
* Updated WatermarkLabel to a new design
* Update Table Detail Columns (amundsen-io#311)
* Added a new 'ColumnStats' component
dorianj added a commit to dorianj/amundsen that referenced this issue Apr 25, 2021
feng-tao pushed a commit that referenced this issue May 7, 2021
* Redesigned the `table_detail` page with a new layout.
* Added new components for FrequentUsers, Lineage, Writer, Source, ExploreButton, etc
* Added EditableSection to replace EntityCard and EntityCardSection with an updated interaction and design
* Updated WatermarkLabel to a new design
* Update Table Detail Columns (#311)
* Added a new 'ColumnStats' component
feng-tao pushed a commit that referenced this issue May 7, 2021
For context see #311 (comment) and various Slack threads.
feng-tao pushed a commit that referenced this issue May 7, 2021
hansadriaans pushed a commit to DataChefHQ/amundsen that referenced this issue Jun 30, 2022
hansadriaans pushed a commit to DataChefHQ/amundsen that referenced this issue Jun 30, 2022
* Redesigned the `table_detail` page with a new layout.
* Added new components for FrequentUsers, Lineage, Writer, Source, ExploreButton, etc
* Added EditableSection to replace EntityCard and EntityCardSection with an updated interaction and design
* Updated WatermarkLabel to a new design
* Update Table Detail Columns (amundsen-io#311)
* Added a new 'ColumnStats' component
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants