-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest -> Rollback - Ingest comes to different results. (kafka topic ingestion) #6733
Comments
There is a bad way to avoid this bug by changing platform_instance: ******** |
@YuriyGavrilov Did you have any transformer in your recipe? |
@treff7es Sure sink: |
We recently made some fixes to the transformers setup in acryl-datahub v0.9.3.2. Could you try bumping your cli version? https://datahubproject.io/docs/ui-ingestion/#advanced-running-with-a-specific-cli-version |
ok. but it takes some time. we deside to wait 0.9.4. I will inform you about results here. |
@hsheth2 Rollback working perfect for the new ingestion. One thing for this moment disappointing me is that there is a data still presenting from the previous ingestion runs. so i Rolled back everything but still see a data. Don't know how to remove it properly. If i change "platform_instance: Kafka.Pss" back to big character like "platform_instance: Kafka.PSS" i received double data. So curently There is no proper function to rollback "everything" without doubling data. |
maybe i need to change running environment back to old version and make roll back. will try it. |
no, it doesn't help me. After all rollbacks there should be no any data but i see it. |
@YuriyGavrilov what do you mean by this, if there's still data present from previous runs?
If you'd like to delete everything in kafka and start over, you can use |
Yes - changing platform instance will cause all new entities to be minted. That being said, previous rollback should still end up clearing all data. However, it will take some time for all deletes to be applied (can take up to 30 minutes). |
@YuriyGavrilov are you still experiencing issues with rollback? If not, I might close this issue as it's been without activity for some time now |
@hsheth2 Yes data still present when i rollback everything. thanks for this "datahub delete --platform kafka" |
@chriscollins3456 i think this case coud be closed due to existing solution to delete all platform. (datahub delete --platform kafka) |
this what i receive after run datahub delete --platform kafkadatahub delete --platform kafka Not Found\n\n\n For request 'POST /entities?action=search'\n \n\n \n\n'[2023-01-10 17:55:58,133] ERROR {datahub.entrypoints:213} - Command failed: 404 Client Error: Not Found for url: https://datahub.uat.edp.s7.aero/entities?action=search Traceback (most recent call last): File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/entrypoints.py", line 171, in main sys.exit(datahub(standalone_mode=False, **kwargs)) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/click/core.py", line 1130, in call return self.main(*args, **kwargs) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/upgrade/upgrade.py", line 385, in async_wrapper loop.run_until_complete(run_func_check_upgrade()) File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete return future.result() File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/upgrade/upgrade.py", line 372, in run_func_check_upgrade ret = await the_one_future File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/upgrade/upgrade.py", line 365, in run_inner_func return await loop.run_in_executor( File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, **self.kwargs) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/telemetry/telemetry.py", line 344, in wrapper raise e File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/telemetry/telemetry.py", line 296, in wrapper res = func(*args, **kwargs) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/cli/delete_cli.py", line 235, in delete deletion_result = delete_with_filters( File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/telemetry/telemetry.py", line 344, in wrapper raise e File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/telemetry/telemetry.py", line 296, in wrapper res = func(*args, **kwargs) File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/cli/delete_cli.py", line 292, in delete_with_filters urns = list( File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/datahub/cli/cli_utils.py", line 419, in get_urns_by_filter response.raise_for_status() File "/home/yuandronnikov/venv/datahub-env/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://datahub.uat.edp.s7.aero/entities?action=search |
@YuriyGavrilov your CLI should be configured with the address of datahub-gms, not the address of the datahub-frontend. That's likely why it's giving a 404 not found error |
thanks @hsheth2 actually don't know yet how to do it |
there is also linked bug for cli delete option. - > #5992 |
@YuriyGavrilov that bug is with the delete command, but it seems this issue with with rollback - so my understanding is that that issue is different? |
@hsheth2 yes it is different |
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io |
ok let's close it. |
Describe the bug
There is an ingestion for the Kafka topic. Everything was normal and nothing foreshadowed trouble. ingestion works perfectly but with zero records update due to no read asscess rights to the topic. pic1.
after adding access rights everythinkg works nice
20 topics added. The problems appears after Rollback this 20 topics and trying to run ingestion again. The logs shows 20 errors. right the same quantity as topics with an overal log as zero record updated.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Ingest -> Rollback - Ingest should goes to the same results.
there shoud be no such error in the log like ... for each updated entities.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
The text was updated successfully, but these errors were encountered: