Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Airflow Logs API to fetch logs from Amazon Cloudwatch with time range #32897

Closed
1 of 2 tasks
rushabh-lokhande opened this issue Jul 27, 2023 · 6 comments · Fixed by #33231
Closed
1 of 2 tasks
Labels

Comments

@rushabh-lokhande
Copy link

rushabh-lokhande commented Jul 27, 2023

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

MWAA Version: 2.4.3
Airflow Version: 2.4.3

Airflow Logs currently do not fetch logs from Cloudwatch without time range, so when the cloudwatch logs are large and CloudWatch log streams are OLD, the airflow UI cannot display logs with error message:

*** Reading remote log from Cloudwatch log_group: airflow-cdp-airflow243-XXXX-Task log_stream: dag_id=<DAG_NAME>/run_id=scheduled__2023-07-27T07_25_00+00_00/task_id=<TASK_ID>/attempt=1.log.
Could not read remote logs from log_group: airflow-cdp-airflow243-XXXXXX-Task log_group: airflow-cdp-airflow243-XXXX-Task log_stream: dag_id=<DAG_NAME>/run_id=scheduled__2023-07-27T07_25_00+00_00/task_id=<TASK_ID>/attempt=1.log

The Airflow API need to pass start and end timestamps to GetLogEvents API from Amazon CloudWatch to resolve this error and it also improves performance of fetching logs.

This is critical issue for customers when they would like to fetch logs to investigate failed pipelines form few days to weeks old

What you think should happen instead

The Airflow API need to pass start and end timestamps to GetLogEvents API from Amazon CloudWatch to resolve this error.
This should also improve performance of fetching logs.

How to reproduce

This issue is intermittent and happens mostly on FAILD tasks.

  1. Log onto Amazon MWAA Service
  2. Open Airflow UI
  3. Select DAG
  4. Select the Failed Tasks
  5. Select Logs
    You should see error message like below in the logs:
*** Reading remote log from Cloudwatch log_group: airflow-cdp-airflow243-XXXX-Task log_stream: dag_id=<DAG_NAME>/run_id=scheduled__2023-07-27T07_25_00+00_00/task_id=<TASK_ID>/attempt=1.log.
Could not read remote logs from log_group: airflow-cdp-airflow243-XXXXXX-Task log_group: airflow-cdp-airflow243-XXXX-Task log_stream: dag_id=<DAG_NAME>/run_id=scheduled__2023-07-27T07_25_00+00_00/task_id=<TASK_ID>/attempt=1.log

Operating System

Running with Amazon MWAA

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.3.1
apache-airflow==2.4.3

Deployment

Amazon (AWS) MWAA

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@rushabh-lokhande rushabh-lokhande added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jul 27, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 27, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@hussein-awala hussein-awala added provider:amazon AWS/Amazon - related issues area:providers area:logging good first issue and removed area:core needs-triage label for new issues that we didn't triage yet labels Jul 29, 2023
@hussein-awala
Copy link
Member

@rushabh-lokhande would you like to work on fixing this issue?

@ivica-k
Copy link
Contributor

ivica-k commented Jul 30, 2023

Hey @rushabh-lokhande, in your opinion, where would the values for start time and end time come from? I'm guessing from the start and end time of the DAG itself?

@rlokhande1982
Copy link

Hey @rushabh-lokhande, in your opinion, where would the values for start time and end time come from? I'm guessing from the start and end time of the DAG itself?

Yes @ivica-k we will pick up start time and end time with UTC conversion if needed, as ClopudWatch translates the dates to UTC. Also, even if we send start_date and end_date instead of start_timestamps and end_timestamps, that should narrow down Amazon Cloudwatch logs window and should improve performance.

@rushabh-lokhande
Copy link
Author

This Issue is linked to performance enhancement of #20814

@shubham22
Copy link

fyi @vincbeck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants