-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve fetching logs from AWS #33231
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good! But maybe some unit tests for the changes to get_cloudwatch_logs
and maybe get_log_events
?
Good call! I added some unit tests for |
Make use of start and end time input to the cloudwatch API to reduce the log search space and speed up log retrieval.
Fetching logs from AWS CloudWatch can take a lot of times or even fail when the log stream is old and no time boundary is specified when querying CloudWatch.
Example: if you try to look at the logs of an old task on the Airflow UI, it can be very slow or even fail doing so.
By setting
start_time
andend_time
(end_time
is the most important), it improves drastically the latency. CloudWatch team recommended adding end time for performance to restrict the search space. I tested it with a 10 days old task and I could experience an improvement from 5 seconds to 1 second when fetching task logs from the UI.Resolves #32897
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.