-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialization error after successful DAG run #20875
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
Already fixed in main. |
Hello All, Apache Airlfow Version : 2.2.5
We are passing the params at task Level like We have also tried using Can anyone please guide us how to fix this error? |
Migration to 2.3.* and using |
I believed this issue had been fixed in 2.2.5 - it originally occurred in 2.2.3 until I know - upgrading to 2.3.0 or 2.3.1 will it be solved ? Also can you please explain the root cause of the issue in case we are unable to upgrade 2.3.* right now because that will be major change for us it would also mean we use Python 3.7 or greater? Is the issue related to the Db upgrade or something related to the code? |
Python 3.6 reached end of life in October and stopped receiving security fixes, so you are putting your company in a great risk by not migrating in "planned" way - it is possible that a critical bug will be discovered and you will HAVE TO upgrade in a hurry - like it was with Log4Shell - If your reason is Python 3.6, I would strongly recommend you to start migration process now.
The root cause is that your serialized dags are likely using some older version of libraries and objects that are serialized are likely not restorable with the new versions - depending on what libraries and objects you have and dependencies you had, it might be something that is not at all in control of Airflow.. The "reserialize" will simply delete serialized dags from the database which will force them to be re-serialized agaain. Think of it as a "cached versions" of your DAGs.
If you reserialize - likely yes. But the only way you can get it fixed is to upgrade anyway, so holding off the upgrade makes no sense whatsovever even if you are not sure if it is going to be fixed. There might be reasons why some fixes might not work for you - but no-one will give you guarantee that your problem will be fixed, I am afraid. This is free software. It comes with no such guarantees. but if you need guarantees, then there are companies who provide paid support, and there you can have more expectations I think. Since this is a free and open software - you can also take a look at what the "reserialize" command is doing - you can easily find it in the code and do similar thing yourself, This is always an option, but then you take responsibility for us - the maintainers decided to implement reserialize command to help in such cases but this absolutely required to migrate to 2.3, but you can take the risk to apply similar approach to pre 2.3. version (but it's your decision and responsibility by not folowing the path recommended by maintainers). So you have options - either you follow the way we recommend (and with good support of the community) or choose to hold on an bear the consequences of the decision :) |
And yes. In most cases the suggestion here (if have not enough data to diagnose our user's environment and specific case) the advise will be to check if the problem is fixed is to migrate. With far less than < 1:1000 ratio of maintainers to users, it's far easier that our users are paying back to the community (and other users) by testing if new version fixes their problem rather than maintainers spending time on analysing individual customisations and modifications impacting those. It's just pure math. |
Thanks for the quick response.
But that too doesn't seem to work as the dags still were giving import errors. |
My quess is that you still have some workers using older airflow version. |
Alternatively, somewhere in your system *.pyc or pycache files are "stalled" and old version of python bytecode is used. Those are all the guesses I can come up, but maybe you should look for similar anomalies - maybe somewhere in your deplpyment the code is cached somewhere. Cache invalidation is one of the hardest things. |
Apache Airflow version
2.2.3 (latest released)
What happened
In case within the DAG definition we use both
params
andon_success_callback
then after triggering the DAG from the UI (option "Trigger DAG") and the finish of the successful DAG run I got the DAG Import Errors error displayed in the UI.Issue is related with the
_serialize_params_dict
method ofBaseSerialization
class.Part of the error message:
In consequence there is also no update in the
serialized_dag
table of backend db.Additionally:
So error occurs only after the successful DAG run from UI (I did not check the output of the run from CLI or just scheduled run)
In case we remove the
params
oron_success_callback
attribute then there is no serialization error.To confirm this issue I set up completely new Airflow instance in separate virtualenv and still had the same issue.
What you expected to happen
Successful serialization after the successful DAG run
How to reproduce
To reproduce this error I prepared very simple DAG presented below.
Additional information about the setup:
After successful execution of this DAG through "Trigger DAG" option in the UI the error should be displayed (of course after refresh of the webpage).
Operating System
MacOS 12.1
Versions of Apache Airflow Providers
n/a
Deployment
Virtualenv installation
Deployment details
Anything else
There is one scenario in which there will be no error:
min_serialized_dag_update_interval
secondthen there will be no error because serialization process will not start after the DAG run (at least this is my explanation).
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: