Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issues with "params" #15

Closed
erwanj opened this issue Mar 24, 2022 · 7 comments · Fixed by #16
Closed

issues with "params" #15

erwanj opened this issue Mar 24, 2022 · 7 comments · Fixed by #16
Assignees
Labels
bug Something isn't working

Comments

@erwanj
Copy link

erwanj commented Mar 24, 2022

we have just upgraded from airflow v1.10.11 to airflow v2.2.4.
We are getting the following error in airflow. This dag was working correctly before the migration.

Broken DAG: [/opt/airflow/dags/dag1.py] Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 578, in serialize_operator
    serialize_op['params'] = cls._serialize_params_dict(op.params)
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 451, in _serialize_params_dict
    if f'{v.__module__}.{v.__class__.__name__}' == 'airflow.models.param.Param':
AttributeError: 'str' object has no attribute '__module__'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 939, in to_dict
    json_dict = {"__version": cls.SERIALIZER_VERSION, "dag": cls.serialize_dag(var)}
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 851, in serialize_dag
    raise SerializationError(f'Failed to serialize DAG {dag.dag_id!r}: {e}')
airflow.exceptions.SerializationError: Failed to serialize DAG 'dag1': 'str' object has no attribute '__module__'

The dag looks like :

# DAG creation
with DAG (
    dag_id="dag1",
    default_args=DEFAULT_ARGS,
    max_active_runs=1,
    description='dag1',
    schedule_interval=WORKFLOW_SCHEDULE_INTERVAL,
    catchup=False
) as dag :

    extract_tables = KitchenOperator (
        pdi_conn_id='pdi_default',
        task_id="extract_tables",
        directory=PROJECT_DIRECTORY,
        job="jb_01_load_ods_tables",
        file=PROJECT_DIRECTORY+"jb_01_load_ods_tables.kjb",
        params={
            "date": '{{ ds }}'
        }
    )

If we remove "date": '{{ ds }}', the error in airflow disapears.

Is it possible that the issue is linked with airflow-pentaho-plugin?

@piffall
Copy link
Member

piffall commented Mar 24, 2022

Hello @erwanj

I don't think so. It seems that this is an Airflow issue on DAG serialization. I found a similar issue on Airflow repository:
apache/airflow#20875

I will investigate more on this.

Please, take a look to DAG serialization,
disabling it, may be a temporal solution for you.

@piffall piffall self-assigned this Mar 24, 2022
@piffall
Copy link
Member

piffall commented Mar 24, 2022

Please, try to run:

airflow dags reserialize

@erwanj
Copy link
Author

erwanj commented Mar 24, 2022

Thank you very much for your feedback.
I have tried with pdi_flow.py from sample_dags folder. I have removed job2 and trans2 as I don't have a carte server.
I have the same behaviour.

Broken DAG: [/opt/airflow/dags/pdi_flow.py] Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 578, in serialize_operator
    serialize_op['params'] = cls._serialize_params_dict(op.params)
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 451, in _serialize_params_dict
    if f'{v.__module__}.{v.__class__.__name__}' == 'airflow.models.param.Param':
AttributeError: 'str' object has no attribute '__module__'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 939, in to_dict
    json_dict = {"__version": cls.SERIALIZER_VERSION, "dag": cls.serialize_dag(var)}
  File "/usr/local/lib/python3.8/dist-packages/airflow/serialization/serialized_objects.py", line 851, in serialize_dag
    raise SerializationError(f'Failed to serialize DAG {dag.dag_id!r}: {e}')
airflow.exceptions.SerializationError: Failed to serialize DAG 'pdi_flow': 'str' object has no attribute '__module__'

If I remove 'date': '{{ ds }}' from params, I don't get the error.

If you have the possibility to do it, can you confirm me that this sample dag pdi_flow.py does work on airflow v2.2.4.

Thank you.

@erwanj
Copy link
Author

erwanj commented Mar 24, 2022

it seems that it works now.
I had to :
from airflow.models.param import Param
and use the following syntax for params :

        params={
           'date': Param('{{ ds }}')
        }

Thank you for your help

@piffall piffall added the bug Something isn't working label Mar 24, 2022
@piffall
Copy link
Member

piffall commented Mar 24, 2022

It seems that params should not be used, because it's already beeing used by BaseOperator.
I found a conversation about this.

This argument property, params, will be renamed to task_params.

piffall added a commit that referenced this issue Mar 24, 2022
piffall added a commit that referenced this issue Mar 24, 2022
@piffall
Copy link
Member

piffall commented Mar 24, 2022

Hi @erwanj , I applied a patch and also I released v1.0.9. Please, update the plugin. Thank you for reporting this.

@erwanj
Copy link
Author

erwanj commented Mar 28, 2022

Thank you very much for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants