Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read only endpoint for XCom #8134 #9170

Merged
merged 4 commits into from
Jun 25, 2020
Merged

Conversation

randr97
Copy link
Contributor

@randr97 randr97 commented Jun 7, 2020

Closes #8134

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Jun 7, 2020
@randr97 randr97 marked this pull request as draft June 7, 2020 18:24
@randr97 randr97 force-pushed the REST_API_XCOMS branch 2 times, most recently from e13159b to 530a5e9 Compare June 7, 2020 20:52
@randr97 randr97 requested a review from mik-laj June 7, 2020 20:54
@ashb
Copy link
Member

ashb commented Jun 9, 2020

@mik-laj Are you happy with c973a07?

@mik-laj
Copy link
Member

mik-laj commented Jun 10, 2020

@ashb It looks good, but I don't know if it suits our code style. We use explicit joins instead of relying on ORM generated more often. Generating joints by ORM together with the use of multi-column primary keys can lead to spaghetti. Each model will have a relationship with each model because many values are repeated. We only use ORM-based relationships in one place - DagModel.tags It fits here because we can easily retrieve the object(DagModel) and metadata. (DagTag). In other places, we need careful control to ensure good performance.

I also prepared an example.
16f8299

    query = session.query(XCom)
    query = query.filter(and_(XCom.dag_id == dag_id,
                              XCom.task_id == task_id,
                              XCom.key == xcom_key))
    query = query.join(DR, and_(XCom.dag_id == DR.dag_id, XCom.execution_date == DR.execution_date))
    query = query.filter(DR.run_id == dag_run_id)

    q_object = query.one_or_none()

@ashb
Copy link
Member

ashb commented Jun 10, 2020

It looks good, but I don't know if it suits our code style. We use explicit joins instead of relying on ORM generated more often.

@mik-laj I agree we don't currently use relationships; I guess I am proposing that we currently start using them. The queries generated is the same/similar. Using explicit relationships feels like a good idea to me, as it documents "yes, this is an 'official' relationship between these two tables, not just something that happens to have the same named columns".

And the example query I had in my commit message was more complex than we needed. This works:

ti = s.query(TI).filter_by(
    dag_id='example_dag1', task_id='print_date2'
).join(TI.dag_run).filter_by(
    run_id='scheduled__2020-06-08T11:00:00+00:00'
).one_or_none()

@mik-laj
Copy link
Member

mik-laj commented Jun 10, 2020

@ashb Personally, I prefer explicite joins. Most of our contributors are SQL experts, so the current approach for them is more natural. We can start using them, but I think it's worth trying to migrate some of the existing code to find out the real limitations and problems. We may not have the shortest code now, but it is readable by our contributors and trusted.

We already had a similar situation in the project. We wanted to start using assertions, fixtures and other new solutions for writing the test. We had discussions and votes on whether this is a good idea. Few people who tried to use it in practice. I wanted to use it for writing the API and gave up.
There were two main problems. assert statements The assertion expression displays much less readable error messages. Fixtures are not compatible with provide_session decorator.

Based on this experience, I think it's best to introduce a new idea by trying to rewrite part of the code. Otherwise, the idea you are proposing affects many users, so it will be better when we discuss it on the mailing list.

@ashb
Copy link
Member

ashb commented Jun 10, 2020

Most of our contributors are SQL experts

Maybe experts at making SQL do what they need, but given the number of n+1 queries you had to fix, maybe not expert at using SQL efficiently 😉

@randr97 randr97 requested a review from mik-laj June 13, 2020 19:27
@randr97 randr97 marked this pull request as ready for review June 13, 2020 19:27
airflow/models/dagrun.py Outdated Show resolved Hide resolved
@randr97 randr97 requested a review from mik-laj June 17, 2020 17:54
@randr97 randr97 changed the title [WIP] Read only endpoint for XCom #8134 Read only endpoint for XCom #8134 Jun 17, 2020
@randr97 randr97 requested a review from mik-laj June 19, 2020 16:51
randr97 and others added 4 commits June 23, 2020 20:08
This lets us select like this:

```
(
    s.query(DagRun)
    .filter_by(dag_id=='example_dag1', run_id=='scheduled__2020-06-08T11:00:00+00:00')
    .join(DagRun.task_instances)
    .with_entities(TaskInstance).filter_by(task_id='print_date2')
```

This lets us easily go from the URL arguments of the API to the task
instance(s)
@ephraimbuddy ephraimbuddy mentioned this pull request Jun 25, 2020
6 tasks
@mik-laj mik-laj merged commit 5744a47 into apache:master Jun 25, 2020
kaxil pushed a commit to kaxil/airflow that referenced this pull request Jun 27, 2020
Co-authored-by: Ash Berlin-Taylor <[email protected]>
Co-authored-by: Kamil Breguła <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API Endpoints - Read-only - XCOM
6 participants