Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-2981] Fix TypeError in dataflow operators #3831

Merged
merged 1 commit into from
Sep 1, 2018

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Sep 1, 2018

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    The GoogleCloudBucketHelper.google_cloud_to_local function attempts to compare a list to an int, resulting in the TypeError, with:
...
path_components = file_name[self.GCS_PREFIX_LENGTH:].split('/')
if path_components < 2:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
  • GoogleCloudBucketHelperTest. test_invalid_object_path

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

  • Passes git diff upstream/master -u -- "*.py" | flake8 --diff

- Fix TypeError in dataflow operators when using GCS jar or py_file
@kaxil kaxil requested review from ashb, Fokko and feng-tao September 1, 2018 01:58
@kaxil
Copy link
Member Author

kaxil commented Sep 1, 2018

cc @fenglu-g @tswast

@codecov-io
Copy link

codecov-io commented Sep 1, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@f279151). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #3831   +/-   ##
=========================================
  Coverage          ?   77.43%           
=========================================
  Files             ?      203           
  Lines             ?    15846           
  Branches          ?        0           
=========================================
  Hits              ?    12271           
  Misses            ?     3575           
  Partials          ?        0

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f279151...8aa8eb4. Read the comment docs.

Copy link
Member

@feng-tao feng-tao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kaxil kaxil merged commit db9bb7f into apache:master Sep 1, 2018
jeffkpayne pushed a commit to bomboradata/bombora-incubator-airflow that referenced this pull request Sep 1, 2018
- Fix TypeError in dataflow operators when using GCS jar or py_file
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Sep 4, 2018
add 8fit to list of companies

[AIRFLOW-XXX] Add THE ICONIC to the list of orgs using Airflow

Closes apache#3807 from ksaagariconic/patch-2

[AIRFLOW-2933] Enable Codecov on Docker-CI Build (apache#3780)

- Add missing variables and use codecov instead of coveralls.
  The issue why it wasn't working was because missing environment variables.
  The codecov library heavily depends on the environment variables in
  the CI to determine how to push the reports to codecov.

- Remove the explicit passing of the variables in the `tox.ini`
  since it is already done in the `docker-compose.yml`,
  having to maintain this at two places makes it brittle.

- Removed the empty Codecov yml since codecov was complaining that
  it was unable to parse it

[AIRFLOW-2960] Pin boto3 to <1.8 (apache#3810)

Boto 1.8 has been released a few days ago and they break our tests.

[AIRFLOW-2957] Remove obselete sensor references

[AIRFLOW-2959] Refine HTTPSensor doc (apache#3809)

HTTP Error code other than 404,
or Connection Refused, would fail the sensor
itself directly (no more poking).

[AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test (apache#3811)

Simplify this test since it takes up 15% of all the time. This is because
every example dag, with some exclusions, are backfilled. This will put some
pressure on the scheduler and everything. If the test just covers a couple
of dags should be sufficient

254 seconds:
[success] 15.03% tests.BackfillJobTest.test_backfill_examples: 254.9323s

[AIRFLOW-XXX] Remove residual line in Changelog (apache#3814)

[AIRFLOW-2930] Fix celery excecutor scheduler crash (apache#3784)

Caused by an update in PR apache#3740.
execute_command.apply_async(args=command, ...)
-command is a list of short unicode strings and the above code pass multiple
arguments to a function defined as taking only one argument.
-command = ["airflow", "run", "dag323",...]
-args = command = ["airflow", "run", "dag323", ...]
-execute_command("airflow","run","dag3s3", ...) will be error and exit.

[AIRFLOW-2916] Arg `verify` for AwsHook() & S3 sensors/operators (apache#3764)

This is useful when
1. users want to use a different CA cert bundle than the
  one used by botocore.
2. users want to have '--no-verify-ssl'. This is especially useful
  when we're using on-premises S3 or other implementations of
  object storage, like IBM's Cloud Object Storage.

The default value here is `None`, which is also the default
value in boto3, so that backward compatibility is ensured too.

Reference:
https://boto3.readthedocs.io/en/latest/reference/core/session.html

[AIRFLOW-2709] Improve error handling in Databricks hook (apache#3570)

* Use float for default value
* Use status code to determine whether an error is retryable
* Fix wrong type in assertion
* Fix style to prevent lines from exceeding 90 characters
* Fix wrong way of checking exception type

[AIRFLOW-2854] kubernetes_pod_operator add more configuration items (apache#3697)

* kubernetes_pod_operator add more configuration items
* fix test_kubernetes_pod_operator test_faulty_service_account failure case
* fix review comment issues
* pod_operator add hostnetwork config
* add doc example

[AIRFLOW-2994] Fix command status check in Qubole Check operator (apache#3790)

[AIRFLOW-2928] Use uuid4 instead of uuid1 (apache#3779)

for better randomness.

[AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators (apache#3828)

[AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators (apache#3828)

[AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators (apache#3828)

[AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators (apache#3828)

[AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators (apache#3828)

[AIRFLOW-2993] Added sftp_to_s3 and s3_to_sftp operators (apache#3828)

[AIRFLOW-2993] Added sftp_to_s3 and s3_to_sftp operators (apache#3828)

[AIRFLOW-2949] Add syntax highlight for single quote strings (apache#3795)

* AIRFLOW-2949: Add syntax highlight for single quote strings

* AIRFLOW-2949: Also updated new UI main.css

[AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (apache#3793)

There may be different combinations of arguments, and
some processings are being done 'silently', while users
may not be fully aware of them.

For example
- User only needs to provide either `ssh_hook`
  or `ssh_conn_id`, while this is not clear in doc
- if both provided, `ssh_conn_id` will be ignored.
- if `remote_host` is provided, it will replace
  the `remote_host` which wasndefined in `ssh_hook`
  or predefined in the connection of `ssh_conn_id`

These should be documented clearly to ensure it's
transparent to the users. log.info() should also be
used to remind users and provide clear logs.

In addition, add instance check for ssh_hook to ensure
it is of the correct type (SSHHook).

Tests are updated for this PR.

[AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

[AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

[AIRFLOW-2984] Convert operator dates to UTC (apache#3822)

Tasks can have start_dates or end_dates separately
from the DAG. These need to be converted to UTC otherwise
we cannot use them for calculation the next execution
date.

[AIRFLOW-2779] Make GHE auth third party licensed (apache#3803)

This reinstates the original license.

[AIRFLOW-XXX] Add Format to list of companies (apache#3824)

[AIRFLOW-2993] Added sftp_to_s3 and s3_to_sftp operators (apache#3828)

[AIRFLOW-2993] Added sftp_to_s3 and s3_to_sftp operators (apache#3828)

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-2900] Show code for packaged DAGs (apache#3749)

[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821)

[AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op (apache#3825)

Add param to set bootDiskType for master and
worker nodes in `DataprocClusterCreateOperator`

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-2993] Fix Docstrings for Operators (apache#3828)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-2993] sftp_to_s3 and s3_to_sftp Operators (apache#3828)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

[AIRFLOW-2900] Show code for packaged DAGs (apache#3749)

[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821)

[AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op (apache#3825)

Add param to set bootDiskType for master and
worker nodes in `DataprocClusterCreateOperator`

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-2993] Fix Docstrings for Operators (apache#3828)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-2993] sftp_to_s3 and s3_to_sftp Operators (apache#3828)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-2993] Addition of s3_to_sftp and sftp_to_s3 operators.

Add 'steps' into template_fields in EmrAddSteps

Rendering templates which are in steps is especially useful if you
want to pass execution time as one of the paramaters of a step in
an EMR cluster. All fields in template_fields will get rendered.

[AIRFLOW-1762] Implement key_file support in ssh_hook create_tunnel

Switched to using sshtunnel package instead of
popen approach

Closes apache#3473 from NielsZeilemaker/ssh_hook

Addition of s3_to_sftp and sftp_to_s3 operators.

[AIRFLOW-2993] Renamed operators to meet name length requirements.

[AIRFLOW-2993] Renamed operators to meet name length requirements (apache#3828)

[AIRFLOW-2993] Corrected flake8 line diff format (apache#3828)

[AIRFLOW-2993] Corrected flake8 line diff format (apache#3828)

[AIRFLOW-2900] Show code for packaged DAGs (apache#3749)

[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821)

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

[AIRFLOW-2993] Corrected flake8 line diff format (apache#3828)

[AIRFLOW-2949] Add syntax highlight for single quote strings (apache#3795)

* AIRFLOW-2949: Add syntax highlight for single quote strings

* AIRFLOW-2949: Also updated new UI main.css

[AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

[AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

[AIRFLOW-2984] Convert operator dates to UTC (apache#3822)

Tasks can have start_dates or end_dates separately
from the DAG. These need to be converted to UTC otherwise
we cannot use them for calculation the next execution
date.

[AIRFLOW-2779] Make GHE auth third party licensed (apache#3803)

This reinstates the original license.

[AIRFLOW-XXX] Add Format to list of companies (apache#3824)

[AIRFLOW-2900] Show code for packaged DAGs (apache#3749)

[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821)

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

[AIRFLOW-2949] Add syntax highlight for single quote strings (apache#3795)

* AIRFLOW-2949: Add syntax highlight for single quote strings

* AIRFLOW-2949: Also updated new UI main.css

[AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

[AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

[AIRFLOW-2984] Convert operator dates to UTC (apache#3822)

Tasks can have start_dates or end_dates separately
from the DAG. These need to be converted to UTC otherwise
we cannot use them for calculation the next execution
date.

[AIRFLOW-2779] Make GHE auth third party licensed (apache#3803)

This reinstates the original license.

[AIRFLOW-XXX] Add Format to list of companies (apache#3824)

[AIRFLOW-2900] Show code for packaged DAGs (apache#3749)

[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821)

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

[AIRFLOW-2949] Add syntax highlight for single quote strings (apache#3795)

* AIRFLOW-2949: Add syntax highlight for single quote strings

* AIRFLOW-2949: Also updated new UI main.css

[AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (apache#3793)

There may be different combinations of arguments, and
some processings are being done 'silently', while users
may not be fully aware of them.

For example
- User only needs to provide either `ssh_hook`
  or `ssh_conn_id`, while this is not clear in doc
- if both provided, `ssh_conn_id` will be ignored.
- if `remote_host` is provided, it will replace
  the `remote_host` which wasndefined in `ssh_hook`
  or predefined in the connection of `ssh_conn_id`

These should be documented clearly to ensure it's
transparent to the users. log.info() should also be
used to remind users and provide clear logs.

In addition, add instance check for ssh_hook to ensure
it is of the correct type (SSHHook).

Tests are updated for this PR.

[AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

[AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

[AIRFLOW-2984] Convert operator dates to UTC (apache#3822)

Tasks can have start_dates or end_dates separately
from the DAG. These need to be converted to UTC otherwise
we cannot use them for calculation the next execution
date.

[AIRFLOW-2779] Make GHE auth third party licensed (apache#3803)

This reinstates the original license.

[AIRFLOW-XXX] Add Format to list of companies (apache#3824)

[AIRFLOW-2900] Show code for packaged DAGs (apache#3749)

[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821)

[AIRFLOW-2989] Add param to set bootDiskType in Dataproc Op (apache#3825)

Add param to set bootDiskType for master and
worker nodes in `DataprocClusterCreateOperator`

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

[AIRFLOW-2949] Add syntax highlight for single quote strings (apache#3795)

* AIRFLOW-2949: Add syntax highlight for single quote strings

* AIRFLOW-2949: Also updated new UI main.css

[AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (apache#3793)

There may be different combinations of arguments, and
some processings are being done 'silently', while users
may not be fully aware of them.

For example
- User only needs to provide either `ssh_hook`
  or `ssh_conn_id`, while this is not clear in doc
- if both provided, `ssh_conn_id` will be ignored.
- if `remote_host` is provided, it will replace
  the `remote_host` which wasndefined in `ssh_hook`
  or predefined in the connection of `ssh_conn_id`

These should be documented clearly to ensure it's
transparent to the users. log.info() should also be
used to remind users and provide clear logs.

In addition, add instance check for ssh_hook to ensure
it is of the correct type (SSHHook).

Tests are updated for this PR.

[AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

[AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

[AIRFLOW-2984] Convert operator dates to UTC (apache#3822)

Tasks can have start_dates or end_dates separately
from the DAG. These need to be converted to UTC otherwise
we cannot use them for calculation the next execution
date.

[AIRFLOW-2779] Make GHE auth third party licensed (apache#3803)

This reinstates the original license.

[AIRFLOW-XXX] Add Format to list of companies (apache#3824)

[AIRFLOW-2900] Show code for packaged DAGs (apache#3749)

[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821)

[AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817)

Add hooks for:
- cluster start,
- restart,
- terminate.
Add unit tests for the added hooks.
Add hooks for cluster start, restart and terminate.
Add unit tests for the added hooks.
Add cluster_id variable for performing cluster operation tests.

[AIRFLOW-XXX] Fix Docstrings for Operators (apache#3820)

[AIRFLOW-2994] Fix flatten_results for BigQueryOperator (apache#3829)

[AIRFLOW-2951] Update dag_run table end_date when state change (apache#3798)

The existing airflow only change dag_run table end_date value when
a user teminate a dag in web UI. The end_date will not be updated
if airflow detected a dag finished and updated its state.

This commit add end_date update in DagRun's set_state function to
make up tho problem mentioned above.

[AIRFLOW-2145] fix deadlock on clearing running TI (apache#3657)

a `shutdown` task is not considered be `unfinished`, so a dag run can
deadlock when all `unfinished` downstreams are all waiting on a task
that's in the `shutdown` state. fix this by considering `shutdown` to
be `unfinished`, since it's not truly a terminal state

[AIRFLOW-2981] Fix TypeError in dataflow operators (apache#3831)

- Fix TypeError in dataflow operators when using GCS jar or py_file

[AIRFLOW-XXX] Fix typo in docstring of gcs_to_bq (apache#3833)

[AIRFLOW-2476] Allow tabulate up to 0.8.2 (apache#3835)

[AIRFLOW-XXX] Fix typos in faq.rst (apache#3837)

[AIRFLOW-2979] Make celery_result_backend conf Backwards compatible (apache#3832)

(apache#2806) Renamed `celery_result_backend` to `result_backend` and broke backwards compatibility.

[AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI (apache#3804)

[AIRFLOW-491] Add feature to pass extra api configs to BQ Hook (apache#3733)

[AIRFLOW-208] Add badge to show supported Python versions (apache#3839)

[AIRFLOW-2993] Added sftp_to_s3 operator and s3_to_sftp operator. (apache#3828)
ashb pushed a commit to ashb/airflow that referenced this pull request Oct 4, 2018
- Fix TypeError in dataflow operators when using GCS jar or py_file
ashb pushed a commit to ashb/airflow that referenced this pull request Oct 22, 2018
- Fix TypeError in dataflow operators when using GCS jar or py_file
galak75 pushed a commit to VilledeMontreal/incubator-airflow that referenced this pull request Nov 23, 2018
- Fix TypeError in dataflow operators when using GCS jar or py_file
aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019
- Fix TypeError in dataflow operators when using GCS jar or py_file
@kaxil kaxil deleted the AIRFLOW-2981-fix-dataproc-type-error branch January 8, 2019 23:18
cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Jan 23, 2019
- Fix TypeError in dataflow operators when using GCS jar or py_file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants