Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/able to retry all connections #194

Merged
merged 10 commits into from
Aug 13, 2021
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
### Fixes
- Add pyodbc import error message to dbt.exceptions.RuntimeException to get more detailed information when running `dbt debug` ([#192](https://github.com/dbt-labs/dbt-spark/pull/192))
- Add support for ODBC Server Side Parameters, allowing options that need to be set with the `SET` statement to be used ([#201](https://github.com/dbt-labs/dbt-spark/pull/201))
- Add `retry_all` configuration setting to retry all connection issues, not just when the `_is_retryable_error` function determines ([#194](https://github.com/dbt-labs/dbt-spark/pull/194))

### Contributors
- [@JCZuurmond](https://github.com/JCZuurmond) ([#192](https://github.com/fishtown-analytics/dbt-spark/pull/192))
- [@jethron](https://github.com/jethron) ([#201](https://github.com/fishtown-analytics/dbt-spark/pull/201))
- [@gregingenii](https://github.com/gregingenii) ([#194](https://github.com/dbt-labs/dbt-spark/pull/194))

## dbt-spark 0.21.0b1 (August 3, 2021)

Expand Down Expand Up @@ -62,7 +64,7 @@
## dbt-spark 0.19.1b2 (February 26, 2021)

### Under the hood
- update serialization calls to use new API in dbt-core `0.19.1b2` ([#150](https://github.com/fishtown-analytics/dbt-spark/pull/150))
- Update serialization calls to use new API in dbt-core `0.19.1b2` ([#150](https://github.com/fishtown-analytics/dbt-spark/pull/150))

## dbt-spark 0.19.0.1 (February 26, 2021)

Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ A dbt profile for Spark connections support the following configurations:
| connect_timeout | The number of seconds to wait before retrying to connect to a Pending Spark cluster | ❌ | ❔ (`10`) | ❔ (`10`) | `60` |
| connect_retries | The number of times to try connecting to a Pending Spark cluster before giving up | ❌ | ❔ (`0`) | ❔ (`0`) | `5` |
| use_ssl | The value of `hive.server2.use.SSL` (`True` or `False`). Default ssl store (ssl.get_default_verify_paths()) is the valid location for SSL certificate | ❌ | ❔ (`False`) | ❌ | `True` |
| retry_all | Whether to retry all failed connections, and not just 'retryable' ones | ❌ | ❔ (`false`) | ❔ (`false`) | `false` |

**Databricks** connections differ based on the cloud provider:

Expand Down Expand Up @@ -124,6 +125,7 @@ your_profile_name:
kerberos_service_name: hive
connect_retries: 5
connect_timeout: 60
retry_all: true
```


Expand All @@ -145,6 +147,7 @@ your_profile_name:
# optional
connect_retries: 5
connect_timeout: 60
retry_all: true
```


Expand Down Expand Up @@ -251,6 +254,7 @@ spark-testing:
schema: analytics
connect_retries: 5
connect_timeout: 60
retry_all: true
```

Connecting to the local spark instance:
Expand Down
11 changes: 11 additions & 0 deletions dbt/adapters/spark/connections.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ class SparkCredentials(Credentials):
connect_timeout: int = 10
use_ssl: bool = False
server_side_parameters: Dict[str, Any] = field(default_factory=dict)
retry_all: bool = False

gregingenii marked this conversation as resolved.
Show resolved Hide resolved
@classmethod
def __pre_deserialize__(cls, data):
Expand Down Expand Up @@ -454,6 +455,16 @@ def open(cls, connection):
)
logger.warning(msg)
time.sleep(creds.connect_timeout)
elif creds.retry_all and creds.connect_retries > 0:
msg = (
f"Warning: {getattr(exc, 'message', 'No message')}, "
f"retrying due to 'retry_all' configuration "
f"set to true.\n\tRetrying in "
f"{creds.connect_timeout} seconds "
f"({i} of {creds.connect_retries})"
)
logger.warning(msg)
time.sleep(creds.connect_timeout)
else:
raise dbt.exceptions.FailedToConnectException(
'failed to connect'
Expand Down