Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added Spark support for Delta and Avro #2757

Conversation

creativedutchmen
Copy link
Contributor

When using spark as your offline store, it's useful to be able to read from Delta tables, so that you get a consistent view of the data, even when writing and reading at the same time (eg when using Spark Streaming to stream updates to the table, and materialising every 10 minutes).

Since we're calling spark_session.read.format(<FORMAT>), which is the preferred way of reading delta and Avro, this should work without further changes.

I've added Avro support because some (cloud) products support writing the contents of a topic to Avro automatically - being able to materialise from this persisted stream could be useful if you don't feel like building it yourself.

@codecov-commenter
Copy link

codecov-commenter commented Jun 4, 2022

Codecov Report

Merging #2757 (89579b7) into master (0d195c4) will decrease coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #2757      +/-   ##
==========================================
- Coverage   80.59%   80.55%   -0.05%     
==========================================
  Files         173      173              
  Lines       15006    15008       +2     
==========================================
- Hits        12094    12089       -5     
- Misses       2912     2919       +7     
Flag Coverage Δ
integrationtests 70.65% <100.00%> (-0.38%) ⬇️
unittests 59.52% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...stores/contrib/spark_offline_store/spark_source.py 61.15% <100.00%> (+0.56%) ⬆️
sdk/python/tests/utils/online_read_write_test.py 93.54% <0.00%> (-6.46%) ⬇️
...ion/registration/test_stream_feature_view_apply.py 92.85% <0.00%> (-3.58%) ⬇️
.../integration/online_store/test_online_retrieval.py 96.84% <0.00%> (-3.16%) ⬇️
...ython/feast/embedded_go/online_features_service.py 89.14% <0.00%> (-0.78%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d195c4...89579b7. Read the comment docs.

Copy link
Collaborator

@adchia adchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adchia, creativedutchmen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@feast-ci-bot feast-ci-bot merged commit 7d16516 into feast-dev:master Jun 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants