Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch.StreamingStore behaviour or something alike #31

Closed
imranismail opened this issue Jun 11, 2018 · 4 comments · Fixed by #36
Closed

Elasticsearch.StreamingStore behaviour or something alike #31

imranismail opened this issue Jun 11, 2018 · 4 comments · Fixed by #36

Comments

@imranismail
Copy link

Reason being we could utilize Repo.stream with Repo.transaction with timeout of infinity.

LIMIT + OFFSET is linear when getting the last 100 in a 1 million row table. I'll have to go through the first 99900. Using a cursor or a stream with a timeout of infinity can help in this case.

Right now I avoid having long queries (waiting for the offset to reach 99900) by doing something like this:

        User
        |> select([:name, :email, :phone, :id])
        |> Repo.stream()
        |> Stream.drop(offset)
        |> Enum.take(limit)

But streaming to the end in one shot would be much much preferred.

@imranismail
Copy link
Author

This can also play well when data is ingested from a GenStage producer.

danielberkompas added a commit that referenced this issue Jul 24, 2018
This fixes the performance problem noted in #31 with the offset and
limit strategy. If the store returns a stream, the queries on most SQL
databases will be more efficient.
@danielberkompas
Copy link
Owner

@imranismail I have a PR open to do this: #36. Do you have any feedback?

danielberkompas added a commit that referenced this issue Aug 31, 2018
@cdunn
Copy link
Contributor

cdunn commented Oct 27, 2018

@danielberkompas @imranismail The switch to streams impacts the ability to preload relationships
warning: passing a query with preloads to Repo.stream/2 leads to erratic behaviour and will raise in future Ecto versions
elixir-ecto/ecto#2424
...i'm still looking at how to handle appropriately but just thought I'd mention

@danielberkompas
Copy link
Owner

I think the solution might be to use a database cursor instead of Repo.stream. I did this in Cloak and it seems to work well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants