Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flushing Redis DB causes trouble retrieving cached queries #636

Closed
gzagatti opened this issue Apr 16, 2019 · 3 comments · Fixed by #664
Closed

Flushing Redis DB causes trouble retrieving cached queries #636

gzagatti opened this issue Apr 16, 2019 · 3 comments · Fixed by #664
Labels
bug Something isn't working

Comments

@gzagatti
Copy link
Contributor

Describe the bug
After flushing the Redis DB, the retrieval of cached queries through q.get_query() causes a re-computation of the query, that is, a call to q._make_query() instead of simply retrieving the cached table.

This is a bug because whileq.is_stored reports as if the query is stored, when the user attempts to fetch it one finds that the query is being re-calculated.

The problem is basically caused because get_query uses redis to check whether the query has been completed, which will be False if the redis has been flushed:

...
state_machine = QueryStateMachine(self.redis, self.md5)
state_machine.wait_until_complete()
if state_machine.is_completed and self.connection.has_table(
                schema=schema, name=name
):
    try:
        touch_cache(self.connection, self.md5)
   ...

On the other hand, is_stored is simpler and only looks at whether the cache table exists:

...
try:
    schema, name = self.fully_qualified_table_name.split(".")
    return self.connection.has_table(name, schema)

Product
Flowamchine

Version
master branch

To Reproduce

  1. Create a new query: q = Query()
  2. Store the query: q.store()
  3. Flush the DB: q.redis.flush_db()
  4. Check that the query is indeed stored: q.is_stored
  5. Retrieve the query to find to your surprise the query is being recomputed: q.get_query()

Expected behavior
Cached queries should not be recomputed unless the user explicit asks for it. Flushing redis should not mean that the user wants to reset the cache. Rather, the redis database should be updated accordingly.

The call to get_query should set the QueryStateMachine to completed if the cache table exists.

@greenape
Copy link
Member

Ouch. Yep. Obviously ideally one would avoid flushing redis, but clearly the state machine needs to reconcile itself with what’s actually in the cache as well.

@gzagatti
Copy link
Contributor Author

Yep, I agree. For now, I have update redis manually with:

   def update_redis(q):                                                                                                                                                                                                                                                        
       if q.is_stored:                                                                                                                                                                                                                                                         
          m = flowmachine.core.query_state.QueryStateMachine(q.redis, q.md5)                                                                                                                                                                                                  
          m.enqueue()                                                                                                                                                                                                                                                         
          m.execute()                                                                                                                                                                                                                                                         
          m.finish()           

@greenape greenape added the bug Something isn't working label Apr 17, 2019
@greenape
Copy link
Member

Got to be a bit careful here, because if Redis were to be flushed while something transactional was going on, then the state of FlowDB may be misleading. I'd be inclined to say that this should be a manually triggered reconcile, because doing it on the fly with potentially multiple flowmachine instances is gonna be asking for race conditions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants