-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching translations #201
Comments
I am not too familiar with the design of We want to have:
Why do we need to hash marian words? We just need to worry about the input string (With sentence boundaries). We can assume the output is going to be the same every time?
The bergamot-translator itself is not thread safe. We can have a single thread populate the cache before returning the callback. |
What do you mean bergamot-translator is not thread safe? One can submit multiple translation tasks concurrently and get them back. I thought I fixed that. There are concurrent hash table implementations out there. An ersatz version of this is bucket-level (or band-level) read-write locks instead of slapping a mutex around the whole thing. |
Oh, didn't know it was fixed |
I'm bringing in the draft implementation, for a more concrete picture. Working with sentences might be possible with some reorganization.
I haven't accounted for single writer at the end flush updating all histories in cache. Seems like the better thing to do. Will incorporate.
Any references you recommend? Is there an eviction policy that we agree on and in which case what data structure for the ordering? Also isn't updating the ordering write every time it's a cache fetch or update? I'm hoping I can fix the API so the integrations work now with the naive implementation when we replace |
Hey. We had a discussion with Jerin about the cache implementation. Here are the main points:
My suggestion would be to try to fit Microsoft's L4 for our needs if it is easy enough to strip boost out of it. |
https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic2 would be nice but it's C++20. |
Oh, we fixed this in #227. |
For potential use in outbound translation (#78) and https://github.com/XapaJIaMnu/translateLocally, we're implementing a translation caching mechanism.
The plan is to implement a simple LRU cache (using a map + linked list) with
std::string
or equivalentmarian::Words
History
It is agreed upon offline that the value is
History
. Sincemarian::Words
are what generates history, I propose we just implement a hash formarian::Words
and use this as key. With this cache is only marian primitives + the cache code.The easiest way to get this into the existing source is during the preparation of a
Request
.Segments = marian::Words
and we can have a stage during construction where we pre-fill the histories.bergamot-translator/src/translator/request.cpp
Lines 19 to 29 in cb855be
The iteration mechanism provided to convert to a batch can now skip the entries where
Histories
are already computed (notnullptr
). The rest of the pipeline (Response
construction matching theRequest
etc) will follow.This means an addition of
isCacheCompleted()
onRequestSentence
relaying back toisCacheCompleted(size_t index)
onRequest
for use in the batch preparation below.bergamot-translator/src/translator/batcher.cpp
Lines 47 to 54 in cb855be
The cache will have to be thread-safe since multiple async calls can potentially read simultaneously. A cache fetch will update recently used (write), so slapping a mutex for access is reasonable(?).
The cache will be populated on
processHistory
atRequest
:bergamot-translator/src/translator/request.cpp
Lines 37 to 47 in cb855be
cc @kpu @XapaJIaMnu
The text was updated successfully, but these errors were encountered: