Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache for translations #227

Merged
merged 31 commits into from
Oct 27, 2021
Merged

Cache for translations #227

merged 31 commits into from
Oct 27, 2021

Conversation

jerinphilip
Copy link
Contributor

Mostly copied over from #202 (comment), with a few modifications.

@jerinphilip jerinphilip changed the title Cache with std::atomic on shared-pointers Cache with atomic on shared-pointers Oct 9, 2021
@kpu
Copy link
Member

kpu commented Oct 9, 2021

I had designed it with the intention you could take an existing shared_ptr object and stuff it in the cache. Is that a useful case?

@jerinphilip
Copy link
Contributor Author

Is that a useful case?

Yes? Start with Ptr<History> don't convert it since you fiercely hate conversions? For us:

using TranslationCache = AtomicCache<size_t, Ptr<History>>;

I'm still suspicious how there won't be any ABA in this cache, but doesn't seem to violate anything in the unit-tests (which are not that big, TBH). We are not editing one memory location, but a whole lot of it. Instructions (which enable lock-free atomics) operate at 1 mem-loc at a time, does it not? If it were this simple, why cant I find this everywhere online, I wonder?

@kpu
Copy link
Member

kpu commented Oct 9, 2021

I hate unnecessary conversions. I also hate unnecessary memory allocations especially if they are shared_ptr that doubles the memory allocations.

Do I really need shared_ptr<AtomicCache::Record<size_t, shared_ptr<History> > ?

Do not understand your concern about ABA. The object pointed to is kept alive the entire time it's in custody of a shared_ptr so nobody can create a second object there.

@jerinphilip
Copy link
Contributor Author

jerinphilip commented Oct 9, 2021

Do I really need shared_ptr<AtomicCache::Record<size_t, shared_ptr<History> > ?

I ran into std::atomic<..> requires is_trivially_copyable with AtomicCache::Record<size_t, shared_ptr<History>. 15ee836 (#227)

Even without Ptr<History> as value (AtomicCache<int, int>) instead, I am getting linking failures to atomic_load and atomic_store. aa6320a (#227).

I'm not sure where I'm headed, but here's some documentation. Looks like we have conversion, and ProcessedRequestSentence / storage is staying.

https://en.cppreference.com/w/cpp/atomic/atomic#Primary_template

The primary std::atomic template may be instantiated with any TriviallyCopyable type T satisfying both CopyConstructible and CopyAssignable. The program is ill-formed if any of following values is false:

https://en.cppreference.com/w/cpp/types/is_trivially_copyable:

Objects of trivially-copyable types that are not potentially-overlapping subobjects are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read().

@kpu
Copy link
Member

kpu commented Oct 9, 2021

What was the problem with the version I wrote using an array of std::shared_ptr<History> ?

@jerinphilip
Copy link
Contributor Author

jerinphilip commented Oct 9, 2021

What was the problem with the version I wrote using an array of std::shared_ptr<History>?

I barely tried to get that one to compile (assuming you wrote in a GitHub comment to provide a rough idea, not after testing), which is 2779090 (#227). Record simply came because the following inconsistency.

Find()
      const std::shared_ptr<Entry> &bucket = entries_[hash_(key) % entries_.size()];
                                                      ^

Store()
	        std::shared_ptr<Entry> &bucket = entries_[hash_(*entry) % entries_.size()];
                                                             ^

2779090 (#227) seems to at-least compile with no link errors (although I haven't tried to instantiate a specialization with size_t from marian::Words key and Ptr<History> value).

@kpu
Copy link
Member

kpu commented Oct 9, 2021

There are two ways to make a store's interface.

  1. Have a key-value struct which should be std::pair just for consistency with the standard library.
  2. Have a single opaque Entry class. The Hash functor is overloaded to accept both Entry and Key. Similarly the Equal functor is overloaded to accept both Entry and Key making a comparison on that. This is what you were missing.

But I think the issue is that the key is not a function of History so much as a function of the input sentence and model used. Therefore you need a size_t key alongside the History object. In that case a key-value store makes sense. It may as well be std::pair<std::size_t, History>. The History object looks moveable. You may as well move (not copy) it out of std::shared_ptr<History> misery into the std::pair<std::size_t, History>. Then we'll have std::vector<std::shared_ptr<std::pair<std::size_t, History> > > as the ultimate type stored by the class (though don't write it that way, use some typedefs).

@jerinphilip
Copy link
Contributor Author

Have a single opaque Entry class. The Hash functor is overloaded to accept both Entry and Key. Similarly the Equal functor is overloaded to accept both Entry and Key making a comparison on that. This is what you were missing.

Nothing about this looks pleasant even after multiple re-reads. I genuinely want to know when this works for a solution - could you please provide a reference to better understand the motivations of such an interface?

As for the previous comment, I've done the following:

using Record = std::pair<Key, Value>;

Do I really need shared_ptr<AtomicCache::Record<size_t, shared_ptr<History> > ?

Because History is not TriviallyCopyable, std::pair<size_t, History> is also not TriviallyCopyable. Set that aside, std::pair is not TriviallyCopyable to begin with. Hence we need std::shared_ptr<std::pair<...>> is my understanding as of now.

@jerinphilip jerinphilip changed the base branch from main to cache October 11, 2021 23:28
@jerinphilip
Copy link
Contributor Author

@kpu The bergamot-translator library part of cache is added now. I've changed base to a development branch, and to avoid major trouble at reviewing 1000+LoC, for review to be done incrementally. There will be broken/untested intermediates. Further changes will have CLI alterations (I can probably settle for a chaotic "union" config, at the cost of more headhurt) which will get me to test-apps (which contribute to a decent LoC). I request a review happen now so I can work on top of this to generate smaller diffs.

Do I really need shared_ptr<AtomicCache::Record<size_t, shared_ptr<History>> ?

This can be avoided by taking responsibility of mutex-buckets, and is done. (Key, Value) = (size_t(model, words), Ptr<History>). Since Ptr<History> is already allocated and cost incurred, an std::move into a History object only adds the value of hiding things (underlying objects are still Ptr Hypothesis linked list inside Beam). I expect WebAssembly to downgrade gracefully (it only complained at OS specific Semaphores in PCQueue). The atomic_load/share proposition internally had mutexes anyway.

std::lock_guard<std::mutex> lock(mutexBuckets_[mutexId]);
Record &candidate = records_[index];

if (!used_[index]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually care about these statistics enough to have an extra vector<bool>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no means to tell if I hit cache or not placing reasonable asserts (at test-apps) without this. Are there alternatives I can use?

The current assert is put in a sentence, get translation. Put same sentence again. assert(cache hits the second time).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The used variable is gone, hit/miss statistics remain.

Copy link
Member

@graemenail graemenail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some art to arriving at the default values for the cache in the different apps using it. It is clear that the BlockingService should have just a single mutex, but the others?

src/translator/cache.h Show resolved Hide resolved
src/translator/cache.h Outdated Show resolved Hide resolved
src/translator/request.cpp Outdated Show resolved Hide resolved
src/tests/units/cache_tests.cpp Outdated Show resolved Hide resolved
src/translator/cache.h Show resolved Hide resolved
src/tests/units/cache_tests.cpp Show resolved Hide resolved
src/translator/cache.h Outdated Show resolved Hide resolved
@jerinphilip
Copy link
Contributor Author

Is there some art to arriving at the default values for the cache in the different apps using it.

My understanding is this is a coarse-to-fine-grained locking control knob. An upper bound could simply be the min(number of processors, workers) simultaneously accessing the shared memory thus providing enough partitions to avoid contention. At 1 bucket, there's max contention. At min(worker, cores) there's hopefully minimum contention.

We did a search for this for L4 (#202 (comment)). Could do if necessary, bit cumbersome to setup.

I see that this should be documented somewhere. Will try to see where best fits.

@graemenail graemenail self-requested a review October 27, 2021 08:17
graemenail
graemenail previously approved these changes Oct 27, 2021
@jerinphilip jerinphilip changed the base branch from cache to main October 27, 2021 18:17
@jerinphilip jerinphilip dismissed graemenail’s stale review October 27, 2021 18:17

The base branch was changed.

@jerinphilip jerinphilip changed the title Cache with atomic on shared-pointers Cache for translations Oct 27, 2021
@jerinphilip jerinphilip merged commit 2b98c67 into browsermt:main Oct 27, 2021
@jerinphilip jerinphilip mentioned this pull request Oct 31, 2021
const Record &candidate = records_[index];
if (equals_(key, candidate.first)) {
value = candidate.second;
stats_.hits += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit late to this party, but shouldn't that be atomic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a mutex above making this section atomic..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, you're right with regards to stats..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::atomic<int>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants