-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caffeine Cache #66
Caffeine Cache #66
Conversation
Hi, Noah! Thanks for contributing this, I will try to find time to consider it this weekend. In general, this sort of thing would benefit from discussing on the community Zulip stream before spending too much time on. For example, the support for Version 6 is necessary for Beat Link to work inside Max/MSP, for afterglow-max, and so I would be hesitant to give it up simply to replace a library that is otherwise working just fine. Until you explained that it was necessary to work with native-image, I was leaning towards rejecting it out of hand. The ability to work with native-image does make it more likely to be something to consider, but that world is completely outside the scope of my own projects for which I created this library, because they use Clojure and rely on |
Thanks for the consideration. I didn't spend too much time on this yet, beyond putting in necessary effort to get a proof of concept of a native compile working, which I believe I have fully proven out now. I plan to share my findings on how I got it working soon. I totally understand if version I now can see some other possibilities too. I believe I need a way to avoid ConcurrentLinkedHashMap at runtime, which likely could also be accomplished through other changes to ArtFinder such as a way to disable the cache, or a pluggable interface. Would you like all discussion about this to move to a Zulip thread? |
The problem with concurrentlinkedhashmap under a native compile is that it crashes when it is initialized in the ArtFinder $ mvn -Pnative native:compile
$ % ./target/beatlink
Exception in thread "main" java.lang.NoSuchFieldError: the Unsafe
at com.googlecode.concurrentlinkedhashmap.ConcurrentHashMapV8$1.run(ConcurrentHashMapV8.java:4150)
at com.googlecode.concurrentlinkedhashmap.ConcurrentHashMapV8$1.run(ConcurrentHashMapV8.java:4140)
at [email protected]/java.security.AccessController.executePrivileged(AccessController.java:114)
at [email protected]/java.security.AccessController.doPrivileged(AccessController.java:571)
at com.googlecode.concurrentlinkedhashmap.ConcurrentHashMapV8.getUnsafe(ConcurrentHashMapV8.java:4139)
at com.googlecode.concurrentlinkedhashmap.ConcurrentHashMapV8.<clinit>(ConcurrentHashMapV8.java:4101)
at com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap.<init>(ConcurrentLinkedHashMap.java:221)
at com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap.<init>(ConcurrentLinkedHashMap.java:104)
at com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Builder.build(ConcurrentLinkedHashMap.java:1598)
at org.deepsymmetry.beatlink.data.ArtFinder.<init>(ArtFinder.java:240)
at org.deepsymmetry.beatlink.data.ArtFinder.<clinit>(ArtFinder.java:627)
at net.mixable.vizlab.beatl.BeatlApplication.startBeatLinkListeners(BeatlApplication.java:82)
at net.mixable.vizlab.beatl.BeatlApplication.main(BeatlApplication.java:60)
at [email protected]/java.lang.invoke.LambdaForm$DMH/sa346b79c.invokeStaticInit(LambdaForm$DMH) |
This discussion is being mirrored to Zulip thanks to the GitHub integration bot, so we can continue it here. If someone there wants to weigh in, they can either do so there and I can convey it over to GitHub, or they can create a GitHub account if they don’t already have one to comment here directly. I was thinking a bit about this overnight, and one of the things I was curious about is what could possibly make I see some ways forward. First, I could abandon compatibility with afterglow-max, and release a new major version that requires Java 8. There are probably not that many people who are still running in that environment, and they can perhaps stick to older versions of Beat Link. But I see this as a last resort. Another possibility which I like better is if you could fork the |
The error comes from an "Unsafe" thing in concurrentlinkedhashmap. Some breadcrumbs:
This makes sense to me and seems like the pragmatic path forward. For now I'll work off my fork/branch, then circle back to this with a solution that maintains Java 6 compatibility. |
@ben-manes if you have any more advice or ideas for a path that supports native and Java 6 I'm all ears. Truly appreciate all the open source work you all have done here. |
The last version of ConcurrentLinkedHashMap switched to a the Java 8 rewrite of ConcurrentHashMap, which was made available 3 years prior to the formal release. That version predated lambdas (as competing designs were being debated) and Java's multi-year release cycles were in limbo due to Sun/Oracle/Google mayhem. The speed up was very significant and the library had users where that mattered to them (e.g. Apache Cassandra). Internally the library doesn't use any V8 features so switching back to the JDK's is a one line change. Those who wanted to embed it have simply forked it into their code with that modification, e.g. Groovy. For a small, quick-and-dirty, easy to maintain cache then I'd recommend using Clock / SecondChance (1960s). That is a fifo with a per-entry boolean indicating it was hit and on eviction will re-enqueues those entries with that flag unset. This gives an approximate LRU that supports concurrent reads at the worst case cost of an O(n) eviction. That's perfectly fine if the cache is small and if very large then a scan threshold might be preferred (rarely evict a hot entry by mistake to avoid GC pause like hiccups). There are variations that approximate MRU / LFU, but get complex as they try to become broadly robust to different workloads. It is probably the most popular approach, e.g. Postgres' buffer pool (m-bit clock sweep) and Linux's page cache ("double clock" variant of ClockPro). You can see a simple example that I wrote in response to a user (I think Apache Cassandra) building an early alpha jar of CLHM from source that was not thread-safe yet (as exploring algorithms). You can write one yourself in 20 or so LOC. Another neat approach is to randomly sampling: select K entries and evict the one with the lowest utility value (e.g. recency timestamp, frequency count). The reads are concurrent as they only have to update the entry for an approx. LRU and a small K value is good enough (redis uses 5). It is also easy to implement, e.g. maintain an array of keys and on insert + eviction then new entry's key takes the victim's slot. This approach is old and popular, but is probably not as well known because it doesn't offer much flexibility to improve the hit rate beyond an approx. LRU / LFU. CLHM / Guava / Caffeine adopt a more complex scheme inspired by a database's write-ahead transaction log. Instead of updating the policy immediately (as in LRU every read is a write to the global order), the events are buffered in intermediate queues and replayed against the policy in a non-blocking fashion. We use lock-free ring buffers that are hashed to by the thread's id, discard events when full, and drained if full or on a write. This captures a sample of the reads, allows for using any non-threadsafe eviction policy like an LRU, and an O(1) policy avoids surprising degradation scenarios like Clock's. It took a long time until I actually investigated alternative policies, where Caffeine's is much more advanced. It is very helpful for a general purpose library where you don't know the workloads or performance requirements of your users so it must be perfect for everyone. Pragmatically it is overkill for a single purpose cache where you want something very simple to write and maintain, so Clock or random sampling are good enough. |
Thanks for all this research, history and ideas! I very much agree that these impressive libraries have performance concerns that are worlds beyond our needs, so we should be able to go with something simple, focused, and minimal, and not run into problems. And even though I personally have no need for compatibility with native-image, I would like the library to be usable in as many contexts as possible, so I am happy to help get it there. |
I am taking a crack at further simplifying and embedding the clock/second-chance example Ben so kindly shared above. Noah, I am curious, what cool thing are you building that will use Beat Link in a native-image context? 😄 |
There is so little concurrency pressure on this cache that we can get by with a handful of synchronized methods that do the coordinated work of updating the hash maps of data, usage flags, and the eviction queue. I have started 7.4.0-SNAPSHOT builds for the next release which incorporate this change, and remove the dependency on It appears to be working fine for me in Beat Link Trigger. Please let me know if it also works for you in native-image land. If you need me to cut a formal release of this change so you can release something of your own, let me know. Otherwise I will leave it as a snapshot while I am working on version 7.4.0 of BLT, in case there are any other coordinated changes I want to make. Unfortunately there wasn’t really anything for me to directly use from this PR, but I did cite it and acknowledge you in the change log. |
@brunchboy my initial test is looking good! With Your change looks great to me, replacing an old dependency with a really simple solution. Thanks for the new release, thanks for the credit, and major thanks to Ben for his advice! |
FYI I'm integrating beat-link into an DJ and VJ recording and performance app I'm building with the Tauri cross-platform app framework. A single binary built with native-image greatly helps with running it as a "sidecar". |
That sounds awesome! I hope you will share a link to your project so I can highlight it once it is ready for public consumption. Meanwhile, shall we close this PR? |
Closing this. Thanks again. I will circle back to document the native compile steps I figured out, and to share my project integrating beat-link when its ready. |
Use
com.github.ben-manes.caffeine
.According to upstream docs:
Caffeine supports native-image which
com.googlecode.concurrentlinkedhashmap
does not.Tested locally with: