-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove synchronized
from LocalCache::get(key, loader)
to allow for VirtualThread
-friendly value-loading
#6845
Comments
I would be happy to work on a PR for this myself, If someone can explain to me what exactly the I have been trying to wrap my head around this for some time now (without success), and I can't find a test case that covers this. |
The fix was made before common.cache was open-sourced, so I can't link to a specific commit for it. The fix cited #369. It added the following test, which was removed as part of a switch to "white-box" tests, again before the code was open-sourced: public void testRecursiveDuplicateComputation() throws InterruptedException {
final class RecursiveFunction implements Function<String, String> {
Map<String, String> map;
@Override public String apply(String key) {
return map.get(key);
}
}
final RecursiveFunction computeFunction = new RecursiveFunction();
final ConcurrentMap<String, String> map = new MapMaker()
.makeComputingMap(computeFunction);
computeFunction.map = map;
// tells the test when the compution has completed
final CountDownLatch doneSignal = new CountDownLatch(1);
Thread thread = new Thread() {
@Override public void run() {
try {
map.get("foo");
} finally {
doneSignal.countDown();
}
}
};
thread.setUncaughtExceptionHandler(new UncaughtExceptionHandler() {
@Override public void uncaughtException(Thread t, Throwable e) {}
});
thread.start();
boolean done = doneSignal.await(1, TimeUnit.SECONDS);
if (!done) {
StringBuilder builder = new StringBuilder();
for (StackTraceElement trace : thread.getStackTrace()) {
builder.append("\tat ").append(trace).append('\n');
}
fail(builder.toString());
}
} The diff to the prod code was: @@ -17,6 +17,7 @@
package com.google.common.collect;
import static com.google.common.base.Preconditions.checkNotNull;
+import static com.google.common.base.Preconditions.checkState;
import com.google.common.base.Equivalence;
import com.google.common.base.Function;
@@ -119,7 +120,13 @@
// This thread solely created the entry.
boolean success = false;
try {
- V value = computingValueReference.compute(key, hash);
+ V value = null;
+ // Synchronizes on the entry to allow failing fast when a
+ // recursive computation is detected. This is not full-proof
+ // since the entry may be copied when the segment is written to.
+ synchronized (entry) {
+ value = computingValueReference.compute(key, hash);
+ }
checkNotNull(value, "compute() returned null unexpectedly");
success = true;
return value;
@@ -136,6 +143,7 @@
try {
while (true) {
try {
+ checkState(!Thread.holdsLock(entry), "Recursive computation");
ValueReference<K, V> valueReference = entry.getValueReference();
V value = valueReference.waitForValue();
if (value == null) { The Perhaps we should be using a All that said: Our usual recommendation is to use Caffeine instead of our caches. Caffeine has so far taken the position that it's not worth restructuring the code to support virtual threads because eventually I don't know whether we'd act on a PR here or not. On the one hand, I'm shocked that this is the only use of |
My recollection is that this was added in one of my first bug fixes when learning that code. The original author was on sabbatical and no one else knew the code very well, but a commonly reported footgun were deadlocks in production due to recursive loads. I didn't want to use a ThreadLocal because those had a reputation for performance issues (multiple rewrites). The alternative was to stash the thread id in the loading entry as extra bookkeeping, but that needed care for proper clean up. I advocated for the holdsLock trick because an uncontended lock is free, it had no extra memory (header bit), there were no observed degradations, and it was a very low risk change to unfamiliar code. We then unofficially supported recursive loads despite discouraging them, and future discussions wrt ConcurrentHashMap facing the same problem. (My name might not be on the internal commit bit due to my manager having preferred that our team ships CLs to others to avoid perceptions that 20% projects might distract from the core project's timeline) Virtual thread support in mutexes is on going work (latest status). It may have been premature to have released the feature in Java 21 without it, as it is very easy to deadlock in JDK only code due to ample usage of synchronization throughout the standard library. The answer isn't to remove all of that and switching to I still stand by my position in Caffeine that it is not worth the effort and instead the JDK team will solve it soonish. It is good to be aware of, but primarily as a reason to avoid adoption of VTs until they are more robust. |
Ok a couple of things first just so I understand correctly:
Regarding your question about other pinning @cpovirk : So far I haven't discovered anything else, but I'm still working my way through it. Regarding the possible fix using a The only question for me that remains is: would you consider a PR for this? I will probably end up implementing either the But I also absolutely understand your point though @ben-manes . I just happen to need a solution now, so I kind of just need to deal with all the quirks that |
I went ahead and implemented the An edgecase that this implementation would not detect is if one tries to recursively load the same entry, but with another "normal" loader in the middle. In that case the threadlocal would be reset by the second loader, and the third one could no longer determine that loader nr. 1 already tried to load this entry. But this really seems very niche to me. I also wrote a test for it. Please do let me know If you would like me to submit a PR, or if you have other suggestions or comments. |
Unfortunately circular recursive calls (a -> b -> a) is not unheard of for longer dependency chains. The recursion is rarely intentional but rather a side-effect of calling an application api that circles back eventually. A fast fail is much better than a deadlock since the developer won’t understand at first what went wrong. Thus, the ThreadLocal would need to capture the list of loading entries. That’s still not horrible or difficult to implement. Alternatively the loading reference entry would need to capture the thread. This entry type is temporary and swapped out once complete. That would avoid the entry copying concerns that the doc comment mentions since the copy could capture the thread as needed. This is probably the ideal approach, just a bit more work and think time to avoid regressions. If you can use Caffeine, then AsyncCache is compatible as it defers the compute to a future so the monitor is held briefly for establishing the mapping. The executor could be set to a VT and the synchronous view used for convenience. |
@ben-manes I updated my implementation to remeber the loading Thanks for the hint, I'm going to have a look at it. But if possible I would still prefer to fix the issue in guava. We use guava caches quite extensively, and I'm guessing that switching to caffeine would require a pretty major rework. |
Those changes look good. From a quick review it doesn't seem this entry type needs to be copied ( Caffeine is mostly a drop-in replacement for Guava's cache with minor differences. The performance difference can be quite stark, see below for a zipfian workload on my macbook with 16 threads and varying Guava's concurrencyLevel (defaults to 4). Whether that translates to realizable gains is application specific, so you might consider capturing a jfr profile to see if the caches remain a bottleneck after your fix. |
Ok great. Can I open a PR for this now? Cool. A migration to Caffeine is definitely on our radar and will happen at some point. |
No guarantees, but I'd say that opening a PR is likely to be worth your time. I can have a look at it and run some tests. Thanks. |
The fix is now released as part of 33.0.0. |
…12730). The `LocalCache` change led to [false reports of recursion during refresh](#6851 (comment)). This CL keeps the new tests from the `LocalCache` change, which already would have passed before the CL. Ideally we would additionally include [tests that demonstrate the refresh problem](#6851 (comment)), but we'll need to see if the contributor can sign a CLA. This rollback un-fixes #6845. Users of `common.cache` and virtual threads will likely have to wait until `synchronized` no longer pins threads. (And at that point, if not earlier, they should migrate to [Caffeine](https://github.com/ben-manes/caffeine) :)) RELNOTES=`cache`: Fixed a bug that could cause [false "recursive load" reports during refresh](#6851 (comment)). PiperOrigin-RevId: 605049501
…12730). The `LocalCache` change led to [false reports of recursion during refresh](#6851 (comment)). This CL keeps the new tests from the `LocalCache` change, which already would have passed before the CL. Ideally we would additionally include [tests that demonstrate the refresh problem](#6851 (comment)), but we'll need to see if the contributor can sign a CLA. This rollback un-fixes #6845. Users of `common.cache` and virtual threads will likely have to wait until `synchronized` no longer pins threads. (And at that point, if not earlier, they should migrate to [Caffeine](https://github.com/ben-manes/caffeine) :)) RELNOTES=`cache`: Fixed a bug that could cause [false "recursive load" reports during refresh](#6851 (comment)). PiperOrigin-RevId: 605049501
…12730). The `LocalCache` change led to [false reports of recursion during refresh](#6851 (comment)). This CL keeps the new tests from the `LocalCache` change, which already would have passed before the CL. Ideally we would additionally include [tests that demonstrate the refresh problem](#6851 (comment)), but we'll need to see if the contributor can sign a CLA. This rollback un-fixes #6845. Users of `common.cache` and virtual threads will likely have to wait until `synchronized` no longer pins threads. (And at that point, if not earlier, they should migrate to [Caffeine](https://github.com/ben-manes/caffeine) :)) RELNOTES=`cache`: Fixed a bug that could cause [false "recursive load" reports during refresh](#6851 (comment)). PiperOrigin-RevId: 605069776
...and, as #6851 (comment) notes, the fix was rolled back for 33.1.0. Given that the fix was really more of a workaround for a temporary issue with virtual threads (on which there has been progress), and given that we continue to recommend Caffeine over Guava's cache, and given that our previous fix caused problems, I don't anticipate that we'll work on this further. |
API(s)
How do you want it to be improved?
This load should be possible without using
synchronized
.Why do we need it to be improved?
Using synchronized here is not performant when the current Thread is a Java 21
VirtualThread
and the loader does some blocking operation, because thesynchronized
prevents the JVM from unmounting the Thread. The performance could be drastically improved by using p.e.ReentrantLock
.Example
Current Behavior
In the above example, the loader is invoked. This invocation happens inside a
synchronized
block incom.google.common.cache.LocalCache$Segment.lockedGetOrLoad
. Because the loader does I/O but is inside thesynchronized
block theVirtualThread
can not be unmounted and therefore blocks its carrier. This is bad for the throughput of the application. This behavior can be observed using any loader that does a blocking operation when it is invoked from aVirtualThread
(just start your JVM with-Djdk.tracePinnedThreads=full
). The performance impact is dependent on how long the loader takes, but since generally one only caches things that are expensive to do I believe this to be a problem for pretty much anyone usingLocalCache
in combination withVirtualThreads
.Desired Behavior
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad
should work gracefully with VirtualThreads and not block the carrier Thread if a loader does I/O or some other blocking operation.The method should therefore be refactored to lock using some other method, p.e. a
ReentrantLock
.Concrete Use Cases
We use a
LocalCache
to cache frequently used database objects. Our loader therefore executes a SQL query over the network, which is a blocking operation and takes comparatively long. We are currently experimenting with usingVirtualThreads
in order to improve the throughput of I/O heavy workloads.Checklist
I agree to follow the code of conduct.
I have read and understood the contribution guidelines.
I have read and understood Guava's philosophy, and I strongly believe that this proposal aligns with it.
The text was updated successfully, but these errors were encountered: