Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager #6898

leventov · 2019-01-22T05:54:22Z

Concurrent maps (i. e. ConcurrentHashMap and ConcurrentSkipListMap) should be assigned into variables of their respective type or ConcurrentMap, but not just Map.

Why this is important could be seen in CoordinatorRuleManager, where it's pretty obvious that code

ConcurrentMap<String, List<Rule>> theRules = rules.get();
if (theRules.get(dataSource) != null) {
  retVal.addAll(theRules.get(dataSource));
}

has a race condition, but previously when the type of the variable was Map it was not obvious.

This race condition in CoordinatorRuleManager is fixed in this PR. Also, improved logic in DirectDruidClient and ResourcePool.

…s; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool

QiuMM · 2019-01-22T06:45:37Z

server/src/main/java/org/apache/druid/client/DirectDruidClient.java

@@ -92,7 +93,8 @@

  private static final Logger log = new Logger(DirectDruidClient.class);

-  private static final Map<Class<? extends Query>, Pair<JavaType, JavaType>> typesMap = new ConcurrentHashMap<>();
+  private static final ConcurrentHashMap<Class<? extends Query>, Pair<JavaType, JavaType>> typesMap =


Change ConcurrentHashMap to ConcurrentMap.

Actually, it's often should be the opposite: ConcurrentHashMap should be deliberately used instead of ConcurrentMap whenever compute(), computeIfAbsent(), etc. called on the map, because ConcurrentHashMap guarantees atomicity and linearizability of such actions, but ConcurrentMap doesn't. E. g. ConcurrentSkipListMap merely guarantees that if two concurrent threads call computeIfAbsent() on the same key at the same time, the program won't crash with IllegalStateException or ConcurrentModificationException, but the lambdas could be computed in parallel and it's unknown which wins.

I will go though this PR and change types.

Got it, thanks for the explanation.

QiuMM · 2019-01-22T06:55:21Z

server/src/main/java/org/apache/druid/client/DirectDruidClient.java

@@ -159,14 +158,15 @@ public int getNumOpenConnections()

    Pair<JavaType, JavaType> types = typesMap.get(query.getClass());
    if (types == null) {


Since using computeIfAbsent, this if (types == null) can be removed.

It could improve concurrency, see #4397 (comment). I'll add a comment.

Hmm, computeIfAbsent() may call get() again(ConcurrentHashMap doesn't, ConcurrentSkipListMap does), so I think we shouldn't use computeIfAbsent(), using put() is ok here.

If two queries of the same new type are run in parallel, there could be a race between them. Maybe it could be tolerated here because computation (body of the lambda) could be run in parallel and re-run for the same type with no harm, but computeIfAbsent() is clearer.

OK, it's fine.

QiuMM · 2019-01-22T06:56:07Z

server/src/main/java/org/apache/druid/curator/discovery/CuratorDruidNodeDiscoveryProvider.java

@@ -67,7 +67,7 @@

  private ExecutorService listenerExecutor;

-  private final Map<NodeType, NodeTypeWatcher> nodeTypeWatchers = new ConcurrentHashMap<>();
+  private final ConcurrentHashMap<NodeType, NodeTypeWatcher> nodeTypeWatchers = new ConcurrentHashMap<>();


Same, change ConcurrentHashMap to ConcurrentMap.

computeIfAbsent is called on this one, https://github.com/apache/incubator-druid/blob/101ca9c00f5fd3721abc01add0aa2b81afcf381d/server/src/main/java/org/apache/druid/curator/discovery/CuratorDruidNodeDiscoveryProvider.java#L91

QiuMM · 2019-01-22T06:58:12Z

server/src/main/java/org/apache/druid/segment/realtime/appenderator/AppenderatorImpl.java

@@ -121,7 +121,7 @@
  private final IndexIO indexIO;
  private final IndexMerger indexMerger;
  private final Cache cache;
-  private final Map<SegmentIdWithShardSpec, Sink> sinks = new ConcurrentHashMap<>();
+  private final ConcurrentHashMap<SegmentIdWithShardSpec, Sink> sinks = new ConcurrentHashMap<>();


Same, change ConcurrentHashMap to ConcurrentMap.

yeah, I guess this one looks like it could just be ConcurrentMap

QiuMM · 2019-01-22T07:09:43Z

server/src/main/java/org/apache/druid/server/coordinator/ReplicationThrottler.java

-      return retVal;
+      ConcurrentMap<SegmentId, String> segments = currentlyProcessingSegments.get(tier);
+      List<String> segmentsAndHosts = new ArrayList<>();
+      segments.forEach((segmentId, serverId) -> segmentsAndHosts.add(segmentId + " ON " + serverId));


I wonder why you didn't use StringUtils.format.

Simple string concatenation should be faster because it doesn't involve parsing of the format string.

…erge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java

leventov · 2019-01-23T04:11:25Z

@QiuMM in a newer commit I've added comments and enforced that ConcurrentHashMap type is used when needed.

@jihoonson FYI I refactored Counters so that now it just holds static utility methods.

gianm

LGTM. I had a couple of comments and questions but nothing critical.

gianm · 2019-01-27T22:12:55Z

.idea/inspectionProfiles/Druid.xml

+        <constraint name="x" within="" contains="" />
+        <constraint name="y" nameOfExprType="java\.util\.concurrent\.ConcurrentMap" expressionTypes="java.util.concurrent.ConcurrentMap" exprTypeWithinHierarchy="true" within="" contains="" />
+      </searchConfiguration>
+      <searchConfiguration name="A ConcurrentHashMap on which compute() is called should be assinged into variables of ConcurrentHashMap type, not ConcurrentMap" text="$x$.compute($y$, $z$)" recursive="true" caseInsensitive="true" type="JAVA">


Consider including the rationale in this message: it is not obvious that it's because ConcurrentMap does not guarantee atomicity.

This field is not assumed to be a message, it's a configuration name, I think I already overuse them. Probably neither desktop IntelliJ nor TeamCity CI are prepared for something multiline in this field.

gianm · 2019-01-27T22:19:13Z

indexing-service/src/main/java/org/apache/druid/indexing/common/Counters.java

  {
-    return intCounters.computeIfAbsent(key, k -> new AtomicInteger()).addAndGet(val);
+    // get() before computeIfAbsent() is an optimization to avoid locking in computeIfAbsent() if not needed.


Any idea why ConcurrentHashMap does not already employ an optimization like this?

That's a throughput vs. scalability tradeoff, + lack of information. We are potentially doing two operations instead of one, and avoid locking in some cases instead.

At some sites where computeIfAbsent() is actually expected to find the key absent and compute the value most of the time, the get() guard just makes things worse.

There is also an area where it's hard for me to say what approach is better, is that when the map is big and computeIfAbsent() constitutes significant part of the app's CPU consumption (the bigger the map and the hotter computeIfAbsent() call is, the more likely that it's better to not guard computeIfAbsent() with get()). I think it's never nearly the case on Druid nodes that computeIfAbsent() is hot, but I could be wrong.

From the ConcurrentHashMap's part, it would be useful if computeIfAbsentMoreScalableButMaybeDoingExtraWork() existed, where they don't recompute hash bucket twice and just walk the collision chain twice. But it's easy to imagine why such method doesn't exist.

…zation; IdentityHashMap optimization

gianm · 2019-01-29T17:03:38Z

It looks like some unit tests are failing now with similar messages. Maybe a recent change broke something?

clintropolis

LGTM 👍

clintropolis · 2019-02-04T02:42:54Z

server/src/main/java/org/apache/druid/curator/discovery/CuratorDruidNodeDiscoveryProvider.java

@@ -67,7 +67,7 @@

  private ExecutorService listenerExecutor;

-  private final Map<NodeType, NodeTypeWatcher> nodeTypeWatchers = new ConcurrentHashMap<>();
+  private final ConcurrentHashMap<NodeType, NodeTypeWatcher> nodeTypeWatchers = new ConcurrentHashMap<>();


computeIfAbsent is called on this one, https://github.com/apache/incubator-druid/blob/101ca9c00f5fd3721abc01add0aa2b81afcf381d/server/src/main/java/org/apache/druid/curator/discovery/CuratorDruidNodeDiscoveryProvider.java#L91

clintropolis · 2019-02-04T02:47:32Z

server/src/main/java/org/apache/druid/segment/realtime/appenderator/AppenderatorImpl.java

@@ -121,7 +121,7 @@
  private final IndexIO indexIO;
  private final IndexMerger indexMerger;
  private final Cache cache;
-  private final Map<SegmentIdWithShardSpec, Sink> sinks = new ConcurrentHashMap<>();
+  private final ConcurrentHashMap<SegmentIdWithShardSpec, Sink> sinks = new ConcurrentHashMap<>();


yeah, I guess this one looks like it could just be ConcurrentMap

Prohibit assigning concurrent maps into Map-types variables and field…

101ca9c

…s; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool

leventov added Bug Area - Automation/Static Analysis labels Jan 22, 2019

leventov changed the title ~~Prohibit assigning concurrent maps into Map-types variables and fields~~ Prohibit assigning concurrent maps into Map-typed variables and fields Jan 22, 2019

leventov changed the title ~~Prohibit assigning concurrent maps into Map-typed variables and fields~~ Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager Jan 22, 2019

QiuMM reviewed Jan 22, 2019

View reviewed changes

leventov added 4 commits January 23, 2019 11:23

Remove unnecessary comment

19056c1

Checkstyle

2791d9f

Merge remote-tracking branch 'upstream/master' into concurrent-map-type

5071ea0

Fix getFromExtensions()

9d48d18

gianm approved these changes Jan 27, 2019

View reviewed changes

leventov added the WIP label Jan 28, 2019

jihoonson mentioned this pull request Jan 28, 2019

Introduce published segment cache in broker #6901

Merged

leventov removed the WIP label Jan 29, 2019

Add a reference to the comment about guarded computeIfAbsent() optimi…

14307c3

…zation; IdentityHashMap optimization

leventov force-pushed the concurrent-map-type branch from 11057ee to 14307c3 Compare January 29, 2019 04:07

Fix UriCacheGeneratorTest

e363b01

jon-wei added this to the 0.14.0 milestone Feb 1, 2019

leventov mentioned this pull request Feb 1, 2019

Design of MaterializedViewQuery #6977

Open

Workaround issue with MaterializedViewQueryQueryToolChest

cd472a1

clintropolis approved these changes Feb 4, 2019

View reviewed changes

leventov added the WIP label Feb 4, 2019

Strengthen Appenderator's contract regarding concurrency

4bb8a7a

leventov removed the WIP label Feb 4, 2019

clintropolis merged commit 0e926e8 into apache:master Feb 4, 2019

leventov deleted the concurrent-map-type branch February 5, 2019 02:57

clintropolis mentioned this pull request May 28, 2019

Use map.putIfAbsent() or map.computeIfAbsent() as appropriate instead of containsKey() + put() #7764

Merged

leventov mentioned this pull request Jan 30, 2020

CoordinatorRuleManager.rules doesn't need to store ConcurrentHashMap #9292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager #6898

Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager #6898

leventov commented Jan 22, 2019

QiuMM Jan 22, 2019

leventov Jan 22, 2019

QiuMM Jan 22, 2019

QiuMM Jan 22, 2019

leventov Jan 22, 2019

QiuMM Jan 22, 2019

QiuMM Jan 23, 2019 •

edited

Loading

leventov Jan 23, 2019

QiuMM Jan 23, 2019

QiuMM Jan 22, 2019

clintropolis Feb 4, 2019

QiuMM Jan 22, 2019

clintropolis Feb 4, 2019

QiuMM Jan 22, 2019

leventov Jan 22, 2019 •

edited

Loading

QiuMM Jan 22, 2019

leventov commented Jan 23, 2019

gianm left a comment

gianm Jan 27, 2019

leventov Jan 28, 2019

gianm Jan 27, 2019

leventov Jan 28, 2019

leventov Jan 28, 2019

gianm commented Jan 29, 2019

clintropolis left a comment

clintropolis Feb 4, 2019

clintropolis Feb 4, 2019

		@@ -159,14 +158,15 @@ public int getNumOpenConnections()

		Pair<JavaType, JavaType> types = typesMap.get(query.getClass());
		if (types == null) {

Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager #6898

Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager #6898

Conversation

leventov commented Jan 22, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QiuMM Jan 23, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leventov Jan 22, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leventov commented Jan 23, 2019

gianm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gianm commented Jan 29, 2019

clintropolis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QiuMM Jan 23, 2019 •

edited

Loading

leventov Jan 22, 2019 •

edited

Loading