[SPARK-6237][NETWORK] Network-layer changes to allow stream upload. #21346

squito · 2018-05-16T18:39:56Z

These changes allow an RPCHandler to receive an upload as a stream of
data, without having to buffer the entire message in the FrameDecoder.
The primary use case is for replicating large blocks. By itself, this change is adding dead-code that is not being used -- it is a step towards SPARK-24296.

Added unit tests for handling streaming data, including successfully sending data, and failures in reading the stream with concurrent requests.

Summary of changes:

Introduce a new UploadStream RPC which is sent to push a large payload as a stream (in contrast, the pre-existing StreamRequest and StreamResponse RPCs are used for pull-based streaming).
Generalize RpcHandler.receive() to support requests which contain streams.
Generalize StreamInterceptor to handle both request and response messages (previously it only handled responses).
Introduce StdChannelListener to abstract away common logging logic in ChannelFuture listeners.

SparkQA · 2018-05-16T18:56:12Z

Test build #90693 has finished for PR 21346 at commit 49e0a80.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
public class StreamInterceptor<T extends Message> implements TransportFrameDecoder.Interceptor
public final class UploadStream extends AbstractMessage implements RequestMessage
public class StreamData

JoshRosen · 2018-05-16T22:12:26Z

It's been a little while since I've thought about this issue, so I have a few clarifying questions to help me understand the high-level changes:

I recall that the problem with large shuffle blocks was that the OneForOneBlockFetcher strategy basically read the entire block as a single chunk, which becomes a problem for large blocks. I understand that we have now removed this limitation for shuffles by using a streaming transfer strategy only for large blocks (above some threshold). Is this patch conceptually doing the same thing for push-based communication where the action is initiated by a sender (e.g. to push a block for replication)? Does it also affect pull-based remote cache block reads or will that be handled separately?
Given that we already seem to have pull-based openStream() calls which can be initiated from the receive side, could we simplify things here by pushing a "this value is big, pull it" message and then have the remote end initiate a streaming read, similar to how DirectTaskResult and IndirectTaskResult work?
For remote reads of large cached blocks: is it true that this works today only if the block is on disk but fails if the block is in memory? If certain size limit problems only occur when things are cached in memory, can we simplify anything if we add a requirement that blocks above 2GB can only be cached on disk (regardless of storage level)?

SparkQA · 2018-05-16T22:48:27Z

Test build #90694 has finished for PR 21346 at commit fa3ac4e.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin

I plan to take a look at this at some point but I basically wanted to ask the same thing as Josh's question 2.

vanzin · 2018-05-16T23:33:37Z

project/MimaExcludes.scala

-    ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.ml.regression.DecisionTreeRegressionModel.this")
+    ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.ml.regression.DecisionTreeRegressionModel.this"),
+
+    // [SPARK-6237][NETWORK] Network-layer changes to allow stream upload


I think we started adding these at the top since that is cleaner (doesn't require changing the previous exclusion rule).

squito · 2018-05-17T02:47:37Z

All good questions and stuff I had wondered about too -- I should actually be sure to comment on these on the jira as well:

I recall that the problem with large shuffle blocks was that the OneForOneBlockFetcher strategy basically read the entire block as a single chunk, which becomes a problem for large blocks. I understand that we have now removed this limitation for shuffles by using a streaming transfer strategy only for large blocks (above some threshold). Is this patch conceptually doing the same thing for push-based communication where the action is initiated by a sender (e.g. to push a block for replication)?

yes

Does it also affect pull-based remote cache block reads or will that be handled separately?

that was already handled by https://issues.apache.org/jira/browse/SPARK-22062 (despite the title saying its something else entirely). That said, I recently discovered that my tests doing this for large blocks was incorrect, so I need to reconfirm this (I need to rearrange my test a little, and I've got a different aspect of this in flight so will take a couple of days probably).

Given that we already seem to have pull-based openStream() calls which can be initiated from the receive side, could we simplify things here by pushing a "this value is big, pull it" message and then have the remote end initiate a streaming read, similar to how DirectTaskResult and IndirectTaskResult work?

its certainly possible to do this, and I started taking this approach, but I stopped because replication is synchronous. So you'd have to add a callback for when the block is finally fetched, to go back to this initial call -- but also add timeout logic to avoid waiting forever if the destination went away. It all seemed much more complicated than doing it the way I'm proposing here.

For remote reads of large cached blocks: is it true that this works today only if the block is on disk but fails if the block is in memory? If certain size limit problems only occur when things are cached in memory, can we simplify anything if we add a requirement that blocks above 2GB can only be cached on disk (regardless of storage level)?

Correct; I'm currently investigating what we can do to address this. (sorry, again I discovered my test was broken shortly after posting this.) It would certainly simplify things if we only supported this for disk cached blocks -- what exactly are you proposing? Just failing when its cached in memory, and telling the user to rerun with disk caching? Changing the block manager to automatically cache on disk also when the block is > 2gb? Or when sending the block, just write it to a temp file, and then send from that?

The problem here is on the sending side, not the receiving side; netty uses an int to manage the length of a ByteBuf based msg, but it uses a long for a FileRegion based msg (code is a little different in the latest on branch 4.1, but same problem is still there). I'm investigating making a "FileRegion" that is actually backed by a ChunkedByteBuffer.

But that would go into another jira under SPARK-6235

squito · 2018-05-17T15:38:40Z

btw I may have made the pull-based approach sound more complex than I meant to, I'm happy to take that approach if you think its better. The fact the replication is synchronous doesn't really matter, I just meant its not a fire-and-forget msg, we have to setup the callbacks to confirm the block has been fetched (or a failure). It just seemed like extra indirection to me, and I thought it would be better to stay closer to the UploadBlock path.

Are there particular reasons you think that approach would be better? I guess the receiver can throttle the requests, but on the other hand the task on the sender will block waiting for the replication to finish (whether its success or failure), so we really don't want it to wait too long.

These changes allow an RPCHandler to receive an upload as a stream of data, without having to buffer the entire message in the FrameDecoder. The primary use case is for replicating large blocks. Added unit tests.

SparkQA · 2018-05-23T21:48:36Z

Test build #91056 has finished for PR 21346 at commit 3098b9c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-05-23T21:57:28Z

Test build #4188 has finished for PR 21346 at commit 3098b9c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-05-24T18:35:45Z

Test build #91121 has finished for PR 21346 at commit 2fef75f.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-05-24T22:16:56Z

Test build #91122 has finished for PR 21346 at commit 54533c2.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-05-25T02:35:34Z

Test build #91136 has finished for PR 21346 at commit 32f4f94.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-05-25T07:05:02Z

Test build #91138 has finished for PR 21346 at commit 331124b.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-05-25T19:24:20Z

Test build #4189 has finished for PR 21346 at commit 331124b.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2018-05-25T20:07:15Z

Last failures are known flakies.

A few updates here from my last set of comments. I've posted an overall design doc, and shared the tests I'm running on a cluster. I think the tests cover all the cases care about, but would appreciate review on that tests too. I can change this to use the existing pull approach for large blocks, rather than updating the push one if you want. If you're OK with this, there will be one more PR on top of this to make use of the new uploadStream functionality.

There will be another PR as well to cover reading large remote blocks in memory for SPARK-24307

vanzin

After reading the code the changes seem simpler than I expected. While it would have been nice to re-use the existing feature if possible, I can see how a "please pull this" approach could make resource management on the server side a little more complicated; the server would have to keep around some list of things that are waiting to be pulled, when with the new RPC is just sends the message and doesn't need to keep state around for cleanup.

I need to go through the stuff you attached to the bug still.

vanzin · 2018-05-25T20:02:52Z

common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java

+
+    channel.writeAndFlush(new UploadStream(requestId, meta, data))
+        .addListener(future -> {
+          if (future.isSuccess()) {


First reaction is that it's about the right time to refactor this into a helper method... all instances in this class look quite similar.

vanzin · 2018-05-25T20:05:15Z

common/network-common/src/main/java/org/apache/spark/network/protocol/UploadStream.java

+import org.apache.spark.network.buffer.NettyManagedBuffer;
+
+/**
+ * An RPC with data that is sent outside of the frame, so it can be read in a stream.


as a stream?

vanzin · 2018-05-25T20:08:53Z

common/network-common/src/main/java/org/apache/spark/network/server/RpcHandler.java

@@ -38,15 +38,24 @@
   *
   * This method will not be called in parallel for a single TransportClient (i.e., channel).
   *
+   * The rpc *might* included a data stream in <code>streamData</code>(eg. for uploading a large


space before (

vanzin · 2018-05-25T20:13:03Z

common/network-common/src/main/java/org/apache/spark/network/server/StreamData.java

+   *
+   * If an exception is thrown from the callback, it will be propogated back to the sender as an rpc
+   * failure.
+   * @param callback


either remove or document all parameters (and add an empty line before).

vanzin · 2018-05-25T20:15:17Z

...on/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java

@@ -23,25 +23,16 @@
 import com.google.common.base.Throwables;
 import io.netty.channel.Channel;
 import io.netty.channel.ChannelFuture;
+import org.apache.spark.network.protocol.*;


These are in the wrong place.

vanzin · 2018-05-25T20:25:09Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

+    res.successMessages = Collections.synchronizedSet(new HashSet<String>());
+    res.errorMessages = Collections.synchronizedSet(new HashSet<String>());
+
+    for (String stream: streams) {


space before :

vanzin · 2018-05-25T20:28:24Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

+    final String streamId;
+    final RpcResult res;
+    final Semaphore sem;
+    RpcStreamCallback(String streamId, RpcResult res, Semaphore sem) {


add empty line

vanzin · 2018-05-25T20:29:08Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

@@ -193,10 +299,78 @@ public void sendOneWayMessage() throws Exception {
    }
  }

+  @Test
+  public void sendRpcWithStreamOneAtATime() throws Exception {
+    for (String stream: StreamTestHelper.STREAMS) {


space before :

vanzin · 2018-05-25T20:34:30Z

common/network-common/src/test/java/org/apache/spark/network/StreamTestHelper.java

+package org.apache.spark.network;
+
+import com.google.common.io.Files;
+import org.apache.spark.network.buffer.FileSegmentManagedBuffer;


Wrong place... basically in every file you've changed.

ooops, sorry got used to the style checker warning finding these in scala. fixed these now.

vanzin · 2018-05-25T20:36:01Z

project/MimaExcludes.scala

@@ -36,6 +36,9 @@ object MimaExcludes {

  // Exclude rules for 2.4.x
  lazy val v24excludes = v23excludes ++ Seq(
+    // [SPARK-6237][NETWORK] Network-layer changes to allow stream upload
+    ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.network.netty.NettyBlockRpcServer.receive"),


Kinda wondering why this class is public in the first place... along with SparkTransportConf in the same package.

I suspect that it's because we might want to access these across Java package boundaries and Java doesn't have the equivalent of Scala's nested package scoped private[package].

I only see references to them in Scala code... also private[package] translates to public in Java, so that would at least avoid the mima checks.

vanzin · 2018-05-25T21:33:28Z

I've posted an overall design doc, and shared the tests I'm running on a cluster.

Where did you post those? Couldn't find them on the bug, nor the bug linked from that bug.

EDIT: nevermind, it's linked from the "epic" (SPARK-6235).

SparkQA · 2018-05-27T02:23:21Z

Test build #91192 has finished for PR 21346 at commit f4d9123.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class RpcChannelListener extends StdChannelListener

SparkQA · 2018-05-27T07:05:02Z

Test build #91195 has finished for PR 21346 at commit 7bd1b43.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-05-29T19:52:00Z

Test build #4190 has finished for PR 21346 at commit 7bd1b43.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

vanzin · 2018-06-04T17:47:02Z

common/network-common/src/main/java/org/apache/spark/network/server/RpcHandler.java

   * @param client A channel client which enables the handler to make requests back to the sender
   *               of this RPC. This will always be the exact same object for a particular channel.
   * @param message The serialized bytes of the RPC.
+   * @param streamData StreamData if there is data which is meant to be read via a StreamCallback;


I'm wondering if a separate callback for these streams wouldn't be better. It would at the very least avoid having to change all the existing handlers.

But it would also make it clearer what the contract is. For example, the callback could return the stream callback to be registered.

It also doesn't seem like StreamData itself has a lot of useful information other than the registration method, so it could be replaced with parameters in the new callback, avoiding having to expose that type to RPC handlers.

I've done this refactoring, and I agree it made the change significantly simpler.

SparkQA · 2018-06-04T20:26:45Z

Test build #91450 has finished for PR 21346 at commit 6c086c5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-13T19:18:15Z

Test build #91791 has started for PR 21346 at commit 3d28a1b.

SparkQA · 2018-06-13T19:29:24Z

Test build #91792 has finished for PR 21346 at commit cf991a9.

This patch fails Java style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-14T00:28:11Z

Test build #91793 has finished for PR 21346 at commit 8a18da5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

witgo · 2018-06-14T02:01:13Z

common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java

+      logger.trace("Sending RPC to {}", getRemoteAddress(channel));
+    }
+
+    long requestId = Math.abs(UUID.randomUUID().getLeastSignificantBits());


This Math.abs(UUID.randomUUID().getLeastSignificantBits()); is repeated twice. Move it to a separate new method .

SparkQA · 2018-06-14T19:38:42Z

Test build #91854 has finished for PR 21346 at commit 1a222aa.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin

Mostly style nits, aside from the test issue.

vanzin · 2018-06-14T21:59:44Z

common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java

-          logger.trace("Sending request {} to {} took {} ms", streamChunkId,
-            getRemoteAddress(channel), timeTaken);
+    channel.writeAndFlush(new ChunkFetchRequest(streamChunkId))
+      .addListener( new StdChannelListener(startTime, streamChunkId) {


nit: no space after (

vanzin · 2018-06-14T22:01:01Z

common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java


    return requestId;
  }

+  /**
+   * Send data to the remote end as a stream.   This differs from stream() in that this is a request


I know you're in the "2 spaces after period camp", but that's 3.

vanzin · 2018-06-14T22:05:47Z

common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java

+    return requestId;
+  }
+
+  private class StdChannelListener


I personally try to keep nested classes at the bottom of the enclosing class, but up to you.

vanzin · 2018-06-14T22:09:10Z

common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java

+      ManagedBuffer meta,
+      ManagedBuffer data,
+      RpcResponseCallback callback) {
+    long startTime = System.currentTimeMillis();


Seems like it should be easy to move this to StdChannelListener's constructor. Looks pretty similar in all methods.

I didn't do that the originally as I figured you wanted the startTime to be before writeAndFlush, but I can work around that too.

vanzin · 2018-06-14T22:20:26Z

...on/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java

+      respond(new RpcFailure(req.requestId, Throwables.getStackTraceAsString(e)));
+      // We choose to totally fail the channel, rather than trying to recover as we do in other
+      // cases.  We don't know how many bytes of the stream the client has already sent for the
+      // stream, its not worth trying to recover.


vanzin · 2018-06-14T22:34:23Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

+    final StreamSuite.TestCallback helper;
+    final OutputStream out;
+    final File outFile;
+    VerifyingStreamCallback(String streamId) throws IOException {


nit: add empty line

whoops, sorry I missed this one. fixed now

vanzin · 2018-06-14T22:35:09Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

+        base.get(expected);
+        assertEquals(expected.length, result.length);
+        assertTrue("buffers don't match", Arrays.equals(expected, result));
+


nit: remove

vanzin · 2018-06-14T22:38:09Z

common/network-common/src/test/java/org/apache/spark/network/StreamTestHelper.java

+  static final String[] STREAMS = { "largeBuffer", "smallBuffer", "emptyBuffer", "file" };
+
+  final File testFile;
+  File tempDir;


final for all these?

vanzin · 2018-06-14T22:38:28Z

common/network-common/src/test/java/org/apache/spark/network/StreamTestHelper.java

+    }
+  }
+
+


nit: remove

vanzin · 2018-06-14T22:39:15Z

common/network-common/src/test/java/org/apache/spark/network/StreamTestHelper.java

+
+  void cleanup() {
+    if (tempDir != null) {
+      for (File f : tempDir.listFiles()) {


JavaUtils.deleteRecursively.

witgo · 2018-06-14T02:23:07Z

common/network-common/src/main/java/org/apache/spark/network/protocol/UploadStream.java

+/**
+ * An RPC with data that is sent outside of the frame, so it can be read as a stream.
+ */
+public final class UploadStream extends AbstractMessage implements RequestMessage {


Is it possible to merge UploadStream and RpcRequest into a class?

perhaps, but do you think that is really that useful? the handling of them is different (both in the network layer and the outer RpcHandler). And other things being equal, I'm biased to fewer changes to existing code paths.

vanzin

Looks good overall, I'll give Josh some time to comment if he has anything to say.

I'd have moved the debug logging into the new listener too, but that's minor.

vanzin · 2018-06-15T17:50:59Z

...on/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java

+      TransportFrameDecoder frameDecoder = (TransportFrameDecoder)
+          channel.pipeline().get(TransportFrameDecoder.HANDLER_NAME);
+      ByteBuffer meta = req.meta.nioByteBuffer();
+      StreamCallbackWithID streamHandler = rpcHandler.receiveStream(reverseClient, meta, callback);


Check for null? Otherwise you'll get some weird NPE buried in some other code path.

vanzin · 2018-06-15T17:55:01Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

+    }
+    streamCallbacks.values().forEach(streamCallback -> {
+      try {
+        streamCallback.waitForCompletionAndVerify(TimeUnit.SECONDS.toMillis(5));


Isn't the wait part now redundant, after you waited for the semaphore?

vanzin · 2018-06-15T17:55:46Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

+    streamCallbacks.values().forEach(streamCallback -> {
+      try {
+        streamCallback.waitForCompletionAndVerify(TimeUnit.SECONDS.toMillis(5));
+      } catch (IOException e) {


Method throws Exception, so this seems unnecessary.

forEach doesn't like the IOException

SparkQA · 2018-06-15T20:11:47Z

Test build #91925 has finished for PR 21346 at commit ea4a1f5.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-20T00:29:54Z

Test build #92101 has finished for PR 21346 at commit fd62f61.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin

LGTM pending tests.

vanzin · 2018-06-26T17:30:19Z

common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java

+    final StreamSuite.TestCallback helper;
+    final OutputStream out;
+    final File outFile;
+    VerifyingStreamCallback(String streamId) throws IOException {


SparkQA · 2018-06-26T21:53:21Z

Test build #92347 has finished for PR 21346 at commit 58d52b9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-06-26T21:54:29Z

Test build #92349 has finished for PR 21346 at commit cd11abc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2018-06-26T22:56:31Z

Merging to master.

These changes allow an RPCHandler to receive an upload as a stream of data, without having to buffer the entire message in the FrameDecoder. The primary use case is for replicating large blocks. By itself, this change is adding dead-code that is not being used -- it is a step towards SPARK-24296. Added unit tests for handling streaming data, including successfully sending data, and failures in reading the stream with concurrent requests. Summary of changes: * Introduce a new UploadStream RPC which is sent to push a large payload as a stream (in contrast, the pre-existing StreamRequest and StreamResponse RPCs are used for pull-based streaming). * Generalize RpcHandler.receive() to support requests which contain streams. * Generalize StreamInterceptor to handle both request and response messages (previously it only handled responses). * Introduce StdChannelListener to abstract away common logging logic in ChannelFuture listeners. Author: Imran Rashid <[email protected]> Closes apache#21346 from squito/upload_stream. Ref: LIHADOOP-52972 RB=2070064

vanzin reviewed May 16, 2018

View reviewed changes

squito added 2 commits May 23, 2018 12:49

[SPARK-6237][NETWORK] Network-layer changes to allow stream upload.

5683b10

These changes allow an RPCHandler to receive an upload as a stream of data, without having to buffer the entire message in the FrameDecoder. The primary use case is for replicating large blocks. Added unit tests.

fix

3098b9c

squito force-pushed the upload_stream branch from fa3ac4e to 3098b9c Compare May 23, 2018 17:51

squito added 2 commits May 24, 2018 12:22

Merge branch 'master' into upload_stream

9dddf06

fixes

2fef75f

java style checks

54533c2

don't care whether error is closed or reset

32f4f94

style

331124b

vanzin reviewed May 25, 2018

View reviewed changes

review feedback

f4d9123

private

7bd1b43

squito mentioned this pull request May 29, 2018

[SPARK-24296][CORE] Replicate large blocks as a stream. #21451

Closed

vanzin reviewed Jun 4, 2018

View reviewed changes

squito added 3 commits June 12, 2018 21:49

wip refactoring

d357885

fixes

93a5adf

Merge branch 'master' into upload_stream

25e48f5

cleanup

cf991a9

squito force-pushed the upload_stream branch from 3d28a1b to cf991a9 Compare June 13, 2018 19:20

fix

8a18da5

witgo reviewed Jun 14, 2018

View reviewed changes

factor out requestId()

1a222aa

vanzin reviewed Jun 14, 2018

View reviewed changes

witgo reviewed Jun 15, 2018

View reviewed changes

review feedback

ea4a1f5

vanzin reviewed Jun 15, 2018

View reviewed changes

review feedback

fd62f61

Merge branch 'master' into upload_stream

58d52b9

vanzin reviewed Jun 26, 2018

View reviewed changes

review feedback

cd11abc

asfgit closed this in 16f2c3e Jun 26, 2018

Victsm mentioned this pull request Dec 14, 2020

[SPARK-32916][SHUFFLE][test-maven][test-hadoop2.7] Ensure the number of chunks in meta file and index file are equal #30433

Closed

[SPARK-6237][NETWORK] Network-layer changes to allow stream upload. #21346

[SPARK-6237][NETWORK] Network-layer changes to allow stream upload. #21346

Conversation

squito commented May 16, 2018 • edited Loading

SparkQA commented May 16, 2018

JoshRosen commented May 16, 2018

SparkQA commented May 16, 2018

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

squito commented May 17, 2018

squito commented May 17, 2018

SparkQA commented May 23, 2018

SparkQA commented May 23, 2018

SparkQA commented May 24, 2018

SparkQA commented May 24, 2018

SparkQA commented May 25, 2018

SparkQA commented May 25, 2018

SparkQA commented May 25, 2018

squito commented May 25, 2018

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vanzin commented May 25, 2018 • edited Loading

SparkQA commented May 27, 2018

SparkQA commented May 27, 2018

SparkQA commented May 29, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 4, 2018

SparkQA commented Jun 13, 2018

SparkQA commented Jun 13, 2018

SparkQA commented Jun 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 14, 2018

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 15, 2018

SparkQA commented Jun 20, 2018

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 26, 2018

SparkQA commented Jun 26, 2018

vanzin commented Jun 26, 2018

squito commented May 16, 2018 •

edited

Loading

vanzin commented May 25, 2018 •

edited

Loading