Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work with Cyberduck on transfer tuning #86

Closed
michael-conway opened this issue Apr 7, 2015 · 11 comments
Closed

Work with Cyberduck on transfer tuning #86

michael-conway opened this issue Apr 7, 2015 · 11 comments
Assignees

Comments

@michael-conway
Copy link
Collaborator

see
https://trac.cyberduck.io/ticket/8720

Make sure they are using parallel transfer versus output stream and work on tuning

@michael-conway
Copy link
Collaborator Author

they are using io streams

You can find the implementation at [1] which uses the IRODSFileFactory interface. This is the best option for us as it allows us to pipe the stream through our standard I/O shared with all protocols. I have written an alternative implementation that uses the DataTransferOperations interface [2] but couldn’t test it due to the iRODS maintenance window today.

— David

[1] https://svn.cyberduck.io/trunk/source/ch/cyberduck/core/irods/IRODSWriteFeature.java
[2] https://svn.cyberduck.io/trunk/source/ch/cyberduck/core/irods/IRODSUploadFeature.java

@dkocher
Copy link
Contributor

dkocher commented Apr 8, 2015

I can confirm that my tests show the DataTransferOperations interface gives us regular upload performance. Although still not what we achieve with multipart uploads to S3 or OpenStack Swift.

@dkocher
Copy link
Contributor

dkocher commented Apr 8, 2015

See also #87.

@michael-conway
Copy link
Collaborator Author

David, checking in on this, is the current status that Cyberduck is using streaming I/o for download?

This would be better to do using the DataTransferOperations 'get' method. I/O streaming is something that we'll be working on in 4.1, and actually has more to do with core server tuning. I am happy to help this week to code with you guys to get us on 'get' processing versus streaming download.

Also trying to go thru some of these other issues to get to a release this week.

@dkocher
Copy link
Contributor

dkocher commented Apr 23, 2015

Added transfer implementation for IRODS using DataTransferOperations interface for downloads in r17427 [1]. It is not currently enabled due to issue #88 and #98.

[1] https://trac.cyberduck.io/changeset/17427

@michael-conway
Copy link
Collaborator Author

Just curious why the status callback is a blocker? I can take a look tomorrow, but I have to prioritize.

@michael-conway
Copy link
Collaborator Author

re IRODS v Swift? iRODS performance is related to a number of factors that don't have anything to do with DataTransferOperations..e.g. is there replication happening? Server side policies being applied as files are uploaded? Current load/scale of grid?

Comparisons between systems is, alas, often apples to oranges, the better benchmark will be between Cyberduck and bare iput/iget operations.

It can even be a question of what sort of checksum validation is being done for individual file transfers... With the Java API we tend to run 10-15% overhead over the pure C transfers, but are doing more things as part of the process (including checksum verification, the level of granularity of callbacks, etc).

Depending on the network and topology we can also see faster performance than other protocols....

@michael-conway michael-conway modified the milestones: idrop for jargon 4.0.2 release support - 4.0.2.1, Maintenance release 4.0.2.2 includes iRODS 4.1 Apr 24, 2015
@dkocher
Copy link
Contributor

dkocher commented Apr 24, 2015

We have set setComputeAndVerifyChecksumAfterTransfer and setComputeChecksumAfterTransfer both to false on TransferOptions.

@michael-conway
Copy link
Collaborator Author

I see that. I've got the cyberduck out of svn and am working now on
getting ant set up etc.

So as it stands, I see the various functions implemented as subclasses,
that seems straight forward enough. I was unclear what it means to say
that 'get' is turned off for download, as the gitHub issue implies? There
is a read/write that appears to use stream, as well as an upload/download,
and I was curious as to what the role of those features are?

I hope to do some testing against known servers here, so I can get a good
performance comparison. If you are testing against iPlant's grid (maybe
QA?) than there are many factors that come into play in trying to benchmark
performance.

Cheers
MC

On Fri, Apr 24, 2015 at 10:14 AM, David Kocher [email protected]
wrote:

We have set setComputeAndVerifyChecksumAfterTransfer and
setComputeChecksumAfterTransfer both to false on TransferOptions.


Reply to this email directly or view it on GitHub
#86 (comment).

@dkocher
Copy link
Contributor

dkocher commented Apr 24, 2015

You can enable the download and upload features override using the patch below. This will disable the stream based transfers and instead use the functionality from the DataTransferOperations.

Index: source/ch/cyberduck/core/irods/IRODSSession.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- source/ch/cyberduck/core/irods/IRODSSession.java    (revision 17423)
+++ source/ch/cyberduck/core/irods/IRODSSession.java    (revision )
@@ -31,10 +31,12 @@
 import ch.cyberduck.core.features.Copy;
 import ch.cyberduck.core.features.Delete;
 import ch.cyberduck.core.features.Directory;
+import ch.cyberduck.core.features.Download;
 import ch.cyberduck.core.features.Find;
 import ch.cyberduck.core.features.Move;
 import ch.cyberduck.core.features.Read;
 import ch.cyberduck.core.features.Touch;
+import ch.cyberduck.core.features.Upload;
 import ch.cyberduck.core.features.Write;
 import ch.cyberduck.core.ssl.DefaultX509KeyManager;
 import ch.cyberduck.core.ssl.DisabledX509TrustManager;
@@ -170,6 +172,12 @@
         }
         if(type == Copy.class) {
             return (T) new IRODSCopyFeature(this);
+        }
+        if(type == Download.class) {
+            return (T) new IRODSDownloadFeature(this);
+        }
+        if(type == Upload.class) {
+            return (T) new IRODSUploadFeature(this);
         }
         return super.getFeature(type);
     }

@michael-conway
Copy link
Collaborator Author

Appears to duplicate several other issues. Looks like get was not being called, instead using i/o streams. Get was turned off because of an incorrect length on the first intra-file status callback, but that issue was separately fixed. Let's evaluate the get performance using DataTransferOperations versus streaming i/o and we can review any outstanding issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants