-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Streams API support for networking task of PDF.js project. #8617
Conversation
src/core/worker.js
Outdated
@@ -197,6 +197,115 @@ IPDFStreamRangeReader.prototype = { | |||
} | |||
|
|||
/** @implements {IPDFStream} */ | |||
class PDFNetworkStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the IRC discussion yesterday, I'm assuming that we shouldn't use class
here (or elsewhere), since this code is in src/core
!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..., I see. Okay, I will change it to the way we implement class
in PDF.js, like: IPDFStream. Although, I still think class
looks nice :-)
src/core/worker.js
Outdated
@@ -197,6 +197,115 @@ IPDFStreamRangeReader.prototype = { | |||
} | |||
|
|||
/** @implements {IPDFStream} */ | |||
class PDFNetworkStream { | |||
constructor(data, msgHandler) { | |||
this._data = data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Underscore is usually used to denote a property, or method, that should be considered "private". However, in both PDFNetworkStreamReader
and PDFNetworkStreamRangeReader
below, you're directly accessing _data
(and _msgHandler
).
To me, this feels like abusing notation, and it'd be better it you change the properties to this.data = data
(and this.msgHandler
) here instead. (Another option would be to keep them "private" here, and add getters.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I will change it to this.data = data
and this.msgHandler = msgHandler
.
3ac72c1
to
4ad80e1
Compare
src/shared/util.js
Outdated
let sendStreamRequest = ({ stream, chunk, success, reason, }) => { | ||
let sendStreamRequest = ({ stream, chunk, transfers, | ||
success, reason, }) => { | ||
if (transfers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to ask: Why isn't it sufficient to simply always call
this.postMessage({ sourceName, targetName, stream, streamId,
chunk, success, reason, }, transfers);
and just let the postMessage
method deal with it?
Also, don't you need to add similar code in _processStreamMessage
below as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and just let the postMessage method deal with it?
Yeah, right. Seems like this.postMessage
is doing same thing, that I am trying here. Okay, calling just this.postMessage
sounds like a good idea. Thanks.
Also, don't you need to add similar code in _processStreamMessage below as well?
sendStreamResponse
mostly used for signaling *_complete
messages e.g. pull_complete
(to resolve callbacks), and I don't think we are going to send transfers in these types of messages, at least for now. Maybe we require this in future, but not sure. @yurydelendik, do we require this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the concern, but sendStreamRequest will have optional transfers, and I recommend just call this.postMessage({...}, transfers)
as mentioned above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking, do we really need same changes in sendStreamResponse? like:
let sendStreamResponse = ({ stream, transfers, success, reason, }) => {
this.postMessage({ sourceName, targetName, stream,
success, streamId, reason, }, transfers);
};
- sink.enqueue(result.value);
+ sink.enqueue(new Uint8Array(result.value), 1, [result.value]);
read() {
- return this._reader.read();
+ return this._reader.read().then(({value, done}) => {
+ if (done) { return {value: undefined, done: true}; }
+ return {value: value.buffer, done: false};
+ });;
}, |
src/display/api.js
Outdated
}, this); | ||
|
||
messageHandler.on('GetRangeReader', function(data, sink) { | ||
this._PDFNetworkStream = new PDFNetworkStream(data.data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to create new PDFNetworkStream (and data field can be removed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks.
b8be2db
to
02baba8
Compare
src/core/worker.js
Outdated
this.data = data; | ||
let source = data.source; | ||
this._contentLength = source.length; | ||
this.msgHandler = msgHandler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Let's place this.msgHandler = msgHandler;
right after the this.data = data;
line, to group the properties that are set from the provided parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks.
src/core/worker.js
Outdated
this._isStreamingSupported = false; | ||
|
||
let readableStream = this._msgHandler.sendWithStream('GetReader', | ||
this._data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Indentation, let's align this directly under 'GetReader'
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
src/core/worker.js
Outdated
this._reader = readableStream.getReader(); | ||
|
||
this._headersReady = this._msgHandler.sendWithPromise('ReaderHeadersReady'). | ||
then((data) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Two more spaces of indentation on this line please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/core/worker.js
Outdated
this.onProgress = null; | ||
|
||
let readableStream = this._msgHandler.sendWithStream('GetRangeReader', | ||
{ begin, end, }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Indentation, let's align this directly under 'GetRangeReader' instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
src/display/api.js
Outdated
Promise.all([SystemJS.import('pdfjs/core/network'), | ||
SystemJS.import('pdfjs/core/worker')]).then((modules) => { | ||
Promise.resolve(SystemJS.import('pdfjs/core/worker')). | ||
then((modules) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please indent this line with two more spaces, and also change it to then(worker) => {
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks.
src/display/api.js
Outdated
// Enqueue data chunk into sink, and transfer it | ||
// to other side as `Transferable` object. | ||
sink.enqueue(new Uint8Array(value), 1, [value]); | ||
}).catch(function(reason) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for an intermediate function here, }).catch(sink.error);
should work just as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, Thanks for pointing out.
src/display/api.js
Outdated
return; | ||
} | ||
sink.enqueue(new Uint8Array(value), 1, [value]); | ||
}).catch(function(reason) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for an intermediate function here, }).catch(sink.error);
should work just as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
src/shared/util.js
Outdated
let sendStreamRequest = ({ stream, chunk, transfers, | ||
success, reason, }) => { | ||
this.postMessage({ sourceName, targetName, stream, streamId, | ||
chunk, success, reason, }, transfers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, done with new commit. Thanks.
02baba8
to
a13a4f8
Compare
src/worker_loader.js
Outdated
@@ -28,7 +28,6 @@ importScripts('./shared/compatibility.js'); | |||
importScripts('../node_modules/systemjs/dist/system.js'); | |||
importScripts('../systemjs.config.js'); | |||
|
|||
Promise.all([SystemJS.import('pdfjs/core/network'), | |||
SystemJS.import('pdfjs/core/worker')]).then(function () { | |||
Promise.resolve(SystemJS.import('pdfjs/core/worker')).then(function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and in api.js, you can directly use SystemJS.import
without wrapping with Promise.resolve
:
SystemJS.import('pdfjs/core/worker').then(function () {...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks.
src/core/worker.js
Outdated
this._contentLength = null; | ||
this._isRangeSupported = false; | ||
this._isStreamingSupported = false; | ||
let queueingStrategy = { highWaterMark: 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove queueingStrategy from here and range reader, or make explicit highWaterMark:1 and size:()=>1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in new commit. Thanks.
src/display/api.js
Outdated
}, this); | ||
|
||
messageHandler.on('GetRangeReader', function(data, sink) { | ||
this._rangeReader = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to let rangeReader
it must be unique for every new range request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
01ad046
to
70909a2
Compare
70909a2
to
9951972
Compare
src/display/network.js
Outdated
if (initialData && initialData.length > 0) { | ||
this._queuedChunks.push(initialData); | ||
} | ||
this._msgHandler = msgHandler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this._msgHandler and msgHandler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, done with new commit.
92b4dbd
to
79821fa
Compare
src/display/api.js
Outdated
this._networkStream = new PDFWorkerStream(data.source); | ||
} else { | ||
this._networkStream = new PDFNetworkStream(data); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this code to the WorkerTransport constructor and remove this.networkStreamCapability promise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with new commit. Thanks.
79821fa
to
a317816
Compare
src/display/api.js
Outdated
@@ -1522,6 +1522,14 @@ var WorkerTransport = (function WorkerTransportClosure() { | |||
this.destroyCapability = null; | |||
this._passwordCapability = null; | |||
|
|||
this._networkStream = null; | |||
this._fullReader = null; | |||
if (pdfDataRangeTransport /* === true = chunkedViewerLoading */) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The comment isn't completely easy to read at first, and furthermore I'm not sure how much value it adds!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed with new commit. Thanks.
a317816
to
cf02bcf
Compare
288e9c1
to
ec6de1a
Compare
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @yurydelendik received. Current queue size: 0 Live output at: http://54.67.70.0:8877/f0e8b6d03703f3d/output.txt |
/botio test |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @yurydelendik received. Current queue size: 1 Live output at: http://54.67.70.0:8877/31567a9079e83b3/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_test from @yurydelendik received. Current queue size: 0 Live output at: http://54.215.176.217:8877/02ee71ae88e44dc/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/f0e8b6d03703f3d/output.txt Total script time: 2.32 mins Published |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/31567a9079e83b3/output.txt Total script time: 16.58 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/02ee71ae88e44dc/output.txt Total script time: 29.32 mins
|
network.js file moved to main thread and `PDFNetworkStream` implemented at worker thread, that is used to ask for data whenever worker needs.
ec6de1a
to
1091067
Compare
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @yurydelendik received. Current queue size: 1 Live output at: http://54.67.70.0:8877/b49b2e015836bb8/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/b49b2e015836bb8/output.txt Total script time: 2.32 mins Published |
Thank you for the patch. |
It seems that this patch may have caused some weird behaviour in the Firefox addon with Looking at data received through [1] It may be necessary to do "skip-cache" reloads, i.e. Ctrl+F5 to ensure that the PDF data isn't cached. |
I was encountering with behavior before: the initial XHR fetches quickly entire file when cached. I think it's normal for small files. We just need to pay attention if data and events is handled properly in chunks and streams. |
Huh, I'm seeing this when doing skip-cache reloads (as indicated above), so I don't think caching should be relevant then!? |
Not sure I'm getting the same behavior locally: PDFViewerApplication.progress reports proper percent values in ascending order (that start from 4). |
I can confirm that it works locally (with |
FWIW I'm testing on Nightly (e10s) with current development addon. (non-e10s is unusably broken) |
Adds Streams API support for networking task of PDF.js project.
This PR aims to add Streams API support for networking task of PDF.js. For the same,
network.js
file is moved to the main thread andPDFNetwokStream
is implemented at worker to talk to main viasendWithStream
and/orsendWithPromise
method ofMessageHandler
.Worker will ask for data(when needed) by sending stream messages to main side. PDF data is returned from the handlers(main to worker) in small chunks.
TODO:
enqueue needs to access tranfers parameter
check how size and highwatermark work (for now highwatermark = 1 and every chunk size = 1)