Implement IO.read
Categories
(Remote Protocol :: CDP, task, P1)
Tracking
(firefox73 fixed)
Tracking | Status | |
---|---|---|
firefox73 | --- | fixed |
People
(Reporter: whimboo, Assigned: whimboo)
References
(Blocks 3 open bugs, )
Details
(Whiteboard: [puppeteer-beta-mvp])
Attachments
(2 files)
IO.read allows to read data from a stream as specified by a handle
(uuid). It is required by Puppeteer's Page.printToPDF
method.
I would suggest that we maintain a singleton for the current session, which keeps track of in-memory (blob) data, and data as stored on disk (we don't want to keep everything in memory - like PDFs). Only data as referenced by the handle
can be accessed to prevent unauthorized access to random files.
Assignee | ||
Comment 1•5 years ago
|
||
Quick update here. I got it all implemented and it works fine for file streams. My first solution was to make use of nsIFileInputstream
and NetUtils.readInputStream()
but that is actually blocking the main process of Firefox and as such had to change it yesterday to make use OS.File which executes any IO via a worker on a different thread.
Also to prevent base64 encoding issues for binary content that may contain characters outside of the range of 0x0 to 0xFF, I want to directly encode a Uint8Array into a base64 string. As reading the base64 MDN article it looks like that I have to use an external library like https://github.com/beatgammit/base64-js to make use of that feature. But maybe I'm missing something there and there is an already existent XPCOM interface that could be utilized for that required purpose? Olli, are you aware of one or do you know whom to task about that? Thanks.
Comment 2•5 years ago
|
||
https://searchfox.org/mozilla-central/source/xpcom/io/nsIScriptableBase64Encoder.idl
And perhaps you could use that with https://searchfox.org/mozilla-central/rev/efdf9bb55789ea782ae3a431bda6be74a87b041e/xpcom/io/nsStringStream.h#53-61
Assignee | ||
Comment 3•5 years ago
|
||
Olli, thank you for the reply. Both proposed methods work on nsIInputStream only, and I had to move our code to OS.File
as mentioned above. As such I have a Uint8Array
what OS.File.read()
returns. Based on your reply I assume there is nothing available in Gecko right now to do that direct encoding?
Comment 4•5 years ago
|
||
NS_NewByteInputStream gives you an nsIInputStream if you have some data to pass in as a param.
And there are APIs to get the raw data out from Uint8Array
Assignee | ||
Comment 5•5 years ago
|
||
This patch implements the IO.read() method to allow
reading streams for files and blobs.
Therefor all the methods in the IO domain need a registry
for streams. Those have to be stored globally because
they need to be kept existent across different client
connections.
Assignee | ||
Comment 6•5 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #4)
NS_NewByteInputStream gives you an nsIInputStream if you have some data to pass in as a param.
And there are APIs to get the raw data out from Uint8Array
It doesn't have a scriptable interface definition, so there is no way for me to make use of it from the Remote Agent which is implemented in JS. Or would the DOM team be in favor to expose it to JS?
Comment 7•5 years ago
|
||
oh, right, this all needs to be JS.
I can't recall if we have anything for Uint8Array -> base64 exposed to JS.
Perhaps baku recalls.
Comment 8•5 years ago
|
||
Can you please give more context of this IO.read()? Do you have any spec?
Asking this because Blob/File have a stream() method. Is it enough?
Assignee | ||
Comment 9•5 years ago
|
||
Sadly there is no spec. So this documentation is everything what we have. The rest is reverse engineering. Maybe have a look at the attached patch to get a better sense. Btw if you get a 404 when opening the documentation try to reload or go to a different domain and back. The viewer is somewhat broken.
Which stream()
method are you referring to? I'm using OS.File.read()
to read the bytes as Uint8Array. Does that one directly return a base64 encoded string?
Comment 10•5 years ago
|
||
Thanks for the link. I'll write a few comments in your phab push.
Assignee | ||
Comment 11•5 years ago
|
||
(In reply to Andrea Marchesini [:baku] from comment #10)
Thanks for the link. I'll write a few comments in your phab push.
Thank you for the reply. I'm not sure if you have seen my own comments yet, so I will just set needinfo for you again.
Assignee | ||
Comment 12•5 years ago
|
||
I tested the IO.read
method again with Chrome and the chrome-remote-interface library. And as noticed the streams are not shared between connections:
➜ bin/client.js inspect
>>> var s = Page.printToPDF({ transferMode: "ReturnAsStream" });
>>> IO.read({ handle: "1", offset: -400 });
Promise { <pending>, inspect: [AsyncFunction] }
(To exit, press ^C again or ^D or type .exit)
➜ bin/client.js inspect
>>> var d = IO.read({ handle: "1", offset: -400 });
>>> (node:56220) UnhandledPromiseRejectionWarning: Error: Invalid stream handle
As such I will also limit the lifetime of streams to the current connection.
Assignee | ||
Comment 13•5 years ago
|
||
As it turned out this is too complicated for now. As such we will do it as a follow-up if necessary. For now by using uuids for the handle it will make it way harder to retrieve a stream handle of a different connection.
Comment 14•5 years ago
|
||
For historical record:
As the attached patches are using UUIDs for stream handles, the question of making the lifetime of the stream registry cache equivalent to that of a connection or to make it global becomes secondary, since a client would necessarily need to know the handle of the other connection’s stream in order to access it.
Updated•5 years ago
|
Assignee | ||
Comment 15•5 years ago
|
||
Comment 16•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Comment 17•5 years ago
|
||
bugherder |
Updated•4 years ago
|
Description
•