"Page.navigate" has to block until a response received
Categories
(Remote Protocol :: CDP, defect, P1)
Tracking
(firefox77 fixed)
Tracking | Status | |
---|---|---|
firefox77 | --- | fixed |
People
(Reporter: whimboo, Assigned: impossibus)
References
(Blocks 3 open bugs)
Details
(Whiteboard: [puppeteer-beta-mvp])
Attachments
(5 files)
When checking the logs from Chrome I can see that the Page.navigate
command blocks until the Network.responseReceived
event has been emitted:
puppeteer:protocol SEND ► {"sessionId":"1717CA6E17AF1412C5304DFEB1CDF719","method":"Page.navigate","params":{"url":"http://localhost/~henrik/page.html","frameId":"EF117F4222FDCCDD62044A9F9A001C59"},"id":17} +3ms
puppeteer:protocol ◀ RECV {"method":"Network.requestWillBeSent","params":{"requestId":"D667AFB7CA9718CABDC8B52822C26DE4","loaderId":"D667AFB7CA9718CABDC8B52822C26DE4","documentURL":"http://localhost/~henrik/page.html","request":{"url":"http://localhost/~henrik/page.html","method":"GET","headers":{"Upgrade-Insecure-Requests":"1","User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36","Sec-Fetch-User":"?1"},"mixedContentType":"none","initialPriority":"VeryHigh","referrerPolicy":"no-referrer-when-downgrade"},"timestamp":45457.827741,"wallTime":1574717750.218332,"initiator":{"type":"other"},"type":"Document","frameId":"EF117F4222FDCCDD62044A9F9A001C59","hasUserGesture":false},"sessionId":"1717CA6E17AF1412C5304DFEB1CDF719"} +6ms
puppeteer:protocol ◀ RECV {"method":"Network.requestWillBeSentExtraInfo","params":{"requestId":"D667AFB7CA9718CABDC8B52822C26DE4","blockedCookies":[],"headers":{"Host":"localhost","Connection":"keep-alive","Upgrade-Insecure-Requests":"1","User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36","Sec-Fetch-User":"?1","Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3","Sec-Fetch-Site":"none","Sec-Fetch-Mode":"navigate","Accept-Encoding":"gzip, deflate, br","Accept-Language":"en-GB,en-US;q=0.9,en;q=0.8"}},"sessionId":"1717CA6E17AF1412C5304DFEB1CDF719"} +3ms
puppeteer:protocol ◀ RECV {"method":"Network.responseReceivedExtraInfo","params":{"requestId":"D667AFB7CA9718CABDC8B52822C26DE4","blockedCookies":[],"headers":{"Date":"Mon, 25 Nov 2019 21:35:50 GMT","Server":"Apache/2.4.34 (Unix) PHP/7.1.23","Last-Modified":"Thu, 21 Nov 2019 21:48:38 GMT","ETag":"\"53-597e242857980\"","Accept-Ranges":"bytes","Content-Length":"83","Keep-Alive":"timeout=5, max=100","Connection":"Keep-Alive","Content-Type":"text/html"},"headersText":"HTTP/1.1 200 OK\r\nDate: Mon, 25 Nov 2019 21:35:50 GMT\r\nServer: Apache/2.4.34 (Unix) PHP/7.1.23\r\nLast-Modified: Thu, 21 Nov 2019 21:48:38 GMT\r\nETag: \"53-597e242857980\"\r\nAccept-Ranges: bytes\r\nContent-Length: 83\r\nKeep-Alive: timeout=5, max=100\r\nConnection: Keep-Alive\r\nContent-Type: text/html\r\n\r\n"},"sessionId":"1717CA6E17AF1412C5304DFEB1CDF719"} +2ms
puppeteer:protocol ◀ RECV {"method":"Network.responseReceived","params":{"requestId":"D667AFB7CA9718CABDC8B52822C26DE4","loaderId":"D667AFB7CA9718CABDC8B52822C26DE4","timestamp":45457.834269,"type":"Document","response":{"url":"http://localhost/~henrik/page.html","status":200,"statusText":"OK","headers":{"Date":"Mon, 25 Nov 2019 21:35:50 GMT","Server":"Apache/2.4.34 (Unix) PHP/7.1.23","Last-Modified":"Thu, 21 Nov 2019 21:48:38 GMT","ETag":"\"53-597e242857980\"","Accept-Ranges":"bytes","Content-Length":"83","Keep-Alive":"timeout=5, max=100","Connection":"Keep-Alive","Content-Type":"text/html"},"headersText":"HTTP/1.1 200 OK\r\nDate: Mon, 25 Nov 2019 21:35:50 GMT\r\nServer: Apache/2.4.34 (Unix) PHP/7.1.23\r\nLast-Modified: Thu, 21 Nov 2019 21:48:38 GMT\r\nETag: \"53-597e242857980\"\r\nAccept-Ranges: bytes\r\nContent-Length: 83\r\nKeep-Alive: timeout=5, max=100\r\nConnection: Keep-Alive\r\nContent-Type: text/html\r\n\r\n","mimeType":"text/html","requestHeaders":{"Host":"localhost","Connection":"keep-alive","Upgrade-Insecure-Requests":"1","User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36","Sec-Fetch-User":"?1","Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3","Sec-Fetch-Site":"none","Sec-Fetch-Mode":"navigate","Accept-Encoding":"gzip, deflate, br","Accept-Language":"en-GB,en-US;q=0.9,en;q=0.8"},"requestHeadersText":"GET /~henrik/page.html HTTP/1.1\r\nHost: localhost\r\nConnection: keep-alive\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36\r\nSec-Fetch-User: ?1\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3\r\nSec-Fetch-Site: none\r\nSec-Fetch-Mode: navigate\r\nAccept-Encoding: gzip, deflate, br\r\nAccept-Language: en-GB,en-US;q=0.9,en;q=0.8\r\n","connectionReused":false,"connectionId":41,"remoteIPAddress":"[::1]","remotePort":80,"fromDiskCache":false,"fromServiceWorker":false,"fromPrefetchCache":false,"encodedDataLength":292,"timing":{"requestTime":45457.829565,"proxyStart":-1,"proxyEnd":-1,"dnsStart":0.814,"dnsEnd":0.822,"connectStart":0.822,"connectEnd":0.98,"sslStart":-1,"sslEnd":-1,"workerStart":-1,"workerReady":-1,"sendStart":1.016,"sendEnd":1.077,"pushStart":0,"pushEnd":0,"receiveHeadersEnd":3.332},"protocol":"http/1.1","securityState":"secure"},"frameId":"EF117F4222FDCCDD62044A9F9A001C59"},"sessionId":"1717CA6E17AF1412C5304DFEB1CDF719"} +2ms
puppeteer:protocol ◀ RECV {"id":17,"result":{"frameId":"EF117F4222FDCCDD62044A9F9A001C59","loaderId":"D667AFB7CA9718CABDC8B52822C26DE4"},"sessionId":"1717CA6E17AF1412C5304DFEB1CDF719"} +1ms
Right now our implementation returns immediately:
puppeteer:protocol SEND ► {"sessionId":1,"method":"Page.navigate","params":{"url":"http://localhost/~henrik/page.html","frameId":23},"id":17} +1s
puppeteer:protocol ◀ RECV {"sessionId":1,"id":17,"result":{"frameId":23}} +21ms
puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Network.requestWillBeSent","params":{"requestId":"100837242175489","loaderId":"100837242175489","documentURL":"about:blank","request":{"url":"http://localhost/~henrik/page.html","method":"GET","headers":[],"hasPostData":false,"isLinkPreload":false},"type":"Document","frameId":23}} +7ms
By doing that we would be able to also forward the loaderId
, and have a proper errorText
if a request fails. Both is currently missing.
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
Add priority tag because early return from Page.navigate introduces races in the life cycle of execution contexts, and therefore impacts many other commands.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
Seems like webNavigation could be useful here and for missing page life cycle events in general.
Reporter | ||
Comment 3•5 years ago
|
||
But that's only for browser extensions, right?
Reporter | ||
Comment 4•5 years ago
|
||
I think we really have to get the Network.responseReceived
event working (see comment 0) first, and let Page.navigate
wait for it.
Assignee | ||
Comment 5•5 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #3)
But that's only for browser extensions, right?
Yeah, nevermind. Plus, I mixed it up with nsIWebNavigation (which is very different, and not useful in this case).
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #4)
I think we really have to get the
Network.responseReceived
event working (see comment 0) first, and letPage.navigate
wait for it.
Perhaps. I'm not sure it's a good idea to mix domains that way, similar concern as in Bug 1603223, but it can be a short-term solution.
Reporter | ||
Comment 6•5 years ago
|
||
(In reply to Maja Frydrychowicz :maja_zf (UTC-4) (maja@mozilla.com) from comment #5)
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #4)
I think we really have to get the
Network.responseReceived
event working (see comment 0) first, and letPage.navigate
wait for it.Perhaps. I'm not sure it's a good idea to mix domains that way, similar concern as in Bug 1603223, but it can be a short-term solution.
Network.responseReceived
should emit an internal event, which the navigation code could subscribe for, and unsubscribe right after. That way I don't see that we would have to mix domain code.
Assignee | ||
Comment 7•5 years ago
|
||
Judging by how Page.navigate
is actually used in clients like Puppeteer and Playwright, I think all this method is expected to do is start the navigation and return an error, if any. In practice, clients call Page.navigate
and then wait for other page life cycle events to draw conclusions about the result of the navigation. Taking Puppeteer as an example (but the
same approach is taken in Playwright), their Page.goto
ultimately calls into a method that sends Page.navigate
and starts a life cycle watcher.
So, I will write a solution that largely leaves our Page.navigate
as is and instead make sure we emit other events as expected by clients.
(As an aside, I'd say this method is a good example of where we might diverge in the future in Remote Protocol versus CDP. The CDP semantics are perhaps too low-level here and clients have to write boiler-plate to handle basic navigation correctly. Instead we could handle those details in the Remote Protocol implementation.)
Henrik, before I change blockers and dependencies here, I'd like your input since you had concluded a few months ago that this is causing problems with screenshots and other features. I think our implementations of screenshots etc will have to account for life cycle events rather than relying solely on Page.navigate
to wait long enough.
Reporter | ||
Comment 8•5 years ago
|
||
(In reply to Maja Frydrychowicz :maja_zf (UTC-4) (maja@mozilla.com) from comment #7)
Judging by how
Page.navigate
is actually used in clients like Puppeteer and Playwright, I think all this method is expected to do is start the navigation and return an error, if any.
And that includes handling the response HTTP status code of the initial request. Otherwise the method won't be able to return error details like this one:
>>> var a = Page.navigate({ url: "https://untrusted-root.badssl.com/" });
>>> a
Promise {
{
frameId: 'D501758CF3A9566E820C8BA6B0113AE3',
loaderId: '034C2344514DB59666E1E1FAAF8ED8F7',
errorText: 'net::ERR_CERT_INVALID'
},
inspect: [AsyncFunction]
}
That given we have to await the Network.responseReceived
event and handle its data. See other examples in Puppeteer's navigation.spec.js unit tests. We won't get those details when firing off the page load, and returning immediately as what we do at the moment.
In practice, clients call
Page.navigate
and then wait for other page life cycle events to draw conclusions about the result of the navigation.
Correct. That is for all the events as fired during a page load like DOMContentLoaded
, and load
. So that part don't need to be covered here.
Taking Puppeteer as an example (but the
same approach is taken in Playwright), theirPage.goto
ultimately calls into a method that sendsPage.navigate
and starts a life cycle watcher.So, I will write a solution that largely leaves our
Page.navigate
as is and instead make sure we emit other events as expected by clients.
That is something different and not covered by this bug.
Henrik, before I change blockers and dependencies here, I'd like your input since you had concluded a few months ago that this is causing problems with screenshots and other features. I think our implementations of screenshots etc will have to account for life cycle events rather than relying solely on
Page.navigate
to wait long enough.
It didn't cause problems, but the examples weren't only waiting for load
but networkidle0
, which we haven't implemented.
Assignee | ||
Comment 9•5 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #8)
And that includes handling the response HTTP status code of the initial request. Otherwise the method won't be able to return error details like this one:
>>> var a = Page.navigate({ url: "https://untrusted-root.badssl.com/" }); >>> a Promise { { frameId: 'D501758CF3A9566E820C8BA6B0113AE3', loaderId: '034C2344514DB59666E1E1FAAF8ED8F7', errorText: 'net::ERR_CERT_INVALID' }, inspect: [AsyncFunction] }
That given we have to await the
Network.responseReceived
event and handle its data. See other examples in Puppeteer's navigation.spec.js unit tests. We won't get those details when firing off the page load, and returning immediately as what we do at the moment.
Fair point, thanks for the example. I'll try to use NetworkObserver or something along those lines to get the missing data. The Page domain can't depend on notifications from the Network domain (even an internal event, as you suggest) since those notifications aren't necessarily enabled.
Reporter | ||
Comment 10•5 years ago
|
||
Internal events are always send compared to the public ones. See the difference between request
, and Network.requestWillBeSent
.
Assignee | ||
Comment 11•5 years ago
|
||
This facilitates monitoring the navigation response.
The remaining work for moving navigation-related code into
the parent process should be done in Bug 1612538.
Assignee | ||
Comment 12•5 years ago
|
||
NetworkObserver will be used by both Network and Page domains.
Depends on D71654
Assignee | ||
Comment 13•5 years ago
|
||
Depends on D71655
Assignee | ||
Comment 14•5 years ago
|
||
Depends on D71656
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 17•5 years ago
|
||
Depends on D71657
Comment 18•5 years ago
|
||
Comment 19•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e5e712f9a861
https://hg.mozilla.org/mozilla-central/rev/82c49d62a2dc
https://hg.mozilla.org/mozilla-central/rev/1fdf3bef9319
https://hg.mozilla.org/mozilla-central/rev/9ecdb6c3e171
https://hg.mozilla.org/mozilla-central/rev/a1152a9c5dc5
Assignee | ||
Updated•5 years ago
|
Reporter | ||
Comment 20•5 years ago
|
||
(In reply to Maja Frydrychowicz :maja_zf (UTC-4) (maja@mozilla.com) from comment #12)
Created attachment 9141861 [details]
Bug 1599257 - [remote] Create one NetworkObserver per SessionNetworkObserver will be used by both Network and Page domains.
Please note that this was not enough yet given that TabSession
sub-classes Session
, and as such we register a new NetworkObserver
for each single tab, and destroy it when the tab gets closed:
0:08.15 GECKO(5071) 1588666265692 RemoteAgent TRACE <-(connection {a83ed2df-3857-964b-a46f-ee8a5065a4a2}) {"method":"Target.targetCreated","params":{"targetInfo":{"browserContextId":null,"targetId":33,"type":"page","url":"about:blank"}}}
0:08.21 GECKO(5071) 1588666265756 RemoteAgent TRACE Registering observers for NetworkObserver
0:08.22 GECKO(5071) 1588666265763 RemoteAgent TRACE <-(connection {a83ed2df-3857-964b-a46f-ee8a5065a4a2}) {"id":2}
0:08.23 GECKO(5071) 1588666265770 RemoteAgent TRACE <-(connection {a83ed2df-3857-964b-a46f-ee8a5065a4a2}) {"method":"Target.targetDestroyed","params":{"targetId":33}}
0:08.25 GECKO(5071) 1588666265794 RemoteAgent TRACE Unregistered observers for NetworkObserver
With my refactoring on bug 1593226 I will work towards making it a real singleton.
Updated•4 years ago
|
Description
•