Emit "init" lifecycleEvent in Runtime.evaluate when document.open() is called (to fix page.setContent in Puppeteer)
Categories
(Remote Protocol :: CDP, task, P1)
Tracking
(firefox76 fixed)
Tracking | Status | |
---|---|---|
firefox76 | --- | fixed |
People
(Reporter: whimboo, Assigned: impossibus)
References
(Blocks 1 open bug)
Details
(Whiteboard: [puppeteer-beta-mvp])
Attachments
(2 files)
Puppeteer relies on an init
lifecycleEvent when document.open()
gets called:
This is how it has been implemented in Chromium:
https://chromium.googlesource.com/chromium/src/+/16de2eabee54981bf2f6b74c45d876bb47264885%5E%21/
I asked Olli in how we could determine in Gecko that document.open
has been called. He proposed to use a readystatechange
listener, and then check when the document's state changed back to READYSTATE_UNINITIALIZED. When this happened we could emit the init
lifecycleEvent via the Page
domain.
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
This is needed for Page.setContent for Puppeteer, and it's used a lot in their unit tests. Perhaps we should change the tests to use data URIs if setContent isn't popular outside of Puppeteer unit tests.
Assignee | ||
Comment 2•5 years ago
|
||
Todo -- push to try with data URI returned by setContent to see how many tests we'll pass with this change.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
Replacing Puppeteer's Page.setContent implementation with just a navigation to a data URL works well for the vast majority of Puppeteer unit tests. When running this change against Chrome, test failures appear in navigation.spec.js, page.spec.js (actual setContent tests) and one in requestinterception.spec.js. These failures show what other behaviour is expected from setContent -- we still have to support that.
Assignee | ||
Comment 4•5 years ago
|
||
~60 more tests would pass: see https://bugzilla.mozilla.org/show_bug.cgi?id=1616452#c1
Reporter | ||
Comment 5•5 years ago
|
||
Given that more than 60 additional Puppeteer unit tests are passing this with, and that this will be pretty helpful for us to verify that our implementations work, I will try to get this fixed.
Reporter | ||
Comment 6•5 years ago
|
||
Actually we do not really hang in setContent given that the lifecycle watcher is fine with the DOMContentLoaded
lifecycle event. So emitting init
might be optional.
Assignee | ||
Comment 7•5 years ago
|
||
There's a race. I'm investigating. Here are two log excerpts, with newlines added for highlight. Using Puppeteer at 0b1a9ce (latest master).
PASS
puppeteer:protocol SEND ► {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"() => {\n let retVal = '';\n if (document.doctype)\n retVal = new XMLSerializer().serializeToString(document.doctype);\n if (document.documentElement)\n retVal += document.documentElement.outerHTML;\n return retVal;\n }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":18} +2ms
1583179174320 RemoteAgent TRACE <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179174291}}
puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.domContentEventFired","params":{"timestamp":1583179174291}} +19ms
puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179174291}} +2ms
1583179174333 RemoteAgent TRACE (connection {5d7fc80e-5369-fb43-87b8-460b4e186241})-> {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"() => {\n let retVal = '';\n if (document.doctype)\n retVal = new XMLSerializer().serializeToString(document.doctype);\n if (document.documentElement)\n retVal += document.documentElement.outerHTML;\n return retVal;\n }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":18}
1583179174340 RemoteAgent TRACE <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"sessionId":1,"id":18,"result":{"result":{"type":"string","value":"<html><head></head><body><div>hello</div></body></html>","description":"<html><head></head><body><div>hello</div></body></html>"}}}
puppeteer:protocol ◀ RECV {"sessionId":1,"id":18,"result":{"result":{"type":"string","value":"<html><head></head><body><div>hello</div></body></html>","description":"<html><head></head><body><div>hello</div></body></html>"}}} +20ms
puppeteer:protocol SEND ► {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":19} +1ms
1583179174353 RemoteAgent TRACE (connection {5d7fc80e-5369-fb43-87b8-460b4e186241})-> {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":19}
1583179174411 RemoteAgent TRACE <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"method":"Target.targetDestroyed","params":{"targetId":11}}
puppeteer:protocol ◀ RECV {"method":"Target.targetDestroyed","params":{"targetId":11}} +70ms
1583179174427 RemoteAgent TRACE <-(connection {5d7fc80e-5369-fb43-87b8-460b4e186241}) {"id":19}
puppeteer:protocol ◀ RECV {"id":19} +17ms
1) [ OK ] Firefox Browser Page Page.setContent should work (page.spec.js:715:5)
FAIL
puppeteer:protocol SEND ► {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"html => {\n document.open();\n document.write(html);\n document.close();\n }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[{"value":"<div>hello</div>"}],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":17} +2ms
1583179615485 RemoteAgent TRACE (connection {2a905a15-88d7-934b-8b34-3087fb72c635})-> {"sessionId":1,"method":"Runtime.callFunctionOn","params":{"functionDeclaration":"html => {\n document.open();\n document.write(html);\n document.close();\n }\n//# sourceURL=__puppeteer_evaluation_script__\n","executionContextId":2,"arguments":[{"value":"<div>hello</div>"}],"returnByValue":true,"awaitPromise":true,"userGesture":true},"id":17}
1583179615496 RemoteAgent TRACE <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"sessionId":1,"id":17,"result":{"result":{"type":"undefined"}}}
1583179615496 RemoteAgent TRACE <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"sessionId":1,"method":"Page.domContentEventFired","params":{"timestamp":1583179615490}}
puppeteer:protocol ◀ RECV {"sessionId":1,"id":17,"result":{"result":{"type":"undefined"}}} +28ms
puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.domContentEventFired","params":{"timestamp":1583179615490}} +11ms
1583179615518 RemoteAgent TRACE <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179615490}}
puppeteer:protocol ◀ RECV {"sessionId":1,"method":"Page.lifecycleEvent","params":{"frameId":"11","loaderId":null,"name":"DOMContentLoaded","timestamp":1583179615490}} +13ms
puppeteer:protocol SEND ► {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":18} +15s
1583179630477 RemoteAgent TRACE (connection {2a905a15-88d7-934b-8b34-3087fb72c635})-> {"method":"Target.disposeBrowserContext","params":{"browserContextId":6},"id":18}
1583179630504 RemoteAgent TRACE <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"method":"Target.targetDestroyed","params":{"targetId":11}}
puppeteer:protocol ◀ RECV {"method":"Target.targetDestroyed","params":{"targetId":11}} +32ms
1583179630528 RemoteAgent TRACE <-(connection {2a905a15-88d7-934b-8b34-3087fb72c635}) {"id":18}
puppeteer:protocol ◀ RECV {"id":18} +26ms
1) [TIME] Firefox Browser Page Page.setContent should work (page.spec.js:715:5)
Message:
Timeout Exceeded 15000ms
Assignee | ||
Comment 8•5 years ago
|
||
Failures increase when SLOW_MO=80.
As Henrik says, this bug might be a symptom of Bug 1599257
Assignee | ||
Comment 9•5 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #6)
Actually we do not really hang in setContent given that the lifecycle watcher is fine with the
DOMContentLoaded
lifecycle event. So emittinginit
might be optional.
We do need init, actually.
Puppeteer tracks life cycle events internally in a way that doesn't correspond exactly to the events received from the browser. For example, it records "load" and "DOMContentLoaded" when it sees "Page.frameStoppedLoading". It's that internal state that is checked by the life cycle watcher for setContent, and it's looking for "load".
Due to the race mentioned in Comment 7, and due to the missing init event which is needed to clear the internal life cycle state, the life cycle watcher finds a 'load' from the initial navigation to about:blank and that's what resolves the Promise for setContent.
So we need to emit an init Page.lifeCycleEvent (based on readystate as Olli suggested) as well as a load Page.lifeCycleEvent which we should get from the document.write/close in the setContent script.
However, I can't capture either of those events from the Page domain. The only events we get via the domain's chromeEventHandler are DOMContentLoaded, pagehide, pageshow. I've tried listening for other events but I get nothing. I also tried listening on this.content, this.content.document, etc.
Henrik, who would be a good person to ask about this? Could you forward the need-info while I'm away the next days?
Reporter | ||
Comment 10•5 years ago
|
||
Olli, do you have any idea why the chromeEventHandler
for readystatechange
doesn't fire when it is registered? Maybe we use the wrong node or even a wrong method? See Maja's comment from above. Thanks!
Comment 11•5 years ago
|
||
Are you perhaps using a listener in bubble phase? readystatechange doesn't bubble, so need to use capturing listener.
Assignee | ||
Comment 12•5 years ago
|
||
Many thanks!
So with capture: true
we also see load
and unload
in addition to readystatechange
on chromeEventHandler
. We were really missing these on the Page domain, so expect the way we emit life cycle events can improve a lot now.
As an aside, I also looked at what we get from the webProgress onStateChange
handler when a document.open/write/close occurs: an about:document-onload-blocker
request starts at that point, so that might be an extra source of info in case we don't get what we need from readystatechange.
Assignee | ||
Comment 13•5 years ago
|
||
Evaluating document.open
+ write
+ close
triggers the readyState
to change to "loading". Puppeteer, for example, relies on a seeing an
"init" life cycle event in that case for their Page.setContent
method.
Also move "Page.loadEventFired" to be triggered by an actual "load"
instead of "pagehide", which we don't see after a document.write
.
Updated•5 years ago
|
Assignee | ||
Comment 14•5 years ago
|
||
Depends on D67702
Reporter | ||
Comment 15•5 years ago
|
||
Note that this patch reduces the test job duration from about 30 minutes to only ~14 minutes! This contributes well to ci costs. CC'ing Joel for info. It also means soon we can run this test job more often again.
Comment 16•5 years ago
|
||
Comment 17•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6a8ca36f31e8
https://hg.mozilla.org/mozilla-central/rev/06b9dd699708
Updated•4 years ago
|
Description
•