Closed Bug 1770496 Opened 2 years ago Closed 2 years ago

Prototype for HAR generator support

Categories

(Remote Protocol :: WebDriver BiDi, task, P1)

task
Points:
13

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: jdescottes)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [webdriver:m4])

Attachments

(4 files)

In milestone 5 we want to work towards supporting HAR generation. As such we need to investigate requirements and possible platform blockers.

The wanted prototype should cover the following topics:

  • Can we retrieve all the network event related information that we need to generate HAR files?
  • Check proper support for all necessary Platform APIs
  • Quick and dirty addition for additional required WebDriver BiDi features (like navigation events for DOMContentLoaded and load)

Finally we should make sure that there is no significant impact on performance with this feature enabled.

Julian will have a look at this bug around mid July.

Assignee: nobody → jdescottes
Status: NEW → ASSIGNED

Draft spec PR for Network module is at https://github.com/w3c/webdriver-bidi/pull/204

After spending some time porting the DevTools network observer codebase to BiDi, I wanted to summarize my understanding of the architecture below.

Overall architecture

DevTools network event handling is using several observers and helpers in order to collect all the necessary information.

In the parent process, the overall activity is monitored by a NetworkObserver class. The role of this class will be to observe activities and notifications to detect requests which match a set of filters (for instance, capturing only requests related to a given tab). To be clear, even if the intent is to only monitor requests for a specific tab, by design the observers will be triggered for all requests, and the NetworkObserver is manually filtering out unwanted ones.

The activity part is handled via gActivityDistributor.addObserver. This mainly allows to detect new requests and collect timings. The activity observer is triggered at various steps of a request's lifecycle. Some steps are only used to collect a timing, others are used for collecting more information.

On top of this several observer notifications are used: http-on-examine-response, http-on-examine-cached-response, http-on-modify-request and http-on-stop-request. The http-on-examine-(cached-)response ones are used to collect response headers & cookies as well as to handle cached responses. http-on-modify-request is only used in throttling scenarios in DevTools, whereas in our CDP implementation, it is where we detect new requests. http-on-stop-request is used for some edge cases (eg CORS blocked requests).

There are two other observer notifications handled by the NetworkObserver: http-on-failed-opening-request and service-worker-synthesized-response, but they are intended to be used from the content process or for service worker requests so we will not focus on them for this description.

When a request is detected, a NetworkResponseListener will be instanciated. Unlike the NetworkObserver, this listener is dedicated to the channel of a single request. It will be used to collect the response content, retrieve security information.

A very basic description of the architecture for monitoring events is that the NetworkObserver will attempt to detect requests matching its filters. Then it will create a NetworkResponseListener and after that both will work together to collect all the data relevant to the request's channel.

Several objects hold the information about a request. From a base platform object (nsIChannel) to a DevTools actor (or resource), with intermediary wrappers in the mix. But since the NetworkObserver is not tied to a channel, they are useful at the moment to hold channel-specific data. Basically, for a single request/channel, the following instances/objects are created:

  • NetworkResponseListener instance
  • httpActivity object
  • response object
  • owner (either a NetworkEventActor or NETWORK_EVENT resource)

There is no clear ownership between the various objects, so it can be hard to track the data flow. For instance, the NetworkObserver will sometimes write information in the response object, which will later on be accessed by the NetworkResponseListener to be passed to the owner.

Monitoring network events is a complicated state-machine, with many edge cases & potential race conditions. So in order to hide the complexity from the main DevTools code, all the data for a given network request ends up on the owner object which defines helpful setters for all the important information (addRequestHeaders, addResponseStart, etc...). This can help navigating the codebase, when you want to understand how a given data is collected, start by the end, find the corresponding add* method and then try to understand when and how this is called by the NetworkObserver machinery.

Collecting request-specific data

A request is usually detected via the ActivityObserver, when receiving the activity with subtype ACTIVITY_SUBTYPE_REQUEST_HEADER. At that point, the request headers will be extracted from the channel. Since cookies are part of the headers as well, they are also parsed here. The entry point to extract cookies is nsIChannel:visitRequestHeaders, plus some helpers to parse and encode the content for safe viewing.

A lot of basic information can already be extracted from the channel, such as the URL and the method. See for instance DevTools' createNetworkEvent.

There are also edge cases where the request is detected at a different point in time. If the observer started after request headers were sent, the request might be detected only via the http-on-stop-request observer notification. Same thing if the request was blocked.

If there is any request post data to collect, it is captured via the activity observer, when receiving ACTIVITY_SUBTYPE_REQUEST_BODY_SENT. There is no direct API to access it, it is retrieved via reading the appropriate input stream. See DevTools' readPostTextFromRequest.

Collecting response-specific data

When it comes to response cookies and headers, they are caught by observer notifications http-on-examine-response and http-on-examine-cached-response. Similarly to request headers and cookies, the entry point is also on nsIChannel, this time the API is visitOriginalResponseHeaders.

Otherwise, as explained earlier, a dedicated NetworkResponseListener is created, which will listen to request events as an nsIRequestObserver (onStartRequest, onStopRequest). When onStartRequest is called, it will start monitoring the input stream to collect the response body as it comes through, as an nsIStreamListener. The way DevTools listens to the stream is noticeably more complex than what is done in remote/cdp. It involves a nsIStreamListenerTee, apparently to avoid hiding network errors (Bug 515051). We should review if this is still relevant and which solution should be used moving forward.

During onStartRequest DevTools will also start collecting security information about the request. This is at least be helpful to extract the protocol information. remote/cdp is also extracting security information, although it does that on the http-on-examine-response notification observer.

Collecting timings

DevToolsNetworkObserver monitors the activity via gActivityDistributor.addObserver.
The translation table from activity subtype to timing data used by DevTools is:

  httpTransactionCodes: {
    0x5001: "REQUEST_HEADER",
    0x5002: "REQUEST_BODY_SENT",
    0x5003: "RESPONSE_START",
    0x5004: "RESPONSE_HEADER",
    0x5005: "RESPONSE_COMPLETE",
    0x5006: "TRANSACTION_CLOSE",

    0x804b0003: "STATUS_RESOLVING",
    0x804b000b: "STATUS_RESOLVED",
    0x804b0007: "STATUS_CONNECTING_TO",
    0x804b0004: "STATUS_CONNECTED_TO",
    0x804b0005: "STATUS_SENDING_TO",
    0x804b000a: "STATUS_WAITING_FOR",
    0x804b0006: "STATUS_RECEIVING_FROM",
    0x804b000c: "STATUS_TLS_STARTING",
    0x804b000d: "STATUS_TLS_ENDING",
  },

The corresponding platform activity subtypes and transport status can be found in 2 spots:

Each activity subtype might be received several times, in which case we record both the first and the last timestamp. If it was only received once last has the same value as first.

Those times are then translated into meaningful timings via the logic at https://searchfox.org/mozilla-central/rev/f3616b887b8627d8ad841bb1a11138ed658206c5/devtools/server/actors/network-monitor/network-observer.js#1067-1360.

Closing comments

There are several differences between DevTools and remote/cdp implementations. For data collected by both implementations they don't necessarily rely on the same events/notifications in order to decide to collect it. Some of that might be because cdp/remote is not extracting timings and therefore does not need to monitor the activity. Whereas DevTools needs the activity observer anyway and therefore reuses it to also detect requests etc... It is hard to say if we could make the activity observer solely about timings. Maybe the DevTools approach is also covering edge cases?

Given the complexity, my preferred option to move forward here would be to cleanup the DevTools implementation and use that in both CDP and BiDi. We should remove intermediary objects, make the API easier to understand, make it more configurable so that BiDi / CDP can opt out of capturing response bodies for instance.

But reusing the same layer means we can rely on all the existing coverage for DevTools.

I will now upload my stack for network + har modules in BiDi. We can then decide what is left to do in this bug.

Depends on D154868

Depends on D154869

Attachment #9290255 - Attachment description: Bug 1770496 - [bidi] proto: Add support for responseHeadersSize → Bug 1770496 - [bidi] proto: Add support for HAR information not in BiDi spec

Here is an example of a HAR file generated with the patches attached here. It was recording by navigating from about:newtab to wikipedia.org

In those patches, HAR generation is implemented as an additional module, with two commands:

  • startRecording
  • stopRecording

The module is mostly listening to events from the network module, but also using a few platform APIs to create the necessary metadata for the HAR file, basically the creator and browser properties:

      "creator": {
        "name": "Firefox",
        "version": "105.0a1"
      },
      "browser": {
        "name": "Firefox",
        "version": "105.0a1"
      },

This is not AFAIK accessible from BiDi but would be trivial to expose.

Another information currently missing from this HAR file are the pages as well as the pageref for each entry. All network events currently have a context information, however I am wondering if this will be specific enough. For top-level browsing context, the context id will be the same across navigations. If a tab navigates from pageA to pageB, I am wondering if a request initiated from pageA might not end up classified for pageB depending on when we actually process the event. This issue has been reported for the current HAR support in Firefox. I feel like the only way to address this would be to really have an id unique to the window global exposed to the caller.

Another piece of information about the pages entry. Each page normally has a title property, which in Firefox gives the actual title of the page (as the spec says). But Chromium based browsers set the URL of the page instead, eg:

    "pages": [
      {
        "startedDateTime": "2022-08-23T07:04:40.642Z",
        "id": "page_1",
        "title": "https://wikipedia.org/",
        "pageTimings": {
          "onContentLoad": 322.7859999751672,
          "onLoad": 322.60700000915676
        }
      }
    ],

I found it odd that the URL was not included in the page entry by definition, so it's interesting to see that in practice implementations differ on that front. I'm not sure how to move forward there, but adding a url to the page entry on top of title would probably be good?

(In reply to Julian Descottes [:jdescottes] from comment #8)

I found it odd that the URL was not included in the page entry by definition, so it's interesting to see that in practice implementations differ on that front. I'm not sure how to move forward there, but adding a url to the page entry on top of title would probably be good?

Good point url could be included in the page entry.
For now we can do the same as Chromium based browsers.

Blocks: 1787436
Blocks: 1787445

Filed all follow up bugs discussed in the meeting, closing.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: