863440 - Proposal for FHR document payload format v.Next

Reporter

Description

•

12 years ago

Attached file Example JSON document following proposed format (deleted) — Details

This is a proposal for a substantial change to the FHR schema to support better tracking of data when the payload version changes or when some factor of application info such as build number, memory, etc. changes. Currently, a change in the schema causes a problem with how to report the mixture of old and new data. The problem is especially visible on the day a change occurs because some sessions for the day were recorded with the old schema and some with the new schema. If we throw out the old one or attempt to merge them, we lose important data or we risk corrupting the new data with the old leading to potentially incorrect analysis. A similar problem exists when the user upgrades the application or makes other changes to the environment such as installing new memory, upgrading or disabling an add-on, etc. Without longitudinal tracking of the app info data, any analysis that focuses on certain factors such as build number becomes unreliable when looking at the data prior to the current session that submitted. A proposal for a way to solve both of these issues and have a schema that is very future-proof is laid out below. Synopsis: Whenever a session starts, the data for the session is keyed with a hash of all the app info variables. Whenever an app info variable changes, the hash and the variables that changed (i.e. the diff) is stored to be part of the submission payload. Each date in the dates section of the payload contains one or more objects whose key is the hash value and whose value is the collection of session information pertaining to that set of app info values. Workflow: * On app / session startup: ** Retrieve all current app info values. ** Compute a hash for the set. ** Key the new session data collection with this hash. ** Check whether this hash and value set has been previously persisted. If not, persist it. * On app / session start after an abnormal termination: ** Persist and finalize the previous session data into storage under the given hash key with the session end flag of "abnormal termination". * On app / session close: ** Persist and finalize session data into storage under the given hash key with the session end flag of "normal termination". * On dynamic app info change (i.e. change of restartless add-on or enable/disable of plug-in): ** Finalize the current session with the time of the change and record the session end flag of "app info change". ** Start a new session with a startup time of the change and a session start flag of "app info change". * On document construction for submission: ** Construct the dates object of the payload. *** Add a day object with the date as the property name for every day we have session info to record. **** Remember the app info hash keys that were used for each date in a set of all observed hash keys. **** Write each hash key as a property name under the date object with all of the session data for that app info hash as the value. ** Construct the appInfo object of the payload. *** Add an object to the appInfo object with a property name of "current" and a value of all app info values for the current session. ** Add each observed hash key as an additional property to the appInfo object with the hash key as the property name and the value being only the app info variables that have different values from the current app info. Note that this means we will have a property with the name being the current hash key and the value being an empty object. * On removal of out of date entries: ** Remove out of date entries as normal. ** Scan remaining dates and remember the observed hash keys used in those dates. ** Remove any app info records that do not exist in the observed set of hash keys. For the purposes of this document, we should consider "appInfo" to include all info about the environment (appInfo + sysInfo) to cover things like memory size and operating system version.

Attachment #739263 - Flags: feedback?(mconnor)

Attachment #739263 - Flags: feedback?(gps)

Gregory Szorc [:gps]

Comment 1

•

12 years ago

Comment on attachment 739263 [details] Example JSON document following proposed format Initial reaction is good and I think most of what you propose is doable. I think we should have all the engineers get on a call together to flush out some things - as I think it will take a lot of back and forth. Perhaps dumping this on an etherpad [and allowing inline comment] is a good first step?

Attachment #739263 - Flags: feedback?(gps) → feedback+

Daniel Einspanjer [:dre] [:deinspanjer]

Reporter

Comment 2

•

12 years ago

Attached file A WIP translator from v1 to vNext (deleted) — Details

I think a hack / discussion session would be great. I'm also attaching a WIP version of a translator that Mark did. It doesn't translate to precisely the format described in the proposal yet (it leaves a lot of data still in the days segments), but it would be a good starting point for further experimentation.

Gregory Szorc [:gps]

Comment 3

•

12 years ago

I was going to suggest the 'hash' doesn't really need to be a hash and could instead by any consistent identifier (say an integer). However, I'm guessing you want the identifier to be globally consistent to make server-side aggregation easier? If that's an objective, perhaps we should limit the hash to more common properties (build ID, platform, etc) and not include add-ons?

Example JSON document following proposed format 12 years ago Daniel Einspanjer [:dre] [:deinspanjer] (deleted), application/json	gps : feedback+	Details
A WIP translator from v1 to vNext 12 years ago Daniel Einspanjer [:dre] [:deinspanjer] (deleted), text/x-wiki		Details