1209390 - Use standard lz4 file format instead of the non-standard jsonlz4/mozlz4

Reporter

Description

•

9 years ago

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:41.0) Gecko/20100101 Firefox/41.0 Build ID: 20150918100310 Steps to reproduce: Use Firefox. Actual results: Bookmark backup files (in "bookmarkbackups/") and other files (such as things in "crashes/") are lz4-compressed files, but they use a non-standard format. Result: Users cannot avail themselves of standard, commonly available tools to inspect these files, which contain *their* data. Instead they have to resort to Firefox-specific (or Mozilla-specific, same point) hacks to access their data. [1] [1] Such as using the Library GUI in Firefox to export bookmarks; or using Mozilla's lz4 interfaces through XPCOM. Expected results: Mozilla should use standard file formats. You promised to switch to a standard format, once one was defined [2]. One was defined a while ago [3]. Standard tools are available since some time [4]. Why are you still delaying? [2] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/workerlz4/lz4.js#49 [3] https://github.com/Cyan4973/lz4/blob/master/lz4_Block_format.md https://github.com/Cyan4973/lz4/blob/master/lz4_Frame_format.md [4] For example: https://packages.debian.org/jessie/liblz4-tool

Boris Zbarsky [:bzbarsky]

Updated

•

9 years ago

Component: General → Places

Product: Core → Toolkit

vlakoff

Comment 1

•

9 years ago

lz4.js file has been moved to /toolkit/components/lz4/lz4.js [1] [1] http://mxr.mozilla.org/mozilla-central/source/toolkit/components/lz4/lz4.js

Avi Halachmi (:avih)

Updated

•

9 years ago

Blocks: 818587

Marco Bonardo [:mak]

Comment 2

•

9 years ago

Places will use a different format, when the platform (toolkit) will move to a different format. And I think it will be the same for all the consumers. So this is a more general Toolkit bug.

Component: Places → General

Marco Bonardo [:mak]

Updated

•

9 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Marco Bonardo [:mak]

Comment 3

•

9 years ago

(In reply to Marco Bonardo [::mak] from comment #2) > Places will use a different format, when the platform (toolkit) will move to > a different format. To clarify, if the compressor starts creating files in the more widely supported format, and the decompressor can still support the old format (I don't see why not, since there's an header it can detect), it will be enough to fix the lz4 component in toolkit to automatically move all the consumers to the new format.

Avi Halachmi (:avih)

Comment 4

•

9 years ago

I've created an unofficial stand-alone decompressor for `.jsonlz4` files. The project with source code is hosted here: https://github.com/avih/dejsonlz4 . Initial v0.1 release can be found here https://github.com/avih/dejsonlz4/releases and includes a Windows executable `dejsonlz4.exe`. It should hopefully compile easily elsewhere too. Please take any discussions regarding this project to the project page on github.

Comment hidden (offtopic)

Avi Halachmi (:avih)

Comment 6

•

9 years ago

(In reply to Anthony Thyssen from comment #5) > Could you include in your source code "README.md" file ... (In reply to Avi Halachmi (:avih) from comment #4) > Please take any discussions regarding this project to the project page on github.

Comment hidden (offtopic)

Georg Fritzsche [:gfritzsche]

Comment 8

•

8 years ago

(In reply to S from comment #7) > (In reply to Avi Halachmi (:avih) from comment #6) > > (In reply to Avi Halachmi (:avih) from comment #4) > > > Please take any discussions regarding this project to the project page on github. > > Thanks for the piece of code to decompress. But do you have any idea to go > the other way and compress? Please take this to GitHub, this can't be answered here.

H.-Dirk Schmitt

Comment 9

•

8 years ago

The current obfuscation by changing the magic field in the lz4 compressed files doesn't provide any security enhancements. For advanced users it is now more complicated to "patch" or "synchronise" configuration files. So +1 for migrate to standard lz4 compressed or uncompressed files.

Marco Bonardo [:mak]

Comment 10

•

8 years ago

(In reply to H.-Dirk Schmitt from comment #9) > The current obfuscation by changing the magic field in the lz4 compressed > files doesn't provide any security enhancements. There is no obfuscation will, nor security enhancement here. At the time when lz4 was added there wasn't a standard format, thus a very simple header had been created in front of the payload. Now a standard exists, but nobody internally had the time to convert the encoder/decoder for it, nor anyone volunteered to do that yet.

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Updated

•

8 years ago

Blocks: 1320539

ZeroUnderscoreOu

Comment 11

•

8 years ago

I would like to support this change in regards to search engines. Currently, I'm forced to use side tools and export/import is severely limited. Also, I don't really see how compression/signing really prevents search hijacking when (from my experience) most of the time it's done by side software with all the needed tools for that.

Stefan Endrullis

Comment 12

•

7 years ago

Are there any news on this? Is Mozilla planning to replace the Firefox-specific lz4 format by a standard format? I definitely want to vote for a standard format, because it's then much easier to work with in other tools. For instance, I've written a Scala/Java library that reads the state of the Firefox session file. I've chosen the JVM to be platform independent. With a standard lz4 format there would not be any problem, since there are multiple lz4 libraries for Java. However, this Firefox-specific format makes the whole decoding now much more complicated. I can either try to reimplement the Firefox-specific implementation in Java or I have to deploy the library with several platform dependent tools like dejsonlz4 which do the decompression task for the specific platform. Both solutions are cumbersome and therefore I really hope for a format change. Are there any workarounds for tool developers like me? For instance, is it possible to decompress the Firefox-specific format somehow with some standard lz4 decompressors by slightly changing the format or something like that?

Stefan Endrullis

Comment 13

•

7 years ago

OK, forget my last question. Fortunately, the Firefox-specific format is actually the same as lz4 except that there is a 12 bytes prefix which can be just skipped for decompressing the file. So there is a workaround for now. :)

Marco Bonardo [:mak]

Comment 14

•

7 years ago

yes, the only non-standard thing is the header.

Priority: -- → P5

vlakoff

Comment 15

•

7 years ago

FWIW, I'm using a PHP code for several years: https://gist.github.com/vlakoff/3139e310664285c6c83b Also, note it's not the latest LZ4, but v1.3, which is not upward compatible.

Graham Perrin

Comment 16

•

7 years ago

(In reply to Stefan Endrullis from comment #12) > … workarounds … mozlz4-edit – Add-ons for Firefox <https://addons.mozilla.org/addon/mozlz4-edit/ > … open, edit and save …

janek

Comment 17

•

5 years ago

Another utility, for people trying to work around this: https://gist.github.com/Tblue/62ff47bef7f894e92ed5

What I'm wondering now - why compress at all? For me, and I guess for most other users, this file doesn't even reach a Megabyte. Yes, lz4 can compress it to around 10%, but who cares at that size?

Marco Bonardo [:mak]

Comment 18

•

5 years ago

It does less I/O when storing and retrieving on disk.

Marcel Partap

Comment 19

•

4 years ago

Ok from a quick benchmark, I'd zuggest switching to zstd? It seems to compress/decompress slightly faster than LZ4 with triple the compression ratio...
(I unpacked a ~4MiB upgrade.jsonlz4 I had in my profile and duplicated its contents random times, than ran multitime with mozlz4 (python3 implemented, that may skew the results), zstd and pigz commands packing and unpacking the test file.. on a tmpfs, with BOINC paused on my hexacore machine)

652M test.json
123M test.json.gz
144M test.json.lz4
 43M test.json.zst

Compressing:

++ multitime -v zstd -kf test.json
===> Executing zstd -kf test.json
===> multitime results
1: zstd -kf test.json
            Mean        Std.Dev.    Min         Median      Max
real        1.266       0.000       1.266       1.266       1.266       
user        1.363       0.000       1.363       1.363       1.363       
sys         0.178       0.000       0.178       0.178       0.178       

++ multitime -v sh -c 'mozlz4 -c < '\''test.json'\'' > '\''test.json'\''.lz4'
===> Executing sh -c "mozlz4 -c < 'test.json' > 'test.json'.lz4"
===> multitime results
1: sh -c "mozlz4 -c < 'test.json' > 'test.json'.lz4"
            Mean        Std.Dev.    Min         Median      Max
real        1.353       0.000       1.353       1.353       1.353       
user        0.852       0.000       0.852       0.852       0.852       
sys         0.497       0.000       0.497       0.497       0.497       

++ multitime -v pigz -kf test.json
===> Executing pigz -kf test.json
===> multitime results
1: pigz -kf test.json
            Mean        Std.Dev.    Min         Median      Max
real        2.010       0.000       2.010       2.010       2.010       
user        22.482      0.000       22.482      22.482      22.482      
sys         0.454       0.000       0.454       0.454       0.454

Decompression:

++ multitime -v zstd -dc test.json.zst
===> Executing zstd -dc test.json.zst
===> multitime results
1: zstd -dc test.json.zst
            Mean        Std.Dev.    Min         Median      Max
real        0.368       0.000       0.368       0.368       0.368       
user        0.363       0.000       0.363       0.363       0.363       
sys         0.004       0.000       0.004       0.004       0.004       

++ multitime -v sh -c 'mozlz4 -d < test.json.lz4 > /dev/null'
===> Executing sh -c "mozlz4 -d < test.json.lz4 > /dev/null"
===> multitime results
1: sh -c "mozlz4 -d < test.json.lz4 > /dev/null"
            Mean        Std.Dev.    Min         Median      Max
real        0.863       0.000       0.863       0.863       0.863       
user        0.408       0.000       0.408       0.408       0.408       
sys         0.455       0.000       0.455       0.455       0.455       

++ multitime -v pigz -dc test.json.gz
===> Executing pigz -dc test.json.gz
===> multitime results
1: pigz -dc test.json.gz
            Mean        Std.Dev.    Min         Median      Max
real        1.467       0.000       1.467       1.467       1.467       
user        2.184       0.000       2.184       2.184       2.184       
sys         0.174       0.000       0.174       0.174       0.174

mozbugz

Comment 20

•

3 years ago

For anyone coming across this needing to decompress these files:

/* 
NOTE: BEFORE RUNNING THIS SCRIPT, CHECK THIS SETTING:
Type or paste about:config into the address bar and press Enter
Click the button promising to be careful
In the search box type devt and pause while Firefox filters the list
If devtools.chrome.enabled is false, double-click it to toggle to true

Paste this entire script into the command line at the bottom of the Browser Console (Windows: Ctrl+Shift+j)
Then press Enter to run the script. A file picker should promptly open.
*/

async function convert() {
  // Set up file chooser
  var fp = Components.classes["@mozilla.org/filepicker;1"]
    .createInstance(Components.interfaces.nsIFilePicker);
  fp.init(window, "Open File", Components.interfaces.nsIFilePicker.modeOpen);
  fp.appendFilter("Bookmark Backup Files", "*.jsonlz4");
  var result = await new Promise(resolve => fp.open(resolve));
  // Call file choose, proceed if a file was chosen
  if (result == Components.interfaces.nsIFilePicker.returnOK) {   var file = fp.file;
    // Check that file can be used
    if (file.exists() && file.isFile() && file.isReadable()) {
      var oldfile = fp.file.path;
      // Construct output file name
      var newfile = oldfile.replace(".jsonlz4", "_converted.json");
      // See: http://forums.mozillazine.org/viewtopic.php?p=14111285#p14111285
      var {utils:Cu} = Components;
      Cu.import("resource://gre/modules/osfile.jsm");
      var jsonString = await OS.File.read(oldfile,{ compression: "lz4" });
      // console.log(jsonString);
      OS.File.writeAtomic(newfile, jsonString);
    }
  }
}
convert()

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

BugBot [:suhaib / :marco/ :calixte]

Comment 21

•

2 years ago

The severity field for this bug is relatively low, S3. However, the bug has 25 votes and 52 CCs.
:mossop, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(dtownsend)

BugBot (nomail) [:suhaib / :marco/ :calixte]

Comment 22

•

2 years ago

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(dtownsend)

Comment hidden (advocacy)

People, it is important to support open formats.

I know Mozilla does not intend to prevent interoperability using this format. But other companies regularly try this. Remember Microsoft in the 90s with the "Embrace, extend, and extinguish"?

Mozilla should lead by example and support open formats as much as possible. This is a trivial modification and could be done easily.

I understand this is a very little thing, but remember your manifesto:

Principle 6

The effectiveness of the internet as a public resource depends upon interoperability (protocols, data formats, content), innovation and decentralized participation worldwide.

This should be accomplished on the little things too.

Dave Townsend [:mossop]

Comment 24

•

1 years ago

I don't think there is any value in posting further comments in support of this, we're on board with doing it, we just don't currently have the resources available to do it. If someone wants to work on implementing it then we would review a patch.