Bugzilla

Reporter

Comment 8

•

7 years ago

(In reply to Punam Dahiya [:pdahiya] from comment #7) > Hi Andy, I am working on a prototype (bug 1357160) that need some way to > update target.path after download is created and since your team explored, > it will be great if you can share issues found in getting this implemented. > Thanks In short, there are at least three different paths for starting downloads in Firefox, and not all of them are written in a way that facilitates asynchronously determining the target filename. The three paths I know of are: 1. Downloads.createDownload() http://searchfox.org/mozilla-central/rev/20963d7269b1b14d455f47bc0260d0653015bf84/toolkit/components/jsdownloads/src/Downloads.jsm#70 2. ContentAreaUtils.DownloadURL() http://searchfox.org/mozilla-central/rev/20963d7269b1b14d455f47bc0260d0653015bf84/toolkit/content/contentAreaUtils.js#793 3. From uriloader/exthandler/nsExternalHelperAppService.cpp (difficult to point to a single line of code there...) It is the third path that is most problematic for onDeterminingFilename. The ideal solution would be significant refactoring of that code path to move a bunch of the logic into the Download objects in toolkit/components/jsdownloads/ but that is a pretty substantial undertaking.

Flags: needinfo?(amckay)

Tom S [:evilpie]

Comment 9

•

7 years ago

This seems to be a big issue for Downloader add-ons. (i.e https://github.com/marklieberman/downloadstar/issues/9) How hard would it be to just support this for downloads created with chrome.downloads API?

Reporter

Comment 10

•

7 years ago

(In reply to Tom Schuster [:evilpie] from comment #9) > How hard would it be > to just support this for downloads created with chrome.downloads API? I'm not sure, but why couldn't an extension that wants to use this on its own downloads just supply the filename parameter when it first creates the download?

Comment 11

•

7 years ago

(In reply to Andrew Swan [:aswan] from comment #10) > (In reply to Tom Schuster [:evilpie] from comment #9) > > How hard would it be > > to just support this for downloads created with chrome.downloads API? > > I'm not sure, but why couldn't an extension that wants to use this on its > own downloads just supply the filename parameter when it first creates the > download? That was already explained in the link given. It requires each and every download be requested via a lower-level API like XMLHttpRequest to determine the filename, then cancelled, then requested again through chrome.downloads, which has significant overhead and breaks sites which only allow a single request per generated URL.

Reporter

Comment 12

•

7 years ago

(In reply to Stephan Sokolow from comment #11) > (In reply to Andrew Swan [:aswan] from comment #10) > > (In reply to Tom Schuster [:evilpie] from comment #9) > > > How hard would it be > > > to just support this for downloads created with chrome.downloads API? > > > > I'm not sure, but why couldn't an extension that wants to use this on its > > own downloads just supply the filename parameter when it first creates the > > download? > > That was already explained in the link given. > > It requires each and every download be requested via a lower-level API like > XMLHttpRequest to determine the filename, then cancelled, then requested > again through chrome.downloads, which has significant overhead and breaks > sites which only allow a single request per generated URL. For that scenario, this bug alone is not enough, bug 1254327 would also need to be fixed.

Tom S [:evilpie]

Comment 13

•

7 years ago

I looked into this a bit, but it seems even fixing this just for downloads started with browser.downloads.download is hard. From what I can tell we always create the target file path/name even before starting the download. We don't even currently seem to look at the Content-Disposition header.

BMO Automation

Updated

•

6 years ago

Product: Toolkit → WebExtensions

Mark Striemer [:mstriemer]

Reporter

Updated

•

5 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1579456

Comment 16

•

5 years ago

I had tried this at some point in the past and also concluded that it is hard. The API didn't seem compatible with how downloads are handled in Firefox.

Assignee: mstriemer → nobody

Type: defect → enhancement

Comment 17

•

5 years ago

Move the "destination stuff" (conflict action, making directories) into DownloadCore.
Implement optional Content-Disposition handling in DownloadCore (which is a nice to have regardless), falling back to deriving the name from either the original URL or the final URL (not sure which would be best). With some sanity checks, of course. Optional in that it should be possible to force a name (ext-downloads needs this). Not sure if legacy/pdf need to actually handle this, as it's probably handled elsewhere already, but copysaver certainly does.
Have DownloadCore notify the web extensions backend (and/or any other interested parties) and wait for the responses before moving the file to the final target path. A simple registerSuggestionProvider() type API where a provider is essentially a function (download): Promise<string?> would suffice to track what to notify.
Have ext-downloads observe the notifications, and in turn fire onDeterminingFilename on extension listeners, validate the results according to API rules and report the result back to DownloadsCore.
DownloadsCore collects the suggestions, validates according to it's rules (things like reserved names in Windows and/or reserved characters should be avoided) and if any is acceptable changes the final target path accordingly, considering the conflict action.
DownloadsCore creates the directories as necessary and moves the file to the final target path.

One hurdle is that the notification stuff probably needs to be async because IPC is involved to call into extensions. But that is solvable by delaying the final move until the results are in.

Of course, once the final target location changes, the part file will be in the wrong place too. Not sure if it should be moved (which requires flushing the file, closing it, moving it, reopening it) or if it is OK to keep it in the original location. I seem to remember the .part location is derived from the target location, which would mean it has to be moved so it can be found if the download is stopped and resumed.

mak, I am not entirely sure when what implementation of the Trinity that is DownloadsCopySaver, DownloadsLegacySaver, DownloadsPDFSaver kicks in. DownloadsCopySaver seems to be for creating new downloads ad-hoc, while DownloadsLegacySaver will adopt an existing channel (e.g. when a user clicked a link that resulted in a download)? And DownloadsPDFSaver is special for pdfjs integration?
Could you shed some light?
Also, does this sound like a plan, in general?

Flags: needinfo?(mak77)

Comment 18

•

5 years ago

(In reply to Nils Maier [:nmaier] from comment #17)

mak, I am not entirely sure when what implementation of the Trinity that is DownloadsCopySaver, DownloadsLegacySaver, DownloadsPDFSaver kicks in. DownloadsCopySaver seems to be for creating new downloads ad-hoc, while DownloadsLegacySaver will adopt an existing channel (e.g. when a user clicked a link that resulted in a download)? And DownloadsPDFSaver is special for pdfjs integration?

DownloadLegacySaver is for downloads initiated by the network stack, like clicking on a link, as you said. The network starts the download in background and we adopt it. In the past I suggested renaming it to DownloadAdoptingSaver, because Legacy doesn't really say anything (there was a plan to get full control of the download but it never went through), never filed a bug for that renaming though.
DownloadCopySaver is for downloads initiated by the downloads code, we retain full control of the process.
DownloadPDFSaver is special because it creates a download from a live PDF document, it's pretty limited in functionality.

I still have to evaluate the plan, will do asap, so I'm leaving the needinfo.

Comment 19

•

5 years ago

(In reply to Nils Maier [:nmaier] from comment #17)

Move the "destination stuff" (conflict action, making directories) into DownloadCore.

This is a bit generic, it would benefit from making a more detailed plan. Maybe you should try to do a whole breakdown of the needed changes and keep it in a gdoc or similar, then dependencies can be filed from it.

Implement optional Content-Disposition handling in DownloadCore (which is a nice to have regardless), falling back to deriving the name from either the original URL or the final URL (not sure which would be best).

This sounds like related, but not strictly necessary for this specific bug. Is it?

With some sanity checks, of course. Optional in that it should be possible to force a name (ext-downloads needs this). Not sure if legacy/pdf need to actually handle this, as it's probably handled elsewhere already, but copysaver certainly does.

The legacy saver will also have to handle onDeterminingFilename. It looks like there may be a general path to go through onDeterminingFilename, if every saver needs it.

One hurdle is that the notification stuff probably needs to be async because IPC is involved to call into extensions. But that is solvable by delaying the final move until the results are in.

Yes, using await it won't be crazy, I'd still probably race the promises with a timeout, we don't want to wait forever for a broken extensions.

Of course, once the final target location changes, the part file will be in the wrong place too. Not sure if it should be moved (which requires flushing the file, closing it, moving it, reopening it) or if it is OK to keep it in the original location. I seem to remember the .part location is derived from the target location, which would mean it has to be moved so it can be found if the download is stopped and resumed.

Both the placeholder and .part must be moved. If it's too complex to do while ongoing, it should be stored and persisted (eventually) in downloads.json, then done when the download is finished, though this may confuse the user. BackgroundFileSaver.cpp is doing that move internally I think, unfortunately I don't know that part very well, but it can be studied.

Flags: needinfo?(mak77)

Comment 20

•

5 years ago

(In reply to Marco Bonardo [::mak] from comment #19)

Yes, using await it won't be crazy, I'd still probably race the promises with a timeout, we don't want to wait forever for a broken extensions.

I have mixed feelings on this because I have a history of running into tab/extension loadouts which cause the browser to bog down for long periods of time and I wouldn't want brokenness elsewhere to randomly cause apparent breakages here.

(eg. At the moment, my chrome process will seize up at 100% of one core for up to a minute once or twice a day and, since most of my extensions are privacy or security extensions and I don't know how to reliably reproduce the problem on demand, I've been unwilling to risk doing an enable/disable binary search among them.)

Comment 21

•

5 years ago

(In reply to Stephan Sokolow from comment #20)

I have mixed feelings on this because I have a history of running into tab/extension loadouts which cause the browser to bog down for long periods of time and I wouldn't want brokenness elsewhere to randomly cause apparent breakages here.

I'm not sure I get your concern, my suggestion is just to unblock the download after some seconds if an add-on never replies to the callback. That has no relation with hangs or heavy load.

Comment 22

•

5 years ago

(In reply to Marco Bonardo [::mak] from comment #21)

(In reply to Stephan Sokolow from comment #20)

I have mixed feelings on this because I have a history of running into tab/extension loadouts which cause the browser to bog down for long periods of time and I wouldn't want brokenness elsewhere to randomly cause apparent breakages here.

I'm not sure I get your concern, my suggestion is just to unblock the download after some seconds if an add-on never replies to the callback. That has no relation with hangs or heavy load.

My concern is that heavy load elsewhere in the browser may slow the extension down to the point that the timeout fires before the extension manages to respond through no fault of its own.

Comment 23

•

5 years ago

I understand it's an unpredictable situation, but If the timeout is large enough it should not be a big deal, and still better than having downloads get stuck forever, imo.

Comment 24

•

5 years ago

The onDeterminingFilename API is such that it receives a suggest function which it must call synchronously before the listener function returns. If that function is not called, then it's just regarded as no suggestion provided. An extension therefore cannot forget to call and stall the download indefinitely.

What makes the entire thing async is the need for the WebExtension implementation to relay the event from the main process into the add-on process and relay back the result. It is my understanding this cannot stall forever either (if implemented correctly), and will throw an Error in the main process in the worst case if the add-on process dies prematurely

Thus, dealing with timeout should not be necessary.

Comment 25

•

5 years ago

Attached patch WIP patch v0 (deleted) — Details — Splinter Review

Here is a work in progress patch.
What's missing still:

tests for the new functionality
onDeterminingFilename for DownloadsLegacy (and pdf)
hooking up call sites other than the WebExtensions API with regards to allowNameDetermination.

What's working:

Determining the file name from channel meta data (Content-Disposition or redirected URL)
Existing tests (well, I tested the components/downloads xpcshell tests and the ext-download ones for now)
onDeterminingFilename for WebExtension-initiated downloads (which is my primary use case to begin with)

Marco, you wanted a more detailed plan, so here it is kinda ;)
I'd like some feedback to the overall approach (not a full code review with nitpicking etc). Are there any major showstopper issues you see there?

Also, as for hooking up other call sites, IMO downloads should be onDeterminingFilename-able unless the user specifically picked a file already. So e.g.

legacy downloads would generally be onDeterminingFilename-able unless the file picker was shown,
"Save Link As" asks to pick a save location so it shouldn't be onDeterminingFilename-able
The PDF download button normally does not show a file picker, so it normally should be onDeterminingFilename-able
extension-initiated downloads should be onDeterminingFilename-able unless they specify a filename in the create options (this is implemented already, and to match Chrome behavior).

I think this would pretty much match Chrome's behavior.
Does this sound reasonable?

Then there is the issue of moving the part file around or not. The issue here is that the part file location is user-defined and does not necessarily have any relationship with the target file, tho often it is just target + ".part". So I see three options now

Leave it be. Easiest, implementation-wise. The only minor drawback might be that the user gets confused when the .part file and the target file seem unrelated. Then again, they might be already "unrelated", and we do not have to move it, meaning less I/O, and one piece less that can break.
Always move it around (as long as name determination is allowed).
Only move it around when partFile === targetFile + ".part".

Thoughts?

Flags: needinfo?(mak77)

Comment 26

•

5 years ago

(In reply to Nils Maier [:nmaier] from comment #25)

legacy downloads would generally be onDeterminingFilename-able unless the file picker was shown,

"Save Link As" asks to pick a save location so it shouldn't be onDeterminingFilename-able

I seem to remember Chrome's behaviour being to fire onDeterminingFilename before showing the file picker so the extension could change the default that was presented to the user.

In fact, that's one of my target uses. To do things like "If the download is coming from domain X and it's a Zip or RAR file, override the extension in the Save As picker to .cbz or .cbr, respectively".

Comment 27

•

5 years ago

I seem to remember Chrome's behaviour being to fire onDeterminingFilename before showing the file picker so the extension could change the default that was presented to the user.

Interesting, kinda makes sense. I wasn't aware of this.
This might be somewhat hard to implement in Firefox I suppose as - if I remember correctly - downloads are not even created until after such a picker is shown.

In fact, that's one of my target uses. To do things like "If the download is coming from domain X and it's a Zip or RAR file, override the extension in the Save As picker to .cbz or .cbr, respectively".

Is that public or even open source? And some instructions?
I'd like to play with it.

Flags: needinfo?(from_bugzilla2)

Comment 28

•

5 years ago

(In reply to Nils Maier [:nmaier] from comment #27)

Is that public or even open source? And some instructions?
I'd like to play with it.

It was just a hard-coded one-off thing that I was side-loading back before Chrome introduced an "opening the Save As dialog has a small but non-zero probability of causing the UI to stop responding to X11 input events" bug that wound up driving me back to Firefox exclusively.

The .cbz example was for a site that you can't just create a free account on and get testable downloads, but I think I also had it correcting Twitter images to change ".jpeg:large.jpg" to ".jpg".

I'll see if I've still got a copy kicking around after I've slept.

Comment 29

•

5 years ago

As I suspected and kinda remembered the legacy downloads (that is initiated from clicking links or using save as) go through HelperAppDlg.jsm which handles the prompts (both, the generic save/launch one, and the file picker for save-as) before they hit nsITransfer (DownloadLegacyTransfer). So implementing onDeterminingFilename before those prompts are shown would be a HUGE task involving essentially rewriting the entire click-to-transfer pipeline.

Comment 30

•

5 years ago

Attached file Example Chrome extension which suggests names for Save As (deleted) — Details

Sorry for the delay.

I'm not sure whether I should share the domain where I was doing the .zip to .cbz correction, but the extension's hard-coded ruleset did also contain an entry for fixing .jpeg_large to .jpg when downloading images off Twitter which I left in.

The attached Zip contains it in unpacked form for easy exploration and I just verified that it does modify the Save As dialog's contents under Chromium 77.0.3865.90.

Flags: needinfo?(from_bugzilla2)

Comment 31

•

5 years ago

Also, I just realized another use I'd want to put such a thing to... Rewriting the default filenames provided to Ctrl+S so I don't have to keep manually swapping in underscores for characters like colons that are valid on Linux+ext2/3/4 and often show up in page titles, but are not valid under FAT32, exFAT, or NTFS's Win32 personality.

(I prefer not to have to load my "Web Page, complete" saves on a Linux machine and re-save them to make them compatible with flash drives and Windows machines without breaking the references to the files in ..._files.)

Comment 32

•

5 years ago

Thanks for the infos.
And, I agree, there are use cases for applying file name changes before showing any pickers.

I think I was tired and not entirely thinking straight when I said it would be a massive undertaking. to implement this... Just patching in the determineFileName step into HelperAppDlg.jsm, right before it does determine the name itself, should be fine.

Tynach

Comment 33

•

5 years ago

Something regarding Chrome's behavior that might be related to this, is that if I click to 'save link as' for a large file and wait a while with the file picker up, and then choose a path to save the file to, the download will usually already be partly completed as though it had been downloading in the background while the file picker was open.

I know it's not directly related to this issue per se, but I think it reveals a bit about how Chrome manages downloads - that is, they do everything to start the download up front, before even giving the user a chance to override anything.

It might be worth debating whether Firefox should use a similar approach or not. On one hand, it means that by the time a user chooses a directory for the file to reside in, the file is already that much closer to being downloaded. On the other hand, if a user doesn't realize how large the file is and is on a metered connection, that portion of the file already downloaded might go against their data cap even if they cancel the transfer. There might be other pros and cons, but those are two things that I could come up with off the top of my head.

If Firefox chooses to mimic Chrome in this regard, then it sounds like much of the work that would be required to implement onDeterminingFilename would be done anyway. Otherwise, perhaps it's too much work to do that much refactoring and reorganization just for this one API feature.

Comment 34

•

5 years ago

Firefox stopped doing that at some point?

I'm certain I remember it behaving that way back during the height of the single-process, XUL UI era and assumed Chrome was copying Firefox on that front.

Tynach

Comment 35

•

5 years ago

I admit I haven't tested that, and I based my post on the wording in other comments, saying things like the download isn't 'created' until after the filepicker is shown, and discussing what to do if the user or script changes what directory to download the file to (if it has to move the .part file, or just finish downloading and then move it, and so on).

I might be misunderstanding the wording, too (perhaps 'after the filepicker is shown' still refers to before the user picks a location).

Comment 36

•

5 years ago

I haven't looked into it but my impression based on prior experience was that the download starts streaming immediately but the higher-level API object that would need to be fed to onDeterminingFilename doesn't get created until after a filename is chosen and something about Firefox's internal architecture makes that difficult to change.

Guoyou Ng

Comment 37

•

5 years ago

As a developer of an extension that renames downloads, this is one API I'm sorely missing from Firefox. I've had lots of user issues with trying to get filenames via a separate fetch request (cookies, Referer, Content-Disposition). The Chrome equivalent generally works great with far fewer lines of custom code. Having this API or an equivalent in Firefox would make my users much happier.

Reporter

Comment 38

•

5 years ago

Firefox does start the download before the final filename is assigned but that's not really relevant for this bug. The problem is that at some point a DownloadTarget object gets created, and all the rest of the downloads code assumes that once this object is created, the target filename never changes. Discussion related to this and an attempt at addressing it happened some time ago over on bug 1254327. If anybody is interested in working on this bug, reading up on that bug and potentially picking up the abandoned patch there could be a good place to start.

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1254327