Closed Bug 454059 Opened 16 years ago Closed 3 years ago

Creating PDF of web page: hyperlinks are lost.

Categories

(Core :: Printing: Output, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
90 Branch
Tracking Status
firefox90 --- fixed

People

(Reporter: paul.noble, Assigned: jfkthame)

References

(Blocks 1 open bug)

Details

(Whiteboard: [print2020][layout:backlog])

Attachments

(6 files)

User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.1) Gecko/2008070206 Firefox/3.0.1 Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.1) Gecko/2008070206 Firefox/3.0.1 Using Firefox to create a PDF of a web page, all hyperlinks are lost. The same process in Safari results in all hyperlinks being retained in the saved PDF file. Though it may relate to the PDF plug-in issue, I could not find specific reference to this in prior bug reports. Reproducible: Always Steps to Reproduce: 1.File > Print > PDF > Save as PDF. 2. View using Preview (default). 3. (Tried also viewing the same PDF files on a PC using Acrobat reader.) Actual Results: PDF file does not retain hyperlinks. Expected Results: PDF file should retain hyperlinks.
I can confirm this (that the hyperlinks are preserved in Safari but not in Firefox 3.0.1 on OS X 10.5.4). The same thing happens printing to a PDF file on Linux -- so this is presumably a cross-platform problem. The hyperlinks aren't preserved printing to PDF from FF2, so this isn't a regression. The hyperlinks aren't preserved printing to a PDF file in Opera, but they are in another WebKit browser (Shiira). So there may be some clues how to do this (preserve the hyperlinks) in WebKit source code.
Status: UNCONFIRMED → NEW
Component: General → Printing: Output
Ever confirmed: true
OS: Mac OS X → All
Product: Firefox → Core
QA Contact: general → printing
Hardware: Macintosh → All
I argue that this is an enhancement request, not a bug. You can't hyperlink a bit of ink on a piece of paper, so it's not surprising that when an application renders a document for printing it doesn't put hyperlinks in it. Putting hyperlinks in a PDF file is thus outside the scope of printing, and something that we can choose to add as an enhancement. On the other hand, if we had a specific feature to save a web page as a PDF (requested in bug 162659), then how this behaves would be another matter.
Severity: major → enhancement
Who says PDFs are printed on paper? Storing articles in PDF is standard proceedure for me and basic workflow. The fact that I use the macOS print dialog is just historical legacy in how "Save to PDF" has evolved in macOS. I really would love to see this. It's a bit painful having to switch browsers just to be able to have a PDF of an article where I do not loose all hyperlinks.
(In reply to steve-_- from comment #7) > Who says PDFs are printed on paper? Nobody. Printing, as a concept, is putting ink onto paper or a similar medium. If a particular OS has PDF output built into its printing facility, or the computer has on it a printer driver for generating PDF files, then normally the content of the PDF generated thereby will be the same as the content of the paper document that the printing facility would ordinarily produce. Because printing isn't designed to do things that can't be done with ink on paper. I'm not saying it wouldn't be useful (though it should be done by implementing bug 162659, so that the feature isn't exclusive to the macOS version), just that producing hyperlinked documents isn't what printing is designed to do. Hence my claim that this is an enhancement request.
Also reported by an Ubuntu user on Ubuntu Launchpad Tracker: https://bugs.launchpad.net/ubuntu/+source/thunderbird/+bug/1551949 Thanks G

Noticed that when saving as pdf on macOS via print dialog, links are intact when using Firefox 66.0b6. Has this bug been fixed?

(In reply to steve-_- from comment #10)

Noticed that when saving as pdf on macOS via print dialog, links are intact when using Firefox 66.0b6. Has this bug been fixed?

Sorry, excuse the noise, it's just that URLs are clickable. But links behind text are still lost :(

(In reply to steve-_- from comment #11)

(In reply to steve-_- from comment #10)

Noticed that when saving as pdf on macOS via print dialog, links are intact when using Firefox 66.0b6. Has this bug been fixed?

Sorry, excuse the noise, it's just that URLs are clickable. But links behind text are still lost :(

So, if a web page contains a URL as part of the visible text of the page, it will be hyperlinked. Correct? This suggests to me that macOS does this out of the box when printing to PDF, and it would equally happen when printing to PDF from another program, such as a word processor or plain text editor.

Has anyone found a workaround for this now 11 year old ticket?

I am in need of this feature and don't want to have to hop back to chrome just for this. I am making academic journal article templates for pdf printing using the paged.js library so this feature is very important to us! Linking citations from inline to the bibliography is a standard feature these days for PDF articles (which are overwhelmingly used on the computer rather than print now) as well as linking to Figures/ Tables/ Equations etc..

It would be great if anyone has a workaround for this even for now that I could use, otherwise the project will need to go back to chrome.

Thanks!

One issue here is that the output is deceptive.

The document is rendered such that the links look like links: they are rendered in blue, and underlined.

I was fooled by this and sent a document to someone without actually trying the links.

If you're not going to make it work, the least you could do is not fake the appearance, you know? Would it be difficult to pop up a dialog box or something?

"this document contains hyperlinks; these will not work [Save PDF Anyway] [Cancel]"

There doesn't seem to be an elegant way to +1 here... So consider this my +1.

It's easy enough to switch over to Safari or Chrome to create the pdfs, but I'd much rather not have to leave Firefox to do this.

(In reply to s.parsons from comment #17)

There doesn't seem to be an elegant way to +1 here... So consider this my +1.

Bugzilla has a voting feature. If you open the 'Details' panel you should see it.

FWIW cairo has functionality for creating links as of v1.16, but it looks like the in-tree cairo is too old (v1.9.5).

Blocks: 1601429
Whiteboard: [print2020]

(In reply to Jonathan Watt [:jwatt] from comment #19)

FWIW cairo has functionality for creating links as of v1.16, but it looks like the in-tree cairo is too old (v1.9.5).

I suspect it wouldn't be too difficult to port that functionality over to our version.

(In reply to kaz from comment #16)

One issue here is that the output is deceptive.

The document is rendered such that the links look like links: they are rendered in blue, and underlined.

This is handled by CSS. In order to not have links highlighted, there would need to be a CSS rule specified as @media print, either in the webpage CSS or in the default stylesheet.

I was fooled by this and sent a document to someone without actually trying the links.

If you're not going to make it work, the least you could do is not fake the appearance, you know? Would it be difficult to pop up a dialog box or something?

"this document contains hyperlinks; these will not work [Save PDF Anyway] [Cancel]"

Given that nearly all web pages have hyperlinks, I think that would look somewhat ridiculous, and annoy a user who is creating PDFs of several webpages. Furthermore, I'm not sure if it's possible to detect if the user has selected 'Save as PDF' under macOS, or is using a PDF printer driver under any OS.

(In reply to Stewart Gordon from comment #21)

(In reply to kaz from comment #16)

The document is rendered such that the links look like links: they are rendered in blue, and underlined.

This is handled by CSS. In order to not have links highlighted, there would need to be a CSS rule specified as @media print, either in the webpage CSS or in the default stylesheet.

Well, let's see. It's not going to be handled in the web page CSS, is it.

Because the web page author doesn't care that you're using Firefox to save the page as PDF, and that it happens to strip hyperlinks of their functionality while retaining their styling.

Maybe the save-to-PDF feature should pull a piece of CSS from behind the magic curtain, and apply it to de-style the links that it has no intention of making work.

If you're not going to make it work, the least you could do is not fake the appearance, you know? Would it be difficult to pop up a dialog box or something?

"this document contains hyperlinks; these will not work [Save PDF Anyway] [Cancel]"

Given that nearly all web pages have hyperlinks, I think that would look somewhat ridiculous, and annoy a user who is creating PDFs of several webpages.

It was not my intent to design the exact UI in my comment, and still isn't; but can we pretend I had included the obligatory "[ ] don't show me this dialog again".

Furthermore, I'm not sure if it's possible to detect if the user has selected 'Save as PDF' under macOS, or is using a PDF printer driver under any OS.

I'm pretty sure this entire bug is not about using a "PDF driver under any OS" but saving a PDF from Firefox to a file.

(In reply to kaz from comment #22)

(In reply to Stewart Gordon from comment #21)

(In reply to kaz from comment #16)

The document is rendered such that the links look like links: they are rendered in blue, and underlined.

This is handled by CSS. In order to not have links highlighted, there would need to be a CSS rule specified as @media print, either in the webpage CSS or in the default stylesheet.

Well, let's see. It's not going to be handled in the web page CSS, is it.

Because the web page author doesn't care that you're using Firefox to save the page as PDF, and that it happens to strip hyperlinks of their functionality while retaining their styling.

I disagree. To "strip hyperlinks of their functionality while retaining their styling", as you put it, is equally what happens when printing to paper. So some web page authors might set such CSS to counteract this behaviour.

Maybe the save-to-PDF feature should pull a piece of CSS from behind the magic curtain, and apply it to de-style the links that it has no intention of making work.

This is somewhat off-topic for this bug report, which is about what happens when one uses the built-in feature of macOS to generate a PDF from the print dialog. Bug 162659 would be a better place to discuss this.

I'm pretty sure this entire bug is not about using a "PDF driver under any OS" but saving a PDF from Firefox to a file.

It's about what happens under the Save as PDF functionality built into macOS as part of its printing provision. Firefox doesn't have the latter functionality at the moment. It's requested in bug 162659.

(In reply to Stewart Gordon from comment #23)

I disagree. To "strip hyperlinks of their functionality while retaining their styling", as you put it, is equally what happens when printing to paper.

But PDF isn't paper, it's a purgatory between electronic and print media.

Sure, someone might decide to take a PDF and ink it onto a physical page, but until then the expectation is that hyperlinks will work. LaTeX is primarily used for creating physically printed documents, but even it has packages for hyperlinking.

(In reply to Conway from comment #24)

Sure, someone might decide to take a PDF and ink it onto a physical page, but until then the expectation is that hyperlinks will work.

When using a PDF export feature within an application, yes. But this bug ticket is about the scenario that all the application is doing is printing.

(In reply to Stewart Gordon from comment #25)

(In reply to Conway from comment #24)

Sure, someone might decide to take a PDF and ink it onto a physical page, but until then the expectation is that hyperlinks will work.

When using a PDF export feature within an application, yes. But this bug ticket is about the scenario that all the application is doing is printing.

Can you refrain from posting any more comments until you actually read the bug submitter's (now twelve-year-old) description and steps to reproduce?

(In reply to kaz from comment #16)

If you're not going to make it work, the least you could do is not fake the appearance, you know? Would it be difficult to pop up a dialog box or something?

I'm sympathetic to where you're coming from, but this bug is filed against the platform code and is about implementing the missing functionality in the platform code. Platform developers do not make decisions about adding/changing frontend UI such as opening dialog boxes, so you would need to file a new bug against the 'Firefox' component to get that attention.

As for the rest of the discussion, from my perspective as a platform dev it adds a lot of text that we have to read through to check we're not missing anything once we get around to trying to fix this bug. The more bugs we have with conversations like that, the more unproductive time is used up reading through comments that don't help inform the development work involved. You're all welcome to comment, but please do keep that in mind and try to keep unnecessary comments to a minimum.

Yes, please; let's get back to the core issue. The Firefox print function does not save functioning hyperlinks when the output is PDF. I dislike having to switch to Chromium (I use Linux) to create a PDF with functioning hyperlinks. Can we please get Firefox to create a PDF with hyperlinks that work? As far as I'm concerned, it can be under either the 'Save Page As ...' menu item or the 'Print ...' menu item this bug was opened under. Thanks!

I got a question on chat about using SkiaPDF instead of cairo for this and I might as well add what info I have here.

SkiaPDF does indeed have functionality to annotate the PDFs in generates. However, although we use Skia, we only made an experimental start on integrating and using SkiaPDF as a printing backend. Finishing that off is probably still a fair amount of work. For anyone looking into fixing this bug the expedient way forward is most likely to try porting the functionality from a newer version of cairo to our old, in-tree version, as suggested by Jeff in comment 20.

Severity: normal → N/A
Priority: -- → P2
Whiteboard: [print2020] → [print2020][layout:backlog]
Depends on: 1689995
Depends on: 739096

With the cairo update in bug 739096, the functions cairo_tag_begin and cairo_tag_end become available, which can be used (with the "Link" tag type) to generate links in PDF output.

There are two ways to use this. One approach is to issue cairo_tag_begin with an associated URL, then do some drawing, and then do cairo_tag_end; the PDF backend will collect the bounding rect of whatever gets drawn in between the begin/end pair, and generate a link for this area. The alternative is to explicitly pass a rect (or list of rects) to cairo_tag_begin, rather than relying on the backend to collect the affected area.

I've experimented with both options, and it seems to me that we get a better result by explicitly passing the frame rect of whatever frame(s) are associated with the link, rather than depending on cairo's accumulation of the area between begin and end. This results in clickable PDF links that better match the clickable areas of the HTML document as viewed in the browser. (Which makes sense, as the active area of an HTML link is determined by the frame's area, not by the actual rendered ink.)

Currently, only the PDF backend supports generating links like this. On macOS, our Save as PDF functionality actually goes via the cairo-quartz backend and the macOS printing architecture, rather than cairo's PDF backend. So to support links there, we'll need to add support for tag begin/end to the quartz surface. Fortunately, we don't currently need to implement everything that the PDF backend provides; just supporting tag_begin with the Link type and an explicit rect will provide enough functionality here.

(Eventually, it would be awesome to implement more Tagged PDF support -- e.g. to tag document structure, internal destinations, etc -- but that's for another day, another bug.)

This provides a basic Link() API on DrawTarget, intended to generate a link
for a single rectangular area (which will be the rect of a frame corresponding
to a link element).

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED

This implements a subset of the tag() function on the quartz surface backend;
just enough to support generating links in PDF output. In particular, the
only tag type supported is Link, and we require the link area to be passed
as a list of rects in the 'begin' call; we don't support accumulating all
drawing operations between 'begin' and 'end' into a link area.

Depends on D114205

Depends on D114206

Attachment #9220571 - Attachment description: Bug 454059 - Add a new PaintForPrinting display list builder mode, and only create a Linkifier when printing. r?mstange → Bug 454059 - Add a new PaintForPrinting display list builder mode. r?mstange
Attachment #9220571 - Attachment description: Bug 454059 - Add a new PaintForPrinting display list builder mode. r?mstange → Bug 454059 - Add a new PaintForPrinting display list builder mode, and only create a Linkifier when printing. r?mstange
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d4d8d352b8eb Add a Link() method to DrawTarget, and implement it in DrawTargetCairo. r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/d23a00ef79ac Support Link in DrawTargetRecording and playback. r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/47fc505e89a8 Add [minimal] tag support to cairo-quartz-surface.c. r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/37a85c239237 Add a LINK display-item type. r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/5d1efa61543d Add a new PaintForPrinting display list builder mode, and only create a Linkifier when printing. r=mstange https://hg.mozilla.org/integration/autoland/rev/26b31c2612b4 Generate hyperlinks in PDF output for HTML link elements. r=mstange,mattwoodrow

Cannot print when the link uses display: contents.

Steps to reproduce the problem:

1.open data:text/html,<a href="http://a.b.c" style="display: contents">link</a>
2.print as PDF

related chromium issue: https://bugs.chromium.org/p/chromium/issues/detail?id=1160139&q=&can=4

Blocks: 1711064
Regressions: 1711064

Should this get a release-note?

Status: RESOLVED → VERIFIED
Status: VERIFIED → RESOLVED
Closed: 3 years ago3 years ago

(In reply to Tom Schuster [:evilpie] from comment #42)

Should this get a release-note?

I think so. I added "Print to PDF now produces working hyperlinks"

Regressions: 1721424
Regressions: 1748077

Thanks for now supporting links :)
... but it works good enough only for "external" links, and not as good for "local" ones, referring to places inside in the same pdf
... doing so, the link GENERATED will not work at all (i.e. not very clickable) while viewing the PDF file in FireFox (internally using pdfjs),
and it works while the pdf file is viewed in an external pdf viewer, but then yet again the link works as an external link, pointing to the linked element inside the original HTML file, not inside the same pdf file, which now contains a pdf equivalent of that html element ...
Below is a test case to show what I mean by all those stated above:

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>PDF test</title>
<style>
@media print{
h1{page-break-before: always;}
}
</style>
</head>
<body>
<div id="hello">Hello!</div>
<h1>Links ...</h1>
This is a test to show how <a href="http://www.example.com/">external</a> and <a href="#hello">internal</a> links are treated while printing HTML files into PDF.
</body>
</html>

(In reply to o.parhizkari from comment #44)

Thanks for now supporting links :)
... but it works good enough only for "external" links, and not as good for "local" ones, referring to places inside in the same pdf

Yes, we know. You can try setting print.save_as_pdf.internal_destinations.enabled to true in about:config to ask Firefox to create internal links in the PDF, but be aware that this functionality may be unreliable (which is why it's not currently enabled by default). See bug 1729276 and the other issues linked from there for more details.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: