I have left Harvard as of July 1, 2008 to take a position at NYU. This website has been cached and left static. Feel free to browse my new website, aka "What the heck is a Clinical Associate Professor?"

04.27.07

Configuring tex4ht to link to PDF

Posted in Courses, Computing, LaTeX at 7:19 am by leingang

As I wrote before, I’m a big fan of TeX4ht to convert (La)TeX documents to HTML and variants. One thing I liked was its professed configurability. So I gave it a whirl.

The HTML that TeX4ht isn’t quite perfect, still. It needs some help with tables, and if I use custom list styles they don’t carry over (although that might be a browser thing, now, come to think of it). So I wanted to insert two things into the HTML file: a <LINK> element in the document head indicating to any machine reader that the PDF version exists, and a note at the top of the document body indicating to any human reader that the PDF version exists and is more authoritative.

So here’s my linktopdf.cfg file, which I put in the same directory as the LaTeX file.

\Preamble{xhtml}
% You can put definitions here
\begin{document}
% This stuff goes in the <head> of the HTML document
\HCode{<link rev="alternate" media="print" xhref="\jobname.pdf" mce_href="\jobname.pdf" />}
\EndPreamble
% This goes at the beginning of the <body> of the document.
Note: This is an automatically generated HTML conversion of a LaTeX file, provided for convenience. The authoritative version is the \Link[\jobname.pdf]{}{PDF}PDF\EndLink\ version.

To compile a LaTeX file with this configuration use

$htlatex foo.tex "linktopdf" "" "-cvalidate"

and that ought to do it! I’d kind of like it to go after the title, rather than at the very beginning of the document, but that would take a little more reading of the configuration process. Advanced usage would be to give it a class and add CSS to the head that sets this paragraph in a different style

I thought for a while whether I wanted to use a LINK REL or a LINK REV declaration. The question is, is it more useful or correct to say the the PDF file is a printable version of the HTML, or that the HTML is a screen version of the PDF? Perhaps the latter, but i don’t exactly know which would make a machine go get the PDF instead.

Blogged with Flock

Comments are closed.