LEDA
document and graphics conversions
What to expect
LEDA uses a battery of programs to convert your RTF file into HTML and PDF
versions. Most of the time these will produce converted files that are reasonably
faithful to the original document. However, all conversion programs have limitations,
and it's possible that conversions will be less than perfect -- not surprising,
when you consider just how many things you can do with a word processor.
Here are some things you can do that will help your RTF documents look as good
as possible once converted:
- In general, RTF generated using Microsoft Word is superior to RTF generated
by WordPerfect; having written the standard, Microsoft is a little better
at conforming to it. Under some circumstances it may benefit you to run your
WordPerfect document into Word, then save as RTF. A list of known
conversion bugs is below.
- In order to be converted, your graphics have to be incorporated in the RTF
file (as opposed to being linked as external objects). This is done differently
by different word processors; consult online help for the word processor you
use.
- Sizing and scaling of graphics is done by applying a one-size-fits-all scaling
algorithm that sometimes won't fit any; it tends to err on the large side.
If you like, you can adjust this; see the sections on conversions below.
- The PDF conversion (in particular) is fairly "broad-brush" --
fonts are available in only five sizes (extra small, small, medium, large,
and extra-large) and a few families (eg. proportional-Times-Roman-like, sans-serif-Helvetica-like,
and so on). Fine gradations of type size and style probably won't come through.
As with many other problems it is well to bear in mind that the reader will
have access to the actual document as it originally appeared in the RTF version,
if fine points of appearance are critical.
- Some special characters are not available in HTML, and hence will not translate.
- Other character-translation problems arise because the various conversion
programs we use don't understand some characters in some character sets. In
PDF versions, this usually results in an empty rectangle (looking something
like a check box) in your document where the character in question should
be. We can improve this in future versions if you report
the problem to us .
It's worth remembering that readers have access to the original RTF version
if an absolutely accurate version is needed, and that submitters have the option
of replacing our converted files with their own should they require something
different or better than what LEDA provides.
We are especially eager to hear about conversion problems -- so let
us know of any that you encounter.
Conversion into HTML -- the technical process.
LEDA uses the popular rtf2html program for conversion to HTML. Its behavior
is controlled by configuration files found in the /usr/local/leda/etc directory,
and in /usr/local/leda/bin/rtf2html . You also have control over the translation
of special characters and the like, and of image scaling. We have found that
rtf2html is sufficiently feature-rich that we don't need to supplement it with
pre- or post-filtering scripts, though we may encounter a need to do so in the
future.
Full documentation for rtf2html is in HTML format on your LEDA server at /leda/manual/r2hdocs/guide.htm.
The documentation on the Logictran site is for a newer, commercial successor
to rtf2html but most of the same features are present. You may alter the configuration
files to suit your needs -- and specify alternate configurations for different
series you set up in LEDA. This is particularly helpful if you want different
logos or stylesheets for different journals, etc.
To set up a different HTML "look" for a series:
- Create a new html-trn file to control the conversion. It will probably be
easiest to copy the "vanilla" html-trn file (/usr/local/leda/etc/html-trn)
under a new name, then edit the new file in accordance with the documents
at /leda/manual/r2hdocs/guide.htm
.
- Associate the new html-trn file with the series, by editing via /leda/review/view_all_serials.php3
Conversion into PDF -- the technical process
Unfortunately, Adobe does not sell a version of Acrobat Distiller that will
run under Linux (at least not at a reasonable price; there is an enterprise-scale
document conversion server product that creates PDF that goes for around $5K
at the time of this writing). As a result, we use a multistep process that first
transforms the RTF document into LaTeX, and from there to PDF:
- The RTF document is pre-filtered. This step is basically used to suppress
features (like document comments) that we don't want to see in the PDF document.
- The RTF is converted to laTeX using the rtf2latex2e package.
- The intermediate laTeX file is filtered; this adjusts footnote numbering,
takes care of graphics scaling, and so on.
- The laTeX file is converted to pdf.
If you're a serious conversion hacker, you can work magic by customizing the
various scripts involved. Depending on what you want to do, you may find it
easier to alter one or the other of our filtering scripts. We've found it easier
to use the RTF pre-filter to remove information and the laTeX filter
to add or alter information.
Where to find more
/usr/local/leda/bin/ledacvt.sh is the shell script that controls the conversion
of LEDA documents; it in turn calls other programs found (for the most part)
in the /usr/local/leda/bin directory tree. Looking at the source is probably
the best way to understand exactly what's being done.
What to do if our conversions don't satisfy you
At the time your document was approved for release, you should have received a
confirming e-mail containing a password, document number, and URL that can be
used to substitute your own PDF or HTML files for the ones created automatically
by LEDA. This is a direct file-for-file replacement -- you can't restructure your
document completely, but you can substitute (eg.) an HTML file you create for
one we create.
Known conversion bugs (last modified 3 January 2002):
- Hanging indents don't indent if the document originated in WordPerfect.