\def\dvips{\textsf{dvips}}
\title{\LaTeX, \dvips, \acro{EPS} and the web \ldots}
\author[Sebastian Rahtz]{Sebastian Rahtz\\
Elsevier Science Ltd\\The Boulevard, Langford
Lane\\Kidlington\\
Oxford, UK\\\texttt{s.rahtz@elsevier.co.uk}}
\begin{Article}
\begin{abstract}
Browsers of \TeX\ question \emph{fora} like \texttt{comp.text.tex}
will often be asked 
what are the issues surrounding Encapsulated PostScript,
and how one goes about making \acro{EPS} files from
\LaTeX\ output, and maybe using them on the World Wide Web. 
This short note\footnote{Reprinted from \TUB{} 16(3) with kind
permission of Barbara Beeton.}
offers some suggestions.
\end{abstract}

\section{What and why is \acro{EPS}?}
\acro{EPS} stands for Encapsulated PostScript; \acro{EPS} files \emph{are}
PostScript, but they conform to a minimum standard of good behaviour.
This is so they can be included in other documents, possibly resized
or rotated. In practice \acro{EPS} means not using certain commands which have global
effects (don't worry, this is quite rare), and inserting structured
comments (starting with \verb|%%|) which tell other programs something
about the file.  The \emph{PostScript Language Reference Manual} goes
into great depth describing what these comments can contain, but the
minumum that is necessary for practical purposes are:
\begin{enumerate}
\item A first line starting \texttt{\%!PS-Adobe}; 
\dvips, for instance,
puts \texttt{\%!PS-Adobe-2.0 EPSF-2.0} in its output, meaning that it
claims conformance with version 2 of the \acro{EPS} standard
(we are now at version 3);
\item A `BoundingBox', like \texttt{\%\%BoundingBox: 33 101 584 715}
which tells applications how much space on the page is occupied.
\end{enumerate}
How do you turn \acro{PS} files into \acro{EPS} files? They probably are already, if
they come from a reputable bit of software (avoid anything from
MicroSoft)\Dash a good check is to see if there is a BoundingBox.

You will come across three types of problem with files that look like
\acro{EPS}. Firstly, the BoundingBox may not be accurate; since this
determines how much space will be left in enclosing applications like
\TeX, it matters. Keith Reckdahl's recent tutorial in \TUB\ goes into
detail on this problem.

Secondly, your file may be \emph{serious} \acro{EPS}, and use all the
facilities of structured comments to specify what sort of resources
(fonts etc) it expects you to supply when you deal with it. This is
bad news if you are in \TeX\ world outside a Macintosh. Look out for
lines with words like \texttt{ProcSetsNeeded}.

Thirdly, your file may think it is \acro{EPS}, but in fact breaks the rules,
and has weird PostScript in it. The rescue technique is to read it
with a forgiving PostScript interpreter, and get a new version written
out. Three programs to try are:
\begin{enumerate}
\item Adobe Acrobat Distiller; this turns PostScript files into \acro{PDF},
and Acrobat Exchange can then load them, and save them as ordinary
PostScript. Since it is written by Adobe, Distiller is an extremely
powerful PostScript interpreter, and can cope with almost anything you
throw at it. It is not cheap (except to academics), but worth having. 
\item Recent versions of Adobe Illustrator share some of the 
Acrobat code, and can read PostScript files, as well as edit \acro{PDF} files.
\item  The free Ghostscript is now a very mature and sophisticated
product. It understands all of the current Level~2
PostScript, and can turn it onto a wide variety of bitmap
forms. Version~4 (released in June~1996) 
also performs many of the functions of Distiller, and it
already reads \acro{PDF} files and writes PostScript. Unfortunately, its
handling of PostScript text to \acro{PDF} is at present unfinished. However,
you can still use Ghostscript to read your PostScript and 
write it out again as a bitmap (e.g.~\acro{TIFF}).
\end{enumerate}

\section{What about \texttt{dvi} to Encapsulated PostScript?}
Most \TeX\ systems, free or commercial, supply a \texttt{dvi} to PostScript
driver; most of them write out more or less acceptable Encapsulated
PostScript, but three are especially well-featured (in the author's
experience): the Macintosh Textures driver, Y\&Y's \textsf{dvipsone} for
Windows and the free \dvips. Since the latter is available for all
platforms, is well-supported, and is probably the finest of its
type,\footnote{For several years, \textsf{dvipsone} has offered partial
downloading of fonts, a very powerful feature, but this is now coming
into \dvips; there are also flaws in \dvips' use of structured \acro{EPS}
comments, and Textures is superior in this respect.} 
we shall concentrate on that.

If you want to produce re-useable PostScript output from \dvips{} (and
this includes output destined for Acrobat Distiller), the
absolute priority is to use outline fonts, not the PK fonts
traditionally used by \TeX. You can either use traditional 
fonts (usually commercial, like Adobe Times, but Ghostscript now comes
with an excellent free set donated by \acro{URW}) or Computer Modern itself
in PostScript Type 1 format. Either buy these from \acro{Y}\&\acro{Y} for Windows and
Unix or Blue Sky for Macintosh, or use Basil Malyshev's BaKoMa set, of
almost comparable quality.\footnote{Windows-worshippers may prefer to
get into the world of TrueType fonts, which are available for Computer
Modern from Kinch Computer Company.}

If you do not use outline fonts, and re-use your output scaled up, you
will not like the effect of Figure~\ref{tips1}
at all, compared to Figure~\ref{tips2}. If you want to turn your
documents into \acro{PDF}, Distiller will produce vile results from PK fonts.
\begin{figure*}
\includegraphics[width=\textwidth,height=1.2in]{tips1} 
\caption{Bitmap \acro{EPS} file, enlarged and distorted}\label{tips1}
\includegraphics[width=\textwidth,height=1.2in]{tips2} 
\caption{Outline font \acro{EPS} file, enlarged and distorted}\label{tips2}
\end{figure*}

The second priority is to get the right bounding box. Surprisingly
many applications cheat by simply making it the page size, regardless
of whether the whole area is used. \dvips{} does this by default too,
but has a command-line option \texttt{-E}, which asks it to try and
calculate the actual extent used. Note that \acro{EPS} files are, by
definition, only one page, so you also have to use \dvips{} options to
select just one page. There are two caveats when preparing the
input. Firstly, make sure you do not include a page number (try
\verb|\pagestyle{empty}| in \LaTeX), or else the bounding box will
cover that too. Secondly, \dvips{} does not always work out the extent of text
correctly. For instance, if you wrote (why, I have no idea):
\begin{verbatim}
Hello\raisebox{10pt}[0pt][0pt]{Up there}!
\end{verbatim}
you would be asking \LaTeX{} to raise \emph{Up there} off the
baseline, but to pretend that it has no effect on the height
calculation. \dvips{} will believe this, and calculate a bounding box
on the \emph{claimed} height.
If you use complicated add-in packages like
PSTricks, which add in arbitrary PostScript code, you will also end up in
real trouble.  In these cases you can either adjust the BoundingBox by hand,
or place invisible marks in \LaTeX\ to make sure that \dvips{}
recognizes the full extent.

A useful trick to remember if you think that \TeX{} knows what you
want, but \dvips{} does not, is to make judicious use of
color. Suppose you wanted to use PSTricks to encircle a mathematical
symbol, you might write:
\begin{verbatim}
absurd \pscirclebox{$\surd$}
\end{verbatim}
\TeX{} leaves the right space, since the PSTricks macros understand
what is going on, but \dvips{} is told to draw the circle in raw
PostScript, and the bounding box calculation ignores that. The result
is that the limits are set just around the size of the letters. If we wrote:
\begin{verbatim}
\framebox{absurd \pscirclebox{$\surd$}}
\end{verbatim}
it would work correctly, because \dvips{} would look at the enclosing
frame, not just the words. But you end up with an unwanted box; so
make it (in effect) invisible by writing:
\begin{verbatim}
{\color{white}\fboxsep{0pt}%
 \framebox{%
  {\color{black}absurd
  \pscirclebox{$\surd$}}%
  }%
}
\end{verbatim}
This creates a white frame around black text; \LaTeX{} proceeds
happily, and so does \dvips, calculating the right extents, but
nothing shows on paper. Obviously, this only works in a monochrome
environment.

\section{\LaTeX\ to \acro{EPS} to \acro{GIF} to Web}
Why do we do all this in practice? Often, these days, because people
want their \LaTeX{} mathematical output on the World Wide Web, and
their only recourse is to embed \acro{GIF} images in their \acro{HTML}. The
sophisticated \emph{latex2html} program does all this for you; its
technique is worth understanding, as it has general utility; the
sequence of events is:
\begin{enumerate}
\item Place bits of \LaTeX\ in an special file, one fragment per page,
and with no page numbers;
\item Run \LaTeX\ to generate a multi-page \texttt{dvi} file;
\item Use \dvips' \texttt{-i} and \texttt{-S} options to generate one
self-contained output file  per page;
\item Give each page to Ghostscript, and ask it to render them in \acro{PBM}
(Portable Bitmap) form;
\item Use the \acro{PBM}plus/Netpbm utility \emph{pnmcrop} to trim away white
space;
\item Use the \emph{ppmtogif} utility to convert the result to a \acro{GIF} image.
\end{enumerate}
Note that it does \emph{not} use the \texttt{-E} option for \dvips{},
but relies on  simply removing all white pixels until just text is
left. This has the advantage that it avoids the problem we saw in the
last section, but it has three disadvantages:
\begin{enumerate}
\item The \acro{PBM} utilities are primarily Unix tools, and many people do not
have access to them;
\item The cropping process is memory-intensive, slow and eats
temporary disk space;
\item The cropping forces everything to the baseline, effectively. A
character like em-dash (---) which sits above the baseline, will be
cropped above and below, so that the placed \acro{GIF} looks wrong.
\end{enumerate}
The core of the problem is the use of Ghostscript, which always
creates a page-sized bitmap, even if there is only one word on the
page. What we want is for
Ghostscript to render just the
portion of the image inside the bounding box, if we 
\emph{do} use the \texttt{-E} flag for \dvips. We can achieve this by
giving Ghostscript a customized page size, which is the size of the
bounding box.  
Then we can insert some extra
PostScript code to move the image so that it starts at the 0,0
coordinate (adjusting the bounding box accordingly). 
Ghostscript then displays or
converts the image just within the desired area, and no cropping is
needed. 

The transformations of the bounding box can be achieved using
\emph{epsffit}, which is part of Angus Duggan's 
\texttt{psutils} collection
(\acro{CTAN}:\texttt{support/psutils}); the page size change is most easily
done using a Level~2 PostScript operator \texttt{setpagedevice}. 
Thus a PostScript file which starts:
\begin{verbatim}
%!PS-Adobe-2.0 EPSF-2.0
%%BoundingBox: 135 528 284 668
...
\end{verbatim}
needs to be transformed to something like:
\begin{verbatim}
%!PS-Adobe-2.0 EPSF-2.0
%%BoundingBox: 0 0 149 140
<< /PageSize [149 140] >> setpagedevice
gsave -135 -528 translate
...
grestore
\end{verbatim}
Here we have worked out the width and height of the enclosing rectangle (149
$\times$ 140 units), moved the origin down to 0,0 on the page, and set
the page size. PostScript purists will shudder at the
\texttt{setpagedevice} command, and point out that this is probably
illegal in Encapsulated PostScript, but as long as we only use this
file strictly in the controlled environment of Ghostscript, we are
safe enough. Figure \ref{fitps} lists a
simple Perl script which performs the necessary changes to a
PostScript file for Ghostscript to eat, without any need for
\emph{epsffit}.\footnote{I am aware that it does not cope with an
\texttt{(atend)} bounding box\ldots}
\begin{figure*}
\begin{verbatim}
#!/usr/local/bin/perl
$bbneeded=1;
$bbpatt="[0-9\.\-]";
while (<>) { 
 if ( /%%BoundingBox:(\s$bbpatt+)\s($bbpatt+)\s($bbpatt+)\s($bbpatt+)/ ) 
 { 
     if ($bbneeded) {
         $width = $3 - $1;
         $height = $4 - $2;
         $xoffset = 0 - $1;
         $yoffset = 0 - $2;
         print "%%BoundingBox: 0 0 $width $height\n";
         print "<< /PageSize [$width $height] >> setpagedevice\n";
         print "gsave $xoffset $yoffset translate\n";
         $bbneeded=0;
    }
}
else {  print; }
}
print "grestore\n";
};
\end{verbatim}
\caption{A Perl script to transform an \acro{EPS} file for Ghostscript}
\label{fitps}
\end{figure*}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure*}[t!]
\includegraphics[width=\textwidth]{tips3}
\caption[]{\LaTeX{} $\rightarrow$ \texttt{dvi}
$\rightarrow$ \acro{EPS}
$\rightarrow$ \acro{GIF}}\label{tips3}
\includegraphics[width=\textwidth]{tips4}
\caption[]{\LaTeX{} $\rightarrow$ \texttt{dvi}
$\rightarrow$ \acro{EPS}
$\rightarrow$ \acro{GIF}, anti-aliased}\label{tips4}
\end{figure*}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure*}[t!]
\centerline{\includegraphics{tips5}}
\caption[]{Mid-aligned \acro{GIF} image in Netscape}\label{tips5}
\end{figure*}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


Now that Ghostscript is only rendering the desired area, we can use
its builtin bitmap output facilities. The Unix or \acro{DOS} command line:
\begin{verbatim}
gs -dNOPAUSE  -q -r100 -sDEVICE=tiffg4 \
 -sOutputFile=foo.tif foo.ps -c quit
\end{verbatim}
will generate a \acro{TIFF} fax group 4 image (Ghostscript does not support
\acro{GIF} output directly, for legal reasons) at 100dpi
of just the imaged area of the PostScript file \texttt{foo.ps}
with no further ado. Ghostscript version 4 adds
anti-aliasing facilities; using the Netpbm tools under Unix, we
can create a variant \acro{GIF} image,  using the command line:
\begin{verbatim}
 gs -r100 -dNOPAUSE  -q -sOutputFile=-  \
  -sDEVICE=pnm -dTextAlphaBits=4 \
  -dGraphicsAlphaBits=4 foo.ps -c quit | \
  ppmtogif -interlace \
  -transparent \#ffffff > \
  equation.gif
\end{verbatim}
Figures~\ref{tips3} and \ref{tips4} show the result of transformations
with and without anti-aliasing. 

There is one remaining problem --- the World Wide Web browsers can
usually align images top, middle or bottom; but what if we have an
image of some characters with descenders below the base line? Bottom
alignment of the images places the bottom of the descenders on the
baseline; top alignment is riduculous, and middle alignment is not
quite right either. The answer is to use middle alignment, and make
\TeX\ lie to \dvips{} (and thence down the chain) about the extent of
the character; making its depth equal to its height, and then middle
aligning it in the Web browser, has the desired effect. 
So how do we make \TeX{} lie? Here is my suggestion:
\begin{verbatim}
\newsavebox{\@Fragment}
\def\Fragment#1{%
 \savebox{\@Fragment}{#1}%
 \@tempdima\ht\@Fragment
 \@tempdimb\dp\@Fragment
 \ifdim\@tempdima>\@tempdimb
  \dp\@Fragment\@tempdima
 \else
  \ht\@Fragment\@tempdimb
 \fi
 \fboxsep0pt
 \color{white}%
 \fbox{%
 {\color{black}%
  \box\@Fragment}%
 }%
}
\end{verbatim}
I use the \LaTeX{} box framing command to ensure that \dvips{} thinks
the depth is there, with the same color trick as we saw earlier.

Unfortunately, there is a side effect --- an \acro{HTML} browser loading the
resulting \acro{GIF} image mid-aligns the image and sticks the `ballast'
white space into the line below, making an unsightly gap (see 
Figure~\ref{tips5}, where the Greek \ensuremath{\eta}s have a small descender). 
With the current browser technology, there is little than be done
about this. In practice, we will have to check first whether there
\emph{is} any descender; if so, we use the mid-align technique, and
accept the gap; if there is not, we can make a simpler process and use
bottom alignment.

It is imperative, of course, that Web-making readers do not take these
examples as `recipes', without both a precise specification of the
desired Web page, or an understanding of some of the basic
image-processing techniques. The aim here has simply been to show how
relatively trivial and efficient it is to create bitmap output from
\LaTeX{} and \dvips{} using the free facilities of Ghostscript.
\end{Article}