\title{\LaTeX{} and tables} \author[R. Allan Reese]{R. Allan Reese\\ Computer Centre\\Hull University\\ \texttt{r.a.reese@ucc.hull.ac.uk}} \begin{Article} \section{Tables} Kroonenberg's article on tables (\cite{Kroo}) made some useful suggestions on \LaTeX{} coding but the examples left much to be desired as patterns to follow. I fully agree that the \LaTeX{} User Guide (\cite{lamport}), \cite{kopka} and virtually every book on word processing place far too much stress on rules as boxes. \cite{chapman} gives better guidance on the presentation of tables; I particularly enjoy commending this book, as it challenges one of the Great Lies of Life.\footnote{I'm from the Government and I'm here to help you.} Ehrenberg's short article (\cite{ehrena}) should be studied by anyone putting figures into a table. \cite{reynolds} is a more general book and discusses presentations of all types. Chapman gives clear, straightforward guidance with many examples of good and bad practice. Here, for example, is a point almost always overlooked by people who believe `a picture is worth a thousand words': \begin{quote} ``Since neither charts nor tables ever `speak for themselves', in order to communicate a message effectively either must be accompanied by a verbal summary.'' \end{quote} Tabular material tends to be complex. For a recent workshop on table construction in \LaTeX{}, I selected what seemed to me the simplest tables from journals to hand. The criteria for simplicity were small size and not appearing to need a wide variety of \LaTeX{} commands. In serious use, you would expect to use more commands than when teaching, so this restriction would be relaxed. More complex tables were often `tables within tables', for which \LaTeX's analytical approach of defining logical units is well suited. However, another of Chapman's truths that is easily swept aside by an author's enthusiasm is: \begin{quote} ``Tables should be small. It is better to include three or four compact tables, each illustrating one or two points succinctly, rather than construct a single large table which is then referred to in text covering a large number of paragraphs or pages.'' \end{quote} Her next sentence will strike a chord with any \LaTeX{} user: \begin{quote} ``Small tables are easier to position close to their verbal summary, and are easier to include in the main report.'' \end{quote} Several problems arose in recreating the examples using \LaTeX. Some were due to the desire to reproduce the table exactly as seen, rather than re-present the information in a natural way. On the other hand, the exercise did bring out the flexibility of \LaTeX's standard tabulation tools. The short workshop was successful, in that most of my (Computer Centre) colleagues were able to add further lines into a part-built table despite having no previous experience of \LaTeX. They followed the layout by copying the commands that they saw; this contrasts greatly with WYSIWYG word processors where you may see a table but have no idea of how it was constructed. \LaTeX{} is an excellent program for formatting tables. It is not, however, realistic to expect tables to be laid out optimally first time and automatically. This is one area where the author must be prepared to make judgements and manual adjustments. \section{Rules of thumb} Rules for constructing tables can be divided into those dealing with the content and those dealing with the layout. The typesetter will generally have little say in the content or ordering of the information, but an editor might (should) make suggestions. The evidence of most journals suggests their editors are not as critical of tables or graphics as of text. Chapman distinguishes between demonstration tables and reference tables. The reader uses the former to perceive a pattern and the latter to look up a value. Tables that try to do both are rarely successful. Ehrenberg's examples would come under the demonstration category, for which he suggests: \begin{enumerate} \item give marginal averages to provide a visual focus \item order the rows or columns by the marginal averages or some relevant measure of size \item put figures to be compared in columns rather than in rows (i.e., to aid mental arithmetic) \item round all numbers to two effective digits, unless the exact value is for reference \end{enumerate} See \cite{ehrena} and 1978 for further explanation and discussion. Layout can make a table easier to read\Dash or can destroy its meaning. Sweep away the black boxes and apply two principles. A table is a discontinuity in reading. The reader is not to scan it linearly, as if reading text. You must therefore guide their eyes: \begin{enumerate} \item use white space to separate objects \item use lines (rules) to join or point connections. \end{enumerate} As when setting text, beginners tend to add too much space, though Chapman complains that even professional compositors like to widen tables to fill the text width. Physically compact tables are easier to scan. \cite{visdis} makes this point about graphics; make them smaller so the reader can see the pattern not the dots. This applies also to tables. White space is used to set off the table from the body text, usually by centering. Spaces within the table are used to indent hierarchical headings and to break the table into sections. In another article (\cite{reesettn}) I discuss \LaTeX{} constructs for interposing space after a set number of lines or when the initial letter changes. Horizontal rules are standard at top and bottom to further demarcate the table. There is usually a thick rule under the table heading, a thin rule under the banner (column) headings and a thin rule between the table body and any explanatory notes. Incidentally, one of Chapman's warning examples shows how disastrous it is to rely on the reader reading the footnotes to understand the table. They must be strictly to expand or qualify a detail. Kroonenberg mentions the problem of text coming too close to rules. This can be adjusted with struts or non-aligned \verb|\vspace|; I like to insert a small extra space above the first row and after the last, to unify the body of the table and distinguish it from the headings. Vertical rules are used sparingly, if at all. It is better to put a little space between stubs (row labels) and the `data' columns, and between the columns and the marginal `averages', than to add rules crossing the direction of scanning. You can help the reader scan across gaps by adding leaders, or by centering rather than justifying short items. Use a smaller font inside tables. Kroonenberg implies this by discussing sans serif fonts ``to set off the table.'' It is another way to make the table more compact and visually distinguished. Telephone directories are large reference tables which demonstrate these principles. They are set in the smallest readable font size, have leaders to bridge the gap between name and number, and are usually multi-column with white space separators. \section{Implementation} The layout principles are easily implemented in standard \LaTeX. The table itself uses the \verb|tabular| environment. The number of columns will often be greater than at first apparent, with many multicolumn items. For example, this is a natural way to handle hierarchical stubs, where the primary labels span two columns and the secondary labels start in column two (indented). \LaTeX{} will calculate column widths but it is often desirable to force several columns to have the same width. Headings will often require either \verb|\multicolumn| or \verb|\noalign| to position them aesthetically. \LaTeX{} 2.09 had only \verb|\hline| and \verb|\cline|. The \verb|hhline| package is worth fetching from CTAN\@. Line spacing for the whole table can be adjusted with \verb|\arraystretch| and made different from \verb|\baselinestretch|. Bear in mind that the table can be compressed as well as extended, and \verb|\arraystretch|$=0.9$ may make the pattern more obvious and the table \emph{more} readable. The table is then embedded in an environment to set it off from the text. You can choose from \verb|quote|, \verb|center|, display maths or \verb|table|. Set the font size smaller within that environment, sans serif for the table and body text style for the captions. (The examples in this article follow their originals as closely as possible and don't do this. I think they would be improved if they did.) The table environment makes the object into a float, hence not to be broken between pages and with a cross-referencing label. The \verb|longtable| package caters for tables that are too large for a single page. Some mechanism should be used to create a left and right indent, the most obvious being to define the table width explicitly and center it. \verb|minipage| puts the footnotes to the table at the bottom of the table, using different marks from those in the body text. Putting the \verb|minipage| round the \verb|table| rather than just the \verb|tabular| also makes the caption narrower. I prefer the table reference to stand out, and often use the \verb|hangcapt| package. When there are several tables or a house style, it is better to define a new environment to ensure consistency in their presentation. I also commonly define new length constants for use in tables rather than copy the values. \section{A \LaTeX{} gap} One common format for tables has rows of short data values but a final column containing texts. The description parameter may therefore tell \LaTeX{} to calculate the widths of data columns from the values, but the final column will be a \verb|parbox| and should sensibly use the remainder of the \verb|linewidth|. This is one need that \LaTeX{} (2.09) blatantly fails to meet. You have to set the width of a \verb|parbox| or \texttt{p}(aragraph) column. Kroonenberg implies this problem when discussing \verb|\raggedright|; as she points out, to choose unjustified text you have to enclose each text in a \verb|parbox|. This is less work if the width is set as a name (to be calculated) and not the numeric length. My pragmatic solution is to set the value for the final column initially to the \verb|\linewidth|, run \LaTeX{}, note (from the log) the overfill, and subtract that from \verb|\linewidth|. \section{Kroonenberg's `after' table} The `after' table of Economic Forecasts is still poor. This is partly due to the content. It lacks a text explanation (possibly the primary source had one), but I \emph{guess} that the intention is to compare forecasts from two sources. Numbers going \emph{down} the columns are not related (in this sense), so flipping (transposing) the table would make logical sense. This might be the best solution on a wider page, as having the table span two columns is yet another way of distinguishing it from the body. If we insist on keeping it within one (text) column, there is room for another column of values as a `margin'. Without a rubric, I cannot decide if the table is trying to show disagreement between the two sets of forecasts, or similarity. Dependent on the message, the best margin might be the difference between each pair of forecasts, a $+$/$-$ sign for higher or lower, the ratio, or the average. The forecasts themselves should all be rounded to two significant digits. We seem to have lost between the `before' and `after' the detail that the figures are percentages; and \emph{was} government income really a `percentage change of a percentage'? The labels that include ($\times$ 1000 persons) are misleading; I take that notation to mean that the numbers shown \emph{have been} multiplied by 1000 (like \TeX{} magnifications). The correct notation would be (000s of persons). Making the rules extend to the linewidth was presumably not thought out. They unbalance the design, and all the white space implies an omission. The spaces between the columns are too wide. If ``mutations w.r.t.~1991'' and ``absolute quantities'' are to be used as primary divisions, then the secondary stubs (``real consumption'' etc) should be indented. If you think too deeply about the title, are these forecast changes or changes to (earlier) forecasts? Follow Chapman's advice and split the table into two, with headings of the form ``Forecast percentage changes 1991/92 in National Economy'' and ``Forecast quantities\ldots''. I leave this redrafting as an exercise for the reader. A parenthetic remark for those preparing tables of accounts comes from \cite{townsend}. He suggests that ``statements comparing budgets to actual should be written not in the usual terms of higher (lower) but in plain English of better (or worse) than predicted by the budget. This eliminates the mental gear changes between income items (where parentheses are bad) and expense items (where parentheses are good).'' Typography helping the reader. \section{New examples} The following examples are shown as output, in the expectation that the interested reader will obtain the `input' from the the author or the editor. If there is sufficient interest, the \LaTeX\ source for the tables can be placed on the CTAN archives. \subsection{Example 1\Dash Cohabitation} This is taken from \textit{Key Data 88} (\cite{key}) which had itself extracted it from \textit{Social Trends}. \textit{Key Data} is a sample of UK government statistics published annually as an educational resource and a guide to the more extensive sources. The table as printed had inconsistencies in the use of italic and upright fonts, and in its indentation. \begin{table*} \begin{center} \caption{Cohabitation} \label{cohab} % The table in the book is sans-serif with the heading bold. {\sffamily \setlength{\doublerulesep}{0pt} \begin{tabular}{llrrrrr} \multicolumn{7}{p{.5\textwidth}}% {\begin{tabbing} \bf 2.12\quad \= Percentage of women aged 18--49 cohabiting:\\ \> by age \end{tabbing}}\\ \multicolumn{2}{c}\textit{Great Britain}& \multicolumn{5}{r}{Percentage and numbers}\\ \hline \hline & & 1979 & 1981 & 1983 & 1984 & 1985 \\ \hline \multicolumn{2}{l}{Age group \textit{(percentages)}} \\ \hspace*{1em} & \textit{18--24} & 4.5 & 5.6 & 5.2 & 7.3 & 9.1 \\ & \textit{25--49} & 2.2 & 2.6 & 3.2 & 3.3 & 3.9 \\ \multicolumn{2}{c}\textit{All aged 18--49} & 2.7 & 3.3 & 3.6 & 4.2 & 5.0 \\ \\ \multicolumn{2}{l}{Women in sample \textit{(numbers)}} \\ & (=100\% above)\\ & 18--24 & 1,353 & 1,517 & 1,191 & 1,174 & 1,182 \\ & 25--49 & 4,651 & 5,007 & 4,094 & 4,070 & 4,182 \\ \cline{3-7} \multicolumn{2}{c}\textit{All aged 18--49} & 6,004 & 6,524 & 5,285 & 5,244 & 5,364\\ \hline \hline \end{tabular} } \end{center} \end{table*} Table~\ref{cohab} could be further improved by a distinct split and putting the two halves the other way round: ``Numbers of women surveyed in each year, in two age groups'' and ``Of the samples, percentage cohabiting''. Note that the first half is essentially for reference and the second half for demonstration. \subsection{Example 2\Dash Vulture meat} Table~\ref{meat} is taken from Ibis (\cite{thibault}), journal of the British Ornithologists' Union. It is a simple table, but appears in a two-column layout and has footnotes. The original used numbers as the footnote markers. I found this disconcerting, as \textit{G${}^\mathrm{2}$} looks like a numeric power; so I'll use the \LaTeX{} default in a minipage, which also sets the table to the width of an Ibis column. This table is adequate for reference, and is discussed in an adjacent text. ``Comparison of food availability among territories showed significant differences.\ \ldots'' The rows are not in an obvious order, and if the primary aim is to compare the number of, say, sheep in each territory, it might have been better flipped. A marginal total of the number of prey animals, or their biomass, in each territory would have been helpful. It is debatable whether the figures should be rounded, since they are taken from previous studies which are themselves recorded or reported with different precisions. \begin{table}[H] \begin{center} \begin{minipage}{81mm} \caption{\em Numbers of ungulates in Lammergeier territories (except for wild boar, differences among territories are statistically significant, $\chi^2_{16}=16.825, \mbox{P} < 0.001$).}\label{meat} \vspace{1ex}\ \\ \renewcommand{\footnoterule}{\rule{0pt}{0pt}} \begin{tabular*}{81mm}{l@{\quad\extracolsep{\fill}}rrrrr} \hline\vspace{.4ex}\\ & \multicolumn{5}{c}{Territory}\vspace{.2ex}\\ \cline{2-6}\vspace{.2ex}\\ Species & \textit{B}\footnote{Dubray \& Roux(1990)} & \textit{G}\footnote{Anonymous (1989).} & \textit{R}\footnote{Direction \dots} & \textit{T}${}^\alph{mpfootnote}$ % Gash way of repeating superscript. & \textit{V}${}^\alph{mpfootnote}$ \\ \hline\vspace{.2ex}\\ Sheep & 2000 & 1400 & 5240 & 3165 & 3480 \\ Mouflon & $<200$ & $-$ & $-$ & $<400$ & $<5$ \\ Goat & 570 & 4142 & 1880 & 1510 & 820 \\ Cattle & 2300 & 6402 & 7204 & 6774 & 965 \\ Pig & 3400 & 7188 & 1900 & 400 & 522 \\ Boar & $+$\footnote{$+ =$ present. $- =$ absent.} & $+$ & $+$ & $+$ & $+$\\[.5ex] \vspace{.2ex}\\ \hline \end{tabular*} \end{minipage} \end{center} \end{table} \subsection{Example 3\Dash YOPs} The problem with this table from \cite[p11 and, slightly changed, p35]{chapman} is that one column has an entry spreading over several lines (the brace linking them). The hint for doing this is in \cite[p79]{kopka}, but they show only one optional argument to \verb|\raisebox|. Using both optional arguments (setting height and depth to zero) and putting the `brace array' on the line opposite its middle, is the easiest way to centralize it vertically. A convenient feature is that you can reset the \verb|\arraystretch| and the brace is still the right size to cover three rows. This example is also set in a minipage but the footnote markers have been reset as numeric. Follow DEK's advice: \emph{don't use footnotes in text.} % The table in the book is column-width, sans-serif with the heading bold. \begin{center} \begin{minipage}{85mm} \small \sffamily % choose font \renewcommand{\footnoterule}{\rule{0pt}{0pt} \vspace{0pt}} \renewcommand{\thempfootnote}{\arabic{mpfootnote}} \renewcommand{\arraystretch}{1.1} \begin{tabular*}{85mm}{@{}l@{\extracolsep{\fill}}r@{\extracolsep{2em}}r@{}} \noalign{\bf % bold font for whole heading \begin{tabbing} Table 3\quad \= Entrants to Youth Opportunities\\ \> Programme in Wales: by type of\\ \> scheme \end{tabbing}}\\ \multicolumn{3}{l}\textit{Wales 1978 to 1980\hfill Percentages}\\ \hline & 1978/79 & 1979/80\\ \cline{2-2} \cline{3-3} \noalign{\vspace{3pt}} WEEP\footnote{Work Experience on Employers' Premises} & 89 & 73\\ Short Training Course & 10 & 10\\ Community Service & & 9\\ Project based work experience & & 7\\ Training Workshops &\raisebox{0pt}[0pt][0pt]{% Centre brace \(\begin{array}{lr} \left. % null delimiter to match brace \begin{array}{@{\extracolsep{0pt}}l}\strut \\ \strut \\ \strut \end{array} \right\} % the big brace \end{array}\)} 1 & 1\\ Induction and other\footnote{Employment induction courses and other remedial and preparatory courses} & & 1\\ \noalign{\vspace{3pt}} \hline \noalign{\vspace{3pt}} Total (100\%) & 15,000 & 22,000\\ \noalign{\vspace{3pt}} \hline \end{tabular*} \end{minipage}%end of \sf default \end{center} The table has a large gap between columns 1 and 2, making it difficult to see at a glance whether the brace includes `Community Service'. The original rubric (\cite{msc}) was: \begin{quote} ``One of the major aims in 1979/80 was to increase the range of provision available to meet the varying needs of unemployed young people. In the early days of the Programme, there was heavy reliance on the Work Experience on Employers' Premises (WEEP) element, but the table reflects the increasing provision that has now been made in the other elements of YOP.'' \end{quote} As presented, the first impression is that the number of schemes apparently grew, but each scheme attracted only a small percentage. `1' as a rounded value is very uninformative, especially as the compared figures are about 10 and 80\%. There is also a wide gap between the caption `(100\%)' and the figures it refers to. A quick sum shows that the numbers entering WEEP went \textit{up} by 20\% between the two periods. It's a classic political table; you can fiddle it to either praise or condemn. \begin{thebibliography}{99} \frenchspacing \bibitem[Chapman 1986]{chapman}{Chapman M. \& B. Mahon (1986)\textit{Plain Figures}. HMSO} \bibitem[Ehrenberg 1981]{ehrena}{Ehrenberg A.S.C. (1981) \textit{The Problem of Numeracy}. The American Statistician, Vol 35, No 2} \bibitem[Ehrenberg 1978]{ehrenb}{Ehrenberg A.S.C. (1978) \textit{Data Reduction: Analysing and Interpreting Statistical Data}. John Wiley} \bibitem[CSO 1988]{key}{Central Statistical Office (1988) \textit{Key Data 88}. HMSO London} \bibitem[Kopka \& Daly 1993]{kopka}{Kopka H. \& P.W. Daly (1993) \textit{A Guide to \LaTeX}. Addison-Wesley} \bibitem[Kroonenberg 1994]{Kroo}{Kroonenberg S. (1994) \textit{Table Design}. Baskerville Vol.\ ~4 No.\ ~4 (reprint of article from NTG journal)} \bibitem[Lamport 1986]{lamport}{Lamport L. (1986,1994) \textit{\LaTeX: User's Guide}. Addison-Wesley} \bibitem[MSC 1979/80]{msc}{Manpower Services Commission \textit{Annual Report 1979/80}, para~8.34 and Table~35. HMSO} \bibitem[Reese forthcoming]{reesettn}{Reese R.A. \textit{Dividing a Table Alphabetically}, submitted to TTN} \bibitem[Reynolds 1983]{reynolds}{Reynolds, L. (1983) \textit{Presentation of Data in Science}. Nijhoff, The Hague} \bibitem[Thibault et al. 1993]{thibault}{Thibault J.-C., Vigne J.-D. \& J. Torre (1993) {\it The Diet of young Lammergeiers in Corsica}. Ibis Vol.~135 No.~1} \bibitem[Townsend 1970]{townsend}{Townsend R. (1970) \textit{Up the Organization}. Michael Joseph, London (Coronet edition 1971)} \bibitem[Tufte 1983]{visdis}{Tufte, E. (1983) \textit{The Visual Display of Quantitative Information}. Graphic Press, Conn} \end{thebibliography} \end{Article}