General guide for creating publication quality tables using R, Sweave, and LaTeX
There are a range of tools available for creating publication quality tables using R, Sweave, and LaTeX.
In particular, there are helper functions like latex
in the Hmisc
package, and xtable
in the xtable
package. I've also often written my own code so that I could have complete control over table formatting (e.g., see this example).
However, when preparing publication quality tables a range of issues often arise:
- how and when to apply numeric formatting
- how to precisely control alignment of columns and cells
- how to precisely control cell borders
- how to convert variable labels to variable names
- and so on
Beyond the high level issues of specifying the desired table format, there are issues of implementation.
- When should a helper function such as
xtable
be used? - Which helper function should be used in a given situation?
- How can the default output of helper functions be customised to particular requirements?
Question
It seems to me that the above issues are deserving of a detailed textbook-style introduction.
Are there any online or offline resources that provide a detailed overview of how to produce publication quality tables using R, Sweave, and LaTeX, and that address the issues discussed above?
Solution 1:
Just to tie this up with a nice little bow at the time of current writing, the best existant tutorials on publication-quality tables and usage scenarios appear to be an amalgamation of these documents:
- A Sweave example (source)
- The Joy of Sweave: A Beginner's Guide to Reproducible Research with Sweave (source)
- Latex and R via Sweave: An example document how to use Sweave (source)
- Sweave = R · LaTeX2 (source)
-
The
xtable
gallery (source) - The Sweave Homepage
- LaTeX documentation
Going beyond the scope of what currently exists, you may want to ask the author of The Joy of Sweave for a document on publication-quality tables specifically. It seems like he's gone above and beyond this problem in his research. In addition to the questions you've raised, this space specifically could use a style guide that, flatly, does not currently exist.
And, as mentioned in the question errata, this is a perfect example of a question for https://tex.stackexchange.com/. I encourage you to continue to ask specific questions there when you run into any difficulties in your current projects.
Solution 2:
The package stargazer can create publication-quality - incl. using templates designed to resemble existing academic journals - from commonly used R statistical functions and packages (lm, glm, plm, svyglm, survival, pscl, AER, and others). Also good for creating summary statistics tables, and can directly output data frame content as well.
Solution 3:
There is a tabular
function in the tables
package which addresses formatting, alignment and label operations. The package has a vignette which is a good starting point.
Solution 4:
xtable
has worked fine for me so far.
In combination with siunitx
, and when necessary, longtable
, it can produce pretty effective tables, in my opinion. With packages like booktabs
and caption
, the aesthetics can be pleasing too.
I am not sure this level of detail was asked for by the OP, but for what it's worth, the basic implementation could be something along these lines: https://tex.stackexchange.com/questions/41067/caption-for-longtable-in-sweave/41183#41183 (my own answer to another question).