General guide for creating publication quality tables using R, Sweave, and LaTeX

There are a range of tools available for creating publication quality tables using R, Sweave, and LaTeX. In particular, there are helper functions like latex in the Hmisc package, and xtable in the xtable package. I've also often written my own code so that I could have complete control over table formatting (e.g., see this example).

However, when preparing publication quality tables a range of issues often arise:

  • how and when to apply numeric formatting
  • how to precisely control alignment of columns and cells
  • how to precisely control cell borders
  • how to convert variable labels to variable names
  • and so on

Beyond the high level issues of specifying the desired table format, there are issues of implementation.

  • When should a helper function such as xtable be used?
  • Which helper function should be used in a given situation?
  • How can the default output of helper functions be customised to particular requirements?

Question

It seems to me that the above issues are deserving of a detailed textbook-style introduction.

Are there any online or offline resources that provide a detailed overview of how to produce publication quality tables using R, Sweave, and LaTeX, and that address the issues discussed above?


Solution 1:

Just to tie this up with a nice little bow at the time of current writing, the best existant tutorials on publication-quality tables and usage scenarios appear to be an amalgamation of these documents:

  • A Sweave example (source)
  • The Joy of Sweave: A Beginner's Guide to Reproducible Research with Sweave (source)
  • Latex and R via Sweave: An example document how to use Sweave (source)
  • Sweave = R · LaTeX2 (source)
  • The xtable gallery (source)
  • The Sweave Homepage
  • LaTeX documentation

Going beyond the scope of what currently exists, you may want to ask the author of The Joy of Sweave for a document on publication-quality tables specifically. It seems like he's gone above and beyond this problem in his research. In addition to the questions you've raised, this space specifically could use a style guide that, flatly, does not currently exist.

And, as mentioned in the question errata, this is a perfect example of a question for https://tex.stackexchange.com/. I encourage you to continue to ask specific questions there when you run into any difficulties in your current projects.

Solution 2:

The package stargazer can create publication-quality - incl. using templates designed to resemble existing academic journals - from commonly used R statistical functions and packages (lm, glm, plm, svyglm, survival, pscl, AER, and others). Also good for creating summary statistics tables, and can directly output data frame content as well.

Solution 3:

There is a tabular function in the tables package which addresses formatting, alignment and label operations. The package has a vignette which is a good starting point.

Solution 4:

xtable has worked fine for me so far. In combination with siunitx, and when necessary, longtable, it can produce pretty effective tables, in my opinion. With packages like booktabs and caption, the aesthetics can be pleasing too.

I am not sure this level of detail was asked for by the OP, but for what it's worth, the basic implementation could be something along these lines: https://tex.stackexchange.com/questions/41067/caption-for-longtable-in-sweave/41183#41183 (my own answer to another question).