Using loops with knitr to produce multiple pdf reports... need a little help to get me over the hump

Solution 1:

You don't need to re-define the data in the .Rnw file and I think the warning is coming from the fact that you are putting the output name together with Hospital (the full vector of hospitals) rather than hosp (the loop index).

Following your example, testingloops.Rnw would be

\documentclass[10pt]{article}
\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
subgroup <- df[ df$Hospital == hosp,]
@

\begin{document}
<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", hosp , sep=""))
@

Some infomative text about hospital \Sexpr{hosp}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(hosp, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))
  }
@
\end{document}

and the driver R file would be just

##  make my data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)

## knitr loop
library("knitr")
for (hosp in unique(df$Hospital)){
  knit2pdf("testingloops.Rnw", output=paste0('report_', hosp, '.tex'))
}

Solution 2:

Great question! This works for me with the other bits you've supplied in your question. Note that I've replaced your hosp with just x. I've called your Rnw file test.rnw

# input data
Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)

# generate the tex files, one for each hospital in df
library(knitr)
lapply(unique(df$Hospital), function(x) 
       knit("C:\\emacs\\test.rnw", 
            output=paste('report_', x, '.tex', sep="")))

# generate PDFs from the tex files, one for each hospital in df
lapply(unique(df$Hospital), function(x)
       tools::texi2pdf(paste0("C:\\emacs\\", paste0('report_', x, '.tex')), 
                       clean = TRUE, quiet = TRUE))

I've replaced your loops withlapply and anonymous functions, which often seem to be considered more R-ish.

Here you can see where I replaced the hosp with x in the rnw file:

\documentclass[10pt]{article}
\usepackage[margin=1.15 in]{geometry}
<<loaddata, echo=FALSE, message=FALSE>>=
  Hospital <- c(rep("A", 20), rep("B", 20))
Ward <- rep(c(rep("ICU", 10), rep("Medicine", 10)), 2)
Month <- rep(seq(1:10), 4)
Outcomes <- rnorm(40, 20, 5)
df <- data.frame(Hospital, Ward, Month, Outcomes)
subgroup <- df[ df$Hospital == x,]
@

\begin{document}
<<setup, echo=FALSE >>=
  opts_chunk$set(fig.path = paste("test", x , sep=""))
@

Some informative text about hospital \Sexpr{x}

<<plots, echo=FALSE >>=
  for(ward in unique(subgroup$Ward)){
    subgroup2 <- subgroup[subgroup$Ward == ward,]
    #     subgroup2 <- subgroup2[ order(subgroup2$Month),]
    savename <- paste(x, ward)
    plot(subgroup2$Month, subgroup2$Outcomes, type="o", main=paste("Trend plot for", savename))
  }
@
\end{document}

The result is two tex files (report_A.tex, report_B.tex), four PDFs for the figures (A1, A2, B1, B2) and two PDFs for the reports (report_A.pdf, report_B.pdf), each with their figures in them. Is that what you were after?

Solution 3:

In this answer I intend to answer a more general question: "Using loops to produce multiple pdf reports", and not your specific example. This is because this trend was quite hard to follow as a noob. I managed to get it working eventualy (html version), so this is my humble solution. There are probably some better ones published here, I just can't fully understand them yet.

  1. create RMD file with your design and save it in the working\input directory (in Rstudio: file->newfile->R markdown). This file should include all functions you need to make the plots in the report (simply declare them in one of those code chunks). Think of this file as the template for all future reports. Don't worry about passing the data into it's environment after chewing it up earlier- I will cover that in (2). the key issue to understand is that all calculations are done further down the pipeline (at the moment you render the RMD file).

  2. create the loop you need to use in a diffrent control r file. In my case, there is a loop that iterates over all files in the directory, and gets them into data frame. then I want to pass those dataframes into the RMD, along with other data variables, in order to plot them. This is how its done:

    run_on_all<-function(path_in="path:\\where\\your\\input\\and\\RMD\\is", path_out="path:\\where\\your\\output\\will\\be") setwd(path_in) ibrary(rmarkdown) library(knitr) list_of_file_names=list.files(path = getwd, pattern = "*.csv") #this gets a list of the input files names for (file_name in list_of_file_names) { data=read.csv(file_name) #read file into data frame report_name=paste(some_variable_name,".html",sep="") render("your_template.Rmd",output_file =report_name,output_dir =path_out,list(data,all other parameters you want to input into the RMD))} }

  3. The most important command is the render function call. It allows you to throw into the RMD environment any paramenters you wish. It also allows you to change the name of the report and to change the output location. In addition, by calling it you are also generating the report, so you get it all in one line.(note that if the call to RMD is within a function, you may find that the variables you input are missing, yet the report will still be published correctly)

summary

there are two files you need- RMD file, that will be the template for all additional reports and a control file. the control file gets data, chews it up and pass the chewed parameteres into the RMD (via the render function). the RMD gets the data, does some computations, plots it and publishes it in a new file (also by the render function). I hope I helped.