How to read the contents of an .sql file into an R script to run a query?

I've had trouble with reading sql files myself, and have found that often times the syntax gets broken if there are any single line comments in the sql. Since in R you store the sql statement as a single line string, if there are any double dashes in the sql it will essentially comment out any code after the double dash.

This is a function that I typically use whenever I am reading in a .sql file to be used in R.

getSQL <- function(filepath){
  con = file(filepath, "r")
  sql.string <- ""

  while (TRUE){
    line <- readLines(con, n = 1)

    if ( length(line) == 0 ){
      break
    }

    line <- gsub("\\t", " ", line)

    if(grepl("--",line) == TRUE){
      line <- paste(sub("--","/*",line),"*/")
    }

    sql.string <- paste(sql.string, line)
  }

  close(con)
  return(sql.string)
}

I've found for queries with multiple lines, the read_file() function from the readr package works well. The only thing you have to be mindful of is to avoid single quotes (double quotes are fine). You can even add comments this way.

Example query, saved as query.sql

SELECT 
COUNT(1) as "my_count"
-- comment goes here
FROM -- tabs work too
  my_table

I can then store the results in a data frame with

df <- dbGetQuery(con, statement = read_file('query.sql'))

You can use the read_file() function from the readr package.

fileName = read_file("C:/Users/me/Desktop/my_script.sql")

You will get a string variable fileName with the desired text.

Note: Use / instead of \\\


The answer by Matt Jewett is quite useful, but I wanted to add that I sometimes encounter the following warning when trying to read .sql files generated by sql server using that answer:

Warning message: In readLines(con, n = 1) : line 1 appears to contain an embedded nul

The first line returned by readLines is often "ÿþ" in these cases (i.e. the UTF-16 byte order mark) and subsequent lines are not read properly. I solved this by opening the sql file in Microsoft SQL Server Management Studio and selecting

File -> Save As ...

then on the small downarrow next to the save button selecting

Save with Encoding ...

and choosing

Unicode (UTF-8 without signature) - Codepage 65001

from the Encoding dropdown menu.

If you do not have Microsoft SQL Server Management Studio and are using a Windows machine, you could also try opening the file with the default text editor and then selecting

File -> Save As ...

Encoding: UTF-8

to save with a .txt file extension.

Interestingly changing the file within Microsoft SQL Server Management Studio removes the BOM (byte order mark) altogether, whereas changing the file within the text editor converts the BOM to the UTF-8 BOM but nevertheless causes the query to be properly read using the referenced answer.