How to perform a GET request for Elasticsearch in R

I'm new to Elastic search and I'm trying to run a basic query in R. Because I need an API key I have not been able to use any of the available libraries for Elasticsearch in R.

I can retrieve all of the documents in the elastic search index but I don't seem to be able to run custom queries. I think it must be because my GET request isn't properly formatted. Here is what I have so far:

json_query <- jsonlite::toJSON('{
    "query": {
        "match" : {
            "LastName": "Baggins"
        }
    }
}
')

I've tried to add the my_query as a body= parameter but it just doesn't run the query (and retrieves the 10000 documents instead). I've ended up trying to paste it to the url parameter:

get_scroll_id <-  httr::GET(url =paste("'https://Myserver:9200/indexOfInterest/_search?scroll=1m&size=10000'",my_query),
                            encoding='json',
                            add_headers(.headers = c("Authorization" = "ApiKey ****", "Content-Type" = "application/json")),
                            config=httr::config(ssl_verifypeer = FALSE,ssl_verifyhost = FALSE))

scroll_data <- fromJSON(content(get_scroll_id, as="text"))

This gives me the error:

Error in curl::curl_fetch_memory(url, handle = handle) : 
  Protocol "" not supported or disabled in libcurl

I have also tried to add the query as the query parameter as follows:

get_scroll_id <-  httr::GET(url ='https://Myserver:9200/indexOfInterest/_search?scroll=1m&size=10000',
                            query= json_query,
                            encoding='json',
                            add_headers(.headers = c("Authorization" = "ApiKey *****==", "Content-Type" = "application/json")),
                            verbose(),
                            config=httr::config(ssl_verifypeer = FALSE,ssl_verifyhost = FALSE))

This gives me the output:

GET https://Myserver:9200/indexOfInterest/_search?{
    "query": {
        "match" : {
            "LastName" : "Baggins"
        }
    }
}

Options:
* ssl_verifypeer: FALSE
* ssl_verifyhost: FALSE
* debugfunction: function (type, msg) 
{
    switch(type + 1, text = if (info) prefix_message("*  ", msg), headerIn = prefix_message("<- ", msg), headerOut = prefix_message("-> ", msg), dataIn = if (data_in) prefix_message("<<  ", msg, TRUE), dataOut = if (data_out) prefix_message(">> ", msg, TRUE), sslDataIn = if (ssl && data_in) prefix_message("*< ", msg, TRUE), sslDataOut = if (ssl && data_out) prefix_message("*> ", msg, TRUE))
}
* verbose: TRUE
Headers:
* Authorization: ApiKey *****==
* Content-Type: application/json

Looking at the Elasticsearch documentation the curl is as follows:

 curl 'localhost:9200/get-together/event/_search?pretty&scroll=1m' -d ' {
 "query": {
"match" : {
 "LastName" : "Baggins"
 }
 }
}'

How can I create the correct command for Elasticsearch?


Solution 1:

I think the problem here is, that the httr package simply doesn't support the body parameter, because it isn't common to use a body in a GET request (Check out this SO answer about HTTP GET with request body).

But you could also use a POST request here, that works for me. Try the following and see if it helps:

library(httr)
library(rjson)

my_query <- rjson::toJSON(
'{
   "query": {
     "match": {
       "LastName": "Baggins"
     }
   }
 }
'
)

response <- httr::POST(
  url = "https://Myserver:9200/indexOfInterest/_search",
  httr::add_headers(
    .headers = c(
      "Authorization" = "ApiKey ****", 
      "Content-Type" = "application/json"
    )
  ), 
  body = fromJSON(my_query)
)


data <- fromJSON(content(response, as="text"))

EDIT:

If you really insist on doing a GET request, try following using curl. I couldn't test the Authorization part, but the rest ist working:

library(curl)
library(jsonlite)

my_query <- toJSON(
'{
   "query": {
     "match": {
       "LastName": "Baggins"
     }
   }
 }
'
)

h <- new_handle(verbose = TRUE)
handle_setheaders(h,
   "Authorization" = "ApiKey ****", 
   "Content-Type" = "application/json"
)
handle_setopt(handle = h, postfields=fromJSON(my_query), customrequest="GET")

c <- curl_fetch_memory(url = "https://Myserver:9200/indexOfInterest/_search", handle=h)

prettify(rawToChar(c$content))

The trick here is to use the postfields param to pass the body. But that would trigger the curl library to do a POST request. So by using setting customrequest="GET" we explicitly tell him to use a GET request.