Elasticsearch Bulk Index JSON Data
What you need to do is to read that JSON file and then build a bulk request with the format expected by the _bulk
endpoint, i.e. one line for the command and one line for the document, separated by a newline character... rinse and repeat for each document:
curl -XPOST localhost:9200/your_index/_bulk -d '
{"index": {"_index": "your_index", "_type": "your_type", "_id": "975463711"}}
{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
{"index": {"_index": "your_index", "_type": "your_type", "_id": "975463943"}}
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
... etc for all your documents
'
Just make sure to replace your_index
and your_type
with the actual index and type names you're using.
UPDATE
Note that the command-line can be shortened, by removing _index
and _type
if those are specified in your URL. It is also possible to remove _id
if you specify the path to your id field in your mapping (note that this feature will be deprecated in ES 2.0, though). At the very least, your command line can look like {"index":{}}
for all documents but it will always be mandatory in order to specify which kind of operation you want to perform (in this case index
the document)
UPDATE 2
curl -XPOST localhost:9200/index_local/my_doc_type/_bulk --data-binary @/home/data1.json
/home/data1.json
should look like this:
{"index":{}}
{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
{"index":{}}
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
{"index":{}}
{"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"}
UPDATE 3
You can refer to this answer to see how to generate the new json style file mentioned in UPDATE 2.
UPDATE 4
As of ES 7.x, the doc_type
is not necessary anymore and should simply be _doc
instead of my_doc_type
. As of ES 8.x, the doc type will be removed completely. You can read more about this here
As of today, 6.1.2 is the latest version of ElasticSearch, and the curl command that works for me on Windows (x64) is
curl -s -XPOST localhost:9200/my_index/my_index_type/_bulk -H "Content-Type:
application/x-ndjson" --data-binary @D:\data\mydata.json
The format of the data that should be present in mydata.json remains the same as shown in @val's answer
A valid Elasticsearch bulk API request would be something like (ending with a newline):
POST http://localhost:9200/products_slo_development_temp_2/productModel/_bulk
{ "index":{ } }
{"RequestedCountry":"slo","Id":1860,"Title":"Stol"}
{ "index":{ } }
{"RequestedCountry":"slo","Id":1860,"Title":"Miza"}
Elasticsearch bulk api documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
This is how I do it
I send a POST http request with the uri
valiable as the URI/URL of the http request and elasticsearchJson
variable is the JSON sent in the body of the http request formatted for the Elasticsearch bulk api:
var uri = @"/" + indexName + "/productModel/_bulk";
var json = JsonConvert.SerializeObject(sqlResult);
var elasticsearchJson = GetElasticsearchBulkJsonFromJson(json, "RequestedCountry");
Helper method for generating the required json format for the Elasticsearch bulk api:
public string GetElasticsearchBulkJsonFromJson(string jsonStringWithArrayOfObjects, string firstParameterNameOfObjectInJsonStringArrayOfObjects)
{
return @"{ ""index"":{ } }
" + jsonStringWithArrayOfObjects.Substring(1, jsonStringWithArrayOfObjects.Length - 2).Replace(@",{""" + firstParameterNameOfObjectInJsonStringArrayOfObjects + @"""", @"
{ ""index"":{ } }
{""" + firstParameterNameOfObjectInJsonStringArrayOfObjects + @"""") + @"
";
}
The first property/field in my JSON object is the RequestedCountry
property that's why I use it in this example.
productModel
is my Elasticsearch document type.
sqlResult
is a C# generic list with products.
This answer is for Elastic Search 7.x onwards. _type
is deprecated. As others have mentioned, you can read the file programatically, and construct a request body as described below. Also, I see that each of your json object has the Id
attribute. So, you could set the document's internal id (_id
) to be the same as this attribute. Updated _bulk
API would look like this:
HTTP Method: POST
URI: /<index_name>/_bulk
Request body (should end with a new line):
{"index":{"_id": "975463711"}}
{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
{"index":{"_id": "975463943"}}
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}