How to make elasticsearch add the timestamp field to every document in all indices?

Solution 1:

Elasticsearch used to support automatically adding timestamps to documents being indexed, but deprecated this feature in 2.0.0

From the version 5.5 documentation:

The _timestamp and _ttl fields were deprecated and are now removed. As a replacement for _timestamp, you should populate a regular date field with the current timestamp on application side.

Solution 2:

You can do this by providing it when creating your index.

$curl -XPOST localhost:9200/test -d '{
"settings" : {
    "number_of_shards" : 1
},
"mappings" : {
    "_default_":{
        "_timestamp" : {
            "enabled" : true,
            "store" : true
        }
    }
  }
}'

That will then automatically create a _timestamp for all stuff that you put in the index. Then after indexing something when requesting the _timestamp field it will be returned.

Solution 3:

Adding another way to get indexing timestamp. Hope this may help someone.

Ingest pipeline can be used to add timestamp when document is indexed. Here, is a sample example:

PUT _ingest/pipeline/indexed_at
{
  "description": "Adds indexed_at timestamp to documents",
  "processors": [
    {
      "set": {
        "field": "_source.indexed_at",
        "value": "{{_ingest.timestamp}}"
      }
    }
  ]
}

Earlier, elastic search was using named-pipelines because of which 'pipeline' param needs to be specified in the elastic search endpoint which is used to write/index documents. (Ref: link) This was bit troublesome as you would need to make changes in endpoints on application side.

With Elastic search version >= 6.5, you can now specify a default pipeline for an index using index.default_pipeline settings. (Refer link for details)

Here is the to set default pipeline:

PUT ms-test/_settings
{
  "index.default_pipeline": "indexed_at"
}

I haven't tried out yet, as didn't upgraded to ES 6.5, but above command should work.

Solution 4:

You can make use of default index pipelines, leverage the script processor, and thus emulate the auto_now_add functionality you may know from Django and DEFAULT GETDATE() from SQL.

The process of adding a default yyyy-MM-dd HH:mm:ss date goes like this:

1. Create the pipeline and specify which indices it'll be allowed to run on:

PUT _ingest/pipeline/auto_now_add
{
  "description": "Assigns the current date if not yet present and if the index name is whitelisted",
  "processors": [
    {
      "script": {
        "source": """
          // skip if not whitelisted
          if (![ "myindex",
                 "logs-index",
                 "..."
              ].contains(ctx['_index'])) { return; }
          
          // don't overwrite if present
          if (ctx['created_at'] != null) { return; }
          
          ctx['created_at'] = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date());
        """
      }
    }
  ]
}

Side note: the ingest processor's Painless script context is documented here.

2. Update the default_pipeline setting in all of your indices:

PUT _all/_settings
{
  "index": {
    "default_pipeline": "auto_now_add"
  }
}

Side note: you can restrict the target indices using the multi-target syntax:

PUT myindex,logs-2021-*/_settings?allow_no_indices=true
{
  "index": {
    "default_pipeline": "auto_now_add"
  }
}

3. Ingest a document to one of the configured indices:

PUT myindex/_doc/1
{
  "abc": "def"
}

4. Verify that the date string has been added:

GET myindex/_search

Solution 5:

An example for ElasticSearch 6.6.2 in Python 3:

from elasticsearch import Elasticsearch

es = Elasticsearch(hosts=["localhost"])

timestamp_pipeline_setting = {
  "description": "insert timestamp field for all documents",
  "processors": [
    {
      "set": {
        "field": "ingest_timestamp",
        "value": "{{_ingest.timestamp}}"
      }
    }
  ]
}

es.ingest.put_pipeline("timestamp_pipeline", timestamp_pipeline_setting)

conf = {
    "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 1,
        "default_pipeline": "timestamp_pipeline"
    },
    "mappings": {
        "articles":{
            "dynamic": "false",
            "_source" : {"enabled" : "true" },
            "properties": {
                "title": {
                    "type": "text",
                },
                "content": {
                    "type": "text",
                },
            }
        }
    }
}

response = es.indices.create(
    index="articles_index",
    body=conf,
    ignore=400 # ignore 400 already exists code
)

print ('\nresponse:', response) 

doc = {
    'title': 'automatically adding a timestamp to documents',
    'content': 'prior to version 5 of Elasticsearch, documents had a metadata field called _timestamp. When enabled, this _timestamp was automatically added to every document. It would tell you the exact time a document had been indexed.',
}
res = es.index(index="articles_index", doc_type="articles", id=100001, body=doc)
print(res)

res = es.get(index="articles_index", doc_type="articles", id=100001)
print(res)

About ES 7.x, the example should work after removing the doc_type related parameters as it's not supported any more.