Elasticsearch searchable synthetic fields

Provided that in a source document (JSON) exist a couple of fields named, a and b, that are of type long, I would like to construct a synthetic field (e.g. c) by concatenating the values of the previous fields with an underscore and index it as keyword.

That is, I am looking into a feature that could be supported with an imaginary, partial, mapping like this:

...

  "a": { "type": "long" },
  "b": { "type": "long" },
  "c": {
    "type": "keyword"
    "expression": "${a}_${b}" 
  },
...

NOTE: The mapping above was made up just for the sake of the example. It is NOT valid!

So what I am looking for, is if there is a feature in elasticsearch, a recipe or hint to support this requirement. The field need not be registered in _source, just need to be searchable.


Solution 1:

There are 2 steps to this -- a dynamic_mapping and an ingest_pipeline.

I'm assuming your field c is non-trivial so you may want to match that field in a dynamic template using a match and assign the keyword mapping to it:

PUT synthetic
{
  "mappings": {
    "dynamic_templates": [
      {
        "c_like_field": {
          "match_mapping_type": "string",
          "match":   "c*",
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ],
    "properties": {
      "a": {
        "type": "long"
      },
      "b": {
        "type": "long"
      }
    }
  }
}

Then you can set up a pipeline which'll concatenate your a & b:

PUT _ingest/pipeline/combined_ab
{
  "description" : "Concatenates fields a & b",
  "processors" : [
    {
      "set" : {
        "field": "c",
        "value": "{{_source.a}}_{{_source.b}}"
      }
    }
  ]
}

After ingesting a new doc (with the activated pipeline!)

POST synthetic/_doc?pipeline=combined_ab
{
  "a": 531351351351,
  "b": 251531313213
}

we're good to go:

GET synthetic/_search

yields

{
  "a":531351351351,
  "b":251531313213,
  "c":"531351351351_251531313213"
}

Verify w/ GET synthetic/_mapping too.