How to copy some ElasticSearch data to a new index

Let's say I have movie data in my ElasticSearch and I created them like this:

curl -XPUT "http://192.168.0.2:9200/movies/movie/1" -d'
{
    "title": "The Godfather",
    "director": "Francis Ford Coppola",
    "year": 1972
}'

And I have a bunch of movies from different years. I want to copy all the movies from a particular year (so, 1972) and copy them to a new index of "70sMovies", but I couldn't see how to do that.


Solution 1:

Since ElasticSearch 2.3 you can now use the built in _reindex API

for example:

POST /_reindex
{
  "source": {
    "index": "twitter"
  },
  "dest": {
    "index": "new_twitter"
  }
}

Or only a specific part by adding a filter/query

POST /_reindex
{
  "source": {
    "index": "twitter",
    "query": {
      "term": {
        "user": "kimchy"
      }
    }
  },
  "dest": {
    "index": "new_twitter"
  }
}

Read more: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

Solution 2:

The best approach would be to use elasticsearch-dump tool https://github.com/taskrabbit/elasticsearch-dump.

The real world example I used :

elasticdump \
  --input=http://localhost:9700/.kibana \
  --output=http://localhost:9700/.kibana_read_only \
  --type=mapping
elasticdump \
  --input=http://localhost:9700/.kibana \
  --output=http://localhost:9700/.kibana_read_only \
  --type=data

Solution 3:

Check out knapsack: https://github.com/jprante/elasticsearch-knapsack

Once you have the plugin installed and working, you could export part of your index via query. For example:

curl -XPOST 'localhost:9200/test/test/_export' -d '{
"query" : {
    "match" : {
        "myfield" : "myvalue"
    }
},
"fields" : [ "_parent", "_source" ]
}'

This will create a tarball with only your query results, which you can then import into another index.