Loading

Migrate your Elasticsearch data

Elastic Stack ECE Elastic Cloud Hosted

You might have switched to Elastic Cloud Hosted (ECH) or Elastic Cloud Enterprise (ECE) for any number of reasons, and you’re likely wondering how to get your existing Elasticsearch data into your new infrastructure. Along with easily creating as many new deployments with Elasticsearch clusters that you need, you have several options for moving your data over. Choose the option that works best for you:

  • Index your data from the original source, which is the simplest method and provides the greatest flexibility for the Elasticsearch version and ingestion method.
  • Reindex from a remote cluster, which rebuilds the index from scratch.
  • Restore from a snapshot, which copies the existing indices.
Note

Although this guide focuses on migrating data from a self-managed cluster to an Elastic Cloud Hosted or Elastic Cloud Enterprise deployment, the steps can also be adapted for other scenarios, such as when the source cluster is managed by Elastic Cloud on Kubernetes, or when migrating from Elastic Cloud Enterprise to Elastic Cloud Hosted.

If both the source and destination clusters belong to the same Elastic Cloud Hosted or Elastic Cloud Enterprise environment, refer to Restore a snapshot across clusters.

Depending on which option you choose, you might have limitations or need to do some preparation beforehand.

Indexing from the source
The new cluster must be the same size as your old one, or larger, to accommodate the data.
Reindex from a remote cluster

The new cluster must be the same size as your old one, or larger, to accommodate the data. Depending on your security settings for your old cluster, you might need to temporarily allow TCP traffic on port 9243 for this procedure.

For Elastic Cloud Hosted, if your cluster is self-managed with a self-signed certificate, you can follow this step-by-step migration guide.

Restore from a snapshot
The new cluster must be the same size as your old one, or larger, to accommodate the data. The new cluster must also be an Elasticsearch version that is compatible with the old cluster (check Elasticsearch snapshot version compatibility for details). If you have not already done so, you will need to set up snapshots for your old cluster using a repository that can be accessed from the new cluster.
Migrating system Elasticsearch indices

In Elasticsearch 8.0 and later versions, to back up and restore system indices and system data streams such as .kibana or .security, you must snapshot and restore the related feature's feature state.

Refer to Migrate system indices to learn how to restore the internal Elasticsearch system indices from a snapshot.

If you still have access to the original data source, outside of your old Elasticsearch cluster, you can load the data from there. This might be the simplest option, allowing you to choose the Elasticsearch version and take advantage of the latest features. You have the option to use any ingestion method that you want—Logstash, Beats, the Elasticsearch clients, or whatever works best for you.

If the original source isn’t available or has other issues that make it non-viable, there are still two more migration options, getting the data from a remote cluster or restoring from a snapshot.

Through the Elasticsearch reindex API, you can connect your new Elasticsearch deployment remotely to your old Elasticsearch cluster. This pulls the data from your old cluster and indexes it into your new one. Reindexing essentially rebuilds the index from scratch and it can be more resource intensive to run than a snapshot restore.

Warning

Reindex operations do not migrate index mappings, settings, or associated index templates from the source cluster.

Before migrating your Elasticsearch data, define the necessary mappings and templates on the new cluster. The easiest way to do this is to copy the relevant index templates from the old cluster to the new one before starting reindex operations.

Follow these steps to reindex data remotely:

  1. Log in to Elastic Cloud Hosted or Elastic Cloud Enterprise.

  2. Select a deployment or create one.

  3. Ensure that the new Elasticsearch cluster can access the remote source cluster to perform the reindex operation. Access is controlled by the Elasticsearch reindex.remote.whitelist user setting.

    Domains matching the patterns ["*.io:*", "*.com:*"] are allowed by default, so if your remote host URL matches that pattern you do not need to explicitly define reindex.remote.whitelist.

    Otherwise, if your remote endpoint is not covered by the default patterns, adjust the setting to add the remote Elasticsearch cluster as an allowed host:

    1. From your deployment menu, go to the Edit page.

    2. In the Elasticsearch section, select Manage user settings and extensions. For deployments with existing user settings, you may have to expand the Edit elasticsearch.yml caret for each node type instead.

    3. Add the following reindex.remote.whitelist: [REMOTE_HOST:PORT] user setting, where REMOTE_HOST is a pattern matching the URL for the remote Elasticsearch host that you are reindexing from, and PORT is the host port number. Do not include the https:// prefix.

      Note that if you override the parameter it replaces the defaults: ["*.io:*", "*.com:*"]. If you still want these patterns to be allowed you need to specify them explicitly in the value.

      For example:

      reindex.remote.whitelist: ["*.us-east-1.aws.found.io:9243", "*.com:*"]

    4. Save your changes.

  4. Using the API Console or within Kibana, either create the destination index with the appropriate settings and mappings, or ensure that the relevant index templates are in place.

  5. Using the API Console or Kibana DevTools Console, reindex the data remotely from the old cluster:

    POST _reindex
    {
      "source": {
        "remote": {
          "host": "https://REMOTE_ELASTICSEARCH_ENDPOINT:PORT",
          "username": "USER",
          "password": "PASSWORD"
        },
        "index": "INDEX_NAME",
        "query": {
          "match_all": {}
        }
      },
      "dest": {
        "index": "INDEX_NAME"
      }
    }
    

    For additional options and details, refer to the reindex API documentation.

  6. Verify that the new index is present:

    GET INDEX-NAME/_search?pretty
    
  7. If you are not planning to reindex more data from the remote, you can remove the reindex.remote.whitelist user setting that you added previously.

Restoring from a snapshot is often the fastest and most reliable way to migrate data between Elasticsearch clusters. It preserves mappings, settings, and optionally parts of the cluster state such as index templates, component templates, and system indices.

System indices can be restored by including their corresponding feature states in the restore operation, allowing you to retain internal configurations related to security, Kibana, or other stack features.

This method is especially useful when you want to fully replicate the source cluster or when remote reindexing is not possible, for example if the source cluster is in a degraded or unreachable state.

To use this method, the new cluster must have access to the snapshot repository that contains the data from the old cluster. Also ensure that both clusters use compatible versions.

For more information, refer to Restore into a different cluster

Note

For Elastic Cloud Enterprise users, while it is most common to have Amazon S3 buckets, you should be able to restore from any addressable external storage that has your Elasticsearch snapshots.

The following steps assume you already have a snapshot repository configured in the old cluster, with at least one valid snapshot containing the data you want to migrate.

In this step, you’ll configure a snapshot repository in the new cluster that points to the storage location used by the old cluster. This allows the new cluster to access and restore snapshots created in the original environment.

Tip

If your new Elastic Cloud Hosted or Elastic Cloud Enterprise deployment cannot connect to the same repository used by your self-managed cluster, for example if it's a private NFS share, consider one of the following alternatives:

  • Back up your repository to a supported storage system such as AWS S3, Google Cloud Storage, or Azure Blob Storage, and then configure your new cluster to use that location for the data migration.
  • Expose the repository contents over ftp, http, or https, and use a read-only URL repository type in your new deployment to access the snapshots.
  1. On your old Elasticsearch cluster, retrieve the snapshot repository configuration:

    GET /_snapshot/_all
    

    Take note of the repository name and type (for example, s3, gcs, or azure), its base path, and any additional settings. Authentication credentials are often stored in the secure settings on each node. You’ll need to replicate all this configuration when registering the repository in the new ECH or ECE deployment.

    If your old cluster has multiple repositories configured, identify the repository with the snapshots containing the data that you want to migrate.

  2. Add the snapshot repository on the new cluster.

    Considerations:

    • If you’re migrating searchable snapshots, the repository name must be identical in the source and destination clusters.
    • If the old cluster still has write access to the repository, register the repository as read-only to avoid data corruption. This can be done using the readonly: true option.

    To connect the existing snapshot repository to your new deployment, follow the steps for the storage provider where the repository is hosted:

    Important

    Although these instructions are focused on Elastic Cloud Hosted, you should follow the same steps for Elastic Cloud Enterprise by configuring the repository directly at the deployment level.

    Do not configure the repository as an ECE-managed repository, which is intended for automatic snapshots of deployments. In this case, you need to add a custom repository that already contains snapshots from another cluster.

After the repository has been registered and verified, you are ready to restore any data from any of its snapshots to your new cluster.

You can run a restore operation using the Kibana Management UI, or using the Elasticsearch API. Refer to Restore a snapshot for more details, including API-based examples.

For details about the contents of a snapshot, refer to Snapshot and restore > Snapshot contents.

To start the restore process:

  1. Open Kibana and go to Management > Snapshot and Restore.

  2. Under the Snapshots tab, you can find the available snapshots from your newly added snapshot repository. Select any snapshot to view its details, and from there you can choose to restore it.

  3. Select Restore.

  4. Select the index or indices you wish to restore.

  5. Optionally, configure additional restore options, such as Restore aliases, Restore global state, or Restore feature state.

  6. Select Restore snapshot to begin the process.

  7. Verify that each restored index is available in your deployment. You can do this using Kibana Index Management UI, or by running this query:

    GET INDEX_NAME/_search?pretty
    

    If you have restored many indices you can also run GET _cat/indices?s=index to list all indices for verification.