Use the Synonyms APIs to Update Synonyms Conveniently in Elasticsearch

Author:Murphy  |  View: 22375  |  Time: 2025-03-22 23:55:43

The synonyms feature of Elasticsearch is very powerful and can significantly enhance your search engine's efficiency when properly used. A common issue when using the synonyms feature is to update the synonyms set.

The synonyms defined inline in the settings of an index cannot be updated directly, and we need to close the index, update the settings, and re-open the index to make the changes effective. Another way is to use a synonyms file which can be updated by reloading the index. However, using an index file is difficult to manage when the Elasticsearch server is distributed or hosted in the cloud. This is because we need to put the file on all cluster nodes.

The good news is that there is a third way to do it now, which is much more convenient than the previous two. We can now use the synonyms APIs to manage synonyms. Even though it's still a beta functionality of Elasticsearch at the time of writing, I think it will be adopted soon because this functionality is highly demanded by developers and can solve the tricky problem of updating synonyms sets very conveniently. We will explore the common usage of the synonyms APIs in this post.


Preparation

We will use the following docker-compose.yaml file to start Elasticsearch and Kinana locally for demonstration.

version: "3.9"
services:
  elasticsearch:
    image: elasticsearch:8.11.1
    environment:
      - discovery.type=single-node
      - ES_JAVA_OPTS=-Xms1g -Xmx1g
      - xpack.security.enabled=false
    ports:
      - target: 9200
        published: 9200
    networks:
      - elastic

  kibana:
    image: kibana:8.11.1
    ports:
      - target: 5601
        published: 5601
    depends_on:
      - elasticsearch
    networks:
      - elastic      

networks:
  elastic:
    name: elastic
    driver: bridge

Note that you need to have at least version 8.10.0 of Elasticsearch to use the synonyms APIs. The latest version available would be the best as the feature should have become more mature then.


Create a synonyms set

When Elasticsearch and Kibana are started with the above docker-compose.yaml file, we can go to http://localhost:5601 to manage Elasticsearch indexes and synonyms.

To use the synonyms APIs to manage synonyms, we need to create a synonyms set first before it can be used in an Elasticsearch index.

We can use the _synonyms endpoint to create or update a synonyms set using the PUT request:

PUT _synonyms/inventory-synonyms-set
{
  "synonyms_set": [
    {
      "id": "synonym-1",
      "synonyms": "ps => playstation"
    },
    {
      "synonyms": "javascript,ecmascript,js"
    }
  ]
}
  • products-synonyms-set is a user-defined name of the synonyms set.
  • synonyms_set is the required key for the request body which includes an array of synonyms rules.
  • Each synonym rule is an object with an optional id key and a mandatory synonyms key. If the id is not provided, an identifier will be created by Elasticsearch. The value for synonyms is a rule defined in the Solr format as used in the previous post.

Create an Elasticsearch index using the synonyms set

When the synonyms set is created, it can be used in the synonym or synonym_graph token filters when an index is created:


PUT /inventory
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "index_analyzer": {
            "tokenizer": "standard",
            "filter": [
              "lowercase"
            ]
          },
          "search_analyzer": {
            "tokenizer": "standard",
            "filter": [
              "lowercase",
              "synonym_filter"
            ]
          }
        },
        "filter": {
          "synonym_filter": {
            "type": "synonym_graph",
            "synonyms_set": "inventory-synonyms-set",
            "updateable": true
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "index_analyzer",
        "search_analyzer": "search_analyzer"
      }
    }
  }
}

As we can see, the configurations for using a synonyms set are very similar to those using a synonyms file, as demonstrated in this post with great details. We just need to change synonyms_path to synonyms_set.


Test the synonyms set

We can use the _analyze endpoint to analyze some text and test the synonyms added in the previous step using the API:

GET /inventory/_analyze
{
  "analyzer": "search_analyzer",
  "text": "PS"
}

{
  "tokens": [
    {
      "token": "playstation",
      "start_offset": 0,
      "end_offset": 2,
      "type": "SYNONYM",
      "position": 0
    }
  ]
}

GET /inventory/_analyze
{
  "analyzer": "search_analyzer",
  "text": "JS"
}

{
  "tokens": [
    {
      "token": "javascript",
      ......
    },
    {
      "token": "ecmascript",
      ......
    },
    {
      "token": "js",
      ......
    }
  ]
}

It shows that the synonyms added by the API are working properly.


Update a synonyms set

Let's now use the synonyms API to update the synonyms set. This is where the synonyms API really shines because we don't need to close, open, or reload the corresponding Elasticsearch indexes, which alleviates the pain dramatically for developers.

We can use the PUT method to update the synonyms set as a whole. Be extremely careful here, otherwise, you will replace the original set with a new one containing only the new synonym rules, which is very destructive in production.

Let's add a new synonym rule to inventory-synonyms-set:

PUT _synonyms/inventory-synonyms-set
{
  "synonyms_set": [
    {
      "id": "synonym-1",
      "synonyms": "ps => playstation"
    },
    {
      "synonyms": "javascript,ecmascript,js"
    },
    {
      "synonyms": "py => Python"
    }
  ]
}

Note that the original Synonyms set should be added here as well.

When an existing synonyms set is updated, the search analyzers that use the synonym set are reloaded automatically for all indices, which can be seen in the response of the PUT request above:

{
  "result": "updated",
  "reload_analyzers_details": {
    "_shards": {
      "total": 2,
      "successful": 1,
      "failed": 0
    },
    "reload_details": [
      {
        "index": "inventory",
        "reloaded_analyzers": [
          "search_analyzer"
        ],
        "reloaded_node_ids": [
          "MD0GUTvcQAOvsHdUIBunNw"
        ]
      }
    ]
  }
}

We can use the _analyze endpoint to test the newly added synonym:

GET /inventory/_analyze
{
  "analyzer": "search_analyzer",
  "text": "py"
}

{
  "tokens": [
    {
      "token": "python",
      "start_offset": 0,
      "end_offset": 2,
      "type": "SYNONYM",
      "position": 0
    }
  ]
}

Yes, the synonyms set is updated successfully and it's effective in real time with no need to close, open, or reload the corresponding index.

Besides, we can also perform incremental updates by adding a rule directly. In this case, we need to specify an id for the rule in the path:

PUT _synonyms/inventory-synonyms-set/synonym-ipod
{
  "synonyms": "i-pod, i pod ⇒ ipod"
}

You can also delete a synonym rule by its id:


DELETE _synonyms/inventory-synonyms-set/synonym-ipod

Updating the synonym rules individually works in the same as when the synonyms set is updated as a whole.


Monitor synonyms sets

We can use the GET method to retrieve the content of a synonyms set directly:

GET _synonyms/inventory-synonyms-set

{
  "count": 3,
  "synonyms_set": [
    {
      "id": "qP8w_osBr1bbhRIfbz1c",
      "synonyms": "javascript,ecmascript,js"
    },
    {
      "id": "qf8w_osBr1bbhRIfbz1c",
      "synonyms": "py => python"
    },
    {
      "id": "synonym-1",
      "synonyms": "ps => playstation"
    }
  ]
}

In practice, we would want to write some script to count the number of synonyms for an index to make sure the synonyms are not removed accidentally as shown above. The following code snippet in Python can count the number of synonyms in a synonyms set:

from elasticsearch import Elasticsearch

es_client = Elasticsearch("http://localhost:9200")

es_client.synonyms.get_synonyms_sets()
# ObjectApiResponse({'count': 1, 'results': [{'synonyms_set': 'inventory-synonyms-set', 'count': 3}]})

In this work, we introduced the basics of Elasticsearch synonyms API and demonstrated how to use it to manage synonyms conveniently. With this functionality, we don't need to close and open an index as with inline synonyms or reload the index manually as with a synonyms file. It can make our search engines more stable and our work as a developer much easier.


Related articles

Tags: API Elasticsearch Hands On Tutorials Python Synonyms

Comment