Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deboosting of search time synonym matches #49001

Closed
cbuescher opened this issue Nov 12, 2019 · 3 comments
Closed

Deboosting of search time synonym matches #49001

cbuescher opened this issue Nov 12, 2019 · 3 comments
Labels
>enhancement :Search/Analysis How text is split into tokens :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Comments

@cbuescher
Copy link
Member

Currently, if a search term is expanded via synonym filters, the term that is expanded to it treated the same way as the original search term in terms of search relevancy. For example, if I search for "automobile" and there is a synonym rule to expand this to "car", I can get results matching "car" ranked higher up than matches for the exact search term. Here is a very simple example that illustrates this:

PUT /test_index
{
    "settings": {
        "index" : {
            "analysis" : {
                "analyzer" : {
                    "synonym" : {
                        "tokenizer" : "standard",
                        "filter" : ["synonym"]
                    }
                },
                "filter" : {
                    "synonym" : {
                        "type" : "synonym",
                        "lenient": true,
                        "synonyms" : ["car, automobile"]
                    }
                }
            }
        }
    },
    "mappings": {
      "properties": {
        "field1" : {
          "type" : "text",
          "analyzer": "standard",
          "search_analyzer": "synonym"
        }
      }
    }
}

PUT /test_index/_doc/1
{
  "field1" : "fast car"
}

PUT /test_index/_doc/2
{
  "field1" : "fast automobile"
}

GET /test_index/_search
{
  "query": {
    "match": {
      "field1": "automobile"
    }
  }
}

When running this locally on one shard, both results returned have the same score. Ideally we should be able to take the token type of the search term into account and score the "car" match with a slightly lower boost than the one on the exact search term.

@cbuescher cbuescher added >enhancement :Search/Analysis How text is split into tokens :Search/Search Search-related issues that do not fall into other categories v8.0.0 labels Nov 12, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Analysis)

cbuescher pushed a commit to cbuescher/elasticsearch that referenced this issue Nov 13, 2019
This is a first WIP draft at tackling elastic#49001 to get some feedback. While this
approach should solve some of the singel and multi-term cases, I'm not sure if
the changes in `blendTerms` are correct. Also some method overwrites might be
better moved up the inheritance hierarchy into Lucene, opening this PR mostly
for initial discussion.
@jimczi jimczi removed the v8.0.0 label Nov 14, 2019
@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
@AlShaffey
Copy link

AlShaffey commented Jul 5, 2021

This is one is actually important, however, I'm more towar having a weighted synonyms, so, I'm in need to have control over the weight of each of the sysnonyms something like this.

So, is this possible?

@javanna
Copy link
Member

javanna commented Jun 19, 2024

This has been open for quite a while, and hasn't had a lot of interest. For now I'm going to close this as something we aren't planning on implementing. We can re-open it later if needed.

@javanna javanna closed this as not planned Won't fix, can't repro, duplicate, stale Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Analysis How text is split into tokens :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

6 participants