代码之家  ›  专栏  ›  技术社区  ›  Paul85

部分搜索返回零命中

  •  1
  • Paul85  · 技术社区  · 7 年前

    我已经成功地使用elasticsearch(V6.1.3)进行了精确的搜索。但当我试图部分或忽略案例时(例如: {"query": {"match": {"demodata": "Hello"}}} {"query": {"match": {"demodata": "ell"}}} ),命中率为零。不知道为什么?我已根据以下提示设置了分析器: Partial search

    from elasticsearch import Elasticsearch
    es = Elasticsearch()
    settings={
        "mappings": {
            "my-type": {
                'properties': {"demodata": {
                    "type": "string",
                    "search_analyzer": "search_ngram",
                    "index_analyzer": "index_ngram"
                }
            }},
    
        },
        "settings": {
                "analysis": {
                        "filter": {
                                "ngram_filter": {
                                        "type": "ngram",
                                        "min_gram": 3,
                                        "max_gram": 8
                                }
                        },
                        "analyzer": {
                                "index_ngram": {
                                        "type": "custom",
                                        "tokenizer": "keyword",
                                        "filter": [ "ngram_filter", "lowercase" ]
                                },
                                "search_ngram": {
                                        "type": "custom",
                                        "tokenizer": "keyword",
                                        "filter": "lowercase"
                                }
                        }
                }
        }
    }
    es.indices.create(index="my-index", body=settings, ignore=400)
    docs=[
        { "demodata": "hello" },
        { "demodata": "hi" },
        { "demodata": "bye" },
        { "demodata": "HelLo WoRld!" }
    ]
    for doc in docs:
        res = es.index(index="my-index", doc_type="my-type", body=doc)
    
    res = es.search(index="my-index", body={"query": {"match": {"demodata": "Hello"}}})
    print("Got %d Hits:" % res["hits"]["total"])
    print (res)
    

    更新的代码基于 Piotr Pradzynski公司 输入但不起作用!!!

    from elasticsearch import Elasticsearch
    es = Elasticsearch()
    if not es.indices.exists(index="my-index"):
        customset={
            "settings": {
                "analysis": {
                    "analyzer": {
                        "my_analyzer": {
                            "tokenizer": "my_tokenizer"
                        }
                    },
                    "tokenizer": {
                        "my_tokenizer": {
                            "type": "ngram",
                            "min_gram": 3,
                            "max_gram": 20,
                            "token_chars": [
                                "letter",
                                "digit"
                            ]
                        }
                    }
                }
            }
        }
    
    
        es.indices.create(index="my-index", body=customset, ignore=400)
        docs=[
            { "demodata": "hELLO" },
            { "demodata": "hi" },
            { "demodata": "bye" },
            { "demodata": "HeLlo WoRld!" },
            { "demodata": "xyz@abc.com" }
        ]
        for doc in docs:
            res = es.index(index="my-index", doc_type="my-type", body=doc)
    
    
    
    es.indices.refresh(index="my-index")
    res = es.search(index="my-index", body={"query": {"match": {"demodata":{"query":"ell","analyzer": "my_analyzer"}}}})
    
    #print res
    print("Got %d Hits:" % res["hits"]["total"])
    print (res)
    
    2 回复  |  直到 7 年前
        1
  •  1
  •   Piotr Pradzynski    7 年前

    我想你应该用 NGram Tokenizer 而不是 NGram Token Filter 和添加 multi field 将使用此标记器。

    诸如此类:

    PUT my-index
    {
      "settings": {
        "analysis": {
          "analyzer": {
            "ngram_analyzer": {
              "tokenizer": "ngram_tokenizer",
              "filter": [
                "lowercase",
                "asciifolding"
              ]
            }
          },
          "tokenizer": {
            "ngram_tokenizer": {
              "type": "ngram",
              "min_gram": 3,
              "max_gram": 15,
              "token_chars": [
                "letter",
                "digit"
              ]
            }
          }
        }
      },
      "mappings": {
        "my-type": {
          "properties": {
            "demodata": {
              "type": "text",
              "fields": {
                "ngram": {
                  "type": "text",
                  "analyzer": "ngram_analyzer",
                  "search_analyzer": "standard"
                }
              }
            }
          }
        }
      }
    }
    

    然后必须使用添加的mulit字段 demodata.ngram 在搜索中:

    res = es.search(index="my-index", body={"query": {"match": {"demodata.ngram": "Hello"}}})
    
        2
  •  0
  •   MrSimple    7 年前

    您需要的是query\u字符串搜索。

    https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html

    {
      "query":{
        "query_string":{
          "query":"demodata: *ell*"
        }
      }
    }