沿用该文章里的数据https://www.cnblogs.com/MRLL/p/12691763.html
查询时发现,一模一样的name,但是相关度不一样
GET /z_test/doc/_search { "explain": false, "query": { "match_phrase": { "name": "测试" } } }
结果
{ "took" : 5, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.5753642, "hits" : [ { "_index" : "z_test", "_type" : "doc", "_id" : "D4eQcnEBf_xjEc-wO9P0", "_score" : 0.5753642, "_source" : { "name" : "测试123" } }, { "_index" : "z_test", "_type" : "doc", "_id" : "2oeLcnEBf_xjEc-wFNK2", "_score" : 0.5753642, "_source" : { "name" : "测试" } }, { "_index" : "z_test", "_type" : "doc", "_id" : "_analyze", "_score" : 0.45840853, "_source" : { "name" : "测试" } }, { "_index" : "z_test", "_type" : "doc", "_id" : "qHeKcnEBvg5mZsCPxwX1", "_score" : 0.3672113, "_source" : { "name" : "测试" } }, { "_index" : "z_test", "_type" : "doc", "_id" : "AVSTcnEBjEFwhOIJHS0S", "_score" : 0.33573607, "_source" : { "name" : "测试1" } } ] } }
查询文档后得知,在相关度分值的计算中有个属性为逆向文档频率,意思为该搜索字段在整个索引的文档里出现的频率,出现的越多所占分值权重越低
参照该文章https://blog.csdn.net/paditang/article/details/79098830
解决办法为用以下查询
GET /z_test/doc/_search?search_type=dfs_query_then_fetch { "explain": false, "query": { "match": { "name": {"query": "测试"} } } }
dfs_query_then_fetch意为使用全局的文档信息打分
默认查询参数为query then fetch
dfs_query_then_fetch
PUT /my_index { "mappings": { "doc": { "properties": { "name": { "type": "string", "index_options": "docs" } } } } }
原文:https://www.cnblogs.com/MRLL/p/12692950.html