Skip to main content

elasticsearch 查询

es查询实在有点过于灵活和复杂了, 都是专门的领域知识, 几个星期前理清楚的东西没有记录又都忘了. 调研的时候关键要有丰富的查询数据, 一边测试一边查看文档, 理解起来就比较容易.

细节有点多, 比如查询语句需要与具体的index相匹配; 比如如何对文本设置index, 然后根据index支持不同的查询类型. 查询语法也非常灵活, 比如模糊查询或者精准匹配, 还支持根据多个查询进行bool组合.

match query 与 term query

所谓的term query 和match query这类es query用语, 其实体现在查询语句的参数字段里. term query里的查询参数就有term字段, match query则有match字段.

term query是精确匹配, match query是模糊匹配. match query 这单词太有歧义了, 以为是精确 match 匹配查询, 其实是全文本的模糊查询.

term query 精准匹配

Term query

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html

term query, 可以精准匹配, 不过index索引的内容不能是text类型(除非索引的使用keyword属性).

Returns documents that contain an exact term in a provided field.

  • You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.
  • Avoid using the term query for text fields.
  • By default, Elasticsearch changes the values of text fields as part of analysis. This can make finding exact matches for text field values difficult.
  • To search text field values, use the match query instead.
GET /_search
{
"query": {
"term": {
"user.id": {
"value": "kimchy",
"boost": 1.0
}
}
}
}

Avoid using the term query for text fields

To better search text fields, the match query also analyzes your provided search term before performing a search. This means the match query can search text fields for analyzed tokens rather than an exact term.

The term query does not analyze the search term. The term query only searches for the exact term you provide. This means the term query may return poor or no results when searching text fields.

text字段里的 keyword 索引

text index字段里增加 keyword 索引可以使用term query 全文匹配, 不过很明显索引index要额外占用空间.

match and term giving different results on text field elasticsearch

https://stackoverflow.com/questions/58254201/match-and-term-giving-different-results-on-text-field-elasticsearch

term查询一般为模糊匹配, 如果term查询也想要完整匹配, 需要在index里额外定义keyword属性, 然后在查询里指定为对应参数的keyword字段.

Term queries are to be used on keyword fields in order to get the exact match.

GET /product/_search
{
"from": 0,
"size" : 1000,
"query": {
"term": {
"name.keyword": { <---- Note this
"value": "Wine - Ice Wine"
}
}
}
}

fields

https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html

It is often useful to index the same field in different ways for different purposes. This is the purpose of multi-fields. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations:

index里对同一个字段进行多次定义用于支持不同的查询需求, 比如对某个字段设置了额外的keyword属性, 用于支持聚合或是关键词查询.

PUT my-index-000001
{
"mappings": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}


PUT my-index-000001/_doc/1
{
"city": "New York"
}

PUT my-index-000001/_doc/2
{
"city": "York"
}

GET my-index-000001/_search
{
"query": {
"match": {
"city": "york"
}
},
"sort": {
"city.raw": "asc"
},
"aggs": {
"Cities": {
"terms": {
"field": "city.raw"
}
}
}
}

match query 模糊匹配

match query太有歧义了, 以为是精确match匹配查询, 其实是全文本的模糊查询.

Full text queries

https://www.elastic.co/guide/en/elasticsearch/reference/current/full-text-queries.html

这文档里提供了全文本搜索语法群里的各种模糊匹配语句.

The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing.

  • match query The standard query for performing full text queries, including fuzzy matching and phrase or proximity queries.

  • match_phrase query Like the match query but used for matching exact phrases or word proximity matches.

  • multi_match query The multi-field version of the match query.

  • query_string query

Supports the compact Lucene query string syntax, allowing you to specify AND|OR|NOT conditions and multi-field search within a single query string. For expert users only.

  • intervals query

A full text query that allows fine-grained control of the ordering and proximity of matching terms.

match query

Match query

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html

Returns documents that match a provided text, number, date or boolean value. The provided text is analyzed before matching.

The match query is the standard query for performing a full-text search, including options for fuzzy matching.

GET /_search
{
"query": {
"match": {
"message": {
"query": "this is a test"
}
}
}
}
  • boost 关联度参数

(Optional, float) Floating point number used to decrease or increase the relevance scores of the query. Defaults to 1.0.

Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score.

  • fuzziness 最长编辑距离

(Optional, string) Maximum edit distance allowed for matching. See Fuzziness for valid values and more information. See Fuzziness in the match query for an example.

Match phrase query

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase.html

A phrase query matches terms up to a configurable slop (which defaults to 0) in any order. Transposed terms have a slop of 2.

GET /_search
{
"query": {
"match_phrase": {
"message": {
"query": "this is a test",
"analyzer": "my_analyzer"
}
}
}
}

slop参数使得match query 勉强实现全文本匹配的效果

picture 0

picture 1