我爱海鲸

es的相关操作

我爱海鲸 2024-11-26 23:42:22 暂无标签

简介elasticsearch、kibana

1、使用Elasticsearch API实现CRUD：

添加一个索引
PUT /lib/
{
  "settings":{
    "index":{
      "number_of_shards":3,     //分片
      "number_of_replicas":0    //备份的数量
    }
  }
}

也可直接  PUT lib2

返回结果为：
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "lib2"
}
即为创建索引成功



GET /lib/_settings    查看索引的配置


GET _all/_settings   查看所有索引的配置

增加：
在指定的索引中添加文档   使用PUT添加文档时需指定id
PUT /1ib/user/1
{ "first_ name" : "Jane",
"last_ name" : "Smith",
"age" : 32,
"about" :"I like to collect rock albums",
"interests": [ "music" ]
}


使用POST添加文档无需指定id将自动生成
POST /1ib/user/
{"first nane" : "Douglas",
"last_ name":"Fir",
"age" :23,
"about":"I like to build cabinets",
"interests": [ "forestry" ]
}

查询：
GET /1ib/user/1  查询lib索引下user类型id为1的文档
GET /1ib/user/nsrgZXABkJVXv6_Qvv_m   查询自动生成id的文档
GET /1ib/user/1?_source=age,about   查询指定字段的值


修改：
使用PUT用覆盖的方式进行更新
PUT /1ib/user/1
{ "first_ name" : "Jane",
"last_ name" : "Smith",
"age" : 36,
"about" :"I like to collect rock albums",
"interests": [ "music" ]
}

使用POST进行直接的修改 
POST /lib/user/1/_update
{ 
  "doc":{
    "age":33
  }
}

删除:
 DELETE /lib/user/1   删除ID为1的文档
DELETE lib2  删除索引

2、Filter查询

filter是不计算相关性的，同时可以cache，因此，filter速度要快于query
POST /lib4/items/_bulk
{"index":{"_id":1}}
{"price":40,"itemID":"ID100123"}
{"index":{"_id":2}}
{"price":50,"itemID":"ID100124"}
{"index":{"_id":3}}
{"price":25,"itemID":"ID100125"}
{"index":{"_id":4}}
{"price":30,"itemID":"ID100126"}
{"index":{"_id":5}}
{"price":null,"itemID":"ID100127"}

GET /lib4/items/_search
{
  "query":{
    "bool":{
      "filter":[
        {"term":{"price":25}}
      ]
    }
  }
}

GET /lib4/items/_search
{
  "query":{
    "bool":{
      "filter":[
        {"terms":{"price":[25,40]}}
      ]
    }
  }
}


以下查不出来   原因mapping的动态映射itermID是text  会进行分词 默认为小写
GET /lib4/items/_search
{
  "query":{
    "bool":{
      "filter":[
        {"terms":{"itemID":"ID100123"}}
      ]
    }
  }
}

GET lib4/_mapping

DELETE lib4


PUT lib4
{
  "mappings":{
    "items":{
      "properties":{
        "itemID":{
          "type":"text",
          "index":false
        }
      }
    }
  }
}


bool过滤查询
可以实现组合过滤查询
格式：
{"bool":{"must":[],"should":[],"must_not":[]}}
must必须满足的条件---and
should:可以满足也可以不满足的条件--or
must_not不需要满足的条件--not

GET /lib4/items/_search
{
  "query":{
    "bool":{
        "should":[
          {"term":{"price":25}},
          {"term":{"itemID":"id100123"}}
        ],
        "must_not":{
          "term":{"price":30}
        }
    }
  }
}

GET /lib4/items/_search
{
  "query":{
    "bool":{
      "should": [
        {"term":{"itemID": "id100123"}},
        {
          "bool":{
            "must": [
              {"term": {"itemID": "id100124"}},
              {"term": {"price": 40}}
            ]
          }
        }
      ]
    }
  }
}

范围过滤
gt:>
lt:<
gte:>=
lte:<=

GET /lib4/items/_search
{
  "query":{
    "bool":{
      "filter":{
        "range":{
          "price":{
            "gt":25,
            "lt":50
          }
        }
      }
    }
  }
}

GET lib4/items/_search
{
  "query":{
    "bool":{
      "filter":{
        "exists": {
          "field": "price"
        }
      }
    }
  }
}

3、mapping和查询：

GET /1ib/user/_mapping   获取mapping

mapping作用：定义字段的属性，定义字段是否分词

属性：
"store":false//是否单独设置此字段的是否存储而从_source字段中分离，默认是false，只能搜索，不能获取值
"index":true//分词，不分词是:false,设置成false，字段将不会被索引
"analyzer":"ik"//指定分词器，默认分词器为standard analyer
"boost":123//字段级别的分数加权，默认值是1.0
"doc_value":false//对not_analyzed字段，默认都是开启，分词字段不能使用，对排序和聚合能提升较大性能，节约内存
"fielddata":{"format":"disabled"}//针对分词字段，参与排序或聚合时能提高性能，不分词字段统一建议使用doc_value
"fields":{"raw":{"type":"string","index":"not_analyzed"}}//可以对一个字段提供多种索引模式，同一个字段的值，一个分词，一个分词
"ignore_above":100//超过100个字符的文本，将会被忽略，不被索引
"include_in_all":true//设置是否此字段包含在_all字段中，默认时true,除非index设置成no选项
"index_options":"docs"//4个可选参数docs(索引文档号)，freqs(文档号+词项)，positions（文档号+词频+位置，通常用来距离查询）,offsets(文档号+词频
+位置+偏移量，通常被使用在高亮字段)分词字段默认是position，其他的默认是docs
"norms":{"eanbale":true,";loading":"lazy"}//分词字段默认配置，不分词字段：默认（"enable":false）,存储长度因子和索引时boost,建议对需要参与评分
字段使用，会额外增加内存消耗量
"null_value":"NULL"//设置一些缺失字段的初始化值，只有string可以使用，分词字段的null值也会被分词
"position_increament_gap":0//影响距离查询或近似查询，可以设置在多值字段的数据上或分词字段上，查询时可指定slop间隔，默认值是100
"search_analyzer":"ik"//设置搜索时的分词器，默认跟ananlyzer是一致的，比如index时用standard+naram,搜索是用standard用来完成自动提示功能
"similarity":"BM25"//默认不储存向量信息，支持参数yes(term储存),with_position(term+位置),with_offsets(term+偏移量)，
with_positions_offsets(term+位置+偏移量)对快射高亮fast vector highlighter能提供性能,但开启又会加大索引体积，不适合大数据量用



//创建数据
PUT /myindex/articel/1
{
  "post_date":"2018-05-10",
  "title":"Java",
  "content":"java is the best language",
  "author_id":119
}

PUT /myindex/articel/2
{
  "post_date":"2018-05-12",
  "title":"html",
  "content":"I like html",
  "author_id":120
}

PUT /myindex/articel/3
{
  "post_date":"2018-05-16",
  "title":"es",
  "content":"Es is disbuted document store",
  "author_id":110
}


查询数据：
GET myindex/articel/_search

#查不出来
GET myindex/articel/_search?q=post_date:2018
#查出来了
GET myindex/articel/_search?q=post_date:2018-05-16

GET myindex/articel/_search?q=content:is

4、match查询：

match query知道分词器的存在，会对filed进行分词操作，然后再查询
GET /lib3/user/_search
{
  "query":{
    "match":{
      "name":"zhaoliu zhaoming"
    }
  }
}

GET lib3/user/_search
{
  "query":{
    "match":{
      "age":20
    }
  }
}

match_all:查询所有文档
GET /lib3/user/_search
{
  "query":{
    "match_all": {}
  }
}

GET /lib3/user/_search
{
  "query":{
    "match_all": {}
  }
  , "_source": {
    "includes": ["name","address"],
    "excludes": ["age","brithday"]
  }
}

通配符：
GET /lib3/user/_search
{
  "query":{
    "match_all": {}
  }
  , "_source": {
    "includes": ["addr*"],
    "excludes": ["age","bri*"]
  }
}

排序：
GET /lib3/user/_search
{
  "query":{
    "match_all": {}
  },
   "sort": [
     {
       "age": {
         "order": "desc"
       }
     }
   ]
}

multi_match:可以指定多个字段
GET lib3/user/_search
{
  "query":{
    "multi_match": {
      "query": "lvyou",
      "fields": ["interests","name"]
    }
  }
}

match_phrase：短语匹配查询
ElasticSearch引擎首先分析(analyze)查询字符串，从分析后的文本中构建短语查询，这意味着必须匹配短语中的所有分词，并且保证各个分词的相对位置不变
GET lib3/user/_search
{
  "query":{
    "match_phrase": {
      "interests": "duanlian,lvyou"
    }
  }
}

指定返回的字段：
GET lib3/user/_search
{
  "_source": ["address","name"],
  "query":{
    "match":{
      "interests":"changge"
    }
  }
}

前缀匹配查询
GET lib3/user/_search
{
  "_source": ["address","name"],
  "query":{
    "match_phrase_prefix":{
      "name":{
        "query":"zhao"
        
      }
    }
  }
}


范围查询：
GET lib3/user/_search
{
  "query":{
    "range":{
      "brithday":{
        "from":"1990-10-10",
        "to":"2018-05-01"
      }
    }
  }
}

GET lib3/user/_search
{
  "query":{
    "range":{
      "age":{
        "from":"20",
        "to":"25",
        "include_lower":true,
        "include_upper":false
      }
    }
  }
}

wildcard查询：
允许使用通配符*和？来进行查询
*代表0个或多个字符
？代表任意一个字符
GET /lib3/user/_search
{
  "query":{
    "wildcard": {
      "name": "zhao*"
    }
  }
}

GET /lib3/user/_search
{
  "query":{
    "wildcard": {
      "name": "li?i"
    }
  }
}

fuzzy实现模糊查询
value：查询的关键字
boost:查询的权值，默认值是：1.0
min_similarity:设置匹配的最小相似度，默认值为0.5，对于字符串，取值为0-1（包括0和1）：对于数值，取值可能大于1；对于日期取值为1d，1m等，1d就代表1天
prefix_length:指明区分词项的共同前缀长度，默认是0
max_expansions:查询中的词项可以扩展的数目，默认可以无限大
GET /lib3/user/_search
{
  "query":{
    "fuzzy":{
      "interests":"changge"
    }
  }
}

GET /lib3/user/_search
{
  "query":{
    "fuzzy":{
      "interests":{
        "value": "changge"
      }
    }
  }
}

高亮搜索结果：（加个标签做强调）
GET /lib3/user/_search
{
  "query":{
    "match": {
      "interests": "changge"
    }
  },
  "highlight":{
    "fields": {
      "interests": {}
    }
  }
}

5、Multi Get API批量获取文档：

获取指定字段的文档：
GET /_mget
{
  "docs":[
    {
      "_index": "1ib",
      "_type": "user",
      "_id": 1,
      "_source": "interests"
    },
    {
      "_index": "1ib",
      "_type": "user",
      "_id": 2,
      "_source": "interests"
    }
  ]
}


获取相同索引和类型下的文档：
GET /1ib/user/_mget
{
  "docs":[
      {
        "_id":1
      },
      {
        "_type": "user",
        "_id": 2
      }
    ]
}

简化：
GET /1ib/user/_mget
{
  "ids":["1","2","3"]
}

6、query查询：

数据准备：
PUT /lib3
{
  "Settings":{
    "number_of_shards":3,
    "number_of_replicas":0
  },
  "mappings":{
    "user":{
      "properties":{
        "name":{"type":"text"},
        "address":{"type":"text"},
        "age":{"type":"integer"},
        "interests":{"type":"text"},
        "brithday":{"type":"date"}
      }
    }
  }
}

PUT /lib3/user/1
{
  "name":"zhaoliu",
  "address":"hei long jiang sheng tie ling shi",
  "age":50,
  "brithday":"1970-12-12",
  "interests":"xi huan hejiu,duanlian,lvyou"
}

PUT /lib3/user/2
{
  "name":"zhaoming",
  "address":"bei jing hai dian qu qing he zhen",
  "age":20,
  "brithday":"1998-12-12",
  "interests":"xi huan hejiu,duanlian,changge"
}

PUT /lib3/user/3
{
  "name":"lisi",
  "address":"bei jing hai dian qu qing he zhen",
  "age":23,
  "brithday":"1998-12-12",
  "interests":"xi huan hejiu,duanlian,changge"
}

PUT /lib3/user/4 
{
  "name":"wangwu",
  "address":"bei jing hai dian qu qing he zhen",
  "age":26,
  "brithday":"1998-12-12",
  "interests":"xi huan hejiu,duanlian,changge"
}

PUT /lib3/user/5
{
  "name":"zhangsan",
  "address":"bei jing hai dian qu qing he zhen",
  "age":29,
  "brithday":"1998-12-12",
  "interests":"xi huan hejiu,duanlian,changge,tiaowu"
}

查询：
#max_score:和当前搜索相关度的匹配分数
GET /lib3/user/_search?q=name:lisi

查询interests为changge且按age倒序排序
GET /lib3/user/_search?q=interests:changge&sort=age:desc

term查询和terms查询
term query会去倒排索引中寻找确切的term，它并不知道分词器的存在。这种查询适合keyword、numeric、date。
term:查询某个字段里含有某个关键词的文档
GET /lib3/user/_search/
{
  "query":{"term":{"interests":"changge"}}
  }

控制查询返回的数量：
from:从哪一个文档开始size:需要的个数
GET /lib3/user/_search
{
  "from":0,"size":2,"query":{"terms": {
    "interests": [
      "hejiu",
      "changge"
    ]
  }}
}

返回版本号：
GET /lib3/user/_search
{
  "from":0,"size":2,
  "version":true,
  "query":{"terms": {
    "interests": [
      "hejiu",
      "changge"
    ]
  }}
}

7、版本控制_并发问题：

每次进行增删改操作时，version会加一
PUT /1ib/user/4?version=2
{ "first_ name" : "Jane",
"last_ name" : "Smith",
"age" : 38,
"about" :"I like to collect rock albums",
"interests": [ "music" ]
}

外部版本控制：version的值需大于当前版本的值
PUT /1ib/user/4?version=4&version_type=external
{ "first_ name" : "Jane",
"last_ name" : "Smith",
"age" : 37,
"about" :"I like to collect rock albums",
"interests": [ "music" ]
}

8、多index，多type查询：

多index，多type查询

GET /lib/user/4

GET _search

GET /lib/_search

GET /lib,lib3/_search

GET /*3,*4/_search

GET /lib/user/_search

GET /lib,lib4/user,items/_search

GET /_all/_search

GET /_all/user,items/_search

9、分页查询：

from 从第几个文档开始查询
size 查询的数据量
GET /lib/user/_search?from=0&size=3

10、复合查询：

 GET /lib3/user/_search
{
  "query":{
    "bool":{
      "must":{"match":{"interests":"changge"}},
      "must_not":{"match":{"interests":"lvyou"}},
      "should": [
        {"match": {
          "address": "bei jing"
        }},
        {"range":{"brithday": {
          "gte": "1996-01-01"
        }}}
      ]
    }
  }
}


GET /lib3/user/_search
{
  "query":{
    "bool":{
      "must":{"match":{"interests":"changge"}},
      "must_not":{"match":{"interests":"lvyou"}},
      "should": [
        {"match": {
          "address": "bei jing"
        }}
      ],
      "filter": {
        "range": {"brithday": {
          "gte": "1996-01-01"
        }}
      }
    }
  }
}

GET /lib3/user/_search
{
  "query":{
    "bool":{
      "must":{"match":{"interests":"changge"}},
      "must_not":{"match":{"interests":"lvyou"}},
      "should": [
        {"match": {
          "address": "bei jing"
        }}
      ],
      "filter": {
        "bool":{
          "must":[
            {"range":{"brithday":{"gte":"1990-01-01"}}},
            {"range":{"age":{"lte":30}}}
            ],
            "must_not":[
              {"term":{"age":"29"}}
              ]
        }
      }
    }
  }
}

constant_score查询
它将一个不变的常量评分应用于所有匹配的文档，它被经常用于你只需执行一个filter而没有其它查询（例如，评分查询）的情况下
GET /lib3/user/_search
{
  "query":{
    "constant_score": {
      "filter": {
        "term":{"interests":"changge"}
      }
    }
  }
}
term查询被放置在contant_score中，转成不评分的filter，这种方式可以取代只有filter语句的bool查询

11、基于groovy脚本执行partial update：

GET /lib/user/4

GET /lib/user/4/_update
{
     "script":"ctx_source.age+=1"
}

GET /lib/user/4/_update
{
     "script":"ctx_source.last_name+='hehe'"
}

GET /lib/user/4/_update
{
     "script":{
	"source":"ctx_source.interests.add(params.tag)",
	"params":{
	    "tag":"football"
	}
       }
}

GET /lib/user/4/_update
{
     "script":{
	"source":"ctx_source.interests.remove(ctx._source.interests.indexof(params.tag))",
	"params":{
	    "tag":"football"
	}
       }
}


删除文档
GET /lib/user/4/_update
{
     "script":{
	"source":"ctx_op=ctx._source.age==params.count?'delete':'none'",
	"params":{
	    "count":"22"
	}
       }
}


upset操作：如果该文档不存在会进行初始化，如果存在执行"script":"ctx._source.age+=1",
GET /lib/user/4/_update
{
     "script":"ctx_source.age+=1"
     "upset":{
	"first_name":"Jame",
	"last_name":"Lucy",
	"age":20,
	"about":"I like to collect rock albums",
	"interests":[
	     "music"
	]
       }
}

12、聚合查询：

所有商品价格的总和
GET /lib4/items/_search
{
  "size":0,
 "aggs":{
   "price_of_sum":{
     "sum":{
       "field":"price"
     }
   }
 }

所有商品价格的最小值
GET /lib4/items/_search
{
  "size":0,
 "aggs":{
   "price_of_sum":{
     "min":{
       "field":"price"
     }
   }
 }
}

所有商品价格的平均值
GET /lib4/items/_search
{
  "size":0,
 "aggs":{
   "price_of_sum":{
     "avg":{
       "field":"price"
     }
   }
 }
}


所有商品价格的奇数列值
GET /lib4/items/_search
{
  "size":0,
 "aggs":{
   "price_of_sum":{
     "cardinality":{
       "field":"price"
     }
   }
 }
}

所有商品价格分组
GET /lib4/items/_search
{
  "size":0,
 "aggs":{
   "price_of_group":{
     "terms":{
       "field":"price"
     }
   }
 }
}

对那些有唱歌兴趣的用户按年龄分组
GET /lib3/user/_search
{
  "size": 0, 
  "query":{
    "match": {
      "interests": "changge"
    }
  },
  "aggs":{
    "age_of_group":{
      "terms":{
        "field":"age"
        , "order": {
          "age_of_avg": "desc"
        }
      },
      "aggs":{
        "age_of_avg":{
          "avg": {
            "field": "age"
          }
        }
      }
    }

  }
}

13、使用Bulk API实现批量操作：

bulk的格式
{action:{metadata}}\n
{requestbody}\n
 
action:(行为) 

create:文档不存在时创建
 
update:更新文档

index:创建新文档或替换已有文档

delete:删除一个文档

metadata:_index,_type,_id

create和index的区别：
如果数据存在，使用create操作失败，会提示文档已经存在，使用index则可以成功执行

列如：

{"delete":{"_index":"1ib","_type":"user","_id":"1"}}

批量添加：
POST /lib2/books/_bulk
{"index":{"_id":1}}
{"title":"java","price":55}
{"index":{"_id":2}}
{"title":"Html5","price":45}
{"index":{"_id":3}}
{"title":"Php","price":35}
{"index":{"_id":4}}
{"title":"Python","price":50}


查询：
GET /lib2/books/_mget
{
  "ids":["1","2","3","4"]
}

批量操作：
POST /lib2/books/_bulk
{"delete":{"_index":"lib2","_type":"books","_id":4}}
{"create":{"_index":"tt","_type":"ttt","_id":"100"}}
{"name":"lisi"}
{"update":{"_index":"lib2","_type":"books","_id":"3"}}
{"doc":{"price":58}}

你好：我的2025

上一篇：springData JPA Specification的查询

下一篇：Feigin调用的问题

es的相关操作

最近更新

最近更新

最近更新