elasticsearch. 向导

Delete By Query API

该接口允许你通过执行一个查询来批量删除,可以通过简单query string参数或者是 Query DSL 的方式,下面是一个简单的例子:
The delete by query API allows to delete documents from one or more indices and one or more types based on a query. The query can either be provided using a simple query string as a parameter, or using the Query DSL defined within the request body. Here is an example:

$ curl -XDELETE 'http://localhost:9200/twitter/tweet/_query?q=user:kimchy'

$ curl -XDELETE 'http://localhost:9200/twitter/tweet/_query' -d '{
    "term" : { "user" : "kimchy" }
}
'

上面的两个例子都是删除twitter索引下面某个用户的操作,操作将返回如下信息:
Both above examples end up doing the same thing, which is delete all tweets from the twitter index for a certain user. The result of the commands is:

{
    "ok" : true,
    "_indices" : {
        "twitter" : { 
            "_shards" : {
                "total" : 5,
                "successful" : 5,
                "failed" : 0
            }
        }
    }
}

注,删除的时候还可以传递版本信息,另外,不推荐一次性删除大量的数据,大多数情况下,将数据重建到一个新的索引上会好一些
Note, delete by query bypasses versioning support. Also, it is not recommended to delete “large chunks of the data in an index”, many times, its better to simply reindex into a new index.

多索引和多类型(Multiple Indices and Types)

delete by query可以在一个索引或者多个索引上执行,同理一个类型或多个类型都是支持的。
The delete by query API can be applies to multiple types within an index, and across multiple indices. For example, we can delete all documents across all types within the twitter index:

$ curl -XDELETE 'http://localhost:9200/twitter/_query?q=user:kimchy'

删除指定类型的数据
We can also delete within specific types:

$ curl -XDELETE 'http://localhost:9200/twitter/tweet,user/_query?q=user:kimchy'

删除多个索引下的数据
We can also delete all tweets with a certain tag across several indices (for example, when each user has his own index):

$ curl -XDELETE 'http://localhost:9200/kimchy,elasticsearch/_query?q=tag:wow'

甚至索引索引下
Or even delete across all indices:

$ curl -XDELETE 'http://localhost:9200/_all/_query?q=tag:wow'

可接受的参数(Request Parameters)

可以接受参数q来传递查询条件,另外还有如下额外的参数可以配置
When executing a delete by query using the query parameter q, the query passed is a query string using Lucene query parser. There are additional parameters that can be passed:

Name Description
df 默认的字段,如果你的查询条件没有指定前缀的时候,默认使用这个字段作为限定符。The default field to use when no field prefix is defined within the query.
analyzer 处理query string的默认分析器。The analyzer name to be used when analyzing the query string.
default_operator 默认的操作符,可以是 ANDOR ,默认是 OR 。 The default operator to be used, can be AND or OR. Defaults to OR.

以下内容和delete接口类似,参照 Delete

Request Body

The delete by query can use the Query DSL within its body in order to express the query that should be executed and delete all documents. The body content can also be passed as a REST parameter named source.

Distributed

The delete by query API is broadcast across all primary shards, and from there, replicated across all shards replicas.

Routing

The routing value (a comma separated list of the routing values) can be specified to control which shards the delete by query request will be executed on.

Replication Type

The replication of the operation can be done in an asynchronous manner to the replicas (the operation will return once it has be executed on the primary shard). The replication parameter can be set to async (defaults to sync) in order to enable it.

Write Consistency

Control if the operation will be allowed to execute based on the number of active shards within that partition (replication group). The values allowed are one, quorum, and all. The parameter to set it is consistency, and it defaults to the node level setting of action.write_consistency which in turn defaults to quorum.

For example, in a N shards with 2 replicas index, there will have to be at least 2 active shards within the relevant partition (quorum) for the operation to succeed. In a N shards with 1 replica scenario, there will need to be a single shard active (in this case, one and quorum is the same).

 
Fork me on GitHub