Elasticsearch高级特性
2025/8/15大约 2 分钟
Elasticsearch高级特性
前置知识
在学习本文之前,请确保您已经:
- 了解Elasticsearch的基本概念
- 掌握了基本的查询和索引操作
- 熟悉Spring Boot集成Elasticsearch
分词器(Analyzer)
1. 内置分词器
Elasticsearch提供了多种内置分词器:
- Standard Analyzer:默认分词器,按词切分,小写处理
- Simple Analyzer:按照非字母切分,小写处理
- Whitespace Analyzer:按照空格切分,不转小写
- Stop Analyzer:去除停用词
- Pattern Analyzer:正则表达式分词
2. 自定义分词器
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"tokenizer": "standard",
"char_filter": ["html_strip"],
"filter": ["lowercase", "stop", "asciifolding"]
}
}
}
}
}
3. IK分词器
@Field(type = FieldType.Text, analyzer = "ik_max_word", searchAnalyzer = "ik_smart")
private String content;
聚合分析
1. 指标聚合
public Map<String, Object> getMetricAggregations() {
// 创建多个指标聚合
AggregationBuilder priceStats = AggregationBuilders
.stats("price_stats")
.field("price");
AggregationBuilder avgPrice = AggregationBuilders
.avg("avg_price")
.field("price");
// 执行聚合查询
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.addAggregation(priceStats)
.addAggregation(avgPrice)
.build();
SearchHits<Product> searchHits = elasticsearchOperations
.search(searchQuery, Product.class);
// 处理结果
Map<String, Object> results = new HashMap<>();
// ... 解析聚合结果
return results;
}
2. 桶聚合
public Map<String, Long> getCategoryDistribution() {
// 创建桶聚合
TermsAggregationBuilder termsAgg = AggregationBuilders
.terms("category_count")
.field("category")
.size(10);
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.addAggregation(termsAgg)
.build();
SearchHits<Product> searchHits = elasticsearchOperations
.search(searchQuery, Product.class);
// 处理结果
Map<String, Long> distribution = new HashMap<>();
// ... 解析聚合结果
return distribution;
}
集群管理
1. 集群健康检查
GET _cluster/health
响应示例:
{
"cluster_name": "elasticsearch",
"status": "green",
"timed_out": false,
"number_of_nodes": 3,
"number_of_data_nodes": 2,
"active_primary_shards": 5,
"active_shards": 10,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0
}
2. 节点管理
@Service
@RequiredArgsConstructor
public class ClusterService {
private final RestHighLevelClient client;
public ClusterHealthResponse getClusterHealth() throws IOException {
return client.cluster().health(
new ClusterHealthRequest(),
RequestOptions.DEFAULT
);
}
public NodesInfoResponse getNodesInfo() throws IOException {
return client.nodes().info(
new NodesInfoRequest(),
RequestOptions.DEFAULT
);
}
}
性能优化
1. 索引优化
- 合理设置分片数
- 使用合适的分词器
- 优化mapping设置
PUT my_index
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "30s"
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word"
},
"keywords": {
"type": "keyword"
},
"content": {
"type": "text",
"analyzer": "ik_max_word",
"index_options": "offsets"
}
}
}
}
2. 查询优化
- 使用Filter Context
- 避免使用脚本
- 合理使用分页
public SearchHits<Product> optimizedSearch(String keyword, String category) {
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
// 使用must进行评分查询
if (keyword != null) {
boolQuery.must(QueryBuilders.matchQuery("name", keyword));
}
// 使用filter进行过滤,不参与评分
if (category != null) {
boolQuery.filter(QueryBuilders.termQuery("category", category));
}
// 使用scroll或search_after进行深度分页
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(boolQuery)
.withPageable(PageRequest.of(0, 20))
.build();
return elasticsearchOperations.search(searchQuery, Product.class);
}
总结
本文介绍了Elasticsearch的以下高级特性:
- ✅ 分词器的使用和自定义
- ✅ 聚合分析的实现
- ✅ 集群管理和监控
- ✅ 性能优化建议
下一步学习
- 学习更多最佳实践
- 探索安全配置
- 了解监控方案
希望这篇文章对您有所帮助!如果您有任何问题,欢迎在评论区讨论。