Elasticsearch搜索引擎(基础-部署)

一个优秀的elasticsearch工程师对elastic官网内容和案例模板要非常清楚,因为elasticsearch的api本就复杂规律性不像sql那么简单易用。

一些概念

ElasticSearch

  • 近实时的搜索引擎
  • 应用场景包括电商搜索、日志分析、指标分析、地理搜索等等

Kibana

  • Elasticsearch 专用的可视化工具
  • 支持丰富的可视化图表和即时的交互体验

常用术语

  • 集群 Cluster

    • 由一个或多个节点组成,对外提供服务
  • 节点 Node

    • 一个 Elasticsearch 的运行实例(JVM 实例)
  • 索引 Index

    • 由具有相同字段的文档(document)列表组成
  • 分片 Shard

    • 一个 Index 被切成 N 份存储在集群的多个 Node 上,每一份被称为一个分片
    • 有 primary 和 replica 两种类型
    • ES 的最小管理单元
  • 文档 Document

    • 用户存储在 es 中的数据文档
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    {
    "remote_ip": "93.180.71.3",
    "user_name": "-",
    "@timestamp": "2015-05-17T08:05:32.000Z",
    "request_action": "GET",
    "request": "/downloads/product_1",
    "http_version": "1.1",
    "response": "304",
    "bytes": "0",
    "referrer": "-",
    "agent": "Debian APT-HTTP/1.3 (0.8.16~exp12ubuntu10.21)"
    }

部署

如何使用?借助kibana

软件包

通过elastic官网地址获取软件安装包。

参考文档地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.13/install-elasticsearch.html

安装

以windows为例子,只需要解压zip即可。进入目录即可使用。

1
cd C:\elasticsearch-8.13.4

启动

1
.\bin\elasticsearch.bat

elasticsearch支持https协议传输,会动态生成如下命令

  • 自动为用户生成密码
  • 生成HTTPS的指纹信息(elasticsearch通过https协议提供服务)
  • 生成elasticsearch的base64签名,提供给kibana使用

kibana安装不过多说,和elasticsearch类似。

elastic目录结构

1
2
3
4
5
6
7
elastic
--bin(elastic二进制启动文件)
--config(elastic配置文件)
--data(elastic索引存储目录)
--jdk
--lib
--log

集群启动

集群中的不同节点需要占用不同的port端口,因此在启动新的节点前,需要手动指定一些配置信息。

以下是一些核心配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 集群名称,同一集群下的节点的这个配置要一致
cluster.name: cluster
# 节点名称
node.name: node-master
# 节点角色
node.roles: [master]
# 索引存储目录
path.data: /data
# 初始节点,控制集群提供服务的最小节点
cluster.initial_master_nodes: ["node-master"]
# 集群对外访问,本地访问可改成localhost
http.host: 0.0.0.0
# 其它节点地址
discovery.seed_hosts: ["127.0.0.1:9201","127.0.0.1:9202"]

一些kibana配置

1
2
3
4
5
6
# elasticsearch地址
elasticsearch.hosts: ['https://127.0.0.1:9200']
elasticsearch.username: kibana_system
#elasticsearch.serviceAccountToken: AAEAAWVsYXN0aWMva2liYW5hL2Vucm9sbC1wcm9jZXNzLXRva2VuLTE3Mzk5MzI0MjE3NTI6ZjdhV0FNbUVUWDIxMFZsUmtfR0U3UQ
elasticsearch.ssl.certificateAuthorities: ['x:\xxx\xxx\elasticSearch\elasticsearch-8.13.4\config\certs\http_ca.crt']

一些常用命令

1
2
3
4
bin\elasticsearch-create-enrollment-token -s node //在master节点上为从节点生成token信息
bin\elasticsearch --enrollment-token <enrollment-token> 通过token启动并连接主节点
bin\elasticsearch-setup-passwrod //生成新的密码
更多命令可在bin目录下查看

基于docker启动

主要通过docker容器编排来负责启动elastic镜像和kibana;以下是一个基本的基于三节点的docker容器。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
version: "2.2"

services:
setup:
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
user: "0"
command: >
bash -c '
if [ x${ELASTIC_PASSWORD} == x ]; then
echo "Set the ELASTIC_PASSWORD environment variable in the .env file";
exit 1;
elif [ x${KIBANA_PASSWORD} == x ]; then
echo "Set the KIBANA_PASSWORD environment variable in the .env file";
exit 1;
fi;
if [ ! -f config/certs/ca.zip ]; then
echo "Creating CA";
bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip;
unzip config/certs/ca.zip -d config/certs;
fi;
if [ ! -f config/certs/certs.zip ]; then
echo "Creating certs";
echo -ne \
"instances:\n"\
" - name: es01\n"\
" dns:\n"\
" - es01\n"\
" - localhost\n"\
" ip:\n"\
" - 127.0.0.1\n"\
" - name: es02\n"\
" dns:\n"\
" - es02\n"\
" - localhost\n"\
" ip:\n"\
" - 127.0.0.1\n"\
" - name: es03\n"\
" dns:\n"\
" - es03\n"\
" - localhost\n"\
" ip:\n"\
" - 127.0.0.1\n"\
> config/certs/instances.yml;
bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key;
unzip config/certs/certs.zip -d config/certs;
fi;
echo "Setting file permissions"
chown -R root:root config/certs;
find . -type d -exec chmod 750 \{\} \;;
find . -type f -exec chmod 640 \{\} \;;
echo "Waiting for Elasticsearch availability";
until curl -s --cacert config/certs/ca/ca.crt https://es01:9200 | grep -q "missing authentication credentials"; do sleep 30; done;
echo "Setting kibana_system password";
until curl -s -X POST --cacert config/certs/ca/ca.crt -u elastic:${ELASTIC_PASSWORD} -H "Content-Type: application/json" https://es01:9200/_security/user/kibana_system/_password -d "{\"password\":\"${KIBANA_PASSWORD}\"}" | grep -q "^{}"; do sleep 10; done;
echo "All done!";
'
healthcheck:
test: ["CMD-SHELL", "[ -f config/certs/es01/es01.crt ]"]
interval: 1s
timeout: 5s
retries: 120

es01:
depends_on:
setup:
condition: service_healthy
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
- esdata01:/usr/share/elasticsearch/data
ports:
- ${ES_PORT}:9200
environment:
- node.name=es01
- cluster.name=${CLUSTER_NAME}
- cluster.initial_master_nodes=es01,es02,es03
- discovery.seed_hosts=es02,es03
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- bootstrap.memory_lock=true
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=true
- xpack.security.http.ssl.key=certs/es01/es01.key
- xpack.security.http.ssl.certificate=certs/es01/es01.crt
- xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.http.ssl.verification_mode=certificate
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.key=certs/es01/es01.key
- xpack.security.transport.ssl.certificate=certs/es01/es01.crt
- xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.license.self_generated.type=${LICENSE}
mem_limit: ${MEM_LIMIT}
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
]
interval: 10s
timeout: 10s
retries: 120

es02:
depends_on:
- es01
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
- esdata02:/usr/share/elasticsearch/data
environment:
- node.name=es02
- cluster.name=${CLUSTER_NAME}
- cluster.initial_master_nodes=es01,es02,es03
- discovery.seed_hosts=es01,es03
- bootstrap.memory_lock=true
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=true
- xpack.security.http.ssl.key=certs/es02/es02.key
- xpack.security.http.ssl.certificate=certs/es02/es02.crt
- xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.http.ssl.verification_mode=certificate
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.key=certs/es02/es02.key
- xpack.security.transport.ssl.certificate=certs/es02/es02.crt
- xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.license.self_generated.type=${LICENSE}
mem_limit: ${MEM_LIMIT}
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
]
interval: 10s
timeout: 10s
retries: 120

es03:
depends_on:
- es02
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
- esdata03:/usr/share/elasticsearch/data
environment:
- node.name=es03
- cluster.name=${CLUSTER_NAME}
- cluster.initial_master_nodes=es01,es02,es03
- discovery.seed_hosts=es01,es02
- bootstrap.memory_lock=true
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=true
- xpack.security.http.ssl.key=certs/es03/es03.key
- xpack.security.http.ssl.certificate=certs/es03/es03.crt
- xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.http.ssl.verification_mode=certificate
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.key=certs/es03/es03.key
- xpack.security.transport.ssl.certificate=certs/es03/es03.crt
- xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.license.self_generated.type=${LICENSE}
mem_limit: ${MEM_LIMIT}
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
]
interval: 10s
timeout: 10s
retries: 120

kibana:
depends_on:
es01:
condition: service_healthy
es02:
condition: service_healthy
es03:
condition: service_healthy
image: docker.elastic.co/kibana/kibana:${STACK_VERSION}
volumes:
- certs:/usr/share/kibana/config/certs
- kibanadata:/usr/share/kibana/data
ports:
- ${KIBANA_PORT}:5601
environment:
- SERVERNAME=kibana
- ELASTICSEARCH_HOSTS=https://es01:9200
- ELASTICSEARCH_USERNAME=kibana_system
- ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
- ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt
mem_limit: ${MEM_LIMIT}
healthcheck:
test:
[
"CMD-SHELL",
"curl -s -I http://localhost:5601 | grep -q 'HTTP/1.1 302 Found'",
]
interval: 10s
timeout: 10s
retries: 120

volumes:
certs:
driver: local
esdata01:
driver: local
esdata02:
driver: local
esdata03:
driver: local
kibanadata:
driver: local

节点属性

  • master

    • 主节点,主要负责集群元数据(Cluster State)管理和分发
    • 大脑,负责制定数据分配规则等
  • Data

    • 数据节点,主要负责数据存储和数据读写请求处理
    • 劳工,真正干活的
    • 劳工也分等级
    层级 访问频率 存储介质 典型场景 存储成本
    Hot(热) 极高(实时访问) 高速存储(如SSD) 交易数据、实时计算
    Warm(温) 中等 高性能HDD 近期日志、活跃归档
    Cold(冷) 普通HDD/对象存储 历史数据、合规性存档
    Frozen(冻结) 极少(几乎不访问) 磁带/云归档 法律留存、灾难恢复备份 极低
  • Ingest

    • 预处理节点,主要负责对数据进行处理转换
  • Coordinate/Client

    • 协调节点,主要负责请求转发
    • 交警,将读写流量调度到具体的数据节点
    • 默认角色(每个角色都自带coordinate属性)

Elasticsearch搜索引擎(基础-部署)
https://andrewjiao.github.io/2022/08/16/elasticsearch/ElasticSearch基础和部署/
作者
Andrew_Jiao
发布于
2022年8月16日
许可协议