ELK Stack 系统部署、优化与安全加固指南
本文提供了ELK Stack(Elasticsearch、Logstash、Kibana)的完整部署指南,包括系统环境准备、组件安装配置和安全加固措施。主要内容涵盖:1)系统要求与优化配置;2)Elasticsearch的安装、集群配置和JVM调优;3)Logstash的输入/输出配置和日志处理管道;4)Kibana的可视化部署;5)全面的安全加固方案,包括认证授权、SSL加密和防火墙规则。该指南
·
ELK Stack 系统部署、优化与安全加固指南
一、ELK Stack 概述
ELK Stack 是 Elasticsearch、Logstash、Kibana 三个开源软件的组合,用于日志收集、存储、分析和可视化。
二、系统部署
2.1 环境准备
系统要求
# 操作系统:CentOS/RHEL 7+ 或 Ubuntu 20.04+
# 硬件要求:
# - 内存:至少 8GB(生产环境建议 16GB+)
# - CPU:4核以上
# - 存储:根据日志量确定,建议使用 SSD
# 检查系统
cat /etc/os-release
free -h
lscpu
df -h
防火墙配置
# CentOS/RHEL
sudo systemctl start firewalld
sudo systemctl enable firewalld
sudo firewall-cmd --permanent --add-port={9200,5601,5044,9600}/tcp
sudo firewall-cmd --reload
# Ubuntu
sudo ufw allow 9200/tcp
sudo ufw allow 5601/tcp
sudo ufw allow 5044/tcp
sudo ufw allow 9600/tcp
sudo ufw enable
系统优化
# 1. 禁用交换分区
sudo swapoff -a
# 永久禁用
sudo sed -i '/swap/d' /etc/fstab
# 2. 调整文件描述符限制
echo "* soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "elasticsearch soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "elasticsearch hard nofile 65536" | sudo tee -a /etc/security/limits.conf
# 3. 调整虚拟内存限制
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# 4. 调整线程限制
echo "* soft nproc 4096" | sudo tee -a /etc/security/limits.conf
echo "elasticsearch soft nproc 4096" | sudo tee -a /etc/security/limits.conf
# 5. 创建专用用户
sudo groupadd elasticsearch
sudo useradd -g elasticsearch -m elasticsearch
sudo usermod -a -G elasticsearch elasticsearch
2.2 Elasticsearch 部署
2.2.1 安装 Elasticsearch
# CentOS/RHEL
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
cat <<EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
[elasticsearch-8.x]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
sudo yum install -y elasticsearch
# Ubuntu/Debian
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list
sudo apt-get update
sudo apt-get install elasticsearch
2.2.2 配置 Elasticsearch
# /etc/elasticsearch/elasticsearch.yml
# 集群配置
cluster.name: production-cluster
node.name: ${HOSTNAME}
# 网络配置
network.host: [_local_, _site_]
http.port: 9200
# 发现配置(单节点)
discovery.type: single-node
# 生产环境多节点配置示例
# cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]
# discovery.seed_hosts: ["host1", "host2", "host3"]
# 数据路径
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
# 内存锁定(防止内存交换)
bootstrap.memory_lock: true
# JVM 堆大小(最大为物理内存的50%,不超过32GB)
# 在 /etc/elasticsearch/jvm.options 中配置
# 启用安全功能(8.x 默认启用)
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
# 启用加密通信
xpack.security.http.ssl:
enabled: true
keystore.path: certs/http.p12
xpack.security.transport.ssl:
enabled: true
verification_mode: certificate
keystore.path: certs/transport.p12
truststore.path: certs/transport.p12
# 设置节点角色(根据实际情况调整)
node.roles: [master, data, ingest]
2.2.3 JVM 配置
# /etc/elasticsearch/jvm.options
# 根据服务器内存调整,建议不超过物理内存的50%,最大不超过32GB
-Xms4g
-Xmx4g
# 垃圾回收器设置
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
-XX:G1ReservePercent=25
-XX:InitiatingHeapOccupancyPercent=30
2.2.4 启动 Elasticsearch
# 设置目录权限
sudo chown -R elasticsearch:elasticsearch /etc/elasticsearch
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch
sudo chown -R elasticsearch:elasticsearch /var/log/elasticsearch
# 启动服务
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
# 查看状态
sudo systemctl status elasticsearch
sudo journalctl -u elasticsearch -f
# 验证安装
curl -X GET "localhost:9200/" -u elastic:<password>
2.3 Logstash 部署
2.3.1 安装 Logstash
# CentOS/RHEL
sudo yum install -y logstash
# Ubuntu
sudo apt-get install logstash
2.3.2 配置 Logstash
# 创建配置文件目录
sudo mkdir -p /etc/logstash/conf.d
sudo chown -R logstash:logstash /etc/logstash
# 示例配置文件:syslog收集
# /etc/logstash/conf.d/syslog.conf
cat <<'EOF' | sudo tee /etc/logstash/conf.d/syslog.conf
input {
beats {
port => 5044
ssl => false
}
tcp {
port => 5000
type => syslog
}
udp {
port => 5000
type => syslog
}
}
filter {
# 根据type进行不同的过滤处理
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
# 移除敏感信息
mutate {
remove_field => ["@version", "tags"]
}
}
output {
# 输出到Elasticsearch
elasticsearch {
hosts => ["https://localhost:9200"]
index => "logstash-%{+YYYY.MM.dd}"
user => "logstash_writer"
password => "${LOGSTASH_PASSWORD}"
ssl_certificate_verification => false
ssl => true
}
# 调试输出(生产环境可移除)
stdout {
codec => rubydebug
}
}
EOF
# 创建JMX监控配置
# /etc/logstash/jvm.options
cat <<'EOF' | sudo tee -a /etc/elasticsearch/jvm.options
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=9010
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
EOF
2.3.3 管道配置管理
# /etc/logstash/pipelines.yml
- pipeline.id: main
path.config: "/etc/logstash/conf.d/*.conf"
pipeline.workers: 2
pipeline.batch.size: 125
pipeline.batch.delay: 50
2.3.4 启动 Logstash
# 设置环境变量密码
sudo tee /etc/sysconfig/logstash <<EOF
LOGSTASH_PASSWORD=your_secure_password
EOF
# 设置权限
sudo chown -R logstash:logstash /etc/logstash
sudo chown -R logstash:logstash /var/lib/logstash
# 启动服务
sudo systemctl enable logstash
sudo systemctl start logstash
# 查看状态
sudo systemctl status logstash
sudo tail -f /var/log/logstash/logstash-plain.log
2.4 Kibana 部署
2.4.1 安装 Kibana
# CentOS/RHEL
sudo yum install -y kibana
# Ubuntu
sudo apt-get install kibana
2.4.2 配置 Kibana
# /etc/kibana/kibana.yml
# 服务器配置
server.port: 5601
server.host: "0.0.0.0"
server.name: "kibana-server"
# Elasticsearch 连接
elasticsearch.hosts: ["https://localhost:9200"]
elasticsearch.username: "kibana_system"
elasticsearch.password: "${KIBANA_PASSWORD}"
# 安全配置
elasticsearch.ssl.certificateAuthorities: [ "/etc/kibana/certs/ca.crt" ]
elasticsearch.ssl.verificationMode: certificate
# 会话管理
server.publicBaseUrl: "https://your-domain.com:5601"
xpack.security.session.lifespan: 7d
xpack.security.session.idleTimeout: 1d
# 启用功能
xpack.fleet.enabled: true
xpack.reporting.enabled: true
xpack.canvas.enabled: true
xpack.graph.enabled: true
xpack.maps.enabled: true
xpack.monitoring.enabled: true
xpack.spaces.enabled: true
# 数据路径
path.data: /var/lib/kibana
path.logs: /var/log/kibana
# 日志配置
logging:
appenders:
file:
type: file
fileName: /var/log/kibana/kibana.log
layout:
type: json
root:
appenders:
- default
- file
# 性能优化
optimize.lazy: true
optimize.lazyPort: 5602
optimize.lazyPrebuild: true
2.4.3 启动 Kibana
# 创建数据目录
sudo mkdir -p /var/lib/kibana
sudo chown -R kibana:kibana /etc/kibana
sudo chown -R kibana:kibana /var/lib/kibana
sudo chown -R kibana:kibana /var/log/kibana
# 启动服务
sudo systemctl enable kibana
sudo systemctl start kibana
# 查看状态
sudo systemctl status kibana
sudo tail -f /var/log/kibana/kibana.log
2.5 Filebeat 部署(可选)
# 安装Filebeat
# CentOS
sudo yum install filebeat
# Ubuntu
sudo apt-get install filebeat
# 配置Filebeat
# /etc/filebeat/filebeat.yml
cat <<'EOF' | sudo tee /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/*.log
- /var/log/syslog
- /var/log/messages
fields:
type: syslog
fields_under_root: true
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
fields:
type: nginx-access
fields_under_root: true
- type: log
enabled: true
paths:
- /var/log/nginx/error.log
fields:
type: nginx-error
fields_under_root: true
# 处理模块
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
# 输出到Logstash
output.logstash:
hosts: ["localhost:5044"]
# 监控
monitoring.enabled: true
EOF
# 启动Filebeat
sudo systemctl enable filebeat
sudo systemctl start filebeat
三、系统优化
3.1 Elasticsearch 优化
3.1.1 索引优化
// 创建索引模板
PUT _template/logstash_template
{
"index_patterns": ["logstash-*"],
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "30s",
"index.routing.allocation.total_shards_per_node": 2,
"index.codec": "best_compression",
"index.translog.durability": "async",
"index.translog.sync_interval": "5s",
"index.translog.flush_threshold_size": "512mb",
// 分片大小优化
"index.merge.scheduler.max_thread_count": 1,
"index.merge.policy.max_merged_segment": "5gb",
"index.merge.policy.segments_per_tier": 10,
// 索引生命周期管理
"lifecycle.name": "logstash_policy",
"lifecycle.rollover_alias": "logstash"
},
"mappings": {
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword"
}
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "text"
}
}
}
}
// ILM策略配置
PUT _ilm/policy/logstash_policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "50gb",
"max_age": "7d"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "1d",
"actions": {
"forcemerge": {
"max_num_segments": 1
},
"shrink": {
"number_of_shards": 1
},
"allocate": {
"number_of_replicas": 1
}
}
},
"cold": {
"min_age": "30d",
"actions": {
"allocate": {
"require": {
"data": "cold"
}
}
}
},
"delete": {
"min_age": "90d",
"actions": {
"delete": {}
}
}
}
}
}
3.1.2 查询优化
// 使用索引别名
POST /_aliases
{
"actions": [
{
"add": {
"index": "logstash-2023.10.01",
"alias": "logstash-current"
}
}
]
}
// 配置慢查询日志
PUT /_cluster/settings
{
"transient": {
"logger.org.elasticsearch.index.search.slowlog.query": "DEBUG",
"logger.org.elasticsearch.index.search.slowlog.fetch": "DEBUG",
"index.search.slowlog.threshold.query.warn": "10s",
"index.search.slowlog.threshold.query.info": "5s",
"index.search.slowlog.threshold.fetch.warn": "1s",
"index.search.slowlog.threshold.fetch.info": "800ms"
}
}
// 使用查询缓存
PUT /my-index/_settings
{
"index.queries.cache.enabled": true
}
3.1.3 集群优化
// 集群设置优化
PUT /_cluster/settings
{
"persistent": {
// 分片分配策略
"cluster.routing.allocation.balance.shard": 0.45,
"cluster.routing.allocation.balance.index": 0.55,
"cluster.routing.allocation.balance.threshold": 1.0,
// 并发恢复设置
"cluster.routing.allocation.node_concurrent_recoveries": 2,
"cluster.routing.allocation.node_initial_primaries_recoveries": 4,
// 磁盘水位线
"cluster.routing.allocation.disk.watermark.low": "85%",
"cluster.routing.allocation.disk.watermark.high": "90%",
"cluster.routing.allocation.disk.watermark.flood_stage": "95%",
// 启用自适应副本选择
"cluster.routing.use_adaptive_replica_selection": true
}
}
// 节点角色分离配置
# 在elasticsearch.yml中添加
node.roles: [master, data_hot] # 热数据节点
# 或
node.roles: [data_warm] # 温数据节点
# 或
node.roles: [data_cold] # 冷数据节点
# 或
node.roles: [master] # 专用主节点
# 或
node.roles: [ml] # 机器学习节点
3.2 Logstash 优化
3.2.1 管道性能优化
# /etc/logstash/pipelines.yml
- pipeline.id: main
path.config: "/etc/logstash/conf.d/*.conf"
pipeline.workers: 4 # CPU核心数
pipeline.batch.size: 125 # 每批处理事件数
pipeline.batch.delay: 50 # 批处理延迟(ms)
pipeline.ordered: false # 是否保持顺序
# 队列设置
queue.type: persisted
queue.max_bytes: 4gb
queue.checkpoint.acks: 1024
queue.checkpoint.writes: 1024
queue.checkpoint.interval: 1000
3.2.2 过滤器优化
# 使用条件语句优化过滤器性能
filter {
# 尽早过滤不需要的数据
if [type] != "application" {
drop {}
}
# 使用grok失败时跳过后续处理
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
tag_on_failure => ["_grokparsefailure"]
}
if "_grokparsefailure" in [tags] {
drop {}
}
# 缓存常用模式
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => { "message" => "%{MY_PATTERN}" }
}
# 使用date过滤器替代ruby代码
date {
match => [ "timestamp", "ISO8601" ]
target => "@timestamp"
}
}
3.2.3 输出优化
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "logstash-%{+YYYY.MM.dd}"
# 批量提交优化
flush_size => 500
idle_flush_time => 10
# 重试策略
retry_initial_interval => 1
retry_max_interval => 60
retry_on_conflict => 3
# HTTP连接池
pool_max => 1000
pool_max_per_route => 100
# 启用文档ID
document_id => "%{[@metadata][_id]}"
# 启用自动重试
retry_on_failure => true
max_retries => 3
}
}
3.3 Kibana 优化
3.3.1 性能优化配置
# /etc/kibana/kibana.yml
# 内存优化
node.options: "--max-old-space-size=4096"
# 缓存配置
elasticsearch.requestTimeout: 30000
elasticsearch.shardTimeout: 30000
elasticsearch.startupTimeout: 30000
# 会话缓存
xpack.security.session.cleanupInterval: "10m"
xpack.security.session.concurrentSessions.maxSessions: 300
# 优化仪表板加载
optimize:
lazy: true
lazyPort: 5602
lazyPrebuild: true
# 启用压缩
server.compression.enabled: true
server.compression.referrerWhitelist: ["/"]
3.3.2 搜索优化
// 创建索引模式优化
PUT /.kibana/_doc/config:7.15.0
{
"config": {
"search:includeFrozen": false,
"search:timeout": 60000,
"search:queryLanguage": "kuery",
"metrics:max_buckets": 10000
}
}
四、安全加固
4.1 Elasticsearch 安全加固
4.1.1 网络层安全
# /etc/elasticsearch/elasticsearch.yml
# 限制网络访问
network.host: _site_
network.bind_host: ["192.168.1.100", "127.0.0.1"]
network.publish_host: 192.168.1.100
# HTTP接口安全
http.host: _site_
http.port: 9200
http.max_content_length: 100mb
http.max_header_size: 16kb
http.compression: true
# 禁用危险的HTTP方法
http.cors.enabled: true
http.cors.allow-origin: "https://kibana.example.com"
http.cors.allow-methods: "OPTIONS, HEAD, GET, POST"
http.cors.allow-headers: "X-Requested-With, Content-Type, Content-Length, Authorization"
http.cors.allow-credentials: true
# 启用审计日志
xpack.security.audit.enabled: true
xpack.security.audit.logfile.events.include: authentication_failed, access_denied, tampered_request, connection_denied
xpack.security.audit.logfile.events.exclude: authentication_success
4.1.2 TLS/SSL 配置
# 生成CA证书
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil ca
# 生成节点证书
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert \
--ca elastic-stack-ca.p12 \
--name elasticsearch \
--dns localhost,elasticsearch.example.com \
--ip 192.168.1.100
# 移动证书到安全目录
sudo mkdir -p /etc/elasticsearch/certs
sudo cp elastic-stack-ca.p12 /etc/elasticsearch/certs/
sudo cp elasticsearch.p12 /etc/elasticsearch/certs/
sudo chown -R elasticsearch:elasticsearch /etc/elasticsearch/certs
sudo chmod 600 /etc/elasticsearch/certs/*.p12
4.1.3 用户与权限管理
# 设置内置用户密码
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u kibana_system
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u logstash_system
# 创建自定义角色
curl -X POST "https://localhost:9200/_security/role/logstash_writer" \
-H 'Content-Type: application/json' \
-u elastic:<password> \
-d '{
"cluster": ["monitor", "manage_index_templates"],
"indices": [
{
"names": ["logstash-*"],
"privileges": ["write", "create_index", "manage", "read"],
"allow_restricted_indices": false
}
]
}'
# 创建用户并分配角色
curl -X POST "https://localhost:9200/_security/user/logstash_user" \
-H 'Content-Type: application/json' \
-u elastic:<password> \
-d '{
"password": "SecurePassword123!",
"roles": ["logstash_writer"],
"full_name": "Logstash User",
"email": "logstash@example.com",
"enabled": true
}'
4.1.4 安全策略配置
// 配置安全策略
PUT /_security/policy/logstash_policy
{
"cluster": [
"cluster:monitor/main",
"cluster:admin/ingest/pipeline/get"
],
"indices": [
{
"names": ["logstash-*"],
"privileges": ["read", "write", "create_index", "delete", "manage"]
}
],
"applications": [
{
"application": "kibana-.kibana",
"privileges": ["feature_canvas.all", "feature_discover.all"],
"resources": ["*"]
}
]
}
// API密钥管理
POST /_security/api_key
{
"name": "logstash-api-key",
"role_descriptors": {
"logstash_writer": {
"cluster": ["monitor"],
"indices": [
{
"names": ["logstash-*"],
"privileges": ["write", "create_index"]
}
]
}
},
"expiration": "7d"
}
4.2 Logstash 安全加固
4.2.1 输入安全
# 使用SSL/TLS加密Beats输入
input {
beats {
port => 5044
ssl => true
ssl_certificate_authorities => ["/etc/logstash/certs/ca.crt"]
ssl_certificate => "/etc/logstash/certs/logstash.crt"
ssl_key => "/etc/logstash/certs/logstash.key"
ssl_verify_mode => "force_peer"
# 客户端认证
client_inactivity_timeout => 3600
max_pending_requests => 1024
}
# 限制TCP连接
tcp {
port => 5000
mode => "server"
ssl_enable => true
ssl_cert => "/etc/logstash/certs/logstash.crt"
ssl_key => "/etc/logstash/certs/logstash.key"
ssl_verify => false
# 连接限制
tcp_keep_alive => true
receive_buffer_size => "64kb"
# IP白名单
# host => "192.168.1.0/24"
}
}
4.2.2 数据脱敏
filter {
# 移除敏感字段
mutate {
remove_field => [
"password",
"credit_card",
"ssn",
"token",
"authorization"
]
}
# 邮箱脱敏
if [email] {
mutate {
gsub => [
"email", "@.*", "@example.com"
]
}
}
# IP地址匿名化
if [client_ip] {
mutate {
gsub => [
"client_ip", "\.\d+$", ".0"
]
}
}
# 信用卡号脱敏
if [credit_card_number] {
mutate {
gsub => [
"credit_card_number", "\d{12}(\d{4})", "************\1"
]
}
}
}
4.2.3 安全审计
# 启用Logstash审计日志
# /etc/logstash/logstash.yml
api.http.host: "127.0.0.1"
api.http.port: 9600
api.auth.type: basic
api.auth.basic.username: "admin"
api.auth.basic.password: "${LOGSTASH_API_PASSWORD}"
log.level: info
log.format: json
4.3 Kibana 安全加固
4.3.1 访问控制
# /etc/kibana/kibana.yml
# 启用安全功能
xpack.security.enabled: true
xpack.security.encryptionKey: "32位随机字符串"
xpack.security.session.idleTimeout: "1h"
xpack.security.session.lifespan: "7d"
# 认证配置
xpack.security.authc.providers:
basic.basic1:
order: 0
description: "本地用户登录"
saml.saml1:
order: 1
realm: "saml1"
description: "SAML认证"
# 会话管理
server.xsrf.protection.enabled: true
server.xsrf.allowlist: ["/api/status"]
server.compression.enabled: true
# CORS配置
server.cors.enabled: true
server.cors.allowCredentials: true
server.cors.allowOrigin: ["https://your-domain.com"]
server.cors.allowHeaders: ["Authorization", "Content-Type", "kbn-xsrf"]
# 速率限制
xpack.security.rateLimiters:
login:
enabled: true
ip: 10
username: 5
4.3.2 空间和角色管理
// 创建Kibana空间
POST /api/spaces/space
{
"id": "security",
"name": "Security",
"description": "安全监控空间",
"color": "#aabbcc",
"disabledFeatures": ["timelion", "maps"]
}
// 创建Kibana角色
PUT /_security/role/kibana_security_viewer
{
"elasticsearch": {
"cluster": ["monitor"],
"indices": [
{
"names": ["logs-*", "audit-*"],
"privileges": ["read"]
}
]
},
"kibana": [
{
"base": ["read"],
"feature": {
"discover": ["read"],
"dashboard": ["read"],
"visualize": ["read"]
},
"spaces": ["security"]
}
]
}
4.4 系统级安全加固
4.4.1 操作系统安全
#!/bin/bash
# security_hardening.sh
# 1. 系统更新
sudo yum update -y
# 2. 配置SSH安全
sudo sed -i 's/^#PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config
sudo sed -i 's/^#PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo sed -i 's/^#MaxAuthTries 6/MaxAuthTries 3/' /etc/ssh/sshd_config
sudo systemctl restart sshd
# 3. 配置防火墙
sudo firewall-cmd --permanent --new-zone=elk
sudo firewall-cmd --permanent --zone=elk --add-source=192.168.1.0/24
sudo firewall-cmd --permanent --zone=elk --add-port=9200/tcp
sudo firewall-cmd --permanent --zone=elk --add-port=5601/tcp
sudo firewall-cmd --permanent --zone=elk --add-port=5044/tcp
sudo firewall-cmd --reload
# 4. 配置SELinux(CentOS/RHEL)
sudo setsebool -P httpd_can_network_connect 1
sudo semanage port -a -t http_port_t -p tcp 9200
sudo semanage port -a -t http_port_t -p tcp 5601
# 5. 配置审计
sudo yum install audit -y
sudo tee /etc/audit/rules.d/elk.rules <<EOF
-w /etc/elasticsearch -p wa -k elasticsearch_config
-w /var/lib/elasticsearch -p wa -k elasticsearch_data
-w /etc/kibana -p wa -k kibana_config
-w /var/log/elasticsearch -p wa -k elasticsearch_logs
EOF
sudo service auditd restart
# 6. 配置日志轮转
sudo tee /etc/logrotate.d/elk <<EOF
/var/log/elasticsearch/*.log {
daily
rotate 30
compress
delaycompress
missingok
create 644 elasticsearch elasticsearch
postrotate
/bin/kill -SIGHUP \$(cat /var/run/elasticsearch/elasticsearch.pid 2>/dev/null) 2>/dev/null || true
endscript
}
EOF
4.4.2 监控与告警
# 配置Elasticsearch监控
PUT _cluster/settings
{
"persistent": {
"xpack.monitoring.collection.enabled": true,
"xpack.monitoring.collection.interval": 10s,
"xpack.monitoring.history.duration": 7d
}
}
# 配置告警
PUT _watcher/watch/cluster_health_watch
{
"trigger": {
"schedule": {
"interval": "1m"
}
},
"input": {
"http": {
"request": {
"host": "localhost",
"port": 9200,
"path": "/_cluster/health",
"auth": {
"basic": {
"username": "elastic",
"password": "{{password}}"
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.status": {
"eq": "red"
}
}
},
"actions": {
"send_email": {
"email": {
"to": "admin@example.com",
"subject": "Cluster Health Alert",
"body": "Cluster status is RED!"
}
}
}
}
五、备份与恢复
5.1 Elasticsearch 备份配置
# 配置快照仓库
# 创建备份目录
sudo mkdir -p /backup/elasticsearch
sudo chown -R elasticsearch:elasticsearch /backup/elasticsearch
# 注册文件系统仓库
PUT /_snapshot/my_backup
{
"type": "fs",
"settings": {
"location": "/backup/elasticsearch",
"compress": true,
"max_restore_bytes_per_sec": "100mb",
"max_snapshot_bytes_per_sec": "100mb"
}
}
# 创建快照策略
PUT /_slm/policy/daily-snapshots
{
"schedule": "0 30 1 * * ?", # 每天1:30
"name": "<daily-snap-{now/d}>",
"repository": "my_backup",
"config": {
"indices": ["*"],
"ignore_unavailable": false,
"include_global_state": true,
"metadata": {
"taken_by": "slm",
"taken_because": "Scheduled daily snapshot"
}
},
"retention": {
"expire_after": "30d",
"min_count": 5,
"max_count": 50
}
}
# 立即执行快照
PUT /_snapshot/my_backup/snapshot_1?wait_for_completion=true
{
"indices": "logstash-*",
"ignore_unavailable": true,
"include_global_state": false,
"metadata": {
"taken_by": "admin",
"taken_because": "backup before upgrade"
}
}
5.2 恢复策略
# 查看可用快照
GET /_snapshot/my_backup/_all
# 恢复快照
POST /_snapshot/my_backup/snapshot_1/_restore
{
"indices": "logstash-2023.10.*",
"ignore_unavailable": true,
"include_global_state": false,
"rename_pattern": "logstash-(.+)",
"rename_replacement": "restored_logstash-$1"
}
# 部分索引恢复
POST /_snapshot/my_backup/snapshot_1/_restore
{
"indices": ["logstash-2023.10.01", "logstash-2023.10.02"],
"index_settings": {
"index.number_of_replicas": 0
},
"ignore_index_settings": ["index.refresh_interval"]
}
六、监控与维护
6.1 健康检查脚本
#!/bin/bash
# elk_health_check.sh
ELASTICSEARCH_HOST="localhost:9200"
KIBANA_HOST="localhost:5601"
LOGSTASH_HOST="localhost:9600"
ALERT_EMAIL="admin@example.com"
# 检查Elasticsearch
check_elasticsearch() {
echo "=== Elasticsearch Health Check ==="
# 集群健康
health=$(curl -s -u elastic:$ELASTIC_PASSWORD "https://$ELASTICSEARCH_HOST/_cluster/health" | jq -r '.status')
if [[ "$health" != "green" ]]; then
echo "ALERT: Elasticsearch cluster status: $health"
send_alert "Elasticsearch cluster status: $health"
fi
# 节点状态
nodes=$(curl -s -u elastic:$ELASTIC_PASSWORD "https://$ELASTICSEARCH_HOST/_cat/nodes?v")
echo "Nodes:"
echo "$nodes"
# 索引状态
indices=$(curl -s -u elastic:$ELASTIC_PASSWORD "https://$ELASTICSEARCH_HOST/_cat/indices?v&health=red")
if [[ -n "$indices" ]]; then
echo "ALERT: Red indices found"
echo "$indices"
send_alert "Red indices found"
fi
# 磁盘使用
disk=$(curl -s -u elastic:$ELASTIC_PASSWORD "https://$ELASTICSEARCH_HOST/_cat/allocation?v")
echo "Disk allocation:"
echo "$disk"
}
# 检查Kibana
check_kibana() {
echo "=== Kibana Health Check ==="
status=$(curl -s -o /dev/null -w "%{http_code}" "https://$KIBANA_HOST/api/status")
if [[ "$status" != "200" ]]; then
echo "ALERT: Kibana returned status: $status"
send_alert "Kibana status: $status"
fi
# 检查Kibana系统状态
kibana_status=$(curl -s "https://$KIBANA_HOST/api/status" | jq -r '.status.overall.level')
echo "Kibana status: $kibana_status"
}
# 检查Logstash
check_logstash() {
echo "=== Logstash Health Check ==="
# 检查管道状态
pipelines=$(curl -s "http://$LOGSTASH_HOST/_node/pipelines?pretty")
echo "Logstash pipelines:"
echo "$pipelines" | jq '.pipelines'
# 检查JVM状态
jvm=$(curl -s "http://$LOGSTASH_HOST/_node/stats/jvm")
heap_used=$(echo "$jvm" | jq '.jvm.mem.heap_used_percent')
if (( heap_used > 80 )); then
echo "ALERT: Logstash heap usage high: $heap_used%"
send_alert "Logstash heap usage: $heap_used%"
fi
}
# 发送告警
send_alert() {
local message="$1"
echo "Sending alert: $message"
# 使用邮件发送
echo "$message" | mail -s "ELK Stack Alert" "$ALERT_EMAIL"
# 或使用Slack webhook
# curl -X POST -H 'Content-type: application/json' \
# --data "{\"text\":\"$message\"}" \
# https://hooks.slack.com/services/YOUR/WEBHOOK/URL
}
# 主函数
main() {
source /etc/elk_credentials.conf # 加载凭证
check_elasticsearch
check_kibana
check_logstash
echo "=== Health check completed at $(date) ==="
}
# 设置定时任务
# crontab -e
# 0 */2 * * * /path/to/elk_health_check.sh >> /var/log/elk_health.log 2>&1
main "$@"
七、故障排除
7.1 常见问题解决
# 1. Elasticsearch启动失败
# 检查日志
sudo journalctl -u elasticsearch -f
sudo tail -f /var/log/elasticsearch/production-cluster.log
# 检查内存锁定
sudo grep -i lock /var/log/elasticsearch/*.log
sudo sysctl vm.max_map_count
# 2. 磁盘空间不足
# 查看磁盘使用
curl -X GET "localhost:9200/_cat/allocation?v"
# 清理旧索引
curl -X DELETE "localhost:9200/logstash-2023.09.*"
# 3. 性能问题
# 检查热点线程
curl -X GET "localhost:9200/_nodes/hot_threads"
# 检查索引状态
curl -X GET "localhost:9200/_cat/indices?v&s=docs.count:desc"
# 4. 连接问题
# 检查防火墙
sudo firewall-cmd --list-all
# 检查SELinux
sudo ausearch -m avc -ts recent
sudo getsebool -a | grep httpd
# 5. 证书问题
# 重新生成证书
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil ca --pem
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca-cert ca/ca.crt --ca-key ca/ca.key
八、最佳实践总结
8.1 部署最佳实践
- 分离节点角色:主节点、数据节点、摄取节点分开部署
- 多可用区部署:跨可用区部署提高可用性
- 容量规划:根据数据量规划分片数量和大小
- 监控先行:部署前先设置监控和告警
8.2 安全最佳实践
- 最小权限原则:每个用户/服务使用最小必要权限
- 网络隔离:ELK组件部署在内部网络,通过反向代理暴露
- 定期更新:保持ELK Stack和操作系统最新
- 审计日志:启用并定期检查审计日志
8.3 性能最佳实践
- 分片优化:单个分片大小控制在20-50GB
- 索引生命周期:合理设置Hot-Warm-Cold架构
- 缓存优化:合理使用查询缓存和请求缓存
- 批量操作:使用批量API减少请求次数
8.4 维护最佳实践
- 定期备份:自动化快照策略
- 容量监控:监控磁盘、内存、CPU使用率
- 日志轮转:配置合理的日志保留策略
- 定期优化:定期执行force_merge和shrink操作
这个完整的ELK Stack部署、优化和安全加固指南涵盖了从基础部署到高级优化的所有方面。根据您的具体环境需求,可以适当调整配置参数。建议在生产环境部署前,先在测试环境充分验证。
更多推荐
所有评论(0)