Elasticsearch Plugin
Monitor Elasticsearch search and analytics engine with comprehensive metrics covering cluster health, indices, nodes, search performance, and JVM statistics.
Overview
The Elasticsearch plugin collects detailed metrics from Elasticsearch API including:
- Cluster Health - Status, nodes, shards, active shards, relocating shards
- Index Statistics - Document count, store size, indexing rate, search rate
- Node Metrics - CPU usage, memory usage, disk usage, JVM heap
- Search Performance - Query time, fetch time, scroll queries
- Indexing Performance - Index time, delete time, merge time
- Thread Pools - Active threads, queue size, rejected tasks
- Cache Statistics - Field cache, query cache, request cache
Requirements
Elasticsearch Version
- Minimum: Elasticsearch 7.0
- Recommended: Elasticsearch 8.0 or later
- Tested with: Elasticsearch 7.17, 8.8, 8.10, 8.11
Python Dependencies
pip install elasticsearch>=8.7.0
Auto-installed when using PLUGINS=elasticsearch
during agent installation.
Elasticsearch Access
The agent needs HTTP access to Elasticsearch REST API (default port 9200).
Configuration
Basic Configuration
plugins:
elasticsearch:
enabled: true
host: localhost
port: 9200
With Authentication
plugins:
elasticsearch:
enabled: true
host: localhost
port: 9200
username: elastic
password: your-password
With HTTPS
plugins:
elasticsearch:
enabled: true
host: elasticsearch.example.com
port: 9200
username: elastic
password: your-password
use_ssl: true
verify_ssl: true
All Configuration Options
plugins:
elasticsearch:
enabled: true # Enable/disable plugin
host: localhost # Elasticsearch host
port: 9200 # Elasticsearch port
username: elastic # Username (optional)
password: password # Password (optional)
use_ssl: false # Use HTTPS (default: false)
verify_ssl: true # Verify SSL certificates
ca_cert: /path/to/ca.pem # CA certificate path
timeout: 10 # Request timeout (seconds)
Environment Variables
Configuration can be overridden with environment variables:
export ELASTICSEARCH_HOST="localhost"
export ELASTICSEARCH_PORT="9200"
export ELASTICSEARCH_USERNAME="elastic"
export ELASTICSEARCH_PASSWORD="password"
Elasticsearch Setup
Installation
Ubuntu/Debian:
# Install Elasticsearch
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt-get update
sudo apt-get install elasticsearch
# Start service
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch
CentOS/RHEL:
# Add repository
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
cat <<EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
# Install
sudo yum install elasticsearch
# Start service
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch
Security Configuration
Elasticsearch 8.x has security enabled by default.
Get initial password:
# Elasticsearch 8.x creates elastic user on first start
# Password is in installation output or reset with:
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic
Create monitoring user:
curl -X POST "localhost:9200/_security/user/monitor" \
-u elastic:password \
-H "Content-Type: application/json" \
-d '{
"password": "monitor-password",
"roles": ["monitoring_user"],
"full_name": "Monitoring User"
}'
Or disable security (not recommended for production):
# /etc/elasticsearch/elasticsearch.yml
xpack.security.enabled: false
Network Configuration
Bind to all interfaces:
# /etc/elasticsearch/elasticsearch.yml
network.host: 0.0.0.0
Security note: Only expose Elasticsearch to trusted networks.
Collected Metrics
Cluster Health Metrics
Metric | Description | Unit | Type |
---|---|---|---|
cluster_status |
Cluster status (2=green, 1=yellow, 0=red) | Numeric | Gauge |
cluster_name |
Cluster name | String | Info |
cluster_number_of_nodes |
Total nodes in cluster | Count | Gauge |
cluster_number_of_data_nodes |
Data nodes in cluster | Count | Gauge |
cluster_active_primary_shards |
Active primary shards | Count | Gauge |
cluster_active_shards |
Active shards (total) | Count | Gauge |
cluster_relocating_shards |
Relocating shards | Count | Gauge |
cluster_initializing_shards |
Initializing shards | Count | Gauge |
cluster_unassigned_shards |
Unassigned shards | Count | Gauge |
cluster_delayed_unassigned_shards |
Delayed unassigned shards | Count | Gauge |
cluster_pending_tasks |
Number of pending tasks | Count | Gauge |
cluster_in_flight_fetch |
Number of in-flight fetch operations | Count | Gauge |
Node Metrics
Metric | Description | Unit | Type |
---|---|---|---|
node_jvm_heap_used_bytes |
JVM heap used (aggregated across all nodes) | Bytes | Gauge |
node_jvm_heap_max_bytes |
JVM heap max (aggregated across all nodes) | Bytes | Gauge |
node_jvm_heap_used_percent |
JVM heap utilization percentage | Percent | Gauge |
node_memory_used_bytes |
OS memory used (aggregated across all nodes) | Bytes | Gauge |
node_memory_total_bytes |
OS memory total (aggregated across all nodes) | Bytes | Gauge |
node_memory_used_percent |
OS memory utilization percentage | Percent | Gauge |
node_disk_used_bytes |
Disk space used (aggregated across all nodes) | Bytes | Gauge |
node_disk_total_bytes |
Total disk space (aggregated across all nodes) | Bytes | Gauge |
node_disk_used_percent |
Disk utilization percentage | Percent | Gauge |
node_cpu_percent |
CPU usage average across all nodes | Percent | Gauge |
Indexing Metrics
Metric | Description | Unit | Type |
---|---|---|---|
indexing_index_total |
Total indexing operations | Count | Counter |
indexing_index_time_ms |
Time spent indexing | Milliseconds | Counter |
Search Metrics
Metric | Description | Unit | Type |
---|---|---|---|
search_query_total |
Total search queries | Count | Counter |
search_query_time_ms |
Time spent in query phase | Milliseconds | Counter |
search_fetch_total |
Total fetch operations | Count | Counter |
search_fetch_time_ms |
Time spent in fetch phase | Milliseconds | Counter |
Cache Metrics
Metric | Description | Unit | Type |
---|---|---|---|
cache_field_size_bytes |
Field data cache size | Bytes | Gauge |
cache_query_size_bytes |
Query cache size | Bytes | Gauge |
Thread Pool Metrics
Metric | Description | Unit | Type |
---|---|---|---|
threadpool_search_queue |
Search thread pool queue size | Count | Gauge |
threadpool_search_rejected |
Search thread pool rejected tasks | Count | Counter |
threadpool_write_queue |
Write thread pool queue size | Count | Gauge |
threadpool_write_rejected |
Write thread pool rejected tasks | Count | Counter |
Index Statistics
Metric | Description | Unit | Type |
---|---|---|---|
indices_count |
Number of indices | Count | Gauge |
indices_docs_count |
Total documents across all indices | Count | Gauge |
indices_store_size_bytes |
Total index store size | Bytes | Gauge |
indices_shards_total |
Total number of shards | Count | Gauge |
indices_shards_primaries |
Number of primary shards | Count | Gauge |
indices_shards_replication |
Average replication factor | Number | Gauge |
Dashboard Metrics
The StatusRadar dashboard displays:
Overview Card
- Cluster Status - Green/yellow/red indicator
- Nodes - Total nodes in cluster
- Documents - Total document count
- Store Size - Total index size
Cluster Health Chart
- Active shards over time
- Unassigned shards
- Relocating shards
- Cluster status changes
Search Performance Chart
- Query rate
- Fetch rate
- Average query time
Indexing Performance Chart
- Indexing rate
- Delete rate
- Average indexing time
JVM Heap Chart
- Heap usage over time
- Heap usage percentage
- GC activity
Node Resources Chart
- CPU usage
- Memory usage
- Disk usage
Installation
Quick Install
PLUGINS='elasticsearch' \
TOKEN='your-agent-token' \
ELASTICSEARCH_USERNAME='elastic' \
ELASTICSEARCH_PASSWORD='your-password' \
bash -c "$(curl -sL https://statusradar.dev/install-agent.sh)"
Install on Existing Agent
-
Install Python dependency:
cd /opt/statusradar source venv/bin/activate # If using venv pip install elasticsearch
-
Enable plugin in config:
sudo nano /opt/statusradar/config/agent.yaml
Add:
plugins: elasticsearch: enabled: true host: localhost port: 9200 username: elastic password: your-password
-
Restart agent:
sudo systemctl restart statusradar-agent
-
Verify:
sudo journalctl -u statusradar-agent -n 50 --no-pager | grep elasticsearch
Expected:
INFO: Plugin elasticsearch: Metrics collected successfully INFO: Plugin elasticsearch: Cluster green, 3 nodes, 1234567 docs
Testing
Manual Plugin Test
cd /opt/statusradar
python3 plugins/elasticsearch_plugin.py
Expected Output:
Plugin: elasticsearch
Enabled: True
Available: True
Collecting metrics...
{
"cluster_status": "green",
"nodes_total": 3,
"nodes_data": 3,
"shards_active": 120,
"shards_primary": 60,
"shards_unassigned": 0,
"indices_count": 25,
"docs_count": 1234567,
"store_size_bytes": 5368709120,
"indexing_total": 987654,
"search_query_total": 456789,
"jvm_heap_used_percent": 45.2,
"cpu_percent": 12.5
}
Test Elasticsearch Connection
# Check cluster health
curl -u elastic:password http://localhost:9200/_cluster/health?pretty
# Check node stats
curl -u elastic:password http://localhost:9200/_nodes/stats?pretty
# Check indices
curl -u elastic:password http://localhost:9200/_cat/indices?v
Troubleshooting
Plugin Not Collecting Metrics
Check 1: Is Elasticsearch running?
sudo systemctl status elasticsearch
Check 2: Can agent connect to Elasticsearch?
curl -u elastic:password http://localhost:9200
Check 3: Is Python package installed?
python3 -c "import elasticsearch; print(elasticsearch.__version__)"
Check 4: Check agent logs
sudo journalctl -u statusradar-agent -n 100 --no-pager | grep elasticsearch
Common Errors
"Authentication failed"
Error:
ERROR: Plugin elasticsearch: 401 Unauthorized
Causes:
- Wrong username/password
- Security not enabled but credentials provided
- User doesn't have monitoring permissions
Solution:
# Elasticsearch 8.x - reset password
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic
# Or disable security (not recommended)
# Edit /etc/elasticsearch/elasticsearch.yml:
# xpack.security.enabled: false
"Connection refused"
Error:
ERROR: Plugin elasticsearch: Connection refused
Causes:
- Elasticsearch not running
- Wrong host/port
- Firewall blocking connection
Solution:
# Check Elasticsearch is running
sudo systemctl status elasticsearch
# Check port
sudo netstat -tlnp | grep 9200
# Test connection
curl http://localhost:9200
"No module named 'elasticsearch'"
Error:
ERROR: No module named 'elasticsearch'
Solution:
pip install elasticsearch
# Or if using venv:
cd /opt/statusradar && source venv/bin/activate && pip install elasticsearch
"SSL certificate verification failed"
Error:
ERROR: Plugin elasticsearch: SSL certificate verification failed
Solution:
plugins:
elasticsearch:
use_ssl: true
verify_ssl: false # Disable verification (not recommended)
# Or provide CA certificate:
# ca_cert: /etc/elasticsearch/certs/ca.crt
Performance Impact
On Elasticsearch
Minimal impact:
- Stats API returns pre-calculated statistics
- No search or indexing operations
- Response time: < 100ms
Benchmark:
- Overhead: < 0.1% CPU
- No measurable performance degradation
On Agent
Resource usage:
- Memory: +20 MB
- CPU: +4% during collection
- Network: +3 KB per collection
Collection time: 0.2-1 second
Use Cases
1. Cluster Health Monitoring
Monitor:
- Cluster status (green/yellow/red)
- Unassigned shards
- Node count
Alert on:
- Cluster status yellow/red
- Unassigned shards > 0
- Node count decreased
2. Search Performance
Monitor:
- Query rate
- Average query time
- Slow queries
Alert on:
- Average query time > 1 second
- Query rate spike
- Search queue growing
3. Indexing Performance
Monitor:
- Indexing rate
- Average indexing time
- Bulk rejections
Alert on:
- Indexing time increasing
- Bulk queue rejections
- Merge time excessive
4. JVM Heap Monitoring
Monitor:
- Heap usage percentage
- GC frequency
- GC duration
Alert on:
- Heap > 75% (risk of OOM)
- Frequent GC pauses
- Long GC pauses
5. Shard Management
Monitor:
- Shard distribution
- Relocating shards
- Initializing shards
Optimize:
- Shard allocation
- Rebalancing
- Shard size
Best Practices
1. Set Appropriate Heap Size
Recommendation: 50% of system RAM, max 31GB
# /etc/elasticsearch/jvm.options
-Xms16g
-Xmx16g # Same as Xms
Never:
- Exceed 31GB (compressed oops limit)
- Use more than 50% of RAM
- Set Xms != Xmx
2. Monitor Cluster Status
Green: All primary and replica shards allocated Yellow: All primary shards allocated, some replicas missing Red: Some primary shards not allocated
Alert on yellow/red status immediately.
3. Manage Shard Count
Guidelines:
- Keep shard size between 10-50GB
- Limit shards per node to 20-25 per GB of heap
- Use index lifecycle management (ILM)
Calculate optimal shards:
Shards per index = Index size / Target shard size (30GB)
4. Enable Slow Logs
# /etc/elasticsearch/elasticsearch.yml
index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.fetch.warn: 1s
index.indexing.slowlog.threshold.index.warn: 10s
index.indexing.slowlog.threshold.index.info: 5s
5. Monitor JVM Heap
Healthy heap usage: 40-75%
If heap > 75%:
- Increase heap size (up to 31GB)
- Reduce field data cache
- Reduce query load
- Add more nodes
6. Use Monitoring User
Create dedicated read-only monitoring user:
curl -X POST "localhost:9200/_security/user/monitor" \
-u elastic:password \
-H "Content-Type: application/json" \
-d '{
"password": "monitor-password",
"roles": ["monitoring_user"]
}'
Elasticsearch Performance Tuning
Indexing Optimization
Bulk indexing:
POST /_bulk
{ "index": { "_index": "logs" } }
{ "message": "Log entry", "timestamp": "2025-10-15" }
Refresh interval:
{
"index": {
"refresh_interval": "30s" # Default: 1s
}
}
Search Optimization
Use filters instead of queries when possible:
{
"query": {
"bool": {
"filter": [
{ "term": { "status": "published" } }
]
}
}
}
Enable request cache:
{
"index": {
"requests": {
"cache": {
"enable": true
}
}
}
}
Memory Settings
Field data circuit breaker:
indices.breaker.fielddata.limit: 40%
Request circuit breaker:
indices.breaker.request.limit: 60%
Advanced Configuration
Elasticsearch Cluster
Monitor each node separately:
plugins:
elasticsearch_node1:
enabled: true
host: es-node1.internal
port: 9200
username: elastic
password: password
elasticsearch_node2:
enabled: true
host: es-node2.internal
port: 9200
username: elastic
password: password
Docker Container
Monitor Elasticsearch in Docker:
plugins:
elasticsearch:
enabled: true
host: elasticsearch-container
port: 9200
username: elastic
password: password
Docker run:
docker run -d --name elasticsearch \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "ELASTIC_PASSWORD=password" \
docker.elastic.co/elasticsearch/elasticsearch:8.11.0
Elastic Cloud
Monitor Elastic Cloud deployment:
plugins:
elasticsearch:
enabled: true
host: my-deployment.es.eastus2.azure.elastic-cloud.com
port: 9243
username: elastic
password: cloud-password
use_ssl: true
verify_ssl: true
Example Configurations
Basic Local
plugins:
elasticsearch:
enabled: true
host: localhost
port: 9200
Production with Authentication
plugins:
elasticsearch:
enabled: true
host: elasticsearch.internal
port: 9200
username: monitor
password: ${ELASTICSEARCH_PASSWORD}
use_ssl: true
verify_ssl: true
Elasticsearch Cluster (3 nodes)
plugins:
es_master:
enabled: true
host: es-master.internal
port: 9200
username: monitor
password: ${ES_PASSWORD}
es_data1:
enabled: true
host: es-data1.internal
port: 9200
username: monitor
password: ${ES_PASSWORD}
es_data2:
enabled: true
host: es-data2.internal
port: 9200
username: monitor
password: ${ES_PASSWORD}
Limitations
Current Limitations
- No per-index details - Only cluster-wide statistics
- No query analysis - Use slow logs for query debugging
- No pipeline metrics - Ingest pipeline stats not collected
Scalability
Tested with:
- Clusters with 100+ nodes
- Indices with 1TB+ data
- 10,000+ docs/second indexing rate
Performance:
- Stats API response time constant regardless of cluster size
- No impact on Elasticsearch performance
Monitoring Best Practices
Critical Metrics
- Cluster status red - Data loss
- Unassigned shards > 0 - Replica missing
- JVM heap > 85% - OOM risk
- Disk space < 10% - Watermark threshold
Alert Thresholds
# Recommended thresholds
cluster_status: != green
shards_unassigned: > 0
jvm_heap_used_percent: > 75
search_query_time_avg_ms: > 1000
indexing_time_avg_ms: > 500
Troubleshooting Performance
Slow Searches
Symptoms: High query time
Solutions:
- Enable query cache
- Use filters instead of queries
- Reduce shard count
- Add more nodes
- Check slow logs
High JVM Heap
Symptoms: Heap > 75%
Solutions:
- Increase heap size (max 31GB)
- Reduce field data
- Disable _source field if not needed
- Use doc_values
- Add more nodes
Unassigned Shards
Symptoms: Yellow/red cluster status
Solutions:
- Check disk space
- Increase
index.number_of_replicas
if needed - Check shard allocation settings
- Manually reroute shards
# Retry failed allocations
curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true"
Next Steps
- Overview
- Requirements
- Elasticsearch Version
- Python Dependencies
- Elasticsearch Access
- Configuration
- Basic Configuration
- With Authentication
- With HTTPS
- All Configuration Options
- Environment Variables
- Elasticsearch Setup
- Installation
- Security Configuration
- Network Configuration
- Collected Metrics
- Cluster Health Metrics
- Node Metrics
- Indexing Metrics
- Search Metrics
- Cache Metrics
- Thread Pool Metrics
- Index Statistics
- Dashboard Metrics
- Overview Card
- Cluster Health Chart
- Search Performance Chart
- Indexing Performance Chart
- JVM Heap Chart
- Node Resources Chart
- Installation
- Quick Install
- Install on Existing Agent
- Testing
- Manual Plugin Test
- Test Elasticsearch Connection
- Troubleshooting
- Plugin Not Collecting Metrics
- Common Errors
- Performance Impact
- On Elasticsearch
- On Agent
- Use Cases
- 1. Cluster Health Monitoring
- 2. Search Performance
- 3. Indexing Performance
- 4. JVM Heap Monitoring
- 5. Shard Management
- Best Practices
- 1. Set Appropriate Heap Size
- 2. Monitor Cluster Status
- 3. Manage Shard Count
- 4. Enable Slow Logs
- 5. Monitor JVM Heap
- 6. Use Monitoring User
- Elasticsearch Performance Tuning
- Indexing Optimization
- Search Optimization
- Memory Settings
- Advanced Configuration
- Elasticsearch Cluster
- Docker Container
- Elastic Cloud
- Example Configurations
- Basic Local
- Production with Authentication
- Elasticsearch Cluster (3 nodes)
- Limitations
- Current Limitations
- Scalability
- Monitoring Best Practices
- Critical Metrics
- Alert Thresholds
- Troubleshooting Performance
- Slow Searches
- High JVM Heap
- Unassigned Shards
- Next Steps