Elasticsearch Plugin

Monitor Elasticsearch search and analytics engine with comprehensive metrics covering cluster health, indices, nodes, search performance, and JVM statistics.

Overview

The Elasticsearch plugin collects detailed metrics from Elasticsearch API including:

Cluster Health - Status, nodes, shards, active shards, relocating shards
Index Statistics - Document count, store size, indexing rate, search rate
Node Metrics - CPU usage, memory usage, disk usage, JVM heap
Search Performance - Query time, fetch time, scroll queries
Indexing Performance - Index time, delete time, merge time
Thread Pools - Active threads, queue size, rejected tasks
Cache Statistics - Field cache, query cache, request cache

Requirements

Elasticsearch Version

Minimum: Elasticsearch 7.0
Recommended: Elasticsearch 8.0 or later
Tested with: Elasticsearch 7.17, 8.8, 8.10, 8.11

Python Dependencies

pip install elasticsearch>=8.7.0

Auto-installed when using PLUGINS=elasticsearch during agent installation.

Elasticsearch Access

The agent needs HTTP access to Elasticsearch REST API (default port 9200).

Configuration

Basic Configuration

plugins:
  elasticsearch:
    enabled: true
    host: localhost
    port: 9200

With Authentication

plugins:
  elasticsearch:
    enabled: true
    host: localhost
    port: 9200
    username: elastic
    password: your-password

With HTTPS

plugins:
  elasticsearch:
    enabled: true
    host: elasticsearch.example.com
    port: 9200
    username: elastic
    password: your-password
    use_ssl: true
    verify_ssl: true

All Configuration Options

plugins:
  elasticsearch:
    enabled: true                    # Enable/disable plugin
    host: localhost                  # Elasticsearch host
    port: 9200                       # Elasticsearch port
    username: elastic                # Username (optional)
    password: password               # Password (optional)
    use_ssl: false                   # Use HTTPS (default: false)
    verify_ssl: true                 # Verify SSL certificates
    ca_cert: /path/to/ca.pem        # CA certificate path
    timeout: 10                      # Request timeout (seconds)

Environment Variables

Configuration can be overridden with environment variables:

export ELASTICSEARCH_HOST="localhost"
export ELASTICSEARCH_PORT="9200"
export ELASTICSEARCH_USERNAME="elastic"
export ELASTICSEARCH_PASSWORD="password"

Elasticsearch Setup

Installation

Ubuntu/Debian:

# Install Elasticsearch
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

sudo apt-get update
sudo apt-get install elasticsearch

# Start service
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch

CentOS/RHEL:

# Add repository
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

cat <<EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF

# Install
sudo yum install elasticsearch

# Start service
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch

Security Configuration

Elasticsearch 8.x has security enabled by default.

Get initial password:

# Elasticsearch 8.x creates elastic user on first start
# Password is in installation output or reset with:
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

Create monitoring user:

curl -X POST "localhost:9200/_security/user/monitor" \
  -u elastic:password \
  -H "Content-Type: application/json" \
  -d '{
    "password": "monitor-password",
    "roles": ["monitoring_user"],
    "full_name": "Monitoring User"
  }'

Or disable security (not recommended for production):

# /etc/elasticsearch/elasticsearch.yml
xpack.security.enabled: false

Network Configuration

Bind to all interfaces:

# /etc/elasticsearch/elasticsearch.yml
network.host: 0.0.0.0

Security note: Only expose Elasticsearch to trusted networks.

Collected Metrics

Cluster Health Metrics

Metric	Description	Unit	Type
`cluster_status`	Cluster status (2=green, 1=yellow, 0=red)	Numeric	Gauge
`cluster_name`	Cluster name	String	Info
`cluster_number_of_nodes`	Total nodes in cluster	Count	Gauge
`cluster_number_of_data_nodes`	Data nodes in cluster	Count	Gauge
`cluster_active_primary_shards`	Active primary shards	Count	Gauge
`cluster_active_shards`	Active shards (total)	Count	Gauge
`cluster_relocating_shards`	Relocating shards	Count	Gauge
`cluster_initializing_shards`	Initializing shards	Count	Gauge
`cluster_unassigned_shards`	Unassigned shards	Count	Gauge
`cluster_delayed_unassigned_shards`	Delayed unassigned shards	Count	Gauge
`cluster_pending_tasks`	Number of pending tasks	Count	Gauge
`cluster_in_flight_fetch`	Number of in-flight fetch operations	Count	Gauge

Node Metrics

Metric	Description	Unit	Type
`node_jvm_heap_used_bytes`	JVM heap used (aggregated across all nodes)	Bytes	Gauge
`node_jvm_heap_max_bytes`	JVM heap max (aggregated across all nodes)	Bytes	Gauge
`node_jvm_heap_used_percent`	JVM heap utilization percentage	Percent	Gauge
`node_memory_used_bytes`	OS memory used (aggregated across all nodes)	Bytes	Gauge
`node_memory_total_bytes`	OS memory total (aggregated across all nodes)	Bytes	Gauge
`node_memory_used_percent`	OS memory utilization percentage	Percent	Gauge
`node_disk_used_bytes`	Disk space used (aggregated across all nodes)	Bytes	Gauge
`node_disk_total_bytes`	Total disk space (aggregated across all nodes)	Bytes	Gauge
`node_disk_used_percent`	Disk utilization percentage	Percent	Gauge
`node_cpu_percent`	CPU usage average across all nodes	Percent	Gauge

Indexing Metrics

Metric	Description	Unit	Type
`indexing_index_total`	Total indexing operations	Count	Counter
`indexing_index_time_ms`	Time spent indexing	Milliseconds	Counter

Search Metrics

Metric	Description	Unit	Type
`search_query_total`	Total search queries	Count	Counter
`search_query_time_ms`	Time spent in query phase	Milliseconds	Counter
`search_fetch_total`	Total fetch operations	Count	Counter
`search_fetch_time_ms`	Time spent in fetch phase	Milliseconds	Counter

Cache Metrics

Metric	Description	Unit	Type
`cache_field_size_bytes`	Field data cache size	Bytes	Gauge
`cache_query_size_bytes`	Query cache size	Bytes	Gauge

Thread Pool Metrics

Metric	Description	Unit	Type
`threadpool_search_queue`	Search thread pool queue size	Count	Gauge
`threadpool_search_rejected`	Search thread pool rejected tasks	Count	Counter
`threadpool_write_queue`	Write thread pool queue size	Count	Gauge
`threadpool_write_rejected`	Write thread pool rejected tasks	Count	Counter

Index Statistics

Metric	Description	Unit	Type
`indices_count`	Number of indices	Count	Gauge
`indices_docs_count`	Total documents across all indices	Count	Gauge
`indices_store_size_bytes`	Total index store size	Bytes	Gauge
`indices_shards_total`	Total number of shards	Count	Gauge
`indices_shards_primaries`	Number of primary shards	Count	Gauge
`indices_shards_replication`	Average replication factor	Number	Gauge

Dashboard Metrics

The StatusRadar dashboard displays:

Overview Card

Cluster Status - Green/yellow/red indicator
Nodes - Total nodes in cluster
Documents - Total document count
Store Size - Total index size

Cluster Health Chart

Active shards over time
Unassigned shards
Relocating shards
Cluster status changes

Search Performance Chart

Query rate
Fetch rate
Average query time

Indexing Performance Chart

Indexing rate
Delete rate
Average indexing time

JVM Heap Chart

Heap usage over time
Heap usage percentage
GC activity

Node Resources Chart

CPU usage
Memory usage
Disk usage

Installation

Quick Install

PLUGINS='elasticsearch' \
TOKEN='your-agent-token' \
ELASTICSEARCH_USERNAME='elastic' \
ELASTICSEARCH_PASSWORD='your-password' \
bash -c "$(curl -sL https://statusradar.dev/install-agent.sh)"

Install on Existing Agent

Install Python dependency:

cd /opt/statusradar
source venv/bin/activate  # If using venv
pip install elasticsearch

Enable plugin in config:

sudo nano /opt/statusradar/config/agent.yaml

Add:

plugins:
  elasticsearch:
    enabled: true
    host: localhost
    port: 9200
    username: elastic
    password: your-password

Restart agent:

sudo systemctl restart statusradar-agent

Verify:

sudo journalctl -u statusradar-agent -n 50 --no-pager | grep elasticsearch

Expected:

INFO: Plugin elasticsearch: Metrics collected successfully
INFO: Plugin elasticsearch: Cluster green, 3 nodes, 1234567 docs

Testing

Manual Plugin Test

cd /opt/statusradar
python3 plugins/elasticsearch_plugin.py

Expected Output:

Plugin: elasticsearch
Enabled: True
Available: True

Collecting metrics...
{
  "cluster_status": "green",
  "nodes_total": 3,
  "nodes_data": 3,
  "shards_active": 120,
  "shards_primary": 60,
  "shards_unassigned": 0,
  "indices_count": 25,
  "docs_count": 1234567,
  "store_size_bytes": 5368709120,
  "indexing_total": 987654,
  "search_query_total": 456789,
  "jvm_heap_used_percent": 45.2,
  "cpu_percent": 12.5
}

Test Elasticsearch Connection

# Check cluster health
curl -u elastic:password http://localhost:9200/_cluster/health?pretty

# Check node stats
curl -u elastic:password http://localhost:9200/_nodes/stats?pretty

# Check indices
curl -u elastic:password http://localhost:9200/_cat/indices?v

Troubleshooting

Plugin Not Collecting Metrics

Check 1: Is Elasticsearch running?

sudo systemctl status elasticsearch

Check 2: Can agent connect to Elasticsearch?

curl -u elastic:password http://localhost:9200

Check 3: Is Python package installed?

python3 -c "import elasticsearch; print(elasticsearch.__version__)"

Check 4: Check agent logs

sudo journalctl -u statusradar-agent -n 100 --no-pager | grep elasticsearch

Common Errors

"Authentication failed"

Error:

ERROR: Plugin elasticsearch: 401 Unauthorized

Causes:

Wrong username/password
Security not enabled but credentials provided
User doesn't have monitoring permissions

Solution:

# Elasticsearch 8.x - reset password
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

# Or disable security (not recommended)
# Edit /etc/elasticsearch/elasticsearch.yml:
# xpack.security.enabled: false

"Connection refused"

Error:

ERROR: Plugin elasticsearch: Connection refused

Causes:

Elasticsearch not running
Wrong host/port
Firewall blocking connection

Solution:

# Check Elasticsearch is running
sudo systemctl status elasticsearch

# Check port
sudo netstat -tlnp | grep 9200

# Test connection
curl http://localhost:9200

"No module named 'elasticsearch'"

Error:

ERROR: No module named 'elasticsearch'

Solution:

pip install elasticsearch
# Or if using venv:
cd /opt/statusradar && source venv/bin/activate && pip install elasticsearch

"SSL certificate verification failed"

Error:

ERROR: Plugin elasticsearch: SSL certificate verification failed

Solution:

plugins:
  elasticsearch:
    use_ssl: true
    verify_ssl: false  # Disable verification (not recommended)
    # Or provide CA certificate:
    # ca_cert: /etc/elasticsearch/certs/ca.crt

Performance Impact

On Elasticsearch

Minimal impact:

Stats API returns pre-calculated statistics
No search or indexing operations
Response time: < 100ms

Benchmark:

Overhead: < 0.1% CPU
No measurable performance degradation

On Agent

Resource usage:

Memory: +20 MB
CPU: +4% during collection
Network: +3 KB per collection

Collection time: 0.2-1 second

Use Cases

1. Cluster Health Monitoring

Monitor:

Cluster status (green/yellow/red)
Unassigned shards
Node count

Alert on:

Cluster status yellow/red
Unassigned shards > 0
Node count decreased

2. Search Performance

Monitor:

Query rate
Average query time
Slow queries

Alert on:

Average query time > 1 second
Query rate spike
Search queue growing

3. Indexing Performance

Monitor:

Indexing rate
Average indexing time
Bulk rejections

Alert on:

Indexing time increasing
Bulk queue rejections
Merge time excessive

4. JVM Heap Monitoring

Monitor:

Heap usage percentage
GC frequency
GC duration

Alert on:

Heap > 75% (risk of OOM)
Frequent GC pauses
Long GC pauses

5. Shard Management

Monitor:

Shard distribution
Relocating shards
Initializing shards

Optimize:

Shard allocation
Rebalancing
Shard size

Best Practices

1. Set Appropriate Heap Size

Recommendation: 50% of system RAM, max 31GB

# /etc/elasticsearch/jvm.options
-Xms16g
-Xmx16g  # Same as Xms

Never:

Exceed 31GB (compressed oops limit)
Use more than 50% of RAM
Set Xms != Xmx

2. Monitor Cluster Status

Green: All primary and replica shards allocated Yellow: All primary shards allocated, some replicas missing Red: Some primary shards not allocated

Alert on yellow/red status immediately.

3. Manage Shard Count

Guidelines:

Keep shard size between 10-50GB
Limit shards per node to 20-25 per GB of heap
Use index lifecycle management (ILM)

Calculate optimal shards:

Shards per index = Index size / Target shard size (30GB)

4. Enable Slow Logs

# /etc/elasticsearch/elasticsearch.yml
index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.fetch.warn: 1s

index.indexing.slowlog.threshold.index.warn: 10s
index.indexing.slowlog.threshold.index.info: 5s

5. Monitor JVM Heap

Healthy heap usage: 40-75%

If heap > 75%:

Increase heap size (up to 31GB)
Reduce field data cache
Reduce query load
Add more nodes

6. Use Monitoring User

Create dedicated read-only monitoring user:

curl -X POST "localhost:9200/_security/user/monitor" \
  -u elastic:password \
  -H "Content-Type: application/json" \
  -d '{
    "password": "monitor-password",
    "roles": ["monitoring_user"]
  }'

Elasticsearch Performance Tuning

Indexing Optimization

Bulk indexing:

POST /_bulk
{ "index": { "_index": "logs" } }
{ "message": "Log entry", "timestamp": "2025-10-15" }

Refresh interval:

{
  "index": {
    "refresh_interval": "30s"  # Default: 1s
  }
}

Search Optimization

Use filters instead of queries when possible:

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "published" } }
      ]
    }
  }
}

Enable request cache:

{
  "index": {
    "requests": {
      "cache": {
        "enable": true
      }
    }
  }
}

Memory Settings

Field data circuit breaker:

indices.breaker.fielddata.limit: 40%

Request circuit breaker:

indices.breaker.request.limit: 60%

Advanced Configuration

Elasticsearch Cluster

Monitor each node separately:

plugins:
  elasticsearch_node1:
    enabled: true
    host: es-node1.internal
    port: 9200
    username: elastic
    password: password

  elasticsearch_node2:
    enabled: true
    host: es-node2.internal
    port: 9200
    username: elastic
    password: password

Docker Container

Monitor Elasticsearch in Docker:

plugins:
  elasticsearch:
    enabled: true
    host: elasticsearch-container
    port: 9200
    username: elastic
    password: password

Docker run:

docker run -d --name elasticsearch \
  -p 9200:9200 \
  -e "discovery.type=single-node" \
  -e "ELASTIC_PASSWORD=password" \
  docker.elastic.co/elasticsearch/elasticsearch:8.11.0

Elastic Cloud

Monitor Elastic Cloud deployment:

plugins:
  elasticsearch:
    enabled: true
    host: my-deployment.es.eastus2.azure.elastic-cloud.com
    port: 9243
    username: elastic
    password: cloud-password
    use_ssl: true
    verify_ssl: true

Example Configurations

Basic Local

plugins:
  elasticsearch:
    enabled: true
    host: localhost
    port: 9200

Production with Authentication

plugins:
  elasticsearch:
    enabled: true
    host: elasticsearch.internal
    port: 9200
    username: monitor
    password: ${ELASTICSEARCH_PASSWORD}
    use_ssl: true
    verify_ssl: true

Elasticsearch Cluster (3 nodes)

plugins:
  es_master:
    enabled: true
    host: es-master.internal
    port: 9200
    username: monitor
    password: ${ES_PASSWORD}

  es_data1:
    enabled: true
    host: es-data1.internal
    port: 9200
    username: monitor
    password: ${ES_PASSWORD}

  es_data2:
    enabled: true
    host: es-data2.internal
    port: 9200
    username: monitor
    password: ${ES_PASSWORD}

Limitations

Current Limitations

No per-index details - Only cluster-wide statistics
No query analysis - Use slow logs for query debugging
No pipeline metrics - Ingest pipeline stats not collected

Scalability

Tested with:

Clusters with 100+ nodes
Indices with 1TB+ data
10,000+ docs/second indexing rate

Performance:

Stats API response time constant regardless of cluster size
No impact on Elasticsearch performance

Monitoring Best Practices

Critical Metrics

Cluster status red - Data loss
Unassigned shards > 0 - Replica missing
JVM heap > 85% - OOM risk
Disk space < 10% - Watermark threshold

Alert Thresholds

# Recommended thresholds
cluster_status: != green
shards_unassigned: > 0
jvm_heap_used_percent: > 75
search_query_time_avg_ms: > 1000
indexing_time_avg_ms: > 500

Troubleshooting Performance

Slow Searches

Symptoms: High query time

Solutions:

Enable query cache
Use filters instead of queries
Reduce shard count
Add more nodes
Check slow logs

High JVM Heap

Symptoms: Heap > 75%

Solutions:

Increase heap size (max 31GB)
Reduce field data
Disable _source field if not needed
Use doc_values
Add more nodes

Unassigned Shards

Symptoms: Yellow/red cluster status

Solutions:

Check disk space
Increase index.number_of_replicas if needed
Check shard allocation settings
Manually reroute shards

# Retry failed allocations
curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true"

Next Steps

On this page

Overview
Requirements
Elasticsearch Version
Python Dependencies
Elasticsearch Access
Configuration
Basic Configuration
With Authentication
With HTTPS
All Configuration Options
Environment Variables
Elasticsearch Setup
Installation
Security Configuration
Network Configuration
Collected Metrics
Cluster Health Metrics
Node Metrics
Indexing Metrics
Search Metrics
Cache Metrics
Thread Pool Metrics
Index Statistics
Dashboard Metrics
Overview Card
Cluster Health Chart
Search Performance Chart
Indexing Performance Chart
JVM Heap Chart
Node Resources Chart
Installation
Quick Install
Install on Existing Agent
Testing
Manual Plugin Test
Test Elasticsearch Connection
Troubleshooting
Plugin Not Collecting Metrics
Common Errors
Performance Impact
On Elasticsearch
On Agent
Use Cases
1. Cluster Health Monitoring
2. Search Performance
3. Indexing Performance
4. JVM Heap Monitoring
5. Shard Management
Best Practices
1. Set Appropriate Heap Size
2. Monitor Cluster Status
3. Manage Shard Count
4. Enable Slow Logs
5. Monitor JVM Heap
6. Use Monitoring User
Elasticsearch Performance Tuning
Indexing Optimization
Search Optimization
Memory Settings
Advanced Configuration
Elasticsearch Cluster
Docker Container
Elastic Cloud
Example Configurations
Basic Local
Production with Authentication
Elasticsearch Cluster (3 nodes)
Limitations
Current Limitations
Scalability
Monitoring Best Practices
Critical Metrics
Alert Thresholds
Troubleshooting Performance
Slow Searches
High JVM Heap
Unassigned Shards
Next Steps