Elasticsearch Plugin

Monitor Elasticsearch search and analytics engine with comprehensive metrics covering cluster health, indices, nodes, search performance, and JVM statistics.

Overview

The Elasticsearch plugin collects detailed metrics from Elasticsearch API including:

  • Cluster Health - Status, nodes, shards, active shards, relocating shards
  • Index Statistics - Document count, store size, indexing rate, search rate
  • Node Metrics - CPU usage, memory usage, disk usage, JVM heap
  • Search Performance - Query time, fetch time, scroll queries
  • Indexing Performance - Index time, delete time, merge time
  • Thread Pools - Active threads, queue size, rejected tasks
  • Cache Statistics - Field cache, query cache, request cache

Requirements

Elasticsearch Version

  • Minimum: Elasticsearch 7.0
  • Recommended: Elasticsearch 8.0 or later
  • Tested with: Elasticsearch 7.17, 8.8, 8.10, 8.11

Python Dependencies

pip install elasticsearch>=8.7.0

Auto-installed when using PLUGINS=elasticsearch during agent installation.

Elasticsearch Access

The agent needs HTTP access to Elasticsearch REST API (default port 9200).

Configuration

Basic Configuration

plugins:
  elasticsearch:
    enabled: true
    host: localhost
    port: 9200

With Authentication

plugins:
  elasticsearch:
    enabled: true
    host: localhost
    port: 9200
    username: elastic
    password: your-password

With HTTPS

plugins:
  elasticsearch:
    enabled: true
    host: elasticsearch.example.com
    port: 9200
    username: elastic
    password: your-password
    use_ssl: true
    verify_ssl: true

All Configuration Options

plugins:
  elasticsearch:
    enabled: true                    # Enable/disable plugin
    host: localhost                  # Elasticsearch host
    port: 9200                       # Elasticsearch port
    username: elastic                # Username (optional)
    password: password               # Password (optional)
    use_ssl: false                   # Use HTTPS (default: false)
    verify_ssl: true                 # Verify SSL certificates
    ca_cert: /path/to/ca.pem        # CA certificate path
    timeout: 10                      # Request timeout (seconds)

Environment Variables

Configuration can be overridden with environment variables:

export ELASTICSEARCH_HOST="localhost"
export ELASTICSEARCH_PORT="9200"
export ELASTICSEARCH_USERNAME="elastic"
export ELASTICSEARCH_PASSWORD="password"

Elasticsearch Setup

Installation

Ubuntu/Debian:

# Install Elasticsearch
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

sudo apt-get update
sudo apt-get install elasticsearch

# Start service
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch

CentOS/RHEL:

# Add repository
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

cat <<EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF

# Install
sudo yum install elasticsearch

# Start service
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch

Security Configuration

Elasticsearch 8.x has security enabled by default.

Get initial password:

# Elasticsearch 8.x creates elastic user on first start
# Password is in installation output or reset with:
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

Create monitoring user:

curl -X POST "localhost:9200/_security/user/monitor" \
  -u elastic:password \
  -H "Content-Type: application/json" \
  -d '{
    "password": "monitor-password",
    "roles": ["monitoring_user"],
    "full_name": "Monitoring User"
  }'

Or disable security (not recommended for production):

# /etc/elasticsearch/elasticsearch.yml
xpack.security.enabled: false

Network Configuration

Bind to all interfaces:

# /etc/elasticsearch/elasticsearch.yml
network.host: 0.0.0.0

Security note: Only expose Elasticsearch to trusted networks.

Collected Metrics

Cluster Health Metrics

Metric Description Unit Type
cluster_status Cluster status (2=green, 1=yellow, 0=red) Numeric Gauge
cluster_name Cluster name String Info
cluster_number_of_nodes Total nodes in cluster Count Gauge
cluster_number_of_data_nodes Data nodes in cluster Count Gauge
cluster_active_primary_shards Active primary shards Count Gauge
cluster_active_shards Active shards (total) Count Gauge
cluster_relocating_shards Relocating shards Count Gauge
cluster_initializing_shards Initializing shards Count Gauge
cluster_unassigned_shards Unassigned shards Count Gauge
cluster_delayed_unassigned_shards Delayed unassigned shards Count Gauge
cluster_pending_tasks Number of pending tasks Count Gauge
cluster_in_flight_fetch Number of in-flight fetch operations Count Gauge

Node Metrics

Metric Description Unit Type
node_jvm_heap_used_bytes JVM heap used (aggregated across all nodes) Bytes Gauge
node_jvm_heap_max_bytes JVM heap max (aggregated across all nodes) Bytes Gauge
node_jvm_heap_used_percent JVM heap utilization percentage Percent Gauge
node_memory_used_bytes OS memory used (aggregated across all nodes) Bytes Gauge
node_memory_total_bytes OS memory total (aggregated across all nodes) Bytes Gauge
node_memory_used_percent OS memory utilization percentage Percent Gauge
node_disk_used_bytes Disk space used (aggregated across all nodes) Bytes Gauge
node_disk_total_bytes Total disk space (aggregated across all nodes) Bytes Gauge
node_disk_used_percent Disk utilization percentage Percent Gauge
node_cpu_percent CPU usage average across all nodes Percent Gauge

Indexing Metrics

Metric Description Unit Type
indexing_index_total Total indexing operations Count Counter
indexing_index_time_ms Time spent indexing Milliseconds Counter

Search Metrics

Metric Description Unit Type
search_query_total Total search queries Count Counter
search_query_time_ms Time spent in query phase Milliseconds Counter
search_fetch_total Total fetch operations Count Counter
search_fetch_time_ms Time spent in fetch phase Milliseconds Counter

Cache Metrics

Metric Description Unit Type
cache_field_size_bytes Field data cache size Bytes Gauge
cache_query_size_bytes Query cache size Bytes Gauge

Thread Pool Metrics

Metric Description Unit Type
threadpool_search_queue Search thread pool queue size Count Gauge
threadpool_search_rejected Search thread pool rejected tasks Count Counter
threadpool_write_queue Write thread pool queue size Count Gauge
threadpool_write_rejected Write thread pool rejected tasks Count Counter

Index Statistics

Metric Description Unit Type
indices_count Number of indices Count Gauge
indices_docs_count Total documents across all indices Count Gauge
indices_store_size_bytes Total index store size Bytes Gauge
indices_shards_total Total number of shards Count Gauge
indices_shards_primaries Number of primary shards Count Gauge
indices_shards_replication Average replication factor Number Gauge

Dashboard Metrics

The StatusRadar dashboard displays:

Overview Card

  • Cluster Status - Green/yellow/red indicator
  • Nodes - Total nodes in cluster
  • Documents - Total document count
  • Store Size - Total index size

Cluster Health Chart

  • Active shards over time
  • Unassigned shards
  • Relocating shards
  • Cluster status changes

Search Performance Chart

  • Query rate
  • Fetch rate
  • Average query time

Indexing Performance Chart

  • Indexing rate
  • Delete rate
  • Average indexing time

JVM Heap Chart

  • Heap usage over time
  • Heap usage percentage
  • GC activity

Node Resources Chart

  • CPU usage
  • Memory usage
  • Disk usage

Installation

Quick Install

PLUGINS='elasticsearch' \
TOKEN='your-agent-token' \
ELASTICSEARCH_USERNAME='elastic' \
ELASTICSEARCH_PASSWORD='your-password' \
bash -c "$(curl -sL https://statusradar.dev/install-agent.sh)"

Install on Existing Agent

  1. Install Python dependency:

    cd /opt/statusradar
    source venv/bin/activate  # If using venv
    pip install elasticsearch
  2. Enable plugin in config:

    sudo nano /opt/statusradar/config/agent.yaml

    Add:

    plugins:
      elasticsearch:
        enabled: true
        host: localhost
        port: 9200
        username: elastic
        password: your-password
  3. Restart agent:

    sudo systemctl restart statusradar-agent
  4. Verify:

    sudo journalctl -u statusradar-agent -n 50 --no-pager | grep elasticsearch

    Expected:

    INFO: Plugin elasticsearch: Metrics collected successfully
    INFO: Plugin elasticsearch: Cluster green, 3 nodes, 1234567 docs

Testing

Manual Plugin Test

cd /opt/statusradar
python3 plugins/elasticsearch_plugin.py

Expected Output:

Plugin: elasticsearch
Enabled: True
Available: True

Collecting metrics...
{
  "cluster_status": "green",
  "nodes_total": 3,
  "nodes_data": 3,
  "shards_active": 120,
  "shards_primary": 60,
  "shards_unassigned": 0,
  "indices_count": 25,
  "docs_count": 1234567,
  "store_size_bytes": 5368709120,
  "indexing_total": 987654,
  "search_query_total": 456789,
  "jvm_heap_used_percent": 45.2,
  "cpu_percent": 12.5
}

Test Elasticsearch Connection

# Check cluster health
curl -u elastic:password http://localhost:9200/_cluster/health?pretty

# Check node stats
curl -u elastic:password http://localhost:9200/_nodes/stats?pretty

# Check indices
curl -u elastic:password http://localhost:9200/_cat/indices?v

Troubleshooting

Plugin Not Collecting Metrics

Check 1: Is Elasticsearch running?

sudo systemctl status elasticsearch

Check 2: Can agent connect to Elasticsearch?

curl -u elastic:password http://localhost:9200

Check 3: Is Python package installed?

python3 -c "import elasticsearch; print(elasticsearch.__version__)"

Check 4: Check agent logs

sudo journalctl -u statusradar-agent -n 100 --no-pager | grep elasticsearch

Common Errors

"Authentication failed"

Error:

ERROR: Plugin elasticsearch: 401 Unauthorized

Causes:

  1. Wrong username/password
  2. Security not enabled but credentials provided
  3. User doesn't have monitoring permissions

Solution:

# Elasticsearch 8.x - reset password
sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

# Or disable security (not recommended)
# Edit /etc/elasticsearch/elasticsearch.yml:
# xpack.security.enabled: false

"Connection refused"

Error:

ERROR: Plugin elasticsearch: Connection refused

Causes:

  1. Elasticsearch not running
  2. Wrong host/port
  3. Firewall blocking connection

Solution:

# Check Elasticsearch is running
sudo systemctl status elasticsearch

# Check port
sudo netstat -tlnp | grep 9200

# Test connection
curl http://localhost:9200

"No module named 'elasticsearch'"

Error:

ERROR: No module named 'elasticsearch'

Solution:

pip install elasticsearch
# Or if using venv:
cd /opt/statusradar && source venv/bin/activate && pip install elasticsearch

"SSL certificate verification failed"

Error:

ERROR: Plugin elasticsearch: SSL certificate verification failed

Solution:

plugins:
  elasticsearch:
    use_ssl: true
    verify_ssl: false  # Disable verification (not recommended)
    # Or provide CA certificate:
    # ca_cert: /etc/elasticsearch/certs/ca.crt

Performance Impact

On Elasticsearch

Minimal impact:

  • Stats API returns pre-calculated statistics
  • No search or indexing operations
  • Response time: < 100ms

Benchmark:

  • Overhead: < 0.1% CPU
  • No measurable performance degradation

On Agent

Resource usage:

  • Memory: +20 MB
  • CPU: +4% during collection
  • Network: +3 KB per collection

Collection time: 0.2-1 second

Use Cases

1. Cluster Health Monitoring

Monitor:

  • Cluster status (green/yellow/red)
  • Unassigned shards
  • Node count

Alert on:

  • Cluster status yellow/red
  • Unassigned shards > 0
  • Node count decreased

2. Search Performance

Monitor:

  • Query rate
  • Average query time
  • Slow queries

Alert on:

  • Average query time > 1 second
  • Query rate spike
  • Search queue growing

3. Indexing Performance

Monitor:

  • Indexing rate
  • Average indexing time
  • Bulk rejections

Alert on:

  • Indexing time increasing
  • Bulk queue rejections
  • Merge time excessive

4. JVM Heap Monitoring

Monitor:

  • Heap usage percentage
  • GC frequency
  • GC duration

Alert on:

  • Heap > 75% (risk of OOM)
  • Frequent GC pauses
  • Long GC pauses

5. Shard Management

Monitor:

  • Shard distribution
  • Relocating shards
  • Initializing shards

Optimize:

  • Shard allocation
  • Rebalancing
  • Shard size

Best Practices

1. Set Appropriate Heap Size

Recommendation: 50% of system RAM, max 31GB

# /etc/elasticsearch/jvm.options
-Xms16g
-Xmx16g  # Same as Xms

Never:

  • Exceed 31GB (compressed oops limit)
  • Use more than 50% of RAM
  • Set Xms != Xmx

2. Monitor Cluster Status

Green: All primary and replica shards allocated Yellow: All primary shards allocated, some replicas missing Red: Some primary shards not allocated

Alert on yellow/red status immediately.

3. Manage Shard Count

Guidelines:

  • Keep shard size between 10-50GB
  • Limit shards per node to 20-25 per GB of heap
  • Use index lifecycle management (ILM)

Calculate optimal shards:

Shards per index = Index size / Target shard size (30GB)

4. Enable Slow Logs

# /etc/elasticsearch/elasticsearch.yml
index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.fetch.warn: 1s

index.indexing.slowlog.threshold.index.warn: 10s
index.indexing.slowlog.threshold.index.info: 5s

5. Monitor JVM Heap

Healthy heap usage: 40-75%

If heap > 75%:

  • Increase heap size (up to 31GB)
  • Reduce field data cache
  • Reduce query load
  • Add more nodes

6. Use Monitoring User

Create dedicated read-only monitoring user:

curl -X POST "localhost:9200/_security/user/monitor" \
  -u elastic:password \
  -H "Content-Type: application/json" \
  -d '{
    "password": "monitor-password",
    "roles": ["monitoring_user"]
  }'

Elasticsearch Performance Tuning

Indexing Optimization

Bulk indexing:

POST /_bulk
{ "index": { "_index": "logs" } }
{ "message": "Log entry", "timestamp": "2025-10-15" }

Refresh interval:

{
  "index": {
    "refresh_interval": "30s"  # Default: 1s
  }
}

Search Optimization

Use filters instead of queries when possible:

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "published" } }
      ]
    }
  }
}

Enable request cache:

{
  "index": {
    "requests": {
      "cache": {
        "enable": true
      }
    }
  }
}

Memory Settings

Field data circuit breaker:

indices.breaker.fielddata.limit: 40%

Request circuit breaker:

indices.breaker.request.limit: 60%

Advanced Configuration

Elasticsearch Cluster

Monitor each node separately:

plugins:
  elasticsearch_node1:
    enabled: true
    host: es-node1.internal
    port: 9200
    username: elastic
    password: password

  elasticsearch_node2:
    enabled: true
    host: es-node2.internal
    port: 9200
    username: elastic
    password: password

Docker Container

Monitor Elasticsearch in Docker:

plugins:
  elasticsearch:
    enabled: true
    host: elasticsearch-container
    port: 9200
    username: elastic
    password: password

Docker run:

docker run -d --name elasticsearch \
  -p 9200:9200 \
  -e "discovery.type=single-node" \
  -e "ELASTIC_PASSWORD=password" \
  docker.elastic.co/elasticsearch/elasticsearch:8.11.0

Elastic Cloud

Monitor Elastic Cloud deployment:

plugins:
  elasticsearch:
    enabled: true
    host: my-deployment.es.eastus2.azure.elastic-cloud.com
    port: 9243
    username: elastic
    password: cloud-password
    use_ssl: true
    verify_ssl: true

Example Configurations

Basic Local

plugins:
  elasticsearch:
    enabled: true
    host: localhost
    port: 9200

Production with Authentication

plugins:
  elasticsearch:
    enabled: true
    host: elasticsearch.internal
    port: 9200
    username: monitor
    password: ${ELASTICSEARCH_PASSWORD}
    use_ssl: true
    verify_ssl: true

Elasticsearch Cluster (3 nodes)

plugins:
  es_master:
    enabled: true
    host: es-master.internal
    port: 9200
    username: monitor
    password: ${ES_PASSWORD}

  es_data1:
    enabled: true
    host: es-data1.internal
    port: 9200
    username: monitor
    password: ${ES_PASSWORD}

  es_data2:
    enabled: true
    host: es-data2.internal
    port: 9200
    username: monitor
    password: ${ES_PASSWORD}

Limitations

Current Limitations

  1. No per-index details - Only cluster-wide statistics
  2. No query analysis - Use slow logs for query debugging
  3. No pipeline metrics - Ingest pipeline stats not collected

Scalability

Tested with:

  • Clusters with 100+ nodes
  • Indices with 1TB+ data
  • 10,000+ docs/second indexing rate

Performance:

  • Stats API response time constant regardless of cluster size
  • No impact on Elasticsearch performance

Monitoring Best Practices

Critical Metrics

  1. Cluster status red - Data loss
  2. Unassigned shards > 0 - Replica missing
  3. JVM heap > 85% - OOM risk
  4. Disk space < 10% - Watermark threshold

Alert Thresholds

# Recommended thresholds
cluster_status: != green
shards_unassigned: > 0
jvm_heap_used_percent: > 75
search_query_time_avg_ms: > 1000
indexing_time_avg_ms: > 500

Troubleshooting Performance

Slow Searches

Symptoms: High query time

Solutions:

  1. Enable query cache
  2. Use filters instead of queries
  3. Reduce shard count
  4. Add more nodes
  5. Check slow logs

High JVM Heap

Symptoms: Heap > 75%

Solutions:

  1. Increase heap size (max 31GB)
  2. Reduce field data
  3. Disable _source field if not needed
  4. Use doc_values
  5. Add more nodes

Unassigned Shards

Symptoms: Yellow/red cluster status

Solutions:

  1. Check disk space
  2. Increase index.number_of_replicas if needed
  3. Check shard allocation settings
  4. Manually reroute shards
# Retry failed allocations
curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true"

Next Steps

On this page