PostgreSQL Plugin

Monitor PostgreSQL database servers with comprehensive metrics tracking connections, transactions, locks, cache performance, query performance, and replication lag.

Overview

The PostgreSQL plugin collects detailed metrics from PostgreSQL servers including:

  • Connection Statistics - Total connections, active/idle states, connection limits
  • Transaction Statistics - Commits, rollbacks, transaction health
  • Cache Performance - Buffer cache hit ratio
  • Lock Statistics - Total locks, deadlocks
  • Query Performance - Slow query count (queries > 5 seconds)
  • Replication - Replication lag for standby servers
  • Database Details - Per-database statistics including size, connections, and tuple operations

Requirements

PostgreSQL Version

  • Minimum: PostgreSQL 10
  • Recommended: PostgreSQL 12 or later
  • Tested with: PostgreSQL 10, 11, 12, 13, 14, 15, 16

Python Dependencies

pip install psycopg2-binary>=2.9.0

Auto-installed when using PLUGINS=postgresql during agent installation.

Network Access

The agent must be able to connect to PostgreSQL:

# Test connectivity
psql -h localhost -U monitor -d postgres -c "SELECT 1"

Configuration

Basic Configuration

plugins:
  postgresql:
    enabled: true
    host: localhost
    port: 5432
    user: monitor
    password: 'monitoring-password'
    database: postgres

With SSL/TLS

plugins:
  postgresql:
    enabled: true
    host: postgres.example.com
    port: 5432
    user: monitor
    password: 'secure-password'
    database: postgres
    ssl_mode: require  # Options: disable, require, verify-ca, verify-full

Remote PostgreSQL Server

plugins:
  postgresql:
    enabled: true
    host: postgres-server.example.com
    port: 5432
    user: monitor
    password: 'secure-password'
    database: postgres

All Configuration Options

plugins:
  postgresql:
    enabled: true              # Enable/disable plugin
    host: localhost            # PostgreSQL hostname or IP
    port: 5432                 # PostgreSQL port (default: 5432)
    user: monitor              # PostgreSQL username
    password: ''               # PostgreSQL password
    database: postgres         # Database to connect to
    ssl_mode: disable          # SSL mode: disable, allow, prefer, require, verify-ca, verify-full
    connect_timeout: 10        # Connection timeout in seconds
    application_name: statusradar-agent  # Application name in pg_stat_activity

Environment Variables

Configuration can be overridden with environment variables:

export POSTGRESQL_HOST="localhost"
export POSTGRESQL_PORT="5432"
export POSTGRESQL_USER="monitor"
export POSTGRESQL_PASSWORD="secret"
export POSTGRESQL_DATABASE="postgres"
export POSTGRESQL_SSL_MODE="disable"

PostgreSQL Setup

Create Monitoring User

For PostgreSQL 10+:

CREATE USER monitor WITH PASSWORD 'your-secure-password';
GRANT pg_monitor TO monitor;

Grant access to specific database:

\c postgres
GRANT CONNECT ON DATABASE postgres TO monitor;

Required Privileges

Recommended: pg_monitor role

The pg_monitor role provides read-only access to all monitoring views:

GRANT pg_monitor TO monitor;

This role includes:

  • pg_read_all_stats - Read all pgstat* views
  • pg_read_all_settings - Read all configuration parameters
  • pg_stat_scan_tables - Read table and index statistics

Manual permissions (if pg_monitor not available):

GRANT pg_read_all_stats TO monitor;
GRANT pg_read_all_settings TO monitor;

Configure pg_hba.conf

Allow monitoring user to connect:

For local connections:

# TYPE  DATABASE    USER      ADDRESS         METHOD
host    postgres    monitor   127.0.0.1/32    md5
host    postgres    monitor   ::1/128         md5

For remote connections:

# TYPE  DATABASE    USER      ADDRESS         METHOD
host    postgres    monitor   10.0.0.0/8      md5

Reload PostgreSQL:

sudo systemctl reload postgresql

Verify User Permissions

-- Check user roles
\du monitor

-- Check database access
SELECT datname FROM pg_database WHERE has_database_privilege('monitor', datname, 'CONNECT');

-- Test stats access
SELECT * FROM pg_stat_database LIMIT 1;

Collected Metrics

Connection Statistics

Metric Description Unit Type
connections_total Total active connections (all databases) Count Gauge
connections_active Connections in active state Count Gauge
connections_idle Connections in idle state Count Gauge
connections_idle_in_transaction Connections idle in transaction Count Gauge
max_connections Max connections limit Count Gauge

Connection utilization:

utilization = connections_total / max_connections

Warning signs:

  • connections_idle_in_transaction > 10 - Connection leak or long transactions
  • utilization > 0.8 - Need to increase max_connections

Transaction Statistics

Metric Description Unit Type
transactions_commits Total committed transactions (all databases) Count Counter
transactions_rollbacks Total rolled back transactions (all databases) Count Counter

Transaction health:

commit_ratio = transactions_commits / (transactions_commits + transactions_rollbacks)

Healthy: > 0.95 (95%)

Cache Performance

Metric Description Unit Type
cache_hit_rate Buffer cache hit ratio percentage Percentage Gauge

Calculation:

cache_hit_rate = 100.0 * buffer_hits / (disk_reads + buffer_hits)

Healthy: > 99%

Lock Statistics

Metric Description Unit Type
locks_total Total locks held (all types) Count Gauge
deadlocks_total Total deadlock count (all databases) Count Counter

Monitoring:

  • Sudden spike in locks = lock contention
  • deadlocks_total > 0 = investigate query patterns

Query Performance

Metric Description Unit Type
slow_queries_count Active queries running > 5 seconds Count Gauge

Warning signs:

  • slow_queries_count > 10 - Query optimization needed
  • Sudden spike - check for long-running queries

Replication Statistics (Standby Only)

Metric Description Unit Type
replication_lag_seconds Seconds behind primary Seconds Gauge

Note: Only available on standby servers running in recovery mode.

Warning threshold:

  • replication_lag_seconds > 60 - Replication falling behind

Server Information

Metric Description Unit Type
version PostgreSQL version string String Info

Database Details (Array)

Metric Description Unit Type
databases Array of per-database statistics Array Info

Each database object contains:

  • datname - Database name
  • connections - Active connections
  • commits - Committed transactions
  • rollbacks - Rolled back transactions
  • disk_reads - Blocks read from disk
  • buffer_hits - Blocks read from cache
  • tuples_returned - Rows returned by queries
  • tuples_fetched - Rows fetched by queries
  • tuples_inserted - Rows inserted
  • tuples_updated - Rows updated
  • tuples_deleted - Rows deleted
  • conflicts - Query conflicts (standby only)
  • deadlocks - Deadlock count
  • database_size_bytes - Database size in bytes

Note: Limited to top 10 databases by connection count, excludes template databases.

Dashboard Metrics

The StatusRadar dashboard displays:

Overview Card

  • Status - PostgreSQL server up/down
  • Version - PostgreSQL version
  • Connections - Total active connections
  • Cache Hit Rate - Buffer cache hit percentage

Connection Chart

  • Total connections (connections_total)
  • Active connections (connections_active)
  • Idle connections (connections_idle)
  • Idle in transaction (connections_idle_in_transaction)
  • Max connections limit (max_connections)

Transaction Chart

  • Total commits (transactions_commits)
  • Total rollbacks (transactions_rollbacks)
  • Commit ratio percentage

Cache Performance

  • Cache hit rate percentage (cache_hit_rate)

Lock Statistics

  • Total locks (locks_total)
  • Total deadlocks (deadlocks_total)

Query Performance

  • Slow queries count (slow_queries_count)

Replication Status (Standby Only)

  • Replication lag in seconds (replication_lag_seconds)

Database Details

  • Per-database statistics from databases array
  • Database sizes
  • Per-database transaction counts
  • Per-database tuple operations

Installation

Quick Install

PLUGINS='postgresql' \
POSTGRESQL_USER='monitor' \
POSTGRESQL_PASSWORD='your-password' \
TOKEN='your-agent-token' \
bash -c "$(curl -sL https://statusradar.dev/install-agent.sh)"

Install on Existing Agent

  1. Create PostgreSQL user:

    CREATE USER monitor WITH PASSWORD 'secure-password';
    GRANT pg_monitor TO monitor;
  2. Configure pg_hba.conf:

    sudo nano /etc/postgresql/15/main/pg_hba.conf

    Add:

    host    postgres    monitor    127.0.0.1/32    md5

    Reload:

    sudo systemctl reload postgresql
  3. Install Python dependency:

    cd /opt/statusradar
    source venv/bin/activate  # If using venv
    pip install psycopg2-binary
  4. Enable plugin in config:

    sudo nano /opt/statusradar/config/agent.yaml

    Add:

    plugins:
      postgresql:
        enabled: true
        host: localhost
        port: 5432
        user: monitor
        password: 'secure-password'
        database: postgres
  5. Restart agent:

    sudo systemctl restart statusradar-agent
  6. Verify:

    sudo journalctl -u statusradar-agent -n 50 --no-pager | grep postgresql

    Expected:

    INFO: Plugin postgresql: Metrics collected successfully

Testing

Manual Plugin Test

cd /opt/statusradar
python3 plugins/postgresql_plugin.py

Expected Output:

Plugin: postgresql
Enabled: True
Available: True

Collecting metrics...
{
  "connections_total": 15,
  "transactions_commits": 45000,
  "transactions_rollbacks": 100,
  "deadlocks_total": 5,
  "cache_hit_rate": 99.5,
  "connections_active": 12,
  "connections_idle": 3,
  "connections_idle_in_transaction": 0,
  "max_connections": 100,
  "locks_total": 25,
  "slow_queries_count": 2,
  "version": "PostgreSQL 15.4 on x86_64-pc-linux-gnu...",
  "databases": [
    {
      "datname": "postgres",
      "connections": 10,
      "commits": 30000,
      "rollbacks": 50,
      "disk_reads": 1000,
      "buffer_hits": 300000,
      "tuples_returned": 500000,
      "tuples_fetched": 400000,
      "tuples_inserted": 5000,
      "tuples_updated": 3000,
      "tuples_deleted": 100,
      "conflicts": 0,
      "deadlocks": 2,
      "database_size_bytes": 104857600
    }
  ]
}

Test PostgreSQL Connectivity

# Basic connection test
psql -h localhost -U monitor -d postgres -c "SELECT 1"

# Test with password from environment
PGPASSWORD='your-password' psql -h localhost -U monitor -d postgres -c "SELECT 1"

# Remote server
psql -h postgres.example.com -U monitor -d postgres -c "SELECT 1"

# Check permissions
psql -h localhost -U monitor -d postgres -c "SELECT * FROM pg_stat_database LIMIT 1"

Test Statistics Queries

# Test database stats
psql -U monitor -d postgres -c "SELECT * FROM pg_stat_database WHERE datname = 'postgres'"

# Test connection stats
psql -U monitor -d postgres -c "SELECT state, count(*) FROM pg_stat_activity GROUP BY state"

# Test lock stats
psql -U monitor -d postgres -c "SELECT mode, count(*) FROM pg_locks GROUP BY mode"

# Test replication stats (primary)
psql -U monitor -d postgres -c "SELECT * FROM pg_stat_replication"

Troubleshooting

Plugin Not Collecting Metrics

Check 1: Is PostgreSQL running?

sudo systemctl status postgresql

Check 2: Can agent connect?

psql -h localhost -U monitor -d postgres -c "SELECT 1"

Check 3: Are credentials correct?

# Check config
cat /opt/statusradar/config/agent.yaml | grep -A10 postgresql

Check 4: Is Python package installed?

python3 -c "import psycopg2; print(psycopg2.__version__)"

Check 5: Check agent logs

sudo journalctl -u statusradar-agent -n 100 --no-pager | grep postgresql

Common Errors

"Connection refused"

Error:

ERROR: Plugin postgresql: could not connect to server: Connection refused

Causes:

  1. PostgreSQL not running
  2. Wrong host/port
  3. PostgreSQL not listening on expected interface

Solution:

# Check if PostgreSQL is running
sudo systemctl status postgresql

# Check PostgreSQL port
sudo netstat -tlnp | grep 5432

# Check listen_addresses
sudo -u postgres psql -c "SHOW listen_addresses"

# If listen_addresses = 'localhost', can only connect from localhost
# To allow remote connections, edit postgresql.conf:
# listen_addresses = '*'

"Password authentication failed"

Error:

ERROR: Plugin postgresql: password authentication failed for user "monitor"

Causes:

  1. Wrong password in configuration
  2. User doesn't exist
  3. pg_hba.conf doesn't allow connection

Solution:

-- Check if user exists
\du monitor

-- Reset password
ALTER USER monitor WITH PASSWORD 'new-secure-password';

Update pg_hba.conf:

host    postgres    monitor    127.0.0.1/32    md5

Reload:

sudo systemctl reload postgresql

"Permission denied for table"

Error:

ERROR: Plugin postgresql: permission denied for table pg_stat_database

Cause: User doesn't have pg_monitor or pg_read_all_stats role

Solution:

GRANT pg_monitor TO monitor;
-- or
GRANT pg_read_all_stats TO monitor;

"No module named 'psycopg2'"

Error:

ERROR: No module named 'psycopg2'

Solution:

pip install psycopg2-binary
# Or if using venv:
cd /opt/statusradar && source venv/bin/activate && pip install psycopg2-binary

"SSL connection has been closed unexpectedly"

Error:

ERROR: SSL connection has been closed unexpectedly

Cause: SSL mode mismatch

Solution:

Set appropriate SSL mode:

plugins:
  postgresql:
    ssl_mode: disable  # or require, verify-ca, verify-full

Performance Impact

On PostgreSQL

Negligible impact:

  • Plugin queries pgstat* views (in-memory statistics)
  • No table scans, no locks on user tables
  • Queries execute in < 10ms
  • Read-only operations

Benchmark:

  • PostgreSQL TPS with monitoring: 5,000+ TPS
  • PostgreSQL TPS without monitoring: 5,000+ TPS
  • Overhead: < 0.01%

On Agent

Resource usage:

  • Memory: +15 MB
  • CPU: +3% during collection (0.2-0.5 seconds)
  • Network: +2 KB per collection

Use Cases

1. Transaction Monitoring

Monitor:

  • Total commits (transactions_commits)
  • Total rollbacks (transactions_rollbacks)
  • Commit ratio

Alert on:

  • Commit ratio < 95%
  • Sudden spike in rollbacks

Formula:

commit_ratio = transactions_commits / (transactions_commits + transactions_rollbacks)

2. Connection Pool Management

Monitor:

  • Total connections (connections_total)
  • Active connections (connections_active)
  • Idle connections (connections_idle)
  • Idle in transaction (connections_idle_in_transaction)
  • Max connections limit (max_connections)

Alert on:

  • connections_total > max_connections * 0.8 - Approaching limit
  • connections_idle_in_transaction > 10 - Connection leaks
  • connections_total >= max_connections - No slots available

3. Cache Hit Ratio Optimization

Monitor:

  • Cache hit rate (cache_hit_rate)

Alert on:

  • cache_hit_rate < 99% - Need to increase shared_buffers

Healthy value: > 99%

4. Lock Contention Analysis

Monitor:

  • Total locks (locks_total)
  • Total deadlocks (deadlocks_total)

Alert on:

  • deadlocks_total increasing - Query pattern issues
  • Sudden spike in locks_total - Lock contention

5. Query Performance Monitoring

Monitor:

  • Slow queries count (slow_queries_count)

Alert on:

  • slow_queries_count > 10 - Many slow queries
  • Sustained high slow query count - Optimization needed

Note: Slow queries are defined as queries running > 5 seconds.

6. Replication Lag Monitoring (Standby Servers)

Monitor:

  • Replication lag (replication_lag_seconds)

Alert on:

  • replication_lag_seconds > 60 - Falling behind primary
  • replication_lag_seconds > 300 - Critical lag

Note: Only available on standby servers in recovery mode.

Best Practices

1. Use pg_monitor Role

Recommended:

CREATE USER monitor WITH PASSWORD 'secure-password';
GRANT pg_monitor TO monitor;

This provides all necessary read-only permissions for monitoring.

2. Monitor Key Performance Indicators

Essential metrics:

  • Cache hit rate (cache_hit_rate) - Should be > 99%
  • Connection count (connections_total) - Should be < 80% of max_connections
  • Idle in transaction (connections_idle_in_transaction) - Should be close to 0
  • Slow queries (slow_queries_count) - Should be low
  • Replication lag (replication_lag_seconds) - Should be < 60 seconds for standbys

3. Set Appropriate Alerts

Connection alerts:

connections_total > max_connections * 0.8
connections_idle_in_transaction > 10
connections_total >= max_connections

Performance alerts:

cache_hit_rate < 99
deadlocks_total increasing
slow_queries_count > 10
replication_lag_seconds > 60 (standbys only)

4. Optimize Shared Buffers

Rule of thumb:

  • shared_buffers = 25% of RAM (for dedicated DB server)
  • Monitor cache_hit_rate to verify effectiveness
  • Adjust based on workload
  • Target: cache_hit_rate > 99%

5. Secure Configuration

# Protect agent config
chmod 600 /opt/statusradar/config/agent.yaml

# Use strong password
openssl rand -base64 32

Advanced Configuration

Unix Socket Connection

plugins:
  postgresql:
    enabled: true
    unix_socket: '/var/run/postgresql'
    user: monitor
    password: 'password'
    database: postgres

Multiple Databases

Currently monitors one database per agent instance. To monitor multiple databases:

Option 1: Query all databases via pg_stat_database

  • Plugin automatically collects stats for all databases
  • Dashboard shows aggregated metrics

Option 2: Multiple agents

  • Install separate agent for each critical database
  • Provides isolated monitoring

SSL with Client Certificates

plugins:
  postgresql:
    enabled: true
    host: postgres.example.com
    port: 5432
    user: monitor
    database: postgres
    ssl_mode: verify-full
    ssl_cert: '/path/to/client-cert.pem'
    ssl_key: '/path/to/client-key.pem'
    ssl_ca: '/path/to/ca-cert.pem'

Example Configurations

Development Server

plugins:
  postgresql:
    enabled: true
    host: localhost
    port: 5432
    user: monitor
    password: 'dev-password'
    database: postgres

Production Primary

plugins:
  postgresql:
    enabled: true
    host: localhost
    port: 5432
    user: monitor
    password: 'strong-production-password'
    database: postgres
    connect_timeout: 10

Production Standby

plugins:
  postgresql:
    enabled: true
    host: localhost
    port: 5432
    user: monitor
    password: 'strong-production-password'
    database: postgres
    # Plugin automatically detects standby status

Remote PostgreSQL (SSL)

plugins:
  postgresql:
    enabled: true
    host: postgres-prod.example.com
    port: 5432
    user: monitor
    password: 'secure-password'
    database: postgres
    ssl_mode: require
    connect_timeout: 15

Next Steps