Agent Troubleshooting
This guide helps you diagnose and fix common issues with the StatusRadar Agent.
Quick Diagnostics
Check Agent Status
sudo systemctl status statusradar-agent
Expected output when running:
โ statusradar-agent.service - StatusRadar Server Monitoring Agent
Loaded: loaded
Active: active (running) since ...
View Recent Logs
sudo journalctl -u statusradar-agent -n 100 --no-pager
Test Agent Manually
# Stop the service
sudo systemctl stop statusradar-agent
# Run manually to see errors
cd /opt/statusradar
python3 statusradar-agent.py
Common Issues
1. Agent Not Starting
Symptoms:
- Service fails to start
systemctl status
shows "failed" or "inactive"
Diagnosis:
sudo journalctl -u statusradar-agent -n 50 --no-pager
Common Causes:
Invalid Token
Error in logs:
ERROR: Authentication failed: Invalid token
ERROR: HTTP 401 Unauthorized
Solution:
- Verify token in
/opt/statusradar/config/agent.yaml
- Get correct token from dashboard
- Restart agent:
sudo systemctl restart statusradar-agent
Missing Dependencies
Error in logs:
ModuleNotFoundError: No module named 'requests'
ImportError: No module named 'psutil'
Solution:
cd /opt/statusradar
sudo pip3 install -r requirements.txt
sudo systemctl restart statusradar-agent
Configuration Syntax Error
Error in logs:
yaml.scanner.ScannerError: mapping values are not allowed here
Solution:
- Check YAML syntax in
/opt/statusradar/config/agent.yaml
- Use a YAML validator: https://www.yamllint.com/
- Common mistakes:
- Incorrect indentation (use spaces, not tabs)
- Missing colons
- Quotes in strings with special characters
# Wrong
plugins:
redis:
enabled: true # Wrong indentation
# Correct
plugins:
redis:
enabled: true
Permission Denied
Error in logs:
PermissionError: [Errno 13] Permission denied: '/opt/statusradar/config/agent.yaml'
Solution:
# Fix file permissions
sudo chown -R root:root /opt/statusradar
sudo chmod 755 /opt/statusradar
sudo chmod 600 /opt/statusradar/config/agent.yaml
# Ensure service runs as root
sudo systemctl restart statusradar-agent
2. Metrics Not Appearing in Dashboard
Symptoms:
- Agent is running but dashboard shows "No data"
- Server appears offline
- Last update time doesn't change
Diagnosis:
Check Agent Logs
sudo journalctl -u statusradar-agent -f
Look for:
INFO: Metrics collected successfully
INFO: Sent metrics to API
Check Network Connectivity
# Test API connectivity
curl -I https://api.statusradar.dev
# Test from agent directory
cd /opt/statusradar
python3 -c "import requests; print(requests.get('https://api.statusradar.dev/health').status_code)"
Expected: 200
Common Causes:
Firewall Blocking Outbound HTTPS
Solution:
# Allow HTTPS outbound (Ubuntu/Debian)
sudo ufw allow out 443/tcp
# Allow HTTPS outbound (CentOS/RHEL)
sudo firewall-cmd --permanent --add-service=https
sudo firewall-cmd --reload
Wrong API URL
Error in logs:
ERROR: Connection failed: Could not resolve host
Solution:
Check /opt/statusradar/config/agent.yaml
:
api:
url: https://api.statusradar.dev # Correct URL
API Rate Limiting
Error in logs:
ERROR: HTTP 429 Too Many Requests
Solution:
- Increase collection interval in config (minimum 60 seconds)
- Contact support if issue persists
Server Already Registered
If you're reinstalling the agent, the server might already exist in the dashboard.
Solution:
- Check dashboard for existing server
- Use existing token or delete old server first
3. Plugin Not Working
Symptoms:
- Plugin-specific metrics missing
- Plugin errors in logs
Diagnosis:
Test Plugin Manually
cd /opt/statusradar
python3 plugins/redis_plugin.py
python3 plugins/mysql_plugin.py
# etc.
Expected output:
Plugin: redis
Enabled: True
Available: True
Collecting metrics...
{
"used_memory": 1234567,
"connected_clients": 5,
...
}
Common Causes:
Plugin Not Enabled
Solution:
Check /opt/statusradar/config/agent.yaml
:
plugins:
redis:
enabled: true # Must be true
Service Not Running
Error:
ConnectionRefusedError: [Errno 111] Connection refused
Solution:
# Check if service is running
sudo systemctl status redis
sudo systemctl status mysql
sudo systemctl status postgresql
sudo systemctl status nginx
# etc.
# Start if needed
sudo systemctl start redis
Wrong Connection Parameters
Error:
ERROR: Authentication failed
ERROR: Access denied for user
Solution: Verify connection parameters in config:
plugins:
mysql:
host: localhost # Check hostname
port: 3306 # Check port
user: monitor # Check username
password: secret # Check password
Missing Permissions
MySQL Example:
-- Grant necessary permissions
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'monitor'@'localhost';
FLUSH PRIVILEGES;
PostgreSQL Example:
-- Grant pg_monitor role
GRANT pg_monitor TO monitor;
Service Configuration Missing
Nginx Example:
Error:
ERROR: HTTP 404 Not Found
Solution: Enable stub_status in nginx:
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
Reload nginx:
sudo nginx -t
sudo systemctl reload nginx
4. High Resource Usage
Symptoms:
- Agent using excessive CPU or memory
- System slowdown
Diagnosis:
# Check agent resource usage
ps aux | grep statusradar-agent
# Check memory usage
sudo systemctl status statusradar-agent | grep Memory
Solutions:
Reduce Collection Frequency
agent:
interval: 600 # Increase from 300 to 600 seconds
Disable Unused Plugins
plugins:
redis:
enabled: false # Disable if not needed
Limit Docker Monitoring
For systems with many containers:
plugins:
docker:
enabled: true
max_containers: 50 # Limit monitored containers
5. SSL/TLS Errors
Error in logs:
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED]
Solutions:
Update CA Certificates
# Ubuntu/Debian
sudo apt-get update && sudo apt-get install -y ca-certificates
sudo update-ca-certificates
# CentOS/RHEL
sudo yum install -y ca-certificates
sudo update-ca-trust
Python SSL Certificate
# Reinstall certifi
sudo pip3 install --upgrade certifi
# Or use system certificates
export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
6. Time Synchronization Issues
Symptoms:
- Metrics timestamps incorrect
- Authentication failures
Solution:
# Install NTP
sudo apt-get install -y ntp # Ubuntu/Debian
sudo yum install -y ntp # CentOS/RHEL
# Start NTP service
sudo systemctl start ntp
sudo systemctl enable ntp
# Check time sync
timedatectl status
7. Agent Stops After Some Time
Symptoms:
- Agent runs initially but stops after hours/days
- No errors in logs before stopping
Diagnosis:
# Check if service auto-restart is enabled
systemctl show statusradar-agent | grep Restart
Expected: Restart=on-failure
Solutions:
Update Systemd Service
Edit /etc/systemd/system/statusradar-agent.service
:
[Service]
Restart=on-failure
RestartSec=30
Reload and restart:
sudo systemctl daemon-reload
sudo systemctl restart statusradar-agent
Check System Resources
# Check available memory
free -h
# Check available disk space
df -h /opt/statusradar
Review System Logs
# Check for OOM killer
sudo journalctl -k | grep -i "killed process"
# Check for system errors
sudo journalctl -xe
Log Interpretation
Successful Collection
INFO: Collecting system metrics
INFO: Plugin redis: Metrics collected successfully
INFO: Plugin mysql: Metrics collected successfully
INFO: Sending metrics to API
INFO: Metrics sent successfully
Warning Signs
WARNING: Plugin redis: Connection timeout (retrying)
WARNING: API request took 5.2s (slow network)
These are usually temporary. If persistent, investigate.
Error Signs
ERROR: Authentication failed
ERROR: Plugin mysql: Access denied
ERROR: Cannot connect to API
These require immediate attention.
Advanced Debugging
Enable Debug Logging
Edit agent code or add to config:
import logging
logging.basicConfig(level=logging.DEBUG)
Network Debugging
# Monitor network traffic
sudo tcpdump -i any -nn port 443
# Test DNS resolution
nslookup api.statusradar.dev
# Test with curl
curl -v -H "Authorization: Bearer YOUR_TOKEN" \
https://api.statusradar.dev/v1/agent/config
Plugin-Specific Debugging
# Test plugin with verbose output
python3 -c "
import sys
sys.path.insert(0, '/opt/statusradar')
from plugins.redis_plugin import RedisPlugin
plugin = RedisPlugin()
print(plugin.collect())
"
Getting Help
If you can't resolve the issue:
-
Collect Information:
# System info uname -a python3 --version # Agent version head -20 /opt/statusradar/statusradar-agent.py | grep VERSION # Recent logs sudo journalctl -u statusradar-agent -n 200 --no-pager > agent-logs.txt # Configuration (remove sensitive data!) cat /opt/statusradar/config/agent.yaml
-
Contact Support:
- Email: [email protected]
- Include: OS version, Python version, error logs, configuration
-
Community Forums:
- GitHub Issues: https://github.com/statusradar/agent/issues
- Community: https://community.statusradar.dev
Preventive Measures
Regular Maintenance
# Update agent (monthly)
curl -sL https://statusradar.dev/install-agent.sh | sudo bash -s update
# Check logs weekly
sudo journalctl -u statusradar-agent -n 100 --no-pager
# Monitor dashboard daily
# https://statusradar.dev/dashboard/servers
Set Up Monitoring
Create an alert for agent offline:
- Go to Dashboard > Alerts
- Create alert for "Server Offline"
- Get notified when agent stops sending metrics
Backup Configuration
# Backup config file
sudo cp /opt/statusradar/config/agent.yaml \
/opt/statusradar/config/agent.yaml.backup
Frequently Asked Questions
Q: Can I run multiple agents on the same server?
No. One agent per server. Use plugins to monitor multiple services.
Q: What happens if agent is offline temporarily?
Historical data is preserved. Resume monitoring when agent comes back online.
Q: How do I completely remove the agent?
See Uninstallation Guide.
Q: Can I monitor remote services?
Yes! Configure plugins with remote hostnames:
plugins:
redis:
host: remote-server.example.com
port: 6379
Q: How much bandwidth does the agent use?
~1-5 KB per minute (basic metrics) + 5-10 KB per enabled plugin.
Q: Is it safe to run as root?
Yes. Agent needs root for system metrics. Follow security best practices.
Next Steps
- Quick Diagnostics
- Check Agent Status
- View Recent Logs
- Test Agent Manually
- Common Issues
- 1. Agent Not Starting
- 2. Metrics Not Appearing in Dashboard
- 3. Plugin Not Working
- 4. High Resource Usage
- 5. SSL/TLS Errors
- 6. Time Synchronization Issues
- 7. Agent Stops After Some Time
- Log Interpretation
- Successful Collection
- Warning Signs
- Error Signs
- Advanced Debugging
- Enable Debug Logging
- Network Debugging
- Plugin-Specific Debugging
- Getting Help
- Preventive Measures
- Regular Maintenance
- Set Up Monitoring
- Backup Configuration
- Frequently Asked Questions
- Q: Can I run multiple agents on the same server?
- Q: What happens if agent is offline temporarily?
- Q: How do I completely remove the agent?
- Q: Can I monitor remote services?
- Q: How much bandwidth does the agent use?
- Q: Is it safe to run as root?
- Next Steps