Skip to content
GreenKube

Multi-Resource Monitoring

GreenKube provides comprehensive resource monitoring that goes far beyond basic CPU and memory metrics, giving you full visibility into your cluster’s resource consumption.

  • Usage — Actual CPU seconds consumed per pod
  • Requests vs. Limits — Configuration vs. actual consumption
  • Throttling — Detect pods being CPU-throttled
  • Per-core breakdown — Utilization across individual cores
  • Working set — Actual memory in use
  • RSS — Resident Set Size
  • Requests vs. Limits — Identify over/under-provisioned workloads
  • OOM risk — Pods approaching their memory limits
  • Bytes transmitted — Outbound network traffic per pod
  • Bytes received — Inbound network traffic per pod
  • Packet rate — Packets per second for anomaly detection
  • Cross-namespace traffic — East-west traffic patterns
  • Read throughput — Bytes read per second
  • Write throughput — Bytes written per second
  • IOPS — I/O operations per second
  • Latency — Read/write latency percentiles
  • PVC usage — Persistent Volume Claim utilization
  • Capacity planning — Growth trends and forecasting
  • Orphaned volumes — PVCs not attached to any pod
  • Restart count — Track instability across workloads
  • Uptime — Time since last restart
  • Phase — Running, Pending, Failed, Succeeded
  • Container status — Individual container readiness
  • GPU utilization — Percentage of GPU compute in use
  • GPU memory — VRAM usage
  • GPU power — Watts consumed by GPU
  • GPU temperature — Thermal monitoring

GreenKube collects metrics from multiple sources:

SourceMetrics
PrometheusCPU, memory, network, disk, GPU
Kubernetes APIPod status, restarts, node info, HPAs
OpenCostCost allocation data
Electricity MapsCarbon intensity per region
Prometheus ─┐
K8s API ────┼──→ Async Collector ──→ Processor ──→ Storage
OpenCost ───┤ │
Elec. Maps ─┘ Dashboard/API

The collection pipeline runs asynchronously using asyncio.gather, ensuring minimal overhead on your cluster.

  • Raw metrics — Configurable retention (default: 30 days)
  • Hourly aggregation — Kept for 90 days
  • Daily aggregation — Kept for 1 year
  • Export — CSV/JSON for any time range