Skip to content
GreenKube

Architecture Overview

GreenKube is built on Clean Architecture and Hexagonal Architecture principles, ensuring modularity, testability, and extensibility.

PrincipleImplementation
Async-FirstAll I/O uses Python asyncio for non-blocking execution
Database AgnosticRepository pattern abstracts storage (PostgreSQL, SQLite, Elasticsearch)
Cloud AgnosticSupports AWS, GCP, Azure, OVH, Scaleway via mapping files
ResilientGraceful degradation when data sources are unavailable
TransparentClear flagging of estimated vs. measured values
ModularEach component is independently testable and replaceable
1 Presentation Layer
Web Dashboard SvelteKit
REST API FastAPI
CLI Typer + Rich
2 Business Logic Infrastructure-independent
DataProcessor Orchestrator
Carbon Calculator CO₂e computation
Recommender Analysis engine
BasicEstimator Energy model
Config 12-Factor
3 Adapters (I/O)
Input (Collectors)
PrometheusCollector
NodeCollector
PodCollector
HPACollector
OpenCostCollector
ElectricityMapsCollector
BoaviztaCollector
Output (Repositories)
PostgresRepository
SQLiteRepository
ElasticsearchRepo
NodeRepository
RecommendationRepo
EmbodiedRepository
src/greenkube/
├── __init__.py # Version
├── api/ # FastAPI server & endpoints
│ ├── app.py # Application factory
│ ├── routers/ # Route handlers
│ └── ...
├── cli/ # Typer CLI commands
│ ├── main.py # CLI entry point
│ └── ...
├── collectors/ # Input adapters
│ ├── base_collector.py # Abstract base class
│ ├── prometheus_collector.py
│ ├── node_collector.py
│ ├── pod_collector.py
│ ├── opencost_collector.py
│ ├── electricity_maps_collector.py
│ ├── hpa_collector.py # HPA detection
│ ├── boavizta_collector.py
│ └── discovery/ # Service auto-discovery
├── core/ # Business logic (no external deps)
│ ├── config.py # Configuration management
│ ├── factory.py # Repository & service factory
│ ├── processor.py # Data pipeline orchestrator
│ ├── collection_orchestrator.py # Async collector runner
│ ├── metric_assembler.py # Combines data into CombinedMetrics
│ ├── node_zone_mapper.py # Cloud zone → carbon zone mapping
│ ├── prometheus_resource_mapper.py # Per-pod resource maps
│ ├── cost_normalizer.py # Cost normalization per step
│ ├── historical_range_processor.py # Day-chunked range queries
│ ├── embodied_emissions_service.py # Boavizta integration
│ ├── calculator.py # Carbon emission calculator
│ ├── recommender.py # Optimization analysis
│ ├── aggregator.py # Metric aggregation
│ ├── scheduler.py # Collection scheduling
│ ├── db.py # Database initialization
│ ├── migration.py # Database schema migrations
│ ├── k8s_client.py # Kubernetes client helper
│ └── exceptions.py # Custom exceptions
├── energy/ # Energy modeling
│ └── estimator.py # BasicEstimator (power → energy)
├── data/ # Static data & profiles
│ ├── datacenter_pue_profiles.py
│ ├── instance_profiles.py
│ ├── provider_power_estimates.csv
│ └── electricity_maps_regions_grid_intensity_default.py
├── models/ # Pydantic data models
│ ├── metrics.py # CombinedMetric, CostMetric, etc.
│ ├── node.py # NodeInfo, NodeZoneContext
│ ├── k8s.py # Kubernetes-specific models
│ ├── boavizta.py # Boavizta API response models
│ ├── cli.py # CLI-specific models
│ └── region_mapping.py # Cloud region → carbon zone
├── storage/ # Output adapters (repositories)
│ ├── base_repository.py # Abstract interfaces (ABC)
│ ├── postgres_repository.py
│ ├── postgres_node_repository.py
│ ├── postgres_recommendation_repository.py
│ ├── postgres_carbon_intensity_repository.py
│ ├── sqlite_repository.py
│ ├── sqlite_node_repository.py
│ ├── sqlite_recommendation_repository.py
│ ├── sqlite_carbon_intensity_repository.py
│ ├── elasticsearch_repository.py
│ ├── elasticsearch_node_repository.py
│ ├── elasticsearch_carbon_intensity_repository.py
│ └── embodied_repository.py
├── reporters/ # Report formatting
├── exporters/ # Data export (CSV, JSON)
└── utils/ # Utilities

All storage operations go through abstract base classes defined in storage/base_repository.py:

# storage/base_repository.py (simplified)
class CarbonIntensityRepository(ABC):
@abstractmethod
async def get_for_zone_at_time(self, zone: str, timestamp: str) -> float | None: ...
@abstractmethod
async def save_history(self, history_data: list, zone: str) -> int: ...
@abstractmethod
async def write_combined_metrics(self, metrics: List[CombinedMetric]): ...
@abstractmethod
async def read_combined_metrics(self, start_time: datetime, end_time: datetime) -> List[CombinedMetric]: ...
class NodeRepository(ABC):
@abstractmethod
async def save_nodes(self, nodes: List[NodeInfo]) -> int: ...
@abstractmethod
async def get_snapshots(self, start: datetime, end: datetime) -> List[tuple[str, NodeInfo]]: ...
class RecommendationRepository(ABC):
@abstractmethod
async def save_recommendations(self, records: List[RecommendationRecord]) -> int: ...
@abstractmethod
async def get_recommendations(self, start: datetime, end: datetime, ...) -> List[RecommendationRecord]: ...

Implementations per backend:

  • PostgreSQLPostgresRepository, PostgresNodeRepository, PostgresRecommendationRepository
  • SQLiteSQLiteRepository, SQLiteNodeRepository, SQLiteRecommendationRepository
  • ElasticsearchElasticsearchRepository, ElasticsearchNodeRepository

The factory instantiates the correct implementation based on configuration:

# core/factory.py (simplified)
def get_repository(config: Config) -> CarbonIntensityRepository:
if config.DB_TYPE == "postgres":
return PostgresRepository(config.DB_CONNECTION_STRING)
elif config.DB_TYPE == "sqlite":
return SQLiteRepository(config.DB_PATH)
elif config.DB_TYPE == "elasticsearch":
return ElasticsearchRepository(config.ELASTICSEARCH_HOSTS)
def get_node_repository(config: Config) -> NodeRepository: ...
def get_recommendation_repository(config: Config) -> RecommendationRepository: ...
def get_embodied_repository(config: Config) -> EmbodiedRepository: ...

All collectors follow a consistent async interface:

class PrometheusCollector:
async def collect(self) -> PrometheusMetric:
"""Fetch all metrics concurrently."""
results = await asyncio.gather(
self._query_cpu(),
self._query_memory(),
self._query_network_rx(),
self._query_network_tx(),
self._query_disk_read(),
self._query_disk_write(),
self._query_restarts(),
self._query_node_labels(),
)
return PrometheusMetric(...)

GreenKube leverages Python’s asyncio throughout. The DataProcessor delegates to focused collaborators:

DataProcessor.run()
├── CollectionOrchestrator ← Parallel collection
│ asyncio.gather(
│ prometheus.collect(),
│ opencost.collect(),
│ pod_collector.collect(),
│ node_collector.collect()
│ )
├── PrometheusResourceMapper ← Build per-pod resource maps
├── BasicEstimator ← CPU → energy estimation
├── NodeZoneMapper ← Region → carbon zone
├── CostNormalizer ← Normalize costs per step
├── CarbonCalculator ← Energy → CO₂e
├── EmbodiedEmissionsService ← Boavizta embodied CO₂
├── MetricAssembler ← Combine into CombinedMetric
└── repository.write() ← Async batch insert

This design allows GreenKube to efficiently handle large clusters with hundreds of nodes and thousands of pods without blocking.

474+ unit tests cover all components:

  • pytest with pytest-asyncio for async tests
  • respx for HTTP request mocking
  • unittest.mock.AsyncMock for async component mocking
  • AAA pattern (Arrange, Act, Assert) throughout
Terminal window
# Run all tests
pytest
# Run with coverage
pytest --cov=greenkube --cov-report=html