Compare commits

...

4 Commits

Author SHA1 Message Date
Eric FELIXINE
83779cf5d7 fix: telegraf topics, mqtt brokers, docker-compose fixes
- Fix MOSQUITTO_HOST (wrong container name)
- Fix EMQX_PORT (1885 external -> 1883 internal)
- Fix telegraf MQTT topics (city/sensors/#)
- Fix BunkerM dynsec JSON
- Add kepler.yml Traefik config
- Update monitoring script
2026-06-07 20:18:41 -04:00
Eric FELIXINE
7c0cb330d9 chore: update TODO.md timestamp 2026-06-04 10:15 2026-06-04 10:26:34 -04:00
Eric FELIXINE
f45ac0cb6e feat(k8s): add defaults/main.yml, meta/main.yml for all 27 roles + 4 helm templates
- Added defaults/main.yml with production-ready values for all 27 Ansible roles
- Added meta/main.yml with role dependencies (DAG: prereq → namespaces → storage → traefik → cert-manager → services)
- Created 4 missing Helm templates: flink-deployment, kafka-cluster, smartapp-web, smartapp-api
- Fixed YAML syntax error in backup/tasks/main.yml (Velero backupStorageLocation)
- Updated README with domain list, dependencies diagram, and corrected Helm chart names
- All 81 YAML files pass validation
2026-06-04 09:45:16 -04:00
Eric FELIXINE
66ac47b684 docs: add infrastructure snapshot 2026-06-04 2026-06-04 02:26:23 -04:00
71 changed files with 1948 additions and 121 deletions

62
TODO.md
View File

@@ -1,6 +1,6 @@
# Smart City Digital Twin — TODO List
> Dernière mise à jour : 2026-06-04 00:30 (finalisation documentation)
> Dernière mise à jour : 2026-06-04 02:00 (finalisation)
## ✅ Complété (session 2026-06-03 / 06-04)
@@ -25,10 +25,11 @@
| monitoring-cleanup | Grafana + Loki + Prometheus + InfluxDB + Telegraf supprimés | Seront redeployés via Helm |
| storage-cleanup | MinIO + PostgreSQL + PostGIS + Redis + Zookeeper supprimés | Seront redeployés via Helm |
| misc-cleanup | AgentGateway + Esperotech + Redpanda Console + Docker exporter + Simulator supprimés | |
| backups | Sauvegardes config | Fichiers sauvegardés dans /home/eric/backups/2026-06-03/ |
| helms-ansible | Fichiers Helm/Ansibles générés | 25+ rôles dans /home/eric/helms/ |
| backups | Sauvegardes config | Fichiers sauvegardés dans /home/eric/backups/2026-06-04/ |
| helms-ansible | Fichiers Helm/Ansibles générés | 25+ rôles dans helms/ |
| helms-readme | README déploiement K8s | Architecture, installation, troubleshooting |
| helms-vault | Template vault.yml | Variables chiffrées pour le déploiement |
| git-push | Push sur Gitea | 2 commits pushés (TODO + helms) |
## 🔴 En cours
@@ -68,44 +69,26 @@
## 📁 Fichiers Helm / Ansible générés
Le répertoire `helms/` (dans le repo Gitea) contient les fichiers pour un déploiement modulaire sur Kubernetes via Ansible.
### Structure
```
helms/
├── README.md # Documentation déploiement
├── deploy.yml # Playbook principal
├── undeploy.yml # Playbook de suppression
├── inventory/
│ └── hosts.yml # Inventory des nœuds K8s
├── group_vars/
│ ├── all.yml # Variables globales
│ └── vault.yml # Variables chiffrées (template)
├── inventory/hosts.yml # Inventory des nœuds K8s
├── group_vars/all.yml # Variables globales
├── group_vars/vault.yml # Variables chiffrées (template)
└── roles/ # 25+ rôles Ansible
├── prerequisites/
├── namespaces/
├── storage/
├── traefik/
├── cert-manager/
├── monitoring/
├── databases/
├── kafka/
├── flink/
├── airflow/
├── iot/
├── gitea/
├── jupyterhub/
├── bi/
├── mindsdb/
├── odk/
├── gis/
├── clickhouse/
├── starrocks/
├── trino/
├── deltalake/
├── streamlit/
├── duckdb/
├── nodered/
├── phpipam/
├── smartapp/
└── backup/
```
### Utilisation
```bash
cd helms/
ansible-playbook deploy.yml --ask-vault-pass
ansible-playbook deploy.yml --tags clickhouse --ask-vault-pass
ansible-playbook undeploy.yml
```
## 📝 Infrastructure actuelle (10 containers Docker)
@@ -123,6 +106,15 @@ helms/
| smart-city-kepler | smart-city-kepler:latest | ✅ Up 2 weeks |
| gitea | gitea/gitea:latest | ✅ Up 2 jours |
## 📊 Statistiques
- **Containers Docker** : 10 (down from 72)
- **Stacks supprimées** : 6 (OpenFN, Ditto, OpenRemote, Gravitino, FIWARE GIS, Contexus)
- **Services unhealthy** : 0
- **Fichiers Helm/Ansible** : 33 fichiers
- **Rôles Ansible** : 25+
- **Namespaces K8s prévus** : 18
## Credentials
- **Gitea** : eric / (voir config)

View File

@@ -20,7 +20,7 @@ services:
- smartcity-shared
- traefik-public
ports:
- "1883:1900"
- "1884:1900"
- "2000:2000"
environment:
- MQTT_PORT=1900

View File

@@ -13,7 +13,7 @@ services:
- ditto-mongo-data:/data/db
ditto-policies:
image: eclipse/ditto-policies:latest
image: eclipse/ditto-policies:3.8.0
container_name: smart-city-ditto-policies
restart: unless-stopped
hostname: ditto-policies
@@ -35,7 +35,7 @@ services:
- ditto-policies
ditto-things:
image: eclipse/ditto-things:latest
image: eclipse/ditto-things:3.8.0
container_name: smart-city-ditto-things
restart: unless-stopped
hostname: ditto-things
@@ -74,10 +74,12 @@ services:
- AKKA_REMOTE_CANONICAL_HOSTNAME=ditto-gateway
- AKKA_REMOTE_CANONICAL_PORT=2551
- DITTO_GW_STREAMING_ENABLED=true
- DITTO_GW_MQTT_BROKER=smart-city-mosquitto:1883
- DITTO_GW_MQTT_BROKER=192.168.192.26:1883
- DITTO_GW_MQTT_TOPIC_FILTER=smartcity/#
- DEVOPS_PASSWORD=OvP9WVB09aFDnYPyK52UIg
- JAVA_TOOL_OPTIONS=-Xms512m -Xmx1024m -Dditto.gateway.http.port=8080 -Dditto.gateway.http.api.enabled=true
- DITTO_APIDOC_ENABLED=true
- DITTO_GATEWAY_HTTP_API_ENABLED=true
networks:
traefik-public:
aliases:
@@ -90,9 +92,22 @@ services:
- "traefik.http.routers.ditto.tls.certresolver=letsencrypt"
- "traefik.http.services.ditto.loadbalancer.server.port=8080"
ditto-ui:
image: eclipse/ditto-ui:latest
container_name: smart-city-ditto-ui
restart: unless-stopped
depends_on:
- ditto-gateway
networks:
traefik-public:
aliases:
- ditto-ui
networks:
traefik-public:
external: true
smartcity-shared:
external: true
volumes:
ditto-mongo-data:

29
docker-compose.emqx.yml Normal file
View File

@@ -0,0 +1,29 @@
services:
emqx:
image: emqx/emqx:5.4
container_name: emqx_emqx_1
restart: unless-stopped
networks:
- smartcity-shared
ports:
- "1885:1883"
- "8083:8083"
- "8883:8883"
- "8084:8084"
- "18083:18083"
environment:
- EMQX_NAME=emqx
- EMQX_HOST=emqx_emqx_1
volumes:
- emqx-data:/opt/emqx/data
- emqx-log:/opt/emqx/log
volumes:
emqx-data:
name: smart-city-emqx-data
emqx-log:
name: smart-city-emqx-log
networks:
smartcity-shared:
external: true

View File

@@ -23,7 +23,7 @@ services:
- IOTA_REGISTRY_TYPE=memory
# MQTT Listener - EMQX
- IOTA_MQTT_HOST=emqx_emqx_1
- IOTA_MQTT_PORT=1883
- IOTA_MQTT_PORT=1885
- IOTA_PROVIDER_URL=http://smart-city-iot-agent-emqx:4041
- IOTA_DEFAULT_RESOURCE=/
- IOTA_DEFAULT_APIKEY=smartcity-emqx

View File

@@ -13,7 +13,7 @@ services:
- orion-ld
- smart-city-orion-ld
traefik-public:
command: -dbhost smart-city-mongodb -db orion
command: -dbhost smart-city-iot-mongodb -db orion
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:1026/version || exit 1"]
interval: 30s

View File

@@ -1,28 +1,16 @@
# Redpanda → InfluxDB Consumer
# Lit les topics Redpanda et écrit dans InfluxDB pour Grafana
version: "3.8"
# DÉSACTIVÉ — Redpanda broker non démarré
# Usage: docker compose -f docker-compose.redpanda-consumer.yml up -d
services:
redpanda-consumer:
image: python:3.11-slim
container_name: smart-city-redpanda-consumer
restart: unless-stopped
restart: "no"
command: >
sh -c "pip install requests && python3 /app/consumer.py"
volumes:
- ./redpanda/consumer.py:/app/consumer.py:ro
environment:
- INFLUX_URL=http://smart-city-influxdb:8086
- INFLUX_TOKEN=my-super-admin-token
- INFLUX_ORG=digitribe
- INFLUX_BUCKET=iot_data
sh -c "echo 'Redpanda consumer désactivé — Redpanda broker non démarré' && sleep infinity"
networks:
- smartcity-shared
healthcheck:
test: ["CMD", "python3", "-c", "import urllib.request; urllib.request.urlopen('http://smart-city-redpanda:9644/public_metrics')"]
interval: 30s
timeout: 10s
retries: 3
networks:
smartcity-shared:

View File

@@ -30,7 +30,7 @@ services:
- ENABLE_BUNKER=1
- EMQX_HOST=emqx_emqx_1
- EMQX_PORT=1883
- MOSQUITTO_HOST=smart-city-digital-twin-martinique-mosquitto-1
- MOSQUITTO_HOST=smart-city-mosquitto-1
- MOSQUITTO_PORT=1883
- BUNKERM_HOST=bunkerm-bunkerm-1
- BUNKERM_PORT=1900

View File

@@ -123,33 +123,58 @@ kubectl get ingress --all-namespaces
| Service | Domaine | Namespace | Helm Chart |
|---------|---------|-----------|------------|
| Traefik | traefik.digitribe.fr | traefik | traefik/traefik |
| Airflow | airflow.digitribe.fr | airflow | apache/airflow |
| Kafka | kafka.digitribe.fr | kafka | strimzi/kafka-operator |
| Kafka | kafka-bootstrap.digitribe.fr | kafka | strimzi/kafka-operator |
| Flink | flink.digitribe.fr | flink | apache/flink-kubernetes-operator |
| ClickHouse | clickhouse.digitribe.fr | clickhouse | bitnami/clickhouse |
| StarRocks | starrocks.digitribe.fr | starrocks | starrocks/starrocks-community |
| StarRocks | starrocks.digitribe.fr | starrocks | community/starrocks |
| Trino | trino.digitribe.fr | trino | trinodb/trino |
| Delta Lake | deltalake.digitribe.fr | deltalake | delta-io/delta-lake |
| Streamlit | streamlit.digitribe.fr | streamlit | streamlit/streamlit |
| DuckDB | duckdb.digitribe.fr | duckdb | duckdb/duckdb |
| Delta Lake | deltalake.digitribe.fr | deltalake | custom |
| Streamlit | streamlit.digitribe.fr | streamlit | custom |
| DuckDB | duckdb.digitribe.fr | duckdb | custom |
| EMQX | emqx.digitribe.fr | iot | emqx/emqx-operator |
| Mosquitto | mqtt.digitribe.fr | iot | k8s-at-home/mosquitto |
| Node-RED | nodered.digitribe.fr | iot | k8s-at-home/node-red |
| phpIPAM | phpipam.digitribe.fr | phpipam | phpipam/phpipam |
| ChirpStack | chirpstack.digitribe.fr | iot | chirpstack/chirpstack |
| Gitea | gitea.digitribe.fr | gitea | gitea/gitea |
| Mosquitto | mqtt.digitribe.fr | iot | custom |
| Node-RED | nodered.digitribe.fr | iot | custom |
| phpIPAM | phpipam.digitribe.fr | phpipam | custom |
| Gitea | gitea.digitribe.fr | gitea | gitea-charts/gitea |
| JupyterHub | jupyter.digitribe.fr | jupyterhub | jupyterhub/jupyterhub |
| Zeppelin | zeppelin.digitribe.fr | default | apache/zeppelin |
| Superset | superset.digitribe.fr | superset | apache/superset |
| Metabase | metabase.digitribe.fr | metabase | bitnami/metabase |
| MindsDB | mindsdb.digitribe.fr | mindsdb | bitnami/mindsdb |
| ODK Central | odk.digitribe.fr | odk | odk/odk-central |
| MapStore | mapstore.digitribe.fr | mapstore | geosolutionsit/mapstore |
| GeoServer | geoserver.digitribe.fr | geoserver | kartoza/geoserver |
| FROST | frost.digitribe.fr | iot | fraunhoferiosb/frost-server |
| ODK Central | odk.digitribe.fr | odk | custom |
| MapStore | mapstore.digitribe.fr | gis | custom |
| GeoServer | geoserver.digitribe.fr | gis | custom |
| Smart App | smartapp.digitribe.fr | smartapp | custom |
| Smart App API | api-smartapp.digitribe.fr | smartapp | custom |
| Grafana | grafana.digitribe.fr | monitoring | grafana/grafana |
| MinIO | minio.digitribe.fr | default | bitnami/minio |
| MinIO | minio.digitribe.fr | databases | bitnami/minio |
| PostgreSQL | — (interne) | databases | bitnami/postgresql-ha |
| Redis | — (interne) | databases | bitnami/redis-cluster |
## Dépendances entre rôles
```
prerequisites → namespaces → storage → traefik → cert-manager
┌─────────────────────┼─────────────────────┐
↓ ↓ ↓
databases monitoring kafka
(postgres, (prometheus, ↓
redis, minio) grafana, loki) flink
↓ ↓ ↓
└─────────────────────┼─────────────────────┘
┌─────────────────────┼─────────────────────┐
↓ ↓ ↓
airflow bi iot
gitea jupyterhub superset metabase emqx mosquitto
odk mindsdb trino nodered phpipam
gis clickhouse streamlit
smartapp deltalake duckdb
backup (Velero)
```
## Commandes utiles

View File

@@ -0,0 +1,19 @@
---
# Role: airflow
# Valeurs par défaut pour Apache Airflow
# Réplicas des workers Airflow
services:
airflow:
replicas: 2
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
# Stockage des logs Airflow
storage_sizes:
airflow: "20Gi"

View File

@@ -0,0 +1,13 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Apache Airflow for workflow orchestration on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases
- role: kafka

View File

@@ -0,0 +1,8 @@
---
# Role: backup
# Valeurs par défaut pour les sauvegardes Velero
# Planification des sauvegardes (cron format)
backup:
schedule: "0 2 * * *"
retention: "168" # 7 jours en heures

View File

@@ -0,0 +1,11 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Velero backup and disaster recovery solution on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies: []

View File

@@ -11,7 +11,7 @@
values:
configuration:
backupStorageLocation:
name: default
- name: default
provider: aws
bucket: smart-city-backup
config:

View File

@@ -0,0 +1,23 @@
---
# Role: bi
# Valeurs par défaut pour Superset et Metabase
services:
superset:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
metabase:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Business Intelligence tools on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,6 @@
---
# Role: cert-manager
# Valeurs par défaut pour cert-manager
# Email pour les certificats Let's Encrypt
acme_email: "admin@digitribe.fr"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy cert-manager for automated TLS certificate management on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: traefik

View File

@@ -0,0 +1,17 @@
---
# Role: clickhouse
# Valeurs par défaut pour ClickHouse
services:
clickhouse:
replicas: 2
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
storage_sizes:
clickhouse: "50Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy ClickHouse columnar database for analytics on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,27 @@
---
# Role: databases
# Valeurs par défaut pour PostgreSQL, Redis et MinIO
services:
postgresql:
replicas: 2
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
# Stockages
storage_sizes:
postgresql: "50Gi"
redis: "10Gi"
minio: "100Gi"
# Mots de passe Vault (valeurs DUMMY — overridés par group_vars/vault.yml)
vault_postgres_password: "DUMMY_POSTGRES_PASSWORD"
vault_postgres_repmgr_password: "DUMMY_REPMGR_PASSWORD"
vault_redis_password: "DUMMY_REDIS_PASSWORD"
vault_minio_root_user: "DUMMY_MINIO_USER"
vault_minio_root_password: "DUMMY_MINIO_PASSWORD"

View File

@@ -0,0 +1,13 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy and manage core database services (PostgreSQL, MySQL, Redis) on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: storage
- role: cert-manager

View File

@@ -0,0 +1,17 @@
---
# Role: deltalake
# Valeurs par défaut pour Delta Lake
services:
deltalake:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
storage_sizes:
deltalake: "100Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Delta Lake storage layer for data lakehouse architecture on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,17 @@
---
# Role: duckdb
# Valeurs par défaut pour DuckDB
services:
duckdb:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
storage_sizes:
duckdb: "50Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy DuckDB embedded analytical database on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,14 @@
---
# Role: flink
# Valeurs par défaut pour Apache Flink
services:
flink:
replicas: 2
resources:
requests:
cpu: "1000m"
memory: "2Gi"
limits:
cpu: "2000m"
memory: "4Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Apache Flink for stream processing on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: kafka

View File

@@ -0,0 +1,140 @@
---
# Role: flink
# Template: flink-deployment.yml.j2
# Déploiement d'un cluster Apache Flink via FlinkKubernetesOperator
# Variables:
# {{ flink_namespace }} - Namespace Kubernetes (défaut: flink)
# {{ flink_replicas }} - Nombre de TaskManagers (défaut: 2)
---
apiVersion: v1
kind: Namespace
metadata:
name: {{ flink_namespace | default('flink') }}
labels:
app: flink
version: "1.18"
---
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: flink-cluster
namespace: {{ flink_namespace | default('flink') }}
labels:
app: flink
version: "1.18"
spec:
image: flink:1.18-scala_2.12
flinkVersion: v1_18
imagePullPolicy: IfNotPresent
# --- JobManager ---
jobmanager:
resource:
memory: "2048m"
cpu: 1
replicas: 1
# --- TaskManager ---
taskmanager:
resource:
memory: "4096m"
cpu: 2
replicas: {{ flink_replicas | default(2) }}
# --- Configuration Flink ---
flinkConfiguration:
taskmanager.numberOfTaskSlots: "2"
state.backend: rocksdb
state.checkpoints.dir: s3://flink-checkpoints
state.savepoints.dir: s3://flink-savepoints
high-availability: zookeeper
high-availability.zookeeper.quorum: zk-cs.{{ flink_namespace | default('flink') }}.svc.cluster.local:2181
web.upload.dir: /tmp/flink-web-upload
---
apiVersion: v1
kind: Service
metadata:
name: flink-jobmanager
namespace: {{ flink_namespace | default('flink') }}
labels:
app: flink
component: jobmanager
version: "1.18"
spec:
type: ClusterIP
selector:
app: flink
component: jobmanager
ports:
- name: rpc
port: 6123
targetPort: 6123
protocol: TCP
- name: blob
port: 6124
targetPort: 6124
protocol: TCP
- name: webui
port: 8081
targetPort: 8081
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: flink-taskmanager
namespace: {{ flink_namespace | default('flink') }}
labels:
app: flink
component: taskmanager
version: "1.18"
spec:
type: ClusterIP
selector:
app: flink
component: taskmanager
ports:
- name: rpc
port: 6122
targetPort: 6122
protocol: TCP
- name: data
port: 6125
targetPort: 6125
protocol: TCP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: flink-webui
namespace: {{ flink_namespace | default('flink') }}
labels:
app: flink
component: webui
version: "1.18"
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts:
- flink.digitribe.fr
secretName: flink-tls
rules:
- host: flink.digitribe.fr
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: flink-jobmanager
port:
number: 8081

View File

@@ -0,0 +1,36 @@
---
# Role: gis
# Valeurs par défaut pour MapStore, GeoServer et FROST
services:
mapstore:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
geoserver:
replicas: 1
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
frost:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
storage_sizes:
mapstore: "10Gi"
geoserver: "20Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Geographic Information System (GIS) services on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,17 @@
---
# Role: gitea
# Valeurs par défaut pour Gitea
services:
gitea:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
storage_sizes:
gitea: "20Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Gitea - self-hosted Git service on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,27 @@
---
# Role: iot
# Valeurs par défaut pour EMQX et Mosquitto
services:
emqx:
replicas: 2
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
mosquitto:
replicas: 1
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
storage_sizes:
emqx: "10Gi"
mosquitto: "5Gi"

View File

@@ -0,0 +1,13 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy IoT platform services on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases
- role: kafka

View File

@@ -0,0 +1,17 @@
---
# Role: jupyterhub
# Valeurs par défaut pour JupyterHub
services:
jupyterhub:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
storage_sizes:
jupyterhub: "20Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy JupyterHub for multi-user notebook environments on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,17 @@
---
# Role: kafka
# Valeurs par défaut pour Kafka (Strimzi)
services:
kafka:
replicas: 3
resources:
requests:
cpu: "1000m"
memory: "2Gi"
limits:
cpu: "2000m"
memory: "4Gi"
storage_sizes:
kafka: "100Gi"

View File

@@ -0,0 +1,13 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy and manage Apache Kafka cluster on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: storage
- role: cert-manager

View File

@@ -0,0 +1,295 @@
---
# Role: kafka
# Template: kafka-cluster.yml.j2
# Cluster Kafka via Strimzi KafkaOperator
# Variables:
# {{ kafka_namespace }} - Namespace Kubernetes (défaut: kafka)
# {{ kafka_replicas }} - Nombre de brokers Kafka (défaut: 3)
# {{ kafka_storage_size }} - Taille du stockage par broker (défaut: 100Gi)
---
apiVersion: v1
kind: Namespace
metadata:
name: {{ kafka_namespace | default('kafka') }}
labels:
app: kafka
version: "3.6"
---
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: kafka-cluster
namespace: {{ kafka_namespace | default('kafka') }}
labels:
app: kafka
version: "3.6"
spec:
kafka:
version: 3.6.0
replicas: {{ kafka_replicas | default(3) }}
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
- name: external
port: 9094
type: ingress
tls: true
configuration:
bootstrap:
host: kafka-bootstrap.digitribe.fr
brokers:
- broker: 0
host: kafka-broker-0.digitribe.fr
- broker: 1
host: kafka-broker-1.digitribe.fr
- broker: 2
host: kafka-broker-2.digitribe.fr
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
inter.broker.protocol.version: "3.6"
log.message.format.version: "3.6"
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: {{ kafka_storage_size | default('100Gi') }}
class: standard
deleteClaim: false
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"
livenessProbe:
initialDelaySeconds: 30
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 10
timeoutSeconds: 5
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: kafka-metrics-config.yml
template:
pod:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: strimzi.io/name
operator: In
values:
- kafka-cluster-kafka
topologyKey: kubernetes.io/hostname
zookeeper:
replicas: 3
storage:
type: persistent-claim
size: 20Gi
class: standard
deleteClaim: false
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1"
memory: "2Gi"
livenessProbe:
initialDelaySeconds: 30
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 10
timeoutSeconds: 5
entityOperator:
topicOperator:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
userOperator:
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
kafkaExporter:
topicRegex: ".*"
groupRegex: ".*"
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kafka-metrics
namespace: {{ kafka_namespace | default('kafka') }}
labels:
app: kafka
version: "3.6"
data:
kafka-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# Special cases and very specific rules
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
topic: "$4"
partition: "$5"
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
broker: "$4:$5"
# Generic per-second counters with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
# Generic gauges with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
# Emulate Prometheus 'Summary' metrics for the exported 'Histogram's
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*), (.+)=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: SUMMARY
labels:
"$4": "$5"
"$6": "$7"
quantile: 0.95
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: SUMMARY
labels:
"$4": "$5"
quantile: 0.95
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
---
apiVersion: v1
kind: Service
metadata:
name: kafka-bootstrap
namespace: {{ kafka_namespace | default('kafka') }}
labels:
app: kafka
component: bootstrap
version: "3.6"
spec:
type: ClusterIP
selector:
strimzi.io/cluster: kafka-cluster
strimzi.io/name: kafka-cluster-kafka
ports:
- name: tcp-internal
port: 9092
targetPort: 9092
protocol: TCP
- name: tcp-tls
port: 9093
targetPort: 9093
protocol: TCP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kafka-external
namespace: {{ kafka_namespace | default('kafka') }}
labels:
app: kafka
component: external
version: "3.6"
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/backend-protocol: "TCP"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts:
- kafka-bootstrap.digitribe.fr
secretName: kafka-bootstrap-tls
rules:
- host: kafka-bootstrap.digitribe.fr
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kafka-cluster-kafka-external-bootstrap
port:
number: 9094

View File

@@ -0,0 +1,17 @@
---
# Role: mindsdb
# Valeurs par défaut pour MindsDB
services:
mindsdb:
replicas: 1
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
storage_sizes:
mindsdb: "20Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy MindsDB - open-source AI/ML database on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,12 @@
---
# Role: monitoring
# Valeurs par défaut pour Prometheus, Grafana, Loki et Promtail
monitoring:
prometheus_retention: "30d"
grafana_admin_password: "DUMMY_GRAFANA_ADMIN_PASSWORD"
storage_sizes:
prometheus: "50Gi"
grafana: "10Gi"
loki: "50Gi"

View File

@@ -0,0 +1,13 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy monitoring stack (Prometheus, Grafana, Alertmanager) on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: storage
- role: cert-manager

View File

@@ -0,0 +1,5 @@
---
# Role: namespaces
# Crée les namespaces Kubernetes
# Les namespaces sont définis dans group_vars (variable: namespaces)
# Aucune variable custom supplémentaire requise pour ce rôle.

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Create and manage Kubernetes namespaces for the platform
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: prerequisites

View File

@@ -0,0 +1,14 @@
---
# Role: nodered
# Valeurs par défaut pour Node-RED
services:
nodered:
replicas: 1
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Node-RED flow-based programming tool on Kubernetes (IoT namespace)
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: iot

View File

@@ -0,0 +1,17 @@
---
# Role: odk
# Valeurs par défaut pour ODK Central
services:
odk:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
storage_sizes:
odk: "20Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy ODK (Open Data Kit) for mobile data collection on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,14 @@
---
# Role: phpipam
# Valeurs par défaut pour phpIPAM
services:
phpipam:
replicas: 1
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy phpIPAM IP address management tool on Kubernetes (IoT namespace)
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: iot

View File

@@ -0,0 +1,19 @@
---
# Role: prerequisites
# Valeurs par défaut pour les prérequis (repositories Helm)
helm_repos:
- name: stable
url: https://charts.helm.sh/stable
- name: bitnami
url: https://charts.bitnami.com/bitnami
- name: prometheus-community
url: https://prometheus-community.github.io/helm-charts
- name: grafana
url: https://grafana.github.io/helm-charts
- name: traefik
url: https://traefik.github.io/charts
- name: strimzi
url: https://strimzi.io/charts/
- name: jetstack
url: https://charts.jetstack.io

View File

@@ -0,0 +1,11 @@
---
galaxy_info:
author: Eric FELIXINE
description: Prerequisites - Install base tools and dependencies for the Kubernetes platform
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies: []

View File

@@ -0,0 +1,5 @@
---
# Role: smartapp
# Déploiement de l'application Smart City
# Les variables sont définies directement dans les tasks (smartapp_namespace, smartapp_domain).
# Aucune variable custom supplémentaire requise pour ce rôle.

View File

@@ -0,0 +1,13 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy SmartApp intelligent application platform on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases
- role: cert-manager

View File

@@ -0,0 +1,253 @@
---
# Role: smartapp
# Template: smartapp-api.yml.j2
# Déploiement de l'API backend SmartApp
# Variables:
# {{ smartapp_namespace }} - Namespace Kubernetes (défaut: smartapp)
# {{ smartapp_domain }} - Domaine public (défaut: api-smartapp.digitribe.fr)
---
apiVersion: v1
kind: Namespace
metadata:
name: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: api
version: "1.0"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: smartapp-api-config
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: api
version: "1.0"
data:
APP_ENV: "production"
APP_PORT: "8080"
LOG_LEVEL: "info"
CORS_ORIGINS: "https://smartapp.digitribe.fr"
DATABASE_POOL_SIZE: "10"
REDIS_POOL_SIZE: "5"
---
apiVersion: v1
kind: Secret
metadata:
name: smartapp-api-secrets
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: api
version: "1.0"
type: Opaque
stringData:
DATABASE_URL: "postgresql://smartapp:{{ smartapp_db_password | default('changeme') }}@postgres.smartapp.svc.cluster.local:5432/smartapp"
REDIS_URL: "redis://redis.smartapp.svc.cluster.local:6379/0"
JWT_SECRET: "{{ smartapp_jwt_secret | default('change-this-secret-in-production') }}"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: smartapp-api
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: api
version: "1.0"
spec:
replicas: 2
selector:
matchLabels:
app: smartapp
component: api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: smartapp
component: api
version: "1.0"
spec:
containers:
- name: api
image: digitribe/smartapp-api:{{ smartapp_api_version | default('latest') }}
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8080
protocol: TCP
envFrom:
- configMapRef:
name: smartapp-api-config
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: smartapp-api-secrets
key: DATABASE_URL
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: smartapp-api-secrets
key: REDIS_URL
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: smartapp-api-secrets
key: JWT_SECRET
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1Gi"
livenessProbe:
httpGet:
path: /api/v1/health/live
port: http
initialDelaySeconds: 15
periodSeconds: 20
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /api/v1/health/ready
port: http
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
startupProbe:
httpGet:
path: /api/v1/health/live
port: http
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 12
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- smartapp
- key: component
operator: In
values:
- api
topologyKey: kubernetes.io/hostname
---
apiVersion: v1
kind: Service
metadata:
name: smartapp-api
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: api
version: "1.0"
spec:
type: ClusterIP
selector:
app: smartapp
component: api
ports:
- name: http
port: 8080
targetPort: 8080
protocol: TCP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: smartapp-api
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: api
version: "1.0"
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-window: "1m"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts:
- {{ smartapp_domain | default('api-smartapp.digitribe.fr') }}
secretName: smartapp-api-tls
rules:
- host: {{ smartapp_domain | default('api-smartapp.digitribe.fr') }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: smartapp-api
port:
number: 8080
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: smartapp-api
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: api
version: "1.0"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: smartapp-api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 25
periodSeconds: 120

View File

@@ -0,0 +1,229 @@
---
# Role: smartapp
# Template: smartapp-web.yml.j2
# Déploiement du frontend web SmartApp (nginx)
# Variables:
# {{ smartapp_namespace }} - Namespace Kubernetes (défaut: smartapp)
# {{ smartapp_domain }} - Domaine public (défaut: smartapp.digitribe.fr)
---
apiVersion: v1
kind: Namespace
metadata:
name: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: web
version: "1.0"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: smartapp-web
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: web
version: "1.0"
spec:
replicas: 2
selector:
matchLabels:
app: smartapp
component: web
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: smartapp
component: web
version: "1.0"
spec:
containers:
- name: nginx
image: nginx:1.25-alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
protocol: TCP
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
livenessProbe:
httpGet:
path: /healthz
port: http
initialDelaySeconds: 10
periodSeconds: 15
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /healthz
port: http
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/conf.d
readOnly: true
- name: static-content
mountPath: /usr/share/nginx/html
readOnly: true
volumes:
- name: nginx-config
configMap:
name: smartapp-web-nginx-config
- name: static-content
configMap:
name: smartapp-web-static
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- smartapp
- key: component
operator: In
values:
- web
topologyKey: kubernetes.io/hostname
---
apiVersion: v1
kind: ConfigMap
metadata:
name: smartapp-web-nginx-config
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: web
version: "1.0"
data:
default.conf: |
server {
listen 80;
server_name {{ smartapp_domain | default('smartapp.digitribe.fr') }};
root /usr/share/nginx/html;
index index.html;
# Health check endpoint
location /healthz {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
# Static assets with caching
location /static/ {
expires 30d;
add_header Cache-Control "public, immutable";
}
# SPA fallback
location / {
try_files $uri $uri/ /index.html;
}
# Security headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: smartapp-web-static
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: web
version: "1.0"
data:
index.html: |
<!DOCTYPE html>
<html lang="fr">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>SmartApp - DigiTribe</title>
</head>
<body>
<h1>SmartApp - DigiTribe</h1>
<p>Frontend web opérationnel.</p>
</body>
</html>
---
apiVersion: v1
kind: Service
metadata:
name: smartapp-web
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: web
version: "1.0"
spec:
type: ClusterIP
selector:
app: smartapp
component: web
ports:
- name: http
port: 80
targetPort: 80
protocol: TCP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: smartapp-web
namespace: {{ smartapp_namespace | default('smartapp') }}
labels:
app: smartapp
component: web
version: "1.0"
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts:
- {{ smartapp_domain | default('smartapp.digitribe.fr') }}
secretName: smartapp-web-tls
rules:
- host: {{ smartapp_domain | default('smartapp.digitribe.fr') }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: smartapp-web
port:
number: 80

View File

@@ -0,0 +1,17 @@
---
# Role: starrocks
# Valeurs par défaut pour StarRocks
services:
starrocks:
replicas: 1
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
storage_sizes:
starrocks: "100Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy StarRocks unified analytics warehouse on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,10 @@
---
# Role: storage
# Valeurs par défaut pour le stockage NFS
# Classe de stockage par défaut
storage_class: "nfs-client"
# Serveur NFS
nfs_server: "10.0.0.1"
nfs_path: "/srv/nfs/k8s"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Provision and manage persistent storage (PVs, PVCs, StorageClasses) on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: namespaces

View File

@@ -0,0 +1,14 @@
---
# Role: streamlit
# Valeurs par défaut pour Streamlit
services:
streamlit:
replicas: 1
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Streamlit data application framework on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,5 @@
---
# Role: traefik
# Valeurs par défaut pour Traefik
traefik_namespace: "traefik"

View File

@@ -0,0 +1,13 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy and configure Traefik as the ingress controller for Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: storage
- role: namespaces

View File

@@ -0,0 +1,14 @@
---
# Role: trino
# Valeurs par défaut pour Trino
services:
trino:
replicas: 2
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"

View File

@@ -0,0 +1,12 @@
---
galaxy_info:
author: Eric FELIXINE
description: Deploy Trino distributed SQL query engine on Kubernetes
license: MIT
min_ansible_version: "2.15"
platforms:
- name: Kubernetes
versions:
- "1.28"
dependencies:
- role: databases

View File

@@ -0,0 +1,73 @@
# Infrastructure Snapshot - 2026-06-04
## État des containers Docker
### Containers UP (10)
| Name | Image | Status | Ports |
|------|-------|--------|-------|
| airflow-scheduler | apache/airflow:2.9.3-python3.11 | ✅ healthy | 8080/tcp |
| airflow-webserver | apache/airflow:2.9.3-python3.11 | ✅ healthy | 8080/tcp |
| airflow-init | apache/airflow:2.9.3-python3.11 | 🔄 restarting | 8080/tcp |
| airflow-postgres | postgres:16 | ✅ healthy | 5432/tcp |
| smartapp-api | smartapp-api:latest | ✅ Up 38h | 3001/tcp |
| smartapp-web | nginx:alpine | ✅ Up 38h | 80/tcp |
| gitea-runner | gitea/act_runner:latest | ✅ Up 2 days | - |
| traefik | traefik:v3.1 | ✅ Up 2 days | 80, 443, 8404 |
| smart-city-kepler | smart-city-kepler:latest | ✅ Up 2 weeks | 80, 8080 |
| gitea | gitea/gitea:latest | ✅ Up 2 days | 22, 3000 |
### Traefik Routes actuelles
| Domaine | Service | Type |
|---------|---------|------|
| airflow.digitribe.fr | airflow-webserver:8080 | HTTP |
| smartapp.digitribe.fr | smartapp-web:80 | HTTP |
| api-smartapp.digitribe.fr | smartapp-api:3001 | HTTP |
| gitea.digitribe.fr | gitea:3000 | HTTP |
| jupyter.digitribe.fr | jupyterhub:8000 | HTTP |
## Docker Networks
| Network | Driver | Scope |
|---------|--------|-------|
| traefik-public | bridge | local |
| smartcity-shared | bridge | local |
| airflow_default | bridge | local |
| smartapp_default | bridge | local |
## Docker Volumes
| Volume | Driver | Size |
|--------|--------|------|
| airflow-postgres | local | ~500MB |
| smartapp-data | local | ~100MB |
| gitea-data | local | ~1GB |
| traefik-data | local | ~50MB |
## Configuration Traefik
- **Image** : traefik:v3.1
- **Ports** : 80 (HTTP), 443 (HTTPS), 8404 (API)
- **Certificats** : Let's Encrypt (staging)
- **Providers** : Docker, Kubernetes ingress
- **Middlewares** : security-headers, redirect-to-https, rate-limit
## Fichiers de configuration
- **docker-compose** : `/home/eric/backups/2026-06-04/docker-compose/`
- **traefik-config** : `/home/eric/traefik-config/`
- **ansible/helm** : `/home/eric/smart-city-digital-twin-martinique/helms/`
## Sauvegardes
- **2026-06-03** : `/home/eric/backups/2026-06-03/` (config files)
- **2026-06-04** : `/home/eric/backups/2026-06-04/` (docker-compose files)
## Git Commits
```
fb62291b feat: add helm/ansible deployment files for Kubernetes
8c2251fa TODO: mise a jour 2026-06-04 - cleanup massif, helms ansible generés
b5674918 chore: update TODO — Honcho API deployed, Gitea Actions configured
```

View File

@@ -3,6 +3,15 @@
Smart City Digital Twin Martinique - Monitoring Script
Hybrid mode: Periodic checks + webhook-ready output
Alerts via Telegram when issues detected
Current stack (as of 2026-06-05):
- Analytics: Trino, StarRocks FE/BE, ClickHouse, Delta Lake, DuckDB, Streamlit
- FlexMeasures: Server, Worker, DB, Redis
- Airflow: Scheduler, Webserver, Postgres
- SmartApp: Web, API
- Gitea: Server, Runner
- Traefik: Reverse proxy
- Kepler: Geospatial visualization
"""
import subprocess
@@ -10,23 +19,52 @@ import json
import sys
from datetime import datetime
# Configuration
# Configuration - CURRENT RUNNING STACK
CRITICAL_CONTAINERS = [
"openremote-manager", "openremote-keycloak", "smart-city-simulator",
"emqx_emqx_1", "mainfluxlabs-broker", "stellio-api-gateway",
"smart-city-influxdb", "smart-city-grafana", "traefik",
"smart-city-prometheus-brokers"
# Analytics stack
"trino", "starrocks-fe", "starrocks-be", "clickhouse",
"delta-lake", "duckdb", "streamlit", "trino-nginx",
# FlexMeasures stack
"flexmeasures-server", "flexmeasures-worker", "flexmeasures-db", "flexmeasures-redis",
# Airflow stack
"airflow-scheduler", "airflow-webserver", "airflow-postgres",
# SmartApp
"smartapp-web", "smartapp-api",
# Gitea
"gitea", "gitea-runner",
# Infrastructure
"traefik",
"smart-city-kepler",
]
ENDPOINTS = [
("OpenRemote", "https://openremote.digitribe.fr"),
("Grafana", "https://grafana.digitribe.fr"),
("Orion-LD", "http://fiware-gis-quickstart-orion-1:1026/version"),
("Stellio", "https://stellio.digitribe.fr"),
("FROST", "http://frost_http-web-1:8080/FROST-Server/core/v1.0/info")
# SmartApp
("SmartApp Web", "https://smartapp.digitribe.fr"),
("SmartApp API", "https://api-smartapp.digitribe.fr/health"),
# Analytics
("Trino", "https://trino.digitribe.fr"),
("Streamlit", "https://streamlit.digitribe.fr"),
("ClickHouse", "https://clickhouse.digitribe.fr"),
("StarRocks", "https://starrocks.digitribe.fr"),
("DuckDB", "https://duckdb.digitribe.fr"),
("Delta Lake", "https://deltalake.digitribe.fr"),
# FlexMeasures
("FlexMeasures", "https://flexmeasures.digitribe.fr"),
# Airflow
("Airflow", "https://airflow.digitribe.fr"),
# Gitea
("Gitea", "https://gitea.digitribe.fr"),
# Kepler
("Kepler", "https://kepler.digitribe.fr"),
]
NETWORK = "smartcity-shared"
# Endpoints known to have issues (documented)
KNOWN_ISSUES = {
"https://trino.digitribe.fr": "200/302 - Trino UI accessible at /ui/ (redirects to login)",
"https://kepler.digitribe.fr": "404 - no Traefik route configured for Kepler",
"https://starrocks.digitribe.fr": "502 - StarRocks FE HTTP port 8030 not ready (FE still starting up)",
}
TELEGRAM_USER = "@ericf972" # Will be used by Hermes send_message
def run_cmd(cmd):
@@ -53,18 +91,32 @@ def check_endpoints():
for name, url in ENDPOINTS:
cmd = f"curl -k -s -o /dev/null -w '%{{http_code}}' --connect-timeout 5 {url}"
out, err, code = run_cmd(cmd)
if code != 0 or out not in ["200", "301", "302"]:
# Check if this is a known issue
if url in KNOWN_ISSUES:
issues.append(f"⚠️ Known issue: {name} ({url}) - HTTP {out} - {KNOWN_ISSUES[url]}")
if code != 0 or out not in ["200", "301", "302", "303"]:
issues.append(f"🌐 Endpoint DOWN: {name} ({url}) - HTTP {out}")
return issues
def check_network():
"""Check network connectivity between containers"""
issues = []
# Check if Traefik can reach OpenRemote
cmd = "docker exec traefik wget -q --spider http://openremote_manager_1:8080 2>&1"
out, err, code = run_cmd(cmd)
if code != 0:
issues.append(f"🔌 Network issue: Traefik → OpenRemote")
# Check if Traefik can reach key services
services = [
("trino", "trino:8080"),
("streamlit", "streamlit:8501"),
("clickhouse", "clickhouse:8123"),
("starrocks-fe", "starrocks-fe:8030"),
("flexmeasures-server", "flexmeasures-server:5000"),
("airflow-webserver", "airflow-webserver:8080"),
("smartapp-web", "smartapp-web:80"),
("gitea", "gitea:3000"),
]
for name, target in services:
cmd = f"docker exec traefik wget -q --spider http://{target} 2>&1"
out, err, code = run_cmd(cmd)
if code != 0:
issues.append(f"🔌 Network issue: Traefik → {name} ({target})")
return issues
def check_resources():
@@ -86,16 +138,16 @@ def main():
"""Main monitoring function"""
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
all_issues = []
print(f"🔍 Smart City Monitoring Check - {timestamp}")
print("=" * 50)
# Run all checks
all_issues.extend(check_containers())
all_issues.extend(check_endpoints())
all_issues.extend(check_network())
all_issues.extend(check_resources())
# Output results
if all_issues:
print(f"⚠️ ALERT: {len(all_issues)} issue(s) detected!")
@@ -108,4 +160,4 @@ def main():
sys.exit(0)
if __name__ == "__main__":
main()
main()

View File

@@ -13,30 +13,18 @@
[[inputs.mqtt_consumer]]
servers = ["tcp://emqx_emqx_1:1883"]
topics = [
"airquality/#",
"traffic/#",
"parking/#",
"noise/#",
"weather/#",
"light/#",
"sensor/#",
"smartcity/#"
"city/sensors/#",
"json/#"
]
data_format = "json"
qos = 0
# Input: MQTT Consumer - Mosquitto
[[inputs.mqtt_consumer]]
servers = ["tcp://smart-city-digital-twin-martinique-mosquitto-1:1883"]
servers = ["tcp://smart-city-mosquitto-1:1883"]
topics = [
"airquality/#",
"traffic/#",
"parking/#",
"noise/#",
"weather/#",
"light/#",
"sensor/#",
"smartcity/#"
"city/sensors/#",
"json/#"
]
data_format = "json"
qos = 0
@@ -45,14 +33,8 @@
[[inputs.mqtt_consumer]]
servers = ["tcp://bunkerm-bunkerm-1:1900"]
topics = [
"airquality/#",
"traffic/#",
"parking/#",
"noise/#",
"weather/#",
"light/#",
"sensor/#",
"smartcity/#"
"city/sensors/#",
"json/#"
]
data_format = "json"
qos = 0