Files
zavrsni-rad-otel-app/README.md
Domagoj Andrić 5cbc1d50fd
All checks were successful
CI / test (push) Successful in 56s
CI / test-analytics (push) Successful in 2m11s
CI / build-api (push) Successful in 40s
CI / build-frontend (push) Successful in 2m0s
CI / build-analytics (push) Successful in 40s
Add more changes
2026-05-11 16:04:10 +02:00

391 lines
16 KiB
Markdown

# OTel BI Platform
OpenTelemetry-instrumented business intelligence platform backed by two MSSQL data warehouses (AdventureWorks DW, WideWorldImporters DW), a PostgreSQL write store, a Go analytics engine with a built-in cron scheduler, and an OIDC-secured React frontend.
---
## Architecture
```
Browser (React + TypeScript)
│ OIDC Authorization Code + PKCE
│ Bearer JWT on every API call
otel-bi-api :8000 Python FastAPI — JWT validation, report generation
├── GET /api/config OIDC config for the SPA (unauthenticated)
├── GET /api/health
├── GET /api/telemetry/status
├── GET /api/audit AuditLog rows from PostgreSQL
├── GET /api/exports ExportRecord rows from PostgreSQL
├── GET /api/jobs/aw JobExecution rows from PostgreSQL
├── GET /api/jobs/wwi
├── POST /api/reports/generate fan-out to Go → XLSX + PDF to shared mount
├── /api/aw/** proxied to otel-bi-analytics + PDF export
└── /api/wwi/** proxied to otel-bi-analytics + PDF export
otel-bi-analytics :8080 Go — both MSSQL sources, robfig/cron scheduler
├── GET /aw/sales/** AdventureWorks analytics
├── GET /aw/reps/**
├── GET /aw/products/**
├── GET /aw/anomalies
├── GET /aw/export/** XLSX generation (excelize)
├── GET /wwi/sales/** WideWorldImporters analytics
├── GET /wwi/stock/**
├── GET /wwi/suppliers/**
├── GET /wwi/export/** XLSX generation
└── POST /scheduler/** manual job triggers
PostgreSQL Shared write store
├── audit_log Append-only event trail
├── job_executions Scheduled job run history
├── export_records Per-view export metadata
├── aw_* AW persistence tables
└── wwi_* WWI persistence tables
Grafana Alloy (OTLP/HTTP) Traces + metrics → Tempo / Prometheus
```
---
## Features
| Feature | Description |
|---|---|
| **AW Analytics** | Sales KPIs, 4-year history, linear-regression forecast, rep scores, product demand, anomaly detection |
| **WWI Analytics** | Sales KPIs, reorder recommendations, supplier reliability scores, business events |
| **Scheduled Jobs** | 8 robfig/cron jobs in the Go service; per-job OTel root span; `job_executions` record |
| **Manual Triggers** | `POST /api/aw/jobs/{job}/trigger` and `/api/wwi/jobs/{job}/trigger` |
| **Audit Log** | Append-only `audit_log`; every analytics call, job, export, and report is recorded |
| **Data Export** | Per-view XLSX (Go/excelize) or PDF (Python/reportlab); `export_records` metadata |
| **Full Reports** | `POST /api/reports/generate` aggregates all views; writes `.xlsx` + `.pdf` to a shared mount |
| **Runtime OIDC** | Frontend fetches OIDC config from `GET /api/config` at boot — nothing baked into the image |
| **OTel** | W3C trace propagation end-to-end; auto-instrumentation on HTTP, SQL, and HTTPX layers |
---
## Repository Layout
```
.
├── .gitea/workflows/
│ └── ci.yml Gitea Actions: test + build 3 images on Rocky Linux 10
├── .env.example Docker Compose top-level vars
├── backend/
│ ├── Dockerfile otel-bi-api (Python FastAPI)
│ ├── Dockerfile.analytics otel-bi-analytics (Go)
│ ├── pyproject.toml
│ ├── uv.lock
│ ├── .env.example Python API local dev
│ ├── app/
│ │ ├── core/
│ │ │ ├── audit.py AuditLog, JobExecution, ExportRecord ORM + helpers
│ │ │ ├── config.py Pydantic settings
│ │ │ ├── db.py PostgreSQL engine + session factory
│ │ │ ├── export.py to_pdf_bytes() via reportlab
│ │ │ ├── executor.py ThreadPoolExecutor
│ │ │ ├── otel.py OTel setup
│ │ │ ├── reports.py save_report() → XLSX + PDF
│ │ │ └── security.py JWT validation
│ │ ├── domain/
│ │ │ ├── aw/ AW ORM models + persistence helpers
│ │ │ └── wwi/ WWI ORM models + persistence helpers
│ │ ├── routers/
│ │ │ ├── aw.py /api/aw/** — proxy to Go + PDF exports
│ │ │ ├── wwi.py /api/wwi/** — proxy to Go + PDF exports
│ │ │ └── platform.py /api/config, /api/health, /api/audit, /api/reports
│ │ └── main.py
│ └── analytics/ Go analytics service
│ ├── go.mod
│ ├── .env.example Go analytics local dev
│ └── cmd/server/main.go
│ └── internal/
│ ├── analytics/ AW + WWI query logic
│ ├── config/ env var loading
│ ├── db/ MSSQL (database/sql) + PostgreSQL (pgxpool)
│ ├── export/ XLSX generation (excelize)
│ ├── handler/ HTTP handlers + route registration
│ ├── persistence/ PostgreSQL writes + audit appends
│ └── scheduler/ robfig/cron job definitions + OTel metrics
└── frontend/
├── Dockerfile
├── nginx.conf
├── .env.example build-time vars only
└── src/
├── api/ API client, types, runtime config fetch
├── auth/ OIDC (oidc-client-ts)
└── pages/ Dashboard pages
```
---
## Prerequisites
| Tool | Min Version | Purpose |
|---|---|---|
| Docker + Docker Compose | 24+ | Run the full stack |
| Go | 1.23+ | Analytics service local dev |
| Python | 3.12+ | API local dev |
| uv | latest | Python package manager |
| Node.js | 22+ | Frontend local dev |
| SQL Server | 2019+ | AdventureWorks DW + WideWorldImporters DW |
| PostgreSQL | 15+ | Write store |
---
## Quick Start (Docker Compose)
```bash
git clone ssh://git@git.andric.com.hr:2222/domagoj/zavrsni-rad-otel-app.git
cd zavrsni-rad-otel-app
cp .env.example .env
$EDITOR .env # fill in MSSQL DSNs and PostgreSQL password
docker compose up -d
```
| Service | URL |
|---|---|
| Frontend | http://localhost:8080 |
| API | http://localhost:8000 |
| Grafana | http://localhost:3000 |
---
## Local Development
### Go analytics service
```bash
cd backend/analytics
cp .env.example .env
# edit AW_MSSQL_DSN, WWI_MSSQL_DSN, POSTGRES_DSN
set -a && source .env && set +a
go run ./cmd/server
```
### Python API
```bash
cd backend
cp .env.example .env
# edit POSTGRES_* and ANALYTICS_SERVICE_URL
uv sync
uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
```
### Frontend
```bash
cd frontend
cp .env.example .env.local
# edit VITE_API_BASE_URL if the API is not on port 8000
npm install
npm run dev
```
---
## Configuration Reference
### Go analytics service (`otel-bi-analytics`)
| Variable | Required | Default | Description |
|---|---|---|---|
| `AW_MSSQL_DSN` | Yes | — | AdventureWorks DSN (go-mssqldb format) |
| `WWI_MSSQL_DSN` | Yes | — | WideWorldImporters DSN |
| `POSTGRES_DSN` | Yes | — | PostgreSQL DSN (pgx format) |
| `PORT` | No | `8080` | HTTP listen port |
| `OTEL_COLLECTOR_ENDPOINT` | No | `http://localhost:4318` | OTLP/HTTP endpoint |
| `OTEL_SERVICE_NAME` | No | `otel-bi-analytics` | |
| `OTEL_SERVICE_NAMESPACE` | No | `final-thesis` | |
| `DEFAULT_TOP_N` | No | `10` | Default ranking list length |
| `FORECAST_HORIZON_DAYS` | No | `30` | Default forecast horizon |
| `DEFAULT_HISTORY_DAYS` | No | `365` | Default sales history look-back |
### Python API (`otel-bi-api`)
| Variable | Required | Default | Description |
|---|---|---|---|
| `ANALYTICS_SERVICE_URL` | No | `http://localhost:8080` | Go analytics base URL |
| `POSTGRES_HOST` | No | `localhost` | |
| `POSTGRES_PORT` | No | `5432` | |
| `POSTGRES_DATABASE` | No | `otel_bi` | |
| `POSTGRES_USERNAME` | No | `otel_bi` | |
| `POSTGRES_PASSWORD` | No | `otel_bi` | |
| `POSTGRES_SSLMODE` | No | `prefer` | Use `require` in production |
| `POSTGRES_CONNECTION_STRING` | No | — | Full DSN override |
| `CORS_ORIGINS` | No | `http://localhost:5173` | Comma-separated allowed origins |
| `REQUIRE_FRONTEND_AUTH` | No | `true` | Enforce JWT on all endpoints |
| `FRONTEND_JWT_ISSUER_URL` | If auth | — | OIDC issuer URL |
| `FRONTEND_JWT_AUDIENCE` | If auth | — | Expected `aud` claim |
| `FRONTEND_JWT_JWKS_URL` | No | — | JWKS URL (derived from issuer if omitted) |
| `FRONTEND_JWT_ALGORITHM` | No | `RS256` | |
| `FRONTEND_REQUIRED_SCOPES` | No | `""` | Space-separated required scopes |
| `FRONTEND_OIDC_CLIENT_ID` | No | `""` | Served to SPA via `GET /api/config` |
| `FRONTEND_OIDC_SCOPE` | No | `openid profile email` | Served to SPA via `GET /api/config` |
| `REPORT_OUTPUT_DIR` | No | `/tmp/otel-bi-reports` | Mount a PVC here in Kubernetes |
| `OTEL_SERVICE_NAME` | No | `otel-bi-api` | |
| `OTEL_SERVICE_NAMESPACE` | No | `final-thesis` | |
| `OTEL_COLLECTOR_ENDPOINT` | No | `http://localhost:4318` | |
| `APP_ENV` | No | `dev` | Set `prod` to disable `/docs` |
| `LOG_LEVEL` | No | `INFO` | |
### Frontend (build-time only)
| Variable | Default | Description |
|---|---|---|
| `VITE_API_BASE_URL` | `http://localhost:8000` | API base URL seen by the browser |
| `VITE_OTEL_COLLECTOR_ENDPOINT` | `http://localhost:4318` | OTLP/HTTP for frontend traces |
| `VITE_OTEL_SERVICE_NAME` | `otel-bi-frontend` | |
| `VITE_OTEL_SERVICE_NAMESPACE` | `final-thesis` | |
OIDC config is **not** a build-time variable — it is fetched at runtime from `GET /api/config`.
---
## Scheduled Jobs
All jobs run inside the Go analytics service via robfig/cron. Each job emits an OTel root span and writes a `job_executions` row to PostgreSQL.
| Job ID | Schedule (UTC) | Description |
|---|---|---|
| `aw.daily.forecast` | 02:00 daily | Sales forecast via linear regression |
| `aw.daily.scores` | 02:30 daily | Rep performance + product demand scores |
| `aw.daily.data_quality` | 03:00 daily | AW data quality checks |
| `aw.daily.anomaly_detection` | 03:30 daily | Revenue anomaly detection |
| `wwi.hourly.reorder` | :00 every hour | Reorder recommendations + stock events |
| `wwi.daily.supplier_scores` | 03:30 daily | Supplier reliability scores |
| `wwi.hourly.events` | :30 every hour | HIGH-urgency stock-level event scan |
| `wwi.daily.data_quality` | 04:00 daily | WWI data quality checks |
**Manual trigger** (authenticated):
```bash
# AW jobs: forecast | scores | data_quality | anomaly_detection
curl -X POST http://localhost:8000/api/aw/jobs/forecast/trigger \
-H "Authorization: Bearer $TOKEN"
# WWI jobs: reorder | supplier_scores | events | data_quality
curl -X POST http://localhost:8000/api/wwi/jobs/reorder/trigger \
-H "Authorization: Bearer $TOKEN"
```
---
## API Reference
All endpoints are on `otel-bi-api` (port 8000). Endpoints marked *proxy* forward the request to `otel-bi-analytics` unchanged.
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/api/config` | No | OIDC config for the SPA |
| GET | `/api/health` | No | Health check |
| GET | `/api/telemetry/status` | Yes | OTel instrumentation info |
| GET | `/api/audit` | Yes | Audit log (`?domain=aw\|wwi&limit=`) |
| GET | `/api/exports` | Yes | Export history (`?domain=aw\|wwi&limit=`) |
| GET | `/api/jobs/aw` | Yes | AW job history |
| GET | `/api/jobs/wwi` | Yes | WWI job history |
| POST | `/api/reports/generate` | Yes | Full XLSX + PDF report |
| GET | `/api/aw/sales/kpis` | Yes | AW KPIs (proxy) |
| GET | `/api/aw/sales/history` | Yes | AW sales history — `?days_back=` (proxy) |
| GET | `/api/aw/sales/forecast` | Yes | AW forecast — `?horizon_days=` (proxy) |
| GET | `/api/aw/reps/scores` | Yes | AW rep scores — `?top_n=` (proxy) |
| GET | `/api/aw/products/demand` | Yes | AW product demand — `?top_n=` (proxy) |
| GET | `/api/aw/export/sales-history` | Yes | `?format=xlsx\|pdf` |
| GET | `/api/aw/export/sales-forecast` | Yes | `?format=xlsx\|pdf` |
| GET | `/api/aw/export/rep-scores` | Yes | `?format=xlsx\|pdf` |
| GET | `/api/aw/export/product-demand` | Yes | `?format=xlsx\|pdf` |
| POST | `/api/aw/jobs/{job}/trigger` | Yes | Trigger AW job immediately |
| GET | `/api/wwi/sales/kpis` | Yes | WWI KPIs (proxy) |
| GET | `/api/wwi/stock/recommendations` | Yes | Reorder recommendations (proxy) |
| GET | `/api/wwi/suppliers/scores` | Yes | Supplier scores — `?top_n=` (proxy) |
| GET | `/api/wwi/export/stock-recommendations` | Yes | `?format=xlsx\|pdf` |
| GET | `/api/wwi/export/supplier-scores` | Yes | `?format=xlsx\|pdf` |
| GET | `/api/wwi/export/business-events` | Yes | `?format=xlsx\|pdf` |
| POST | `/api/wwi/jobs/{job}/trigger` | Yes | Trigger WWI job immediately |
---
## PostgreSQL Schema
Tables are created automatically on API startup via SQLAlchemy `metadata.create_all()`.
| Table | Purpose |
|---|---|
| `audit_log` | Append-only event trail |
| `job_executions` | One row per scheduled job run |
| `export_records` | Per-view download metadata |
| `aw_forecasts` | Persisted AW forecast points |
| `aw_rep_scores` | AW rep score snapshots |
| `aw_product_demand` | AW product demand snapshots |
| `aw_anomaly_runs` | AW anomaly detection results |
| `wwi_reorder_recommendations` | WWI reorder snapshots |
| `wwi_supplier_scores` | WWI supplier score snapshots |
| `wwi_business_events` | Stock-level business events |
---
## CI/CD
`.gitea/workflows/ci.yml` — three independent pipelines on push to `master` or a version tag. Test jobs run inside `rockylinux/rockylinux:10` containers using shell-based checkout.
| Pipeline | Test job | Build job | Image |
|---|---|---|---|
| Python API | `test` (uv + pytest) | `build-api` | `domagoj/otel-bi-api` |
| Go analytics | `test-analytics` (go vet + go test) | `build-analytics` | `domagoj/otel-bi-analytics` |
| Frontend | `test` | `build-frontend` | `domagoj/otel-bi-frontend` |
**Required repository secrets:**
| Secret | Description |
|---|---|
| `REGISTRY_HOST` | Container registry hostname |
| `REGISTRY_USERNAME` | Registry login |
| `REGISTRY_TOKEN` | Registry PAT (packages read + write) |
---
## OTel Coverage
**Frontend:**
- W3C `traceparent` / `tracestate` propagation on all fetch calls
- `@opentelemetry/instrumentation-document-load`
- `@opentelemetry/instrumentation-fetch`
- Manual spans on dashboard data aggregation
**Go analytics service:**
- `otelhttp` HTTP server auto-instrumentation
- Manual spans on all analytics and export functions
- Per-job root span (`trace.WithNewRoot()`) for independent trace trees
- OTel metrics: `scheduler.job.duration_seconds`, `scheduler.job.success_total`, `scheduler.job.failure_total`, `scheduler.job.records_processed_total`
**Python API:**
- FastAPI auto-instrumentation (request span per endpoint)
- SQLAlchemy auto-instrumentation (PostgreSQL)
- HTTPX auto-instrumentation (calls to Go analytics)
- `x-trace-id` / `x-span-id` response headers
- W3C context propagation injected into all Go analytics calls
---
## Read-Only Guarantee (MSSQL)
- `ApplicationIntent=ReadOnly` on all MSSQL DSNs
- SQL layer only executes `SELECT` / `WITH` statements
- All writes target PostgreSQL only
- Use a SQL Server login with `db_datareader` only — no DDL or DML permissions needed
---
## Verification Checklist
1. `GET /api/health``{"status": "ok", "service": "otel-bi-backend"}`
2. `GET /api/config` → OIDC config without a token
3. After login, `GET /api/aw/sales/kpis` → data rows
4. `GET /api/audit` → rows for the KPI call
5. `GET /api/jobs/aw` → job run records (after cron tick or manual trigger)
6. `GET /api/aw/export/sales-forecast?format=xlsx` → downloads `.xlsx`
7. `POST /api/reports/generate` → paths to `.xlsx` and `.pdf` in `REPORT_OUTPUT_DIR`
8. Grafana Tempo → trace spanning `otel-bi-api → otel-bi-analytics → MSSQL / PostgreSQL`