DevOps Guard - DevLogs

Revisiting .NET and C#

Intro

DevOpsGuard is envisioned as a web API service that tracks work items (tickets), compute simple operational KPIs (backlog health, SLA breach rate, risk), ingest ops events (e.g. Build failed), and show everything in a lightweight dashboard.

I realized that my .NET and C# skills were developed during the second year of my degree, but haven’t really been used all that much since. Over the past four days, I took it upon myself to carefully build an application from the ground up and log everything. This will help me revisit this part of my skillset, rebuild it, and come out with a new project under my belt.

Check it out on GitHub

Screenshot of the DevOps Guard dashboard
Screenshot of the DevOps Guard dashboard

Goals

I had several constraints/goals I set out for this project:


Environment & Tooling

Sanity Checks

These are some super simple validations to make sure everything is working before I kicked things off.


Project scaffolding (top level)

I created a multi-project solution consisting of domain, infra, API. Here’s what the root surface looks like:

src/
  DevOpsGuard.Domain/        // entities, enums
  DevOpsGuard.Infrastructure/ // EF Core DbContext, configurations, repositories, migrations
  DevOpsGuard.Api/            // minimal API, endpoints, filters, wwwroot dashboard

Key design decisions:


Domain Model

Everything is centered on WorkItem and a MetricsSnapshot

Here’s a simplified Entities block:

// Work item + lifecycle
public enum Priority { Low, Medium, High, P0 }
public enum WorkItemStatus { Open, InProgress, Blocked, Resolved }

public sealed class WorkItem {
  public Guid Id { get; }
  public string Title { get; private set; }
  public string Service { get; private set; }
  public Priority Priority { get; private set; }
  public DateOnly? DueDate { get; private set; }
  public WorkItemStatus Status { get; private set; } = WorkItemStatus.Open;
  public string? Component { get; private set; }
  public string? Assignee { get; private set; }
  public List<string> Labels { get; } = new();

  public DateTime CreatedAtUtc { get; private set; }
  public DateTime UpdatedAtUtc { get; private set; }
  // domain methods: Rename, ChangePriority, SetStatus, etc.
}

public sealed class MetricsSnapshot {
  public Guid Id { get; init; }
  public DateTime CapturedAtUtc { get; init; }
  public double BacklogHealthPct { get; init; }
  public double SlaBreachRatePct { get; init; }
  public int OverdueCount { get; init; }
  public double RiskAvg { get; init; }
}

I’ve also defined DTOs for create/update/list/response shapes.


Infrastructure: EF Core & configuration

Db Context + mapping

On my way, I ran into a few compiler/EF expression tree issues:

Fix: switch to expression-bodied lambdas:

v => v == null ? 0 : v.Aggregate(0, (h, s) => HashCode.Combine(h, s?.GetHashCode() ?? 0))

Migrations & schema


API: minimal endpoints & typed results

I implemented endpoints with typed results (Results<Created<WorkItemResponse>, BadRequest<string>>) and FluentValidation filters.

Endpoints:

Gotchas I fixed


Validation & consistent errors


Metrics

KPIs:

I initially referenced WorkItemStatus.Done, then renamed the status to Resolved (more dev-friendly) and fixed queries accordingly.

History:
MetricsSnapshot stored daily.

Chart empty?
The dashboard plot relies on snapshots, not the live endpoint. If you see an empty chart: call POST /dev/metrics/snapshot (or use the dashboard’s “Capture Snapshot” button) a couple times after making changes to items.


Events ingest

POST /events/ingest applies simple rule-based updates to a WorkItem:

I verified the flow by ingesting events and observing changes in GET /workitems/{id} and /metrics.


Dashboard (static; no Node)

I served a single HTML page from wwwroot/dashboard/index.html.
Two common pitfalls I hit and fixed:

Features:

Everything is vanilla JS / Fetch API; no frameworks.



Docker Compose & environment

docker/docker-compose.yml orchestrates:

The big Compose pitfall:

Warnings showed:

“The SA_PASSWORD variable is not set. Defaulting to a blank string.”

Root cause: variable substitution happens before env_file is applied, and Compose looks for .env in the compose file’s directory. I had .env in repo root while compose lived in /docker.

Fixes (either works):

I also hit SQL login failed for ‘sa’ and container unhealthy when the SA password didn’t apply or when the data volume had an old password.
Fix: docker compose -f docker/docker-compose.yml down -v to drop volume, then up -d with correct env loaded.

Verification:


API security

(For production, use OAuth/OIDC; API key here is for demo simplicity.)


GHCR container publishing & repo hygiene

I enabled GitHub Container Registry publishing in CI and hit:

Security hygiene:


Testing & developer ergonomics


Common errors I encountered (and how I recognized them)

Each time I hit an error, I:

  1. Read the exact exception (top lines matter)

  2. Mapped to the layer (routing, model binding, EF, DB, Docker)

  3. Applied the minimal fix, rebuilt, re-tested


Architecture & data diagrams

High-level flow (Mermaid)

[Click here to view the chart on mermaid.js]

flowchart LR

  UI --|Auth: X-API-Key|--> API

API --|Auth: X-API-Key|--> UI

  

  API -- EF Core --> DB[(SQL Server)]

  subgraph Background

    Svc[Hosted Service: Daily Snapshot]

  end

  API <-->|Auth: X-API-Key| UI

  Svc --> DB

  API -- CSV Export --> UI

  API -- OpenAPI --> UI

ERD

[Click here to view the diagram on mermaid.js]

erDiagram
  WorkItem {
    string   Id PK
    string   Title
    string   Service
    string   Priority
    date     DueDate
    string   Status
    string   Component
    string   Assignee
    string   LabelsCsv
    datetime CreatedAtUtc
    datetime UpdatedAtUtc
  }

  MetricsSnapshot {
    string   Id PK
    datetime CapturedAtUtc
    float    BacklogHealthPct
    float    SlaBreachRatePct
    int      OverdueCount
    float    RiskAvg
  }

What each metric means

Backlog Health %

Interpretation: Team momentum on open work.

How to improve: make small updates, triage regularly, reassign or close stale items.

SLA Breach %

Interpretation: Timeliness adherence for open work.

How to improve: renegotiate due dates, clear blockers, escalate critical items.

Overdue Count

Interpretation: Absolute number of late items.
Use alongside SLA Breach %-a small team might have a low % but a non-trivial count; large teams the inverse.

How to improve: same as SLA Breach-reduce bottlenecks and recalibrate dates.

Risk (Average)

Interpretation: Aggregate operational risk from priority and lateness.

Rule of thumb:

How to improve: resolve P0/High first, set and honor realistic due dates, break down large scope.


Demo run through

  1. docker compose -f docker/docker-compose.yml up -d (with docker/.env present)

  2. Browse http://localhost:8080/swaggerAuthorize with your API key

  3. POST /dev/seed → creates demo items

  4. GET /workitems → see items; use filters & sorting

  5. POST /events/ingest → e.g., build_failed on an item

  6. GET /metrics → observe KPI changes

  7. POST /dev/metrics/snapshot → capture point(s)

  8. Dashboard http://localhost:8080/dashboard → paste API key, Load

  9. Try Download CSV, Edit, Delete, Mark Resolved, Bump → P0

  10. Toggle auto-refresh; try presets


What I accomplished here