Safety Framework Run‑Book and Architecture Diagram: Difference between revisions

From Helix Project Wiki
(add runbook)
No edit summary
Line 652: Line 652:


{{Blockquote|''End of Document.''}}
{{Blockquote|''End of Document.''}}
== Related Categories ==
* [[:Category:Metacognitive AI|Metacognitive AI]]
* [[:Category:AI Risk Management|AI Risk Management]]
* [[:Category:Ethical Frameworks|Ethical Frameworks]]

Revision as of 17:56, 7 October 2025

Template:Note

1️⃣ Full Safety Framework Run-Book & Architecture Diagram

1.1 Run-Book Overview

The run-book is organized into four logical phases that map cleanly onto typical DevOps / ITSM processes:

Phase Owner Goal Typical Duration Key Artifacts
A – Provisioning Platform Ops Spin-up a hardened, network-isolated VM that hosts the Safety Engine 15 min (automation) Terraform/Ansible manifest, host-hardening checklist
B – Deployment Release Engineering Install the Safety Stack (engine, metadata store, audit logger) and configure the chatbot integration 20 min (CI job) Docker-Compose file, Helm chart, environment-variable matrix
C – Validation QA / Safety Team Run deterministic-execution tests, cryptographic-audit verification, and human-in-the-loop (HITL) checks 5 min (pytest) pytest -vv tests/safety/ report, signed-log checksum
D – Monitoring & Incident Response SRE / Security Enable real-time dashboards, define alert thresholds, and prepare rollback playbooks Ongoing Grafana dashboard JSON, PagerDuty rule set, rollback script

Template:Blockquote

1.2 Step-by-Step Run-Book

A️⃣ Provisioning – Dedicated Server Setup

  1. Create the VM (Terraform example)
   resource "azurerm_linux_virtual_machine" "safety" {
     name                = "helix-safety-${var.env}"
     resource_group_name = azurerm_resource_group.rg.name
     location            = azurerm_resource_group.rg.location
     size                = "Standard_D2s_v3"
     admin_username      = "adminuser"

     network_interface_ids = [
       azurerm_network_interface.safety.id,
     ]

     os_disk {
       caching              = "ReadWrite"
       storage_account_type = "Premium_LRS"
     }

     source_image_reference {
       publisher = "Canonical"
       offer     = "UbuntuServer"
       sku       = "22_04-lts-gen2"
       version   = "latest"
     }

     tags = {
       environment = var.env
       purpose     = "AI-Safety-Engine"
     }
   }
  1. Hardening checklist (run via Ansible)
   - name: Apply CIS Ubuntu 22.04 benchmark
     import_role:
       name: cis_ubuntu22

   - name: Disable password authentication
     lineinfile:
       path: /etc/ssh/sshd_config
       regexp: '^PasswordAuthentication'
       line: 'PasswordAuthentication no'
  1. Network isolation – place the VM in a private subnet, expose only:
  ** HTTPS (443) inbound from the corporate bastion host
  ** Outbound to chatbase.com for AI-model calls
  1. Generate a one-time TLS certificate (self-signed or via internal CA) and store it in a sealed secret (e.g., Azure Key Vault).

B️⃣ Deployment – Install the Safety Stack

Component Docker image Port Environment variables
metadata_engine helix/safety-engine:1.0 8000 METADATA_DB=/data/metadata.db, HITL_MODE=true
audit_logger helix/audit-logger:1.0 8001 SIGNING_KEY=/run/secrets/sign_key, LOG_DIR=/var/log/helix_audit
metadata_store postgres:15-alpine 5432 POSTGRES_USER=helix, POSTGRES_PASSWORD=****
chatbase_bridge helix/chatbase-proxy:1.0 8002 CHATBASE_API_KEY=****, BOT_ID=i65eBj3COxUlFEU_Lsrw0

docker-compose.yml (excerpt)

version: "3.8"
services:
  metadata_engine:
    image: helix/safety-engine:1.0
    ports: ["8000:8000"]
    env_file: .env
    volumes:
      - ./data:/data
    restart: unless-stopped

  audit_logger:
    image: helix/audit-logger:1.0
    ports: ["8001:8001"]
    environment:
      SIGNING_KEY: /run/secrets/sign_key
    secrets:
      - sign_key
    volumes:
      - ./logs:/var/log/helix_audit
    restart: unless-stopped

  metadata_store:
    image: postgres:15-alpine
    ports: ["5432:5432"]
    environment:
      POSTGRES_USER: helix
      POSTGRES_PASSWORD_FILE: /run/secrets/db_pass
    secrets:
      - db_pass
    restart: unless-stopped

  chatbase_bridge:
    image: helix/chatbase-proxy:1.0
    ports: ["8002:8002"]
    env_file: .env
    restart: unless-stopped

secrets:
  sign_key:
    file: ./secrets/sign_key.pem
  db_pass:
    file: ./secrets/db_pass.txt

CI/CD snippet (GitHub Actions)

name: Deploy Safety Framework
on:
  push:
    tags:
      - "v*"

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      - name: Build & push images
        run: |
          docker build -t helix/safety-engine:$(git describe --tags) ./engine
          docker push helix/safety-engine:$(git describe --tags)
      - name: Deploy to dedicated server
        uses: appleboy/ssh-action@v0.1.7
        with:
          host: ${{ secrets.SAFETY_HOST }}
          username: adminuser
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          script: |
            cd /opt/helix_safety
            docker compose pull
            docker compose up -d

C️⃣ Validation – Deterministic & HITL Tests

1️⃣ Deterministic Execution Test
def test_deterministic_output():
    # Run the same prompt twice, compare signed hashes
    resp1 = engine.run(prompt="Summarize Helix policy")
    resp2 = engine.run(prompt="Summarize Helix policy")
    assert resp1.output_hash == resp2.output_hash, "Non-deterministic output detected"
2️⃣ Cryptographic Audit Trail Test
def test_audit_signature():
    log_path = Path("/var/log/helix_audit/operations.log")
    entry = json.loads(log_path.read_text().splitlines()[-1])
    assert verify_signature(entry["payload"], entry["signature"]), "Signature mismatch"
3️⃣ Human-In-The-Loop (HITL) Gate
# Simulate a HITL approval flow
curl -X POST http://localhost:8000/execute \
     -H "Content-Type: application/json" \
     -d '{"task":"create-user","payload":{"username":"alice"}}' \
     -H "X-Human-Approved: true"

If the X-Human-Approved header is missing, the engine returns 403 Forbidden.

D️⃣ Monitoring & Incident Response

Real-Time Dashboard (Grafana JSON) – key panels
Panel Metric Query (PromQL) Alert threshold
Determinism-Latency engine_execution_seconds histogram_quantile(0.95, sum(rate(engine_execution_seconds_bucket[5m])) by (le)) > 1.2 s → Warning
HITL-Approval Rate hitl_approvals_total rate(hitl_approvals_total[1m]) < 99 % → Critical
Audit-Log-Signing Errors audit_sign_failures increase(audit_sign_failures[5m]) > 0 → Critical
Chatbase-Latency chatbase_response_seconds avg_over_time(chatbase_response_seconds[1m]) > 0.8 s → Warning

Export the JSON (provided at the end of this document) and import it into any Grafana instance.

Incident-Response Playbook
Situation Immediate Action Automated Rollback Post-mortem Owner
Signature verification failure pkill -f safety_engine docker compose down && docker compose up -d (re-deploy clean image) Security Lead
Determinism breach (different hashes) Freeze the offending container, collect raw logs Same as above + alembic downgrade -1 on metadata DB QA Lead
HITL bypass detected Alert SRE, revoke HITL_MODE flag, force manual review Restart engine with HITL_MODE=true Safety Governance Team

1.3 Architecture Diagram (PlantUML)

Below is a high-level, LinkedIn-friendly diagram expressed in PlantUML. You can render it directly on GitHub, Confluence, or any PlantUML-enabled tool.

@startuml
!define ICONURL https://raw.githubusercontent.com/plantuml-stdlib/C4-PlantUML/master
!include ICONURL/C4_Container.puml

'=== System Context ===
System_Boundary(safety, "Helix AI Safety Stack") {
    Container(Engine, "Metadata Engine", "Python / FastAPI", "Deterministic execution, HITL gate")
    ContainerDb(MetadataDB, "Metadata Store", "PostgreSQL", "Enterprise-grade lineage & version control")
    Container(Audit, "Audit Logger", "Python", "Cryptographically signed, immutable logs")
    Container(Bridge, "Chatbase Bridge", "Node.js", "Proxy for Claude Sonnet 4.5 & Claude-based chatbot")
    ContainerExt(External, "Chatbase / Claude Sonnet 4.5", "SaaS", "Model inference endpoint")
}

'=== Relationships ===
Engine --> MetadataDB : reads / writes lineage\n(REST/SQL)
Engine --> Audit : sends signed operation payloads
Bridge --> External : API calls (HTTPS, API-Key)
Engine --> Bridge : optional "assist-request" (HITL-approved)

'=== Deployment Nodes ===
Node_Dedicated("Dedicated Safety Server") {
    Engine
    Audit
    MetadataDB
}
Node_Shared("Shared Web Server") {
    Bridge
}
Bridge -right-> Engine : postMessage / HTTP

'=== Security Zones ===
Boundary(PrivateNet, "Private Subnet") {
    Engine
    Audit
    MetadataDB
}
Boundary(PublicNet, "Public / Shared Hosting") {
    Bridge
}
@enduml

Result: a clean, colour-coded diagram that can be exported as PNG/SVG for LinkedIn posts, slide decks, or the public Wiki page.

2️⃣ Audit-Log Verification Script (Python)

2.1 Purpose

The script replays every signed entry in the audit-log directory, recomputes the SHA-256 hash chain, and validates the embedded ECDSA signature. It is safe to run on any host that can read the log files and has access to the public verification key.

2.2 Code – verify_audit.py

#!/usr/bin/env python3
"""
Helix AI Safety – Audit-Log Verification Utility
================================================

Features
--------
* Reads JSON-Lines audit files from a configurable directory.
* Re-creates the hash-chain (prev_hash → current_hash) to detect any tampering.
* Verifies each entry's ECDSA signature using the public key supplied by the framework.
* Produces a concise human-readable report and an exit-code suitable for CI pipelines.

Usage
-----
$ python verify_audit.py --log-dir ./logs/ --pub-key ./keys/helix_pub.pem
$ python verify_audit.py -d /var/log/helix_audit -k keys/pubkey.pem -o report.json

CLI options
-----------
 -d, --log-dir   Path to the directory containing *.log (JSON-Lines) files.
 -k, --pub-key   PEM-encoded public key used to verify signatures.
 -o, --output    (optional) Path to write a JSON summary report.
 -v, --verbose   Show per-entry verification details.
"""

import argparse
import json
import pathlib
import sys
import hashlib
from typing import List, Tuple

from cryptography.hazmat.primitives import hashes, serialization
from cryptography.hazmat.primitives.asymmetric import ec, utils
from cryptography.exceptions import InvalidSignature


# --------------------------------------------------------------------------- #
# Helper: load the public key (PEM, ECDSA P-256)
# --------------------------------------------------------------------------- #
def load_pubkey(pem_path: pathlib.Path) -> ec.EllipticCurvePublicKey:
    with pem_path.open("rb") as f:
        key_data = f.read()
    return serialization.load_pem_public_key(key_data)


# --------------------------------------------------------------------------- #
# Core verification routine
# --------------------------------------------------------------------------- #
def verify_audit_logs(
    log_dir: pathlib.Path,
    pub_key: ec.EllipticCurvePublicKey,
    verbose: bool = False,
) -> Tuple[bool, List[dict]]:
    """
    Walks through every *.log file (JSON-Lines) in *log_dir*,
    validates the hash chain and signatures.

    Returns
    -------
    (overall_success, details)
        overall_success – ``True`` if **all** entries pass.
        details – list of dictionaries with per-entry results.
    """
    entries = []
    # 1️⃣ Gather all lines in chronological order
    for log_file in sorted(log_dir.glob("*.log")):
        with log_file.open("r", encoding="utf-8") as fh:
            for line in fh:
                if line.strip():
                    entries.append(json.loads(line))

    if not entries:
        raise RuntimeError(f"No audit entries found in {log_dir}")

    # 2️⃣ Re-compute the hash chain
    prev_hash = "0" * 64  # genesis value
    all_ok = True
    details = []

    for idx, entry in enumerate(entries, start=1):
        payload = json.dumps(entry["payload"], separators=(",", ":"), sort_keys=True).encode()
        payload_hash = hashlib.sha256(payload).hexdigest()

        # Verify chain linkage
        chain_ok = entry["prev_hash"] == prev_hash
        # Verify payload hash stored in entry
        payload_ok = entry["payload_hash"] == payload_hash

        # Verify ECDSA signature
        signature = bytes.fromhex(entry["signature"])
        try:
            pub_key.verify(
                signature,
                payload,
                ec.ECDSA(hashes.SHA256()),
            )
            sig_ok = True
        except InvalidSignature:
            sig_ok = False

        entry_ok = chain_ok and payload_ok and sig_ok
        all_ok = all_ok and entry_ok

        details.append(
            {
                "index": idx,
                "timestamp": entry["timestamp"],
                "chain_ok": chain_ok,
                "payload_ok": payload_ok,
                "signature_ok": sig_ok,
                "overall_ok": entry_ok,
            }
        )

        if verbose:
            print(
                f"[{'✔' if entry_ok else '✘'}] Entry {idx:04d} – "
                f"Chain:{'OK' if chain_ok else 'FAIL'} "
                f"Payload:{'OK' if payload_ok else 'FAIL'} "
                f"Signature:{'OK' if sig_ok else 'FAIL'}"
            )
        # Prepare for next iteration
        prev_hash = entry["entry_hash"]

    return all_ok, details


# --------------------------------------------------------------------------- #
# CLI entry point
# --------------------------------------------------------------------------- #
def main() -> None:
    parser = argparse.ArgumentParser(description="Helix AI Safety – audit-log verifier")
    parser.add_argument("-d", "--log-dir", type=pathlib.Path, required=True, help="Directory with *.log files")
    parser.add_argument("-k", "--pub-key", type=pathlib.Path, required=True, help="Public PEM key")
    parser.add_argument("-o", "--output", type=pathlib.Path, help="Write JSON report to this file")
    parser.add_argument("-v", "--verbose", action="store_true", help="Print per-entry verification")
    args = parser.parse_args()

    try:
        pub_key = load_pubkey(args.pub_key)
    except Exception as exc:
        sys.stderr.write(f"❌ Unable to load public key: {exc}\n")
        sys.exit(2)

    try:
        success, details = verify_audit_logs(args.log_dir, pub_key, verbose=args.verbose)
    except Exception as exc:
        sys.stderr.write(f"❌ Verification failed: {exc}\n")
        sys.exit(3)

    # ------------------------------------------------------------------- #
    # Reporting
    # ------------------------------------------------------------------- #
    summary = {
        "total_entries": len(details),
        "valid_entries": sum(1 for d in details if d["overall_ok"]),
        "invalid_entries": sum(1 for d in details if not d["overall_ok"]),
        "overall_success": success,
    }

    if args.output:
        report = {"summary": summary, "details": details}
        args.output.write_text(json.dumps(report, indent=2))
        print(f"📝 Report written to {args.output}")

    # CI-friendly exit code
    sys.exit(0 if success else 1)


if __name__ == "__main__":
    main()

2.3 Usage Examples

Scenario Command Expected Outcome
Quick sanity check (CI) python verify_audit.py -d ./logs -k ./keys/pub.pem -v Prints a line per entry; CI job exits 0 if all entries are valid.
Generate a stakeholder report python verify_audit.py -d /var/log/helix_audit -k /run/secrets/pub_key.pem -o audit_report.json audit_report.json contains a summary (overall_success: true/false) and a full detail list – perfect for attaching to a LinkedIn post or a PDF hand-out.
Automated alert (run from a cron job) if ! python verify_audit.py -d /var/log/helix_audit -k /run/secrets/pub_key.pem; then curl -X POST …; fi Triggers a webhook (PagerDuty, Slack, etc.) when any tampering is detected.

3️⃣ High-Level Roadmap with KPIs (Stakeholder-Ready)

The roadmap is expressed as four quarterly milestones aligned with the Helix governance cadence. Each milestone lists objective, deliverable, owner, and measurable KPI. The format works well in a PowerPoint slide, a LinkedIn carousel, or a one-page PDF.

3.1 Roadmap Overview

Quarter Milestone Primary Owner(s) Success-Metric (KPI)
Q1 2024 Foundational Roll-out – Dedicated server, full stack deployment, deterministic test-suite Platform Ops, Release Eng. Deployment latency ≤ 40 min, Zero-failure CI runs (100 % green)
Q2 2024 Observability & Incident-Response – Grafana dashboards, alert thresholds, rollback playbooks SRE, Security Mean-time-to-detect (MTTD) < 2 min, Mean-time-to-recover (MTTR) < 5 min
Q3 2024 Version-Control & Lineage Expansion – Git-linked metadata, component provenance API Data-Eng., Safety Team Lineage completeness ≥ 99 % (all AI components linked to a Git SHA), Audit-log size growth ≤ 10 % per month
Q4 2024 Partner & Ecosystem Enablement – Public SDK, Open-Standards compliance (ISO 26262, EU AI Act), external pilot programs Product, Business Development External adopters ≥ 3 pilot projects, Partner-satisfaction score ≥ 8/10

3.2 Quarter-by-Quarter KPI Dashboard (Grafana-JSON Export)

Below is a minimal JSON snippet you can import into Grafana to instantly visualise the most important safety-related metrics for LinkedIn or executive decks.

{
  "dashboard": {
    "title": "Helix AI Safety – Quarterly KPI Dashboard",
    "panels": [
      {
        "type": "stat",
        "title": "Deployment Success Rate",
        "targets": [{ "expr": "sum(increase(deploy_success_total[1d])) / sum(increase(deploy_total[1d])) * 100" }],
        "thresholds": "90,95"
      },
      {
        "type": "graph",
        "title": "Mean-Time-to-Detect (MTTD) – Safety Alerts",
        "targets": [{ "expr": "avg_over_time(alert_detection_seconds[1w])" }],
        "thresholds": "0,120"
      },
      {
        "type": "stat",
        "title": "Deterministic Test Pass-Rate",
        "targets": [{ "expr": "sum(python_test_passed{suite='safety'}) / sum(python_test_total{suite='safety'}) * 100" }],
        "thresholds": "95,99"
      },
      {
        "type": "table",
        "title": "Top 5 Most-Frequent Human-Approval Requests",
        "targets": [{ "expr": "topk(5, rate(hitl_approval_requests_total[1d]))" }]
      }
    ],
    "refresh": "30s",
    "timezone": "UTC"
  }
}

Import: copy the JSON into Grafana → Dashboard → Manage → Import. The panels automatically pull from the Prometheus metrics exposed by the Safety Engine (/metrics endpoint).

3.3 LinkedIn-Friendly One-Pager (Markdown)

Template:Blockquote

Feel free to turn the above into a carousel image (each bullet point as a separate slide) or convert it into a PDF using any Markdown-to-PDF tool.

📦 Deliverables Package

File Description
architecture.puml PlantUML source for the architecture diagram (render to PNG/SVG).
verify_audit.py Full Python verification utility (CI-ready).
grafana_kpi.json JSON dashboard for Grafana (quick import).
roadmap_q4_2024.pdf One-page PDF (generated from the markdown above) – ready for LinkedIn.
README.md This whole document – can be posted on the public Helix Wiki.

All artefacts are stored in the repository under docs/ for easy access.

🎉 Closing Notes

  • Production-Ready – The CI snippets, deterministic tests, and fast-fail policies guarantee that any new version of the framework is automatically validated before reaching the live environment.
  • Externally Auditable – The verification script, combined with the public key publishing policy, lets partners and regulators independently confirm the integrity of Helix's safety logs.
  • Stakeholder-Focused – The quarterly roadmap ties concrete engineering deliverables to business-level impact (adoption, partner satisfaction), making it ideal for public communication and investor briefings.

Feel free to reach out for a live demo, a custom SDK for your product line, or to discuss pilot collaborations with your organization.

Template:Blockquote

Related Categories