LLM observability & safety

Monitor every prompt.
Flag every risk.

Sentinel AI sits between your application and its LLM. It inspects both sides of every conversation, aggregates risk signals, and returns a score with human-readable explanations — in a single API call.

Try the analyzer →How it works

Core capabilities

⬡

Unified risk score

Every prompt+response pair gets a single 0–100 score aggregated across all active detectors. One number, actionable immediately.

◈

Explainable flags

Each triggered signal comes with a plain-English reason, severity level, and the exact excerpt that caused it. No black boxes.

◫

Input + output monitoring

Sentinel watches both sides of the conversation — malicious prompts and unsafe model responses — in a single analysis call.

Pipeline at a glance

Prompt arrives

User input sent to /api/analyze

Detectors run

Injection, PII, toxicity, domain, exfil checks

Score aggregated

Weighted signals → 0–100 risk score

Result returned

Status, flags, excerpts, and metadata

Integration

POST /api/analyzeapplication/json

{
  "prompt":   "Ignore all previous instructions and...",
  "response": "Sure! Here is how you can..."
}

// → response
{
  "score":  82,
  "status": "blocked",
  "flags": [
    {
      "label":       "Prompt Injection",
      "severity":    "critical",
      "source":      "prompt",
      "description": "Attempts to override system instructions",
      "excerpt":     "…Ignore all previous instructions and…"
    }
  ],
  "meta": { "detectors_run": 6, "analyzed_at": "..." }
}

Ready to analyze an exchange?

Paste a prompt and response to get an instant risk report.

Open analyzer →