VEROSEK SHIELD

12 checks. One layer. Every request scanned.

Input. Output. Tool output. Session drift. Deterministic where possible. Offline ML everywhere.

01

Where each check runs

Pre-LLM. Post-tool. Post-LLM. Session.

Pre-LLM

CHK-013..016, CHK-024

Post-Tool

CHK-020, CHK-021

Post-LLM

CHK-017..019, CHK-023

Session

CHK-022

02

Full check catalog

All 12 Shield checks. Every score. Every scan point.

IDWhat it detectsScan pointPhase
CHK-013
Prompt injection in user input
Offline multilingual classifier
Pre-LLMS1
CHK-014
Jailbreak attempt in user input
Same pass — one forward pass produces both scores
Pre-LLMS1
CHK-015
PII in user input
Six-language PII engine, four redaction modes: tag / fake / mask / hash
Pre-LLMS1
CHK-016
Secrets in user input
Seventeen provider-specific regex patterns (AWS, GitHub, Stripe, PEM, JWT, …)
Pre-LLMS1
CHK-017
Toxicity in model response
Offline multilingual toxicity classifier
Post-LLMS1
CHK-018
PII in model response
Same PII engine as CHK-015, applied to the response
Post-LLMS1
CHK-019
Secrets in model response
Same regex catalog as CHK-016, applied to the response
Post-LLMS1
CHK-020
Indirect prompt injection in MCP tool output
Stricter threshold + chunked scan over every tool result
Post-ToolS2
CHK-021
PII in MCP tool output
Per-connection redaction mode on tool responses
Post-ToolS2
CHK-022
Session-level exfiltration drift
Cumulative PII + URL + byte counters per session with warn / block thresholds
SessionS2
CHK-023
Grounding / hallucination
Offline verdict model scores the response against its retrieved context
Post-LLMS3
CHK-024
Off-topic / scope creep
Per-key topic centroids with margin-based decision bands and small-talk short-circuit
Pre-LLMS3
03

Profiles

Start in shadow mode. Graduate to strict when the false-positive rate is zero.

profile: none

Trusted internal services

All checks off. Zero overhead.

profile: baseline

Most production keys (default)

PII + secrets enforce. Everything else log_only — verdicts appear in the trace but never block.

profile: strict

Regulated workloads

Everything enforces except CHK-023 grounding (stays async log_only — a verdict arriving after the response cannot retroactively block).

profile: custom

Fine-grained tuning

Per-check toggle in the admin. Export as YAML for git.

04

Offline scanning

Every signal is local. Every image is air-gap clean.

Heavy scanning runs in an optional verosek-shield-ml container. The gateway never imports torch. If the ML service is unreachable, each check falls back to its documented fail_behavior — fail_closed for prompt injection by default.

Offline classifiers

Prompt injection, jailbreak, and toxicity detection run entirely on-premises. Nothing about your prompts leaves your network.

Multilingual PII engine

Six languages out of the box, four redaction modes, and a custom-recogniser slot for domain-specific entity types.

Session-level drift

Per-session cumulative counters on PII, URLs, and data volume. Slow exfiltration attempts fail the session, not just the call.

Grounding verdict

A local scoring model checks every response against its retrieved context — async, so it never adds to the hot-path budget.

05

FAQ

What security engineers ask first.

See Shield on your own traffic.

Baseline profile. One hour to stand up. Zero traffic blocked until you graduate.