Five agents — Scammer, Victim, on-device Analyzer LLM, Bank Monitor, Regulator — run adversarial fraud episodes under structural information asymmetry. Two trained adapters: the Analyzer (Qwen2.5-7B + LoRA, 8-rubric GRPO) hits 99.3 % detection / 6.7 % FPR; the Scammer (Qwen2.5-0.5B + LoRA, adversarial GRPO) bypasses rules at 93.75 % — a 0.5B model beating 70B+ frontier LLMs at detector evasion.
/demo/
Interactive Gradio UI — replay curated episodes or score your own message.
GET /health
OpenEnv liveness probe. Returns {"status": "healthy"}.
GET /metadata
Environment metadata (action / observation schema, version).
GET /schema
Pydantic model JSON schemas for action and observation.
GET /leaderboard
Ranked submissions on chakravyuh-bench-v0.
GET /eval
v2 eval artifact — detection / FPR / F1 / per-difficulty breakdown.
GET /eval/bootstrap
10k-iteration percentile bootstrap 95% confidence intervals.
POST /diagnose
Score one message; get full 8-rubric AnalyzerRubricV2 decomposition.
/docs · /openapi.json
Interactive API explorer + OpenAPI 3.1 schema.