Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Private AI R&D: Fleet Routing, Security Evals, and Knowing When Your Models Fail
See how to build a private AI fleet with diverse hardware, self-hosted routing, and continuous security testing, demonstrating real-world OWASP LLM Top 10 risks.
I built a heterogeneous LLM inference fleet and evaluation harness for a boutique AI and security consultancy R&D lab — spanning AMD, Nvidia, and Apple Silicon hardware, routed through a self-hosted LiteLLM gateway with named capability lanes, and tested continuously by Hermia, a local-first security eval TUI I wrote specifically because existing tools weren’t built for this use case.
For the demo: live Hermia run against the fleet, Grafana leaderboard updating in real time, and a walkthrough of what the security evals actually test and why the results differ across models.