Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
June 25, 2024
·
Los Angeles
Cortex: Hardware LLM Runtime
A technical walkthrough of hardware inference processes, exploring Nvidia and AMD tools, and an early demonstration of Cortex, an open‑source multi‑platform LLM runtime.
Video
Overview
We all know that AI has a hardware problem.
- What happens at a hardware level during inference?
- What exactly are all these tools in the inference ecosystems from Nvidia, AMD…?
- An early preview of Cortex, an open source tool that runs LLMs across multiple platforms
Here’s my cofounder Dan doing a similar talk, but I’m planning for this demo to be shorter & purely technical:
https://www.youtube.com/watch?v=orcPcUzSbOw&ab_channel=HackerHouseTW
Links
Local C++ platform runs GGUF models via OpenAI-compatible API.
Jan is an open-source local LLM inference engine providing an OpenAI-compatible API server.
Tech stack