/ werkstatt / local-ai · 2 entries
Local AI.
lm-studio · llama.cpp · open-webui. No cloud required.
Own the stack, own the data. Running models locally on consumer hardware — NUCs, a Mac Mini, whatever is available. Latency is a feature when it means your data never leaves the room.
↓ build log
2026-03-20
Mistral 7B Q4_K_M running local — notes from the first week
Running Mistral 7B Q4_K_M via llama.cpp on a Mac Mini M2. Fast enough for daily use. Notes on what works, what doesn't, and what I actually use it for.
2026-01-15
LiteLLM proxy — one endpoint, rotated keys
LiteLLM proxy sitting in front of multiple LLM backends. Key rotation without touching application config. Worth it.
→ sibling benches