← werkstatt
meica.ch/werkstatt/local-ai

Local AI.

lm-studio · llama.cpp · open-webui. No cloud required.

Own the stack, own the data. Running models locally on consumer hardware — NUCs, a Mac Mini, whatever is available. Latency is a feature when it means your data never leaves the room.

2026-03-20

Mistral 7B Q4_K_M running local — notes from the first week

Running Mistral 7B Q4_K_M via llama.cpp on a Mac Mini M2. Fast enough for daily use. Notes on what works, what doesn't, and what I actually use it for.

2026-01-15

LiteLLM proxy — one endpoint, rotated keys

LiteLLM proxy sitting in front of multiple LLM backends. Key rotation without touching application config. Worth it.