Wiki
The technical knowledge base for running frontier-adjacent AI on hardware you own.
Architecture
Capabilities
Local Embeddings & Semantic Search
Embeddings turn text into vectors so meaning can be compared by distance, enabling semantic search and retrieval without keyword matching. A small sentence-transformer running locally delivers this at zero marginal cost.
Vision Models Run Locally (Qwen2.5-VL)
Qwen2.5-VL is an open vision-language model that reads images and answers questions about them. Run locally via MLX or LM Studio, it provides private, zero-marginal-cost image tagging, captioning, and visual analysis.
Concepts
Economics
Networking
Runtime
Local Inference on Apple Silicon (MLX)
MLX is Apple's array framework for machine learning on Apple Silicon, exploiting unified memory so a single M-series Mac can hold and serve large language and vision models with no discrete GPU. It is the runtime backbone of a Mac-based sovereign stack.
Quantization for Home Inference
Quantization shrinks a model's weights from 16-bit floats to lower-precision integers, cutting memory footprint several-fold so large models fit on consumer hardware. 4-bit is the workhorse precision for sovereign home setups.