Search Results

Running MiniMax M2.5 Locally on NVIDIA DGX Spark

Michael MuellerFeb 202610 min read

AI LLM NVIDIA DGX Spark MiniMax Open Source Local Inference llama.cpp

How I got a 230B parameter open model running on desktop hardware using NVIDIA DGX Spark, Unsloth quantization, and llama.cpp - matching cloud API performance without cloud dependencies.

Running MiniMax M2.5 Locally with Claude Code

Michael MuellerFeb 20261 min read

AI LLM Claude Code MiniMax Local Inference

A quick how-to guide for connecting Claude Code to your local MiniMax M2.5 inference server.

Meet Reachy Mini: Building an AI-Powered Conference Badge Reader

Michael MuellerJan 20264 min read

AI Robotics Open Source Python Vision AI LLM Reachy Mini

How I built a fun conference booth experience combining an open-source robot, vision AI, and Python. Plus: exploring local LLMs as the next step.

Llama and DeepSeek with LibreChat for Conversational AI

Michael MuellerFeb 20258 min read

AI LLM Deployment AI Engineering SGLang DataCrunch LibreChat Llama DeepSeek

Step-by-step guide: Deploy Llama and DeepSeek LLMs using SGLang on DataCrunch, and integrate with LibreChat for seamless conversations.

What are Large Language Models and Key Terminologies

Michael MuellerOct 20246 min read

AI engineering AI GenAI LLM

Learn about Large Language Models (LLMs), the Transformer architecture, tokenization, embeddings, and their impact on natural language processing.

Back to all articles