24 Feb 2026

Running MiniMax M2.5 Locally with Claude Code

By Michael Mueller

LLM

Claude Code

MiniMax

Local Inference

24 Feb 2026

TL;DR: Once you have MiniMax M2.5 running locally (see previous post), here's how to connect Claude Code to it.

If you've got MiniMax M2.5 running via the setup guide, connecting Claude Code is straightforward.

Basic Configuration

Edit your ~/.claude.settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://<your-server>:8080",
    "ANTHROPIC_AUTH_TOKEN": "any-placeholder-value",
    "ANTHROPIC_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_SMALL_FAST_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "MiniMax-M2.5",
    "CLAUDE_CODE_SUBAGENT_MODEL": "MiniMax-M2.5",
    "API_TIMEOUT_MS": "3000000",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
  }
}

Replace <your-server> with your inference server's address.

With Agent Teams and Hooks

If you use agent teams, skills, or hooks:

{
  "env": {
    "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1",
    "ANTHROPIC_BASE_URL": "http://<your-server>:8080",
    "ANTHROPIC_AUTH_TOKEN": "local",
    "ANTHROPIC_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_SMALL_FAST_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "MiniMax-M2.5",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "MiniMax-M2.5",
    "CLAUDE_CODE_SUBAGENT_MODEL": "MiniMax-M2.5",
    "API_TIMEOUT_MS": "3000000",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
  },
  "hooks": {
    "UserPromptSubmit": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "$HOME/.claude/hooks/skill-activation-prompt.sh"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "Edit|MultiEdit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "$HOME/.claude/hooks/skill-verification-guard.sh"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Edit|MultiEdit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "$HOME/.claude/hooks/post-tool-use-tracker.sh"
          }
        ]
      }
    ]
  },
  "skipDangerousModePermissionPrompt": true
}

Next Steps

Start Claude Code with your configuration
If prompted, log out and log back in to clear any warnings
You're running entirely on local hardware

This gives you a local, private inference endpoint that works with Claude Code - useful for analyzing sensitive codebases without cloud dependencies.

Resources

Table of Contents

Basic Configuration

With Agent Teams and Hooks

Next Steps

Featured Blogs

Running MiniMax M2.5 Locally on NVIDIA DGX Spark

How I ran a 230B parameter open model on desktop hardware with NVIDIA DGX Spark, Unsloth quantization, and llama.cpp at cloud API speed.

Wave: Bringing Determinism Back to AI-Assisted Development

Wave uses deterministic YAML pipelines with contract-based handoffs so quality checks run every time, instead of hoping the model remembers.

Lore: Shared Context Infrastructure for Claude Code

We built Lore so Claude Code knows your org's conventions and remembers yesterday, without anyone maintaining a giant CLAUDE.md.

Master the AI Native Transformation

174 patterns, 422 pages — #1 Bestseller From Cloud Native to AI Native is FREE for a limited time

Get it For Free!Get it For Free!

Featured Blogs

Running MiniMax M2.5 Locally on NVIDIA DGX Spark

How I ran a 230B parameter open model on desktop hardware with NVIDIA DGX Spark, Unsloth quantization, and llama.cpp at cloud API speed.

Wave: Bringing Determinism Back to AI-Assisted Development

Wave uses deterministic YAML pipelines with contract-based handoffs so quality checks run every time, instead of hoping the model remembers.

Lore: Shared Context Infrastructure for Claude Code

We built Lore so Claude Code knows your org's conventions and remembers yesterday, without anyone maintaining a giant CLAUDE.md.

Master the AI Native Transformation

174 patterns, 422 pages — #1 Bestseller From Cloud Native to AI Native is FREE for a limited time

Get it For Free!Get it For Free!

Continue Exploring

A Pattern Language for Transformation

Browse our interactive library of 119 transformation patterns. Each one describes a specific architectural problem and a tested way to solve it, so your team can talk about real tradeoffs instead of abstract ideas.

Learn MoreLearn More

Free AI Assessment

Take our free diagnostic to see where you stand and get a 90-day plan telling you exactly what to fix first.

Learn MoreLearn More

Join our community

We organize and sponsor engineering events across Europe. Come meet the people building this stuff.

Learn MoreLearn More

Running MiniMax M2.5 Locally with Claude Code

By Michael Mueller

Basic Configuration

With Agent Teams and Hooks

Next Steps

Resources

Running MiniMax M2.5 Locally on NVIDIA DGX Spark

Michael Mueller

Wave: Bringing Determinism Back to AI-Assisted Development

Michael Czechowski

Lore: Shared Context Infrastructure for Claude Code

Michael Mueller

Master the AI Native Transformation

Running MiniMax M2.5 Locally on NVIDIA DGX Spark

Michael Mueller

Wave: Bringing Determinism Back to AI-Assisted Development

Michael Czechowski

Lore: Shared Context Infrastructure for Claude Code

Michael Mueller

Master the AI Native Transformation

You Might Also Like

A Pattern Language for Transformation

Free AI Assessment

Join our community

Quick Links

Waves of Innovation