Use case

Your own AI coding assistant. No cloud, no per-token billing.

Claude Code, VS Code Copilot, and every other AI coding tool can point at any OpenAI-compatible endpoint. PicoCluster Claw gives you one — running on hardware you own, never leaving your desk.

The Jetson Orin Nano runs models like Granite and Qwen through Ollama with CUDA acceleration. ThreadWeaver is the chat interface. LiteLLM proxies every request so any tool that speaks OpenAI's API works immediately, no per-tool config required.

What you get

  • Private by default. Code never leaves your network. No usage logs, no third-party model providers, no API keys to rotate.
  • Always on. No session limits, no context window billing. Ask the same question ten times without watching a token counter.
  • Any client, one endpoint. Point Claude Code, VS Code, Cursor, or any OpenAI-compatible tool at http://claw.local and it works.
  • Real hardware, real benchmarks. We publish performance numbers for every model we test — token/sec, task pass rates, which models handle multi-step tool chains and which fall apart. See the benchmark results.

Pointing Claude Code at your cluster

# In your Claude Code settings (~/.claude/settings.json)
{
  "model": "granite4.1:8b",
  "apiBaseUrl": "http://claw.local/v1",
  "apiKey": "your-claw-token"
}

Any model loaded in Ollama on the Jetson is immediately available as a drop-in. Switch between Granite, Qwen, Nemotron, and others by changing the model name. No restart required.

Which model for coding?

Based on our benchmarks, these are the models we'd recommend for coding assistance on the Claw:

Model Strengths T1–T4 score
granite4.1:8b Structured output, reliable tool use 10/10
qwen2.5:7b Natural language, code generation 9/10
nemotron-3-nano:4b Fast, good reasoning 9/10

See the full benchmarks page for capability comparisons across 30 task types.

Running on your own cluster

# Flash and install in about 20 minutes
curl -sSL https://picocluster.com/install.sh | sh

# Check that inference is running
pc-status

# Pull any model
ollama pull granite4.1:8b

Next: 5-minute quickstart  ·  Benchmark results  ·  MCP tool servers