Open models inference
We run the best open coding models on our own GPUs, and we host the cloud agents that use them. Plug us into the harness you already use, or open a wired-up cloud agent from any device. Open stack. No lock-in.
Use it with the tools you already use
Why teams choose Umans
Infrastructure we own
We run open inference on our own GPUs and host the cloud agents that use them. Mostly in Europe, with US capacity for US customers.
Open models, by design
Open weights you can audit. The model that wrote your code is the same one anyone can inspect and run elsewhere.
Open stack, no lock-in
Plug our endpoint into the harness you already use with BYOK. Mix our inference with proprietary models. Walk away whenever.
Reach your agent from anywhere
Cloud agents are always-on and reachable from your laptop or phone. Same session, any device.
How it works
Two ways to use Umans
Use Umans from your tool
Three ways to plug in. Pick the one that fits your harness.
We're on models.dev. OpenCode, Zed and other modern tools list Umans as a provider out of the box.
umans claudeInstall the umans CLI, then `umans claude` powers Claude Code with our inference. Switch back to `claude` anytime.
Two env vars in any OpenAI- or Anthropic-compatible tool.
Run your agent in our cloud
One click. The agent runs on our infra, already wired up to our inference.
- Close your laptop
The agent keeps working. Come back later, the session is still there.
- Open it from your phone
Same session, any device. Walk in and out of the work.
- Sandboxed in its own box
The agent does its thing on isolated infra. Your local setup stays clean.
Pricing at a glance
- 200 effective requests per rolling 5h window
- Up to 3 concurrent requests
- Full open model lineup
- Unlimited tokens
- Up to 4 concurrent sessions
- Full open model lineup
- Pay per token via service keys
- Centralized usage and billing
- Role management for seats
Plans for individuals. Service keys for automations.
Cloud agents
Spin up a coding agent on our infra in one click. It lands wired up to our inference, with the umans CLI installed and configured. Reach it from your laptop or phone. Same session, any device.
What you can do with one
Set the goal, close your laptop. The agent keeps working through the night.
Open the agent on your phone. Walk through issues, ship the small fix.
Spin up a fresh box, try a wild change, throw it away. Your local stays clean.