Kilo launches KiloClaw, allowing anyone to deploy hosted OpenClaw agents into production in 60 seconds
In the rapidly evolving landscape of artificial intelligence, the distance between a developer’s idea and a functioning agent has historically been measured in hours of configuration, dependency conflicts, and terminal-induced headaches.
That friction point changed today. Kilo, the AI infrastructure startup backed by GitLab co-founder Sid Sijbrandij, has announced the general availability of KiloClaw, a fully managed service designed to deploy a production-ready OpenClaw agent in under 60 seconds.
By eliminating the “SSH, Docker, and YAML” barriers that have gatekept high-end AI agents, Kilo is betting that the next phase of software development—often called "vibe coding"—will be defined not just by the quality of a model, but by the reliability of the infrastructure that hosts it.
Technology: Re-engineering the agentic sandbox
OpenClaw has emerged as a viral phenomenon, amassing over 161,000 GitHub stars by offering a capability that many proprietary tools lack: the ability to actually perform tasks—controlling browsers, managing files, and connecting to over 50 chat platforms like Telegram and Signal.
However, as Kilo co-founder and CEO Scott Breitenother noted in an exclusive interview with VentureBeat, "OpenClaw itself isn't the hard part… getting it running is".
The technical architecture of KiloClaw is a departure from the "Mac Mini on a desk" model that many early adopters have relied on. Instead of requiring users to provision their own hardware or Virtual Private Servers (VPS), KiloClaw runs on a multi-tenant Virtual Machine (VM) architecture powered by Fly.io, a Chicago remote-first startup offering a developer-focused public cloud. This setup provides a level of isolation and security that is difficult for individual developers to replicate.
"What we're doing is making KiloClaw the safest way to claw," Breitenother explained during the interview. "We have a virtual machine that is a hosted OpenClaw instance, and we're handling all that network security, sandboxing, and proxies that an enterprise company would require. We are essentially running multi-tenant, hosted OpenClaw".
To ensure security, KiloClaw utilizes two distinct proxies that sit outside the VM to manage traffic and protect the instance from the open internet. This prevents the common "user error" of accidentally exposing an agent’s API keys or leaving a local instance vulnerable to external attacks. "It's going to be better than [a local setup] in every single way," Breitenother asserted. "If you were to set it up yourself, you'd probably miss a setting and end up with it accidentally on the internet or exposing an API key".
Product: The 'mech suit' and the 3 am crash
A primary pain point for OpenClaw users is the “3 am crash”—the tendency for locally hosted Node.js processes to die silently overnight without health monitoring or auto-restart capabilities. KiloClaw addresses this with built-in process monitoring and a cloud-native "always on" state.
Unlike standard Kilo Code workflows, which spin up a terminal session only when a developer initiates a command, KiloClaw is persistent. "KiloClaw is just running and listening," said Breitenother. "It's always on, waiting for your WhatsApp message or your Slack message. It has to be always on. That's a different paradigm—always-on infrastructure to engage with".
This persistence allows for a suite of "agentic affordances" that Kilo calls an "exoskeleton for the mind":
Scheduled automations: Users can set cron jobs for the agent to perform research, monitor repositories, or generate reports while the human user is offline.
Persistent memory: Utilizing a "Memory Bank" system, the agent stores context in structured Markdown files within the repository, ensuring it retains the state of a project even if the underlying model is swapped.
Cross-platform command: The agent can be triggered from Slack, Telegram, or a terminal, maintaining a unified execution state across all entry points.
Breitenother highlighted the shift in the developer’s role during the interview: "We've actually moved our engineers to be product owners. The time they freed up from writing code, they're actually doing much more thinking. They're setting the strategy for the product".
The “gateway” advantage: 500+ models, no lock-in
A core component of the KiloClaw architecture is its native integration with the Kilo Gateway. While the original OpenClaw was initially tied closely to Anthropic's models, KiloClaw allows users to toggle between over 500 different models from providers like OpenAI, Google, and MiniMax, as well as open-weight models like Qwen or GLM.
"Your preferred model today may not be the same—and honestly shouldn't be the same—a month and a half from now," Breitenother said, emphasizing the speed of the industry. "You may want different models for different tasks. Maybe you use Opus for something complex, or you switch to a tighter-budget open-weight model for routine work".
This flexibility is supported by Kilo's transparent pricing model. The company offers "zero markup" on AI tokens, charging users the exact API rates provided by the model vendors. For power users, this is managed through Kilo Pass, a subscription tier that provides bonus credits (e.g., $199/month for $278.60 in credits) to subsidize high-volume agentic work.
How to get started with KiloClaw right now
Sign in or register: Navigate to the Kilo Code application on the web (desktop) at app.kilo.ai and sign in using your existing account. Kilo supports several authentication methods, including GitHub and Google OAuth.
Create your instance: Select the "Claw" tab from the side navigation menu to access the KiloClaw dashboard. Click the "Create Instance" button to begin provisioning your agent (see image above for where to find it).
Choose your model: Select a default AI model to power your agent from the dropdown menu. Users can choose from a wide array of options, including free (for the time being) models like MiniMax.
Configure messaging channels (optional): During setup, you can optionally connect your agent to Discord, Telegram, or Slack and communicate with your KiloClaw agent directly over those channels — instead of on the Kilo Code website. But to move faster, you may skip this step and are always able to add these supported bot keys and configure these channels later in the instance settings.
Provision and start: Click "Create and Provision" to set up your virtual machine. Once the instance is provisioned, click "Start" to boot the agent, which typically takes only a few second
Verify and access: Click the "Open" button to enter the OpenClaw interface. For security, you will need to click "Access Code" to generate a one-time verification token that validates your device for the first time.
Begin vibe coding: Once verified, you can begin interacting with your agent directly in the chat interface. The agent will remain running 24/7 on a dedicated virtual machine, listening for commands across all connected platforms.
According to Brendan O'Leary, Developer Relations at Kilo Code and former Developer Evangelist at GitLab, users unsure which model to select should consult PinchBench, an open-source benchmarking tool developed to evaluate models on 23 real-world agentic tasks, such as email sorting and blog post generation.
Benchmarking the agentic era: the launch of PinchBench, a new open-source benchmarking suite specifically for Claw tasks
To help developers navigate the choice between 500+ models, Kilo has also released PinchBench, an open-source benchmark specifically for agentic workloads.
While traditional benchmarks like MMLU or HumanEval test chat prompts in isolation, PinchBench tests agents on 23 real-world, multi-step tasks such as calendar management and multi-source research.
The project was spearheaded by O'Leary, who noted during a demonstration that the benchmark was "kind of inspired by… other little kind of fun benches" like those created by developer YouTuber Theo Browne (@t3dotgg), CEO/Founder of Ping Labs.
O'Leary explained that while existing benchmarks are often highly specialized, he wanted a way to "benchmark the kind of things that we asked OpenClaw to do".
He has personally run the benchmark "hundreds and hundreds of times against OpenClaw" to ensure its accuracy, and taking a page out of Browne's book (er, video playbook?), also launched a YouTube series to find out if KiloClaw can handle various tasks, entitled, fittingly, "Will It Claw?"
To maintain high standards of evaluation for subjective tasks like writing blog posts, O'Leary designed a system where a high-end "judge model"—specifically Claude 4.5 Opus—is used to grade the output of other models. "We actually have… not the model under test, but always Opus… [judge] the output of each of the models," O'Leary stated, adding that the judge model even provides specific notes on execution quality.
The benchmark allows users to view a scatter plot comparing "Cost to Intelligence," identifying which models offer the highest proficiency for the lowest price. This specific visualization is a priority for O'Leary, who noted it is "my favorite graph for looking at models… how much do you spend versus how much is the success rate".
For those who prefer to host their own infrastructure, O'Leary has made the process entirely transparent, providing a "skill file that people can download" so they can "benchmark their own OpenClaw instance" independently
"We're doing this work anyway to know which defaults we should recommend," Breitenother added in a separate interview. "We decided to open source it because the individual developer shouldn't have to think about which model is best for the job. We want to give people more and more information".
O'Leary expanded on this philosophy, describing the benchmark as being "kind of like the Olympics in a lot of ways," where tasks range from "very objectively graded" to those requiring a more nuanced assessment.
Industry context: Distinguishing from the growing OpenClaw family of offshoots
KiloClaw enters a market increasingly crowded with OpenClaw variants. Projects like Nanoclaw have gained traction for being lightweight, while companies like Runlayer have targeted the enterprise "Virtual Private Server" niche.
However, Kilo distinguishes itself by refusing to "fork" the code. "It’s not a fork, and that’s what’s important," Breitenother stated. "OpenClaw moves so quickly that we are hosting the actual OpenClaw [version]. It is literally OpenClaw on a really well-tuned, well-set-up managed virtual machine".
This ensures that as the core OpenClaw project evolves, KiloClaw users receive updates automatically without manual "git pull" operations.
This "open core" philosophy extends to the licensing. While KiloClaw is a paid hosted service, the underlying Kilo CLI and core extensions remain MIT-licensed. This allows for community auditing—a critical feature for security-conscious enterprises.
Conclusion: toward an agentic future
The launch of KiloClaw marks a strategic move by Kilo to expand its user base beyond "wonky" developers to enterprise managers and non-technical professionals. By offering a "one-click" path to a production agent, the company is attempting to democratize the "magical moments" of AI.
According to a release provided to VentureBeat by Kilo ahead of the launch, in the first two weeks, more than 3,500 developers joined the waitlist. These early adopters have been "really pushing KiloClaw in all kinds of directions," using it to automate everything from Discord management to repository maintenance.
"Our mission is to build the best all-in-one AI work platform," Breitenother concluded. "Whether you are a developer, a product manager, or a data engineer, we want all of these personas to experience the magic of the exoskeleton for the mind".
KiloClaw is available now, offering 7 days of free compute for all new users. With thousands of developers already having cleared the waitlist, the era of the managed AI agent appears to have arrived—no Mac Mini required.
