CloneDex: Rebuilding OpenAI Codex with VibeKit
by Alan Zabihi, Co-founder & CEO of Superagent Technologies
25 June 2025
OpenAI Codex is one of the most thoughtfully built developer tools out there.
It doesn't just call a language model. It runs coding tasks: spins up a sandbox, installs dependencies, modifies files, runs tests, and opens a pull request. The interface is simple, but the system behind it has real structure: ephemeral containers, Git integration, asynchronous task handling, and safe code execution.
We wanted to understand that structure — and show that you can build something similar using open tools.
CloneDex is our attempt to do that in public. It's a Codex-style agent built with VibeKit, E2B, Inngest, and GitHub OAuth. The user types a task. The agent does the work in a sandbox. The result is a real pull request.
What we're replicating
Codex isn't just inference. It's orchestration.
From a simple input like "add Stripe checkout," it figures out what to do, sets up a container, runs commands, updates files, installs packages, runs tests, and pushes the result to GitHub. It doesn't require the user to copy and paste anything, or manage state across multiple runs. It turns a prompt into a structured coding workflow.
That's the part we wanted to understand. Not how to call a model — how to turn that call into something concrete.
What VibeKit does
You could build something like CloneDex without VibeKit. OpenAI clearly did. But they also had to build everything themselves: container orchestration, shell execution, file syncing, agent logic, GitHub integration, error handling, telemetry.
VibeKit gives you that layer out of the box. It manages:
- Container lifecycle: create, pause, resume, destroy
- Agent execution: run commands, stream logs, collect output
- GitHub automation: clone, branch, commit, pull request
You control the task logic. VibeKit handles the environment that task runs in.
Example: Running a Coding Agent in a Sandbox
Here's how you can use VibeKit to run a coding agent (like Codex) in a secure sandbox and create a pull request:
import { VibeKit } from "@vibekit/core";
import { VibeKitConfig } from "@vibekit/core/types";
const config: VibeKitConfig = {
agent: {
type: "codex",
model: {
provider: "openai",
apiKey: process.env.OPENAI_API_KEY!,
name: "gpt-4.1-codex"
}
},
environment: {
e2b: {
apiKey: process.env.E2B_API_KEY!,
templateId: "nodejs"
}
},
github: {
token: process.env.GITHUB_TOKEN!,
repository: "your/repo"
}
};
async function main() {
const vibekit = new VibeKit(config);
await vibekit.generateCode({
prompt: "Add a new endpoint to fetch user data",
mode: "code"
});
const pr = await vibekit.createPullRequest();
console.log("Pull request created:", pr.html_url);
}
main().catch(console.error);
How CloneDex works
CloneDex is a real coding agent. It takes a plain-language task from the user, runs it inside a secure container, and opens a working pull request with the result. There's no demo shell or fake eval — the agent installs packages, writes code, runs tests, and pushes changes to GitHub. Here's how that actually happens behind the scenes.
When a user submits a task in the UI, the request hits your backend and gets queued as a background job using Inngest. This is where everything starts: the backend now owns the job lifecycle, and the user just watches the task progress in real time.
Inside the job handler, you call into VibeKit. You pass in the user's prompt, a GitHub token, repo metadata, and any secrets the agent might need (like API keys). From that point on, VibeKit handles the entire agent run: spinning up a container, invoking the model, executing shell commands, writing files, and opening a pull request.
The container is created via E2B. It's a clean environment that behaves like a temporary developer workspace — one that can install packages, run shell commands, access secrets, and be safely thrown away at the end. VibeKit injects the repo by cloning it directly inside the sandbox. A new branch is created automatically.
Once the environment is set, VibeKit constructs a system prompt. This includes high-level context (project structure, key files, possibly some config) along with the user's request. The agent then runs — usually GPT-4.1 or Claude Sonnet 4, but the SDK supports multiple providers — and returns a plan for what to do. This often includes shell commands like npm install
, file edits, or test invocations.
At this point, VibeKit takes over execution. It runs the agent's commands one by one inside the container. Logs are streamed in real time. If something fails — bad install, broken test — you see it. If everything works, VibeKit creates a commit with the changes and pushes the branch to GitHub. Then it opens a pull request.
Example: Streaming Logs and Real-Time Output
You can stream logs from the agent run to provide real-time feedback in your UI or CLI:
import { VibeKit } from "@vibekit/core";
// ...config as above...
const vibekit = new VibeKit(config);
await vibekit.generateCode({
prompt: "Add a health check endpoint",
mode: "code",
callbacks: {
onUpdate: (log) => process.stdout.write(log),
onError: (err) => console.error("Error:", err)
}
});
Throughout the process, VibeKit emits structured events. These include container status (starting
, ready
, done
), agent phase (cloning repo
, running tests
, committing changes
), command logs, and error states. You decide how to surface that to the user.
Once the task is complete, your frontend gets the result: pull request URL, summary, commit hash. You can display it however you want — a success screen, a toast, a GitHub link.
CloneDex isn't just a model call with pretty output. It's a full agent runtime that produces real results. You give it a prompt. It gives you a working pull request. Everything in between — the container, the commands, the code, and the cleanup — is handled for you.
Why not just call the model?
You can call GPT-4 or Claude and get a response. That gets you code — but not execution. You still need to figure out:
- Where to run the code
- How to install what it needs
- How to commit or revert the result
VibeKit runs the task in a real container, with secrets, GitHub access, and a clean lifecycle. You don't have to build the bridge between model output and actual system change.
What "safe execution" actually means
It's not just about using a sandbox. It's about controlling the lifecycle: knowing when it started, what it ran, what succeeded, and what failed. It means being able to inject secrets, install packages, and track outcomes across retries.
That's the kind of structure coding agents need. VibeKit gives it to them without requiring you to wire it all yourself.
🗣️ Ismail Pelaseyed (Co-founder & CTO of Superagent.sh): "The agent needs its own world. Somewhere it can install stuff, run code, and crash safely. That's what the sandbox gives it."
You can build your own
CloneDex isn't a product. It's a reference implementation.
You can fork it and use it as a starting point for:
- Onboarding flows with agent-generated PRs
- GitHub bots that handle repetitive coding tasks
- Internal tools that delegate work to agents in sandboxes
🗣️ Ismail: "If you understand CloneDex, you can build your own Codex. Not a copy. Your version, your constraints."
Try it
Demo: clonedex.vercel.app
Source: github.com/superagent-ai/vibekit/tree/main/examples/codex-clone
SDK: github.com/superagent-ai/vibekit
Codex is a working blueprint for the future of coding agents. CloneDex is our way of showing how that blueprint works — and how you can build it yourself.
Let us know what you build.