✦ Every top AI · on your own box

The whole crew,
on your land.

Claude, GPT, Gemini, Perplexity and Grok — one app, working together on a private box that's yours. They cross-check each other, remember your projects, and do the real work. One flat price, and no one but you ever sees a word.
Runs in your own private cloudYour subscription · your data
Trail BossCrew is working
You
Which database for a read-heavy app — Postgres or Mongo?
ClaudeGPTGeminiPerplexityGrok
Trail Boss · roped the crew
All five answered. Postgres wins for your read pattern — GPT nailed the schema, Perplexity confirmed the pricing's current, and here's the one trade-off Grok flagged that the others missed…
Ask the crew
✦ The complaint, and the mechanism

Everything wrong with AI coding tools. And what we do about it.

The complaints people actually have about AI and AI coding tools, and the specific thing Wetlether does about each. Read the gripe. Read the fix.
What it does to your AI bill
דThe token meter never stops, and I never know what my monthly AI bill will be.”
Claude and GPT run against your flat subscriptions — the metered API is never touched.
Wetlether drives Claude and GPT the way the desktop apps do — an interactive session, not per-token API calls. Interactive use draws your flat Claude Max / ChatGPT plan; the metered path is deliberately never hit (Claude runs with no API key set, which is the thing that triggers metering). Nothing accrues per token and we add no markup. Add Gemini, Perplexity, or Grok on your own API keys — those bill per use at the provider's rate, not ours.
דAn agent looping overnight could rack up a terrifying API bill.”
On Claude and GPT it runs against your flat plan — a long run can't meter you into a hole.
A task that runs for hours on Claude or GPT ticks no meter, because it's on your flat subscription, not per-token API. Any engine you add on your own key bills at that provider's rate — but the workhorse can't surprise you.
דI'm paying for ChatGPT and Claude and Gemini separately and barely using each.”
Run the ones you own on their subscriptions; add the rest on your keys — one app, not five.
Claude and GPT run on the plans you already pay for. Gemini, Perplexity and Grok connect with your own keys. Five engines in one place instead of five tabs, five logins, five seats.
דI hit my usage limit mid-task and I'm locked out for hours.”
Cap out on one engine, keep going on another.
When one provider walls you off, the others you've connected are right there to take the rest of the job. It doesn't lift any single limit — it means one limit isn't the end of your afternoon.
Can you actually trust the answer
דIt's almost right — but not quite, and I lose the time I saved catching it.”
Several engines answer; a blinded judge merges them into one.
Stampede runs every engine side by side. Trail Boss adds a synthesis judge that reads them all with the labels hidden and writes one answer — and it rotates which model judges, because a Claude judge favors Claude even blind. Where they agree, move; where they split, that's the exact spot to check.
דI don't trust the AI to be right, so I end up re-checking everything it hands me.”
Agreement across independent models is a signal one model can't fake.
A lone model sounds equally confident when it's right and when it's wrong, so its confidence tells you nothing. Wetlether runs engines trained by different companies on the same question and shows where they land — agreement is real corroboration; a split is the precise place to look before you ship.
דIt invents libraries and API calls that don't exist.”
The other engines don't corroborate the made-up ones — and your box can check.
Ask the crew and a hallucinated package fails to show up in the other answers, so it stands out instead of sliding through. And because it runs on your own box, it can try to actually resolve the import and catch it.
דIt just agrees with whatever I say.”
Shootout makes the engines argue, with a neutral judge.
Shootout runs two or more engines through rounds of structured debate on your question — each sees the last round and has to push back — then a separate model judges. Sycophancy doesn't survive an opponent whose whole job is to disagree.
דEvery model claims it's the best; benchmarks are gamed and I can't tell.”
You get data on which engine wins which kind of question — refined by your own votes.
The Research tab tracks which engine tends to win which type of question, tuned by your votes over time. 'Which is best' becomes your measured experience on your own work, not a leaderboard someone optimized for.
Babysitting, and losing your afternoon to it
דIt stops on a yes/no prompt every few minutes — I can't walk away from the keyboard.”
It runs the safe steps itself and only escalates the risky ones to a tap on your phone.
A permission hook reads each action before it happens: read-only tools (search, reading files) run automatically, obviously destructive shell is auto-denied, and only the genuinely risky steps become a tap-to-approve card on your phone. It waits — and defaults to 'deny' if you never answer — so you can start a job and leave.
דIf I close my laptop, the long task dies.”
It runs in your cloud, not on your laptop.
The box is an always-on server in your own cloud account. A long job keeps running when you shut the lid or pocket the phone, and pings you when it needs a decision or when it's done. Start it at your desk, approve the one risky step from your phone on the train.
דI can't tell if it's working or silently hung, so I hover over it.”
It shows live activity and pings you — silence never means guessing.
While a turn runs you see real, moving activity, not a frozen screen, and a long job notifies you when it needs you or finishes. You're not stuck watching a spinner deciding whether to kill it.
דSupervising the agent is reactive work — it keeps pulling me back for micro-checks.”
You set how much rope it gets, once — the Regulator.
The Regulator dials how long and deep answers run and how big a crew works the question; the permission policy handles interruptions automatically. You tune it to your risk tolerance instead of re-answering the same prompt all day.
It forgets you
דEvery new chat, I re-explain my whole project from scratch.”
It keeps durable notes about your work on your box and reads them back later.
Every conversation is saved as a full transcript on your own box, and the facts it records about your projects and how you work sit in a memory file it re-reads on later turns. A new session isn't a blank slate — it starts with what it already knows about you.
דMy conversations disappear into the void; I can't find what I worked out last month.”
Full-text search across every conversation, all stored on your box.
Every transcript is kept on your box and indexed. Search finds any word across all of them and reopens the conversation rendered like the live chat — nothing evaporates, and a one-tap compact trims the live window when it gets heavy.
דSwitch models and you start over — every tool has its own memory silo.”
The history lives on your box, not inside one vendor's account.
Because your transcripts and notes sit on your own box, the context isn't trapped in a single provider's product. Change which engine answers and it's still working from the history on your disk.
It talks; it doesn't do the work
דIt gives me a plan, but I still have to run everything myself.”
It's a real coding agent on your box — it runs, edits, and ships.
Not a chat wrapper: a live Claude Code session on your own server. It runs code and tests, reads and writes files, uses tools, and deploys — the things you'd do in a desktop terminal, driven from your phone.
דIt swears the code works but never actually ran it.”
It runs what it writes on your box before it tells you it's done.
The agent executes the code and tests on your server, reads the result, and fixes what failed — so 'done' means it ran, not that it looks right. It's still bounded by what your tests check, but it isn't claiming success blind.
דIt writes the website but I still have to deploy it myself.”
The box ships it live and hands you the URL.
Because it runs real tools on your own cloud, it builds and deploys the site and gives you the live link — not just code to go paste somewhere.
דIts knowledge is a year out of date.”
Perplexity is in the crew for live web.
When a question needs current information, the crew can include an engine built for live search, so an answer isn't frozen at a training cutoff. (Perplexity connects on your own key.)
The actual coding grind
דA rename or signature change means touching thirty files, and I always miss one.”
It edits every call site on your real repo, then runs the tests to prove nothing broke.
Because it works on the actual files on your box, it finds every place a thing is used, changes them together, and runs your suite to confirm the refactor held. You see the diff in Files and the run in the log. It's only as safe as your tests are — it won't catch what they don't cover.
דIt hands me tests that don't even compile, let alone pass.”
It runs the tests it writes, on your box, until they're green.
Not a model guessing at test code it can't execute: it writes the test, runs it against your real code, reads the failure, and iterates — so what comes back actually ran. It can't know the behavior you meant if you never said it, but it won't hand you a test that doesn't execute.
דI want a real second opinion on my diff, not one model rubber-stamping it.”
Several engines review the same code and each flags what it sees.
Stampede sends your diff to every engine at once; trained differently, they catch different things — one spots the race condition, another the off-by-one. Trail Boss merges it into one review. A second, third and fourth set of eyes on your code — the team-PR plumbing still lives on your git host.
דWhen its fix is wrong, it cycles through the same three wrong fixes forever.”
The box runs the code, so it debugs against reality — and a rival engine breaks the loop.
Instead of guessing, it runs the failing code on your box, reads the real error and state, and adjusts. When one model gets stuck circling, Shootout puts a different engine on it whose job is to disagree — usually what breaks a loop a single model can't climb out of.
דThe stack trace is a wall of noise and I don't know where to start.”
It reads the real trace against your actual code — and hits the live web for the obscure ones.
Let it hit the error while running your code, or paste it, and it reads the surrounding files on your box to explain what actually went wrong in plain terms. For the obscure ones no model was trained on, Perplexity pulls live results. It explains; it can't guarantee the fix.
דI'm scared to touch this legacy module — I don't understand what it does.”
It reads the real code and explains it before you change a line.
It reads the actual files on your box and walks you through what a gnarly function does and what depends on it, so you're not editing blind. It can make risky code legible — it can't make it safe.
דI waste twenty minutes hunting for where a feature is actually implemented.”
Ask in plain language and it finds the code on your box.
Describe the behavior — 'where do we send the welcome email?' — and it greps and reads your real repo to point you at the exact file and line, then explains it. No guessing at filenames; it's reading your actual tree.
דEvery new codebase is weeks of 'where is anything and why is it like this.'”
It maps your real repo and keeps what you learn on your box.
Point it at the repo on your box and it reads the real structure, explains the architecture, and answers where/why against the actual code. What you work out this week is saved on your box, so it's there next week instead of re-discovered every session.
דVersion conflicts and peer-dep errors eat an entire afternoon.”
It runs the installs and builds on your box until it resolves.
Rather than guessing at a manifest, it runs the install and build on your box, reads the real conflict, and iterates on versions. Dependency graphs can be genuinely unsolvable — when that's the case it names exactly what's clashing instead of hand-waving.
ד'Works on my machine' — half my day is someone's broken local setup.”
The box IS the machine, and it's the same one from any device.
Your box is one consistent cloud environment. It's configured once and it's identical whether you open it from your laptop or your phone — no 'my local' drifting from 'your local.' It can stand up a fresh environment for you; it can't reach into a teammate's separate laptop.
דMerge conflicts are a guessing game and I'm scared I'll pick the wrong side.”
It reads both sides, resolves, then runs the tests to check the result.
It runs git on your box, reads both versions of the conflict with the surrounding code, proposes a resolution, and runs your suite to see if the merged result actually works — a checked resolution instead of a blind pick, bounded by what your tests cover.
דThe docs are wrong, and writing new ones is the chore I always skip.”
It drafts docs from the actual code, not from stale guesses.
It reads the real, current code on your box and drafts the README or the function docs from what the code does now — not an outdated comment. You own the final word; it hands you a real first draft instead of a blank page.
דI burn ten minutes naming a variable and it's still bad.”
Ask the crew for options, then it applies the rename across the repo.
Several engines pitch names at once, so you pick from a spread instead of staring at nothing — and once you choose, it renames every occurrence on your real files and runs the tests to confirm it still builds.
דI re-type the same scaffolding — route, controller, test, wiring — for the hundredth time.”
It generates the whole pattern and wires it into your real project.
It writes all the pieces into the right files on your box, matching the shape of what's already there, then runs it. A glyph fires your standard scaffolding instruction with one tap, so you're not re-typing the ask either.
דThird-party API docs are outdated and I can't tell what actually works.”
It checks live docs and can make a real test call from your box.
Perplexity pulls current documentation instead of a model's year-old memory of it, and because it runs on your box it can make an actual test call to see what the API really returns — not what a stale doc claims. Their keys and rate limits are still theirs to grant.
Which AI, and being stuck with one
דOne's great at code, another at writing — I waste time switching apps.”
The Conductor routes each question to the right engine automatically.
Conductor mode reads your question and sends it to the best-fit engine — or assembles a crew — using data on which model actually wins which kind of question, refined by your own votes. Or stay in Solo and pick the engine yourself.
דThey deprecated the model I actually liked, and I was stranded.”
Up to five independent engines, so no single vendor's roadmap owns you.
Claude, GPT, Gemini, Perplexity and Grok are all supported; if one changes or degrades, you route around it the same day. On Claude you also pick Auto, Sonnet or Opus directly. Some connect on your own key.
דThe app decides which model answers; sometimes I want a specific one.”
You choose — or delegate the choice on purpose.
Solo pins the exact engine and Claude tier you want. Conductor is opt-in for when you'd rather have it routed. Your call, per question.
Your code, your privacy, your keys
דI can't paste our proprietary code into someone's cloud AI — it's a policy violation.”
The box is in your own cloud on your own accounts; your code never routes through us.
There's no Wetlether server that sees your prompts or code — the box runs in your cloud account, on your AI subscriptions. What each provider does under your own account is your agreement with them; nothing of ours sits in the middle holding a copy. (Your own org's policy is still yours to check.)
דThey train on everything I type.”
We never see your prompts, so there's nothing for us to train on.
Your conversations run on your accounts and your box, so we can't and don't train on them. Whatever each AI provider does under your account is governed by your agreement with them directly — Wetlether just isn't a middleman adding another set of eyes.
דI'm scared the agent will slurp up my API keys and leak them.”
Your credentials sit outside the sandbox the tools can reach.
The file space the AI reads and writes is walled off from where your credentials live on the box — the tools structurally can't reach into them. A boundary, not a promise to be careful.
דOne wrong `rm -rf` from an over-eager agent could wipe my machine.”
It runs on a contained box, and obviously destructive shell is auto-denied.
The agent works on a sandboxed box in your cloud, not your laptop, so the blast radius is contained — and the permission hook auto-denies the obviously destructive commands before they run, escalating only the genuinely risky ones to a tap on your phone. Contained and gated, not trusted to be careful.
דI'm locked into one vendor and can't leave without losing everything.”
Your box, your keys, your cloud — provider-agnostic.
The hosting is bring-your-own, the accounts are yours, and the transcripts live on your disk. Nothing about it is a hostage; you can walk with your data intact.
Control you actually have
דThe agent goes way wider than I asked and I lose the afternoon undoing half of it.”
The Regulator caps how far it goes; a tamper-proof log shows exactly what it changed.
The Regulator sets the leash — short and surgical, or long and thorough — and the permission wedge gates the edits that matter before they happen. Every action lands in a hash-chained log on your box, so you can see exactly which files it touched and revert them in git — not reconstruct it from memory.
דThere's no record of what the AI actually did on my behalf.”
A tamper-proof activity log of every action, on your box.
Every tool the AI runs is written to a hash-chained log you can read — with an alarm if the chain is ever broken. An audit trail you own, not a black box.
דI have good prompts and setups but no fast way to reuse them.”
Glyphs fire your most-used instructions with one tap.
The glyph bar is a row of one-tap shortcuts for the instructions you send constantly — approve, go deeper, show me, and your own. Six on the phone, sixteen on desktop.
דAnswers are always too long, or too short.”
The Regulator is a length dial, always on.
One control sets how long and deep the answer runs, every time — not a prompt you re-type asking it to 'be brief.'
The stuff nobody else bothers with
דMaking an image or a video means opening yet another app.”
Quickdraw makes pictures and short video right in the chat.
Ask for an image or a short clip in plain language and it's generated in the conversation — no separate tool, no prompt-craft.
דVoice is an afterthought that barely works.”
Dictate a message and have answers read back, hands-free.
Voice is a first-class way in: dictate, set a word that fires the message, and have the reply spoken — built for using it on a phone while your hands are busy.
דThe phone app is always a crippled version of the desktop.”
The phone IS the product — a full desktop coding session on a small screen.
Wetlether was built phone-first: a real, interactive coding session driven from your phone, not a read-only companion. The box does the heavy lifting so the phone doesn't have to.
דI get the fix in my head on the train and it's gone by the time I'm at my desk.”
Start the actual work from your phone the moment you think of it.
It's a real coding session on your phone, not a note-to-self — you kick off the change from wherever you are and the box runs it. The Cue glyph even queues a half-formed thought to fire when you're ready.
דStanding up the tooling is a weekend project before I write any code.”
An in-app wizard provisions the box; you bring accounts, not server admin.
The provisioning wizard walks you through standing up your box in your own cloud and connecting your AI accounts — you're not hand-assembling a dev environment. Bring your accounts, not your sysadmin skills.
דEverything's built for teams and dashboards; I'm one person who just wants to build.”
It's a personal tool — your box, your work, no team overhead.
Built for one developer working from a phone, not a team dashboard you administer. Shared review pipelines live on your git host; Wetlether is the personal coding session that follows you around.
What it can't do — and what it does instead
דNo AI can hold my entire codebase in its head at once.”
It keeps searchable memory on your box and reads your files on demand.
No model holds a whole large codebase at once. So it keeps persistent, searchable memory on your own box and reads your actual files when it needs them — working the relevant part instead of loading the whole thing.
דLLMs hallucinate, and that's never getting fixed.”
No model stops making things up — so you stop trusting one alone.
Cross-model agreement and a judge catch the confident-but-wrong answers far more often than any single model catches its own. It cuts how much slips past you; it doesn't get you to zero.
Don't see your coding complaint?
Type it here — see if, and how, Wetlether handles it.
No account, no tracking. We may keep your question to make Wetlether better.
Pricing

One flat rate. The whole crew.

Start free for 7 days · cancel anytime · never a metered token bill.
Desperado
Free
  • Any one AI, always on
  • Solo — one at a time
  • Runs on your own box
What you get free
Rodeo
$19.99 / mo
  • The whole crew
  • Stampede & Trail Boss
  • Syncs every device
Start free trial
Full Steam
$39.99 / mo
  • Everything, plus:
  • The Conductor
  • The Regulator
Start free trial
✦ The Grand Creative House
Orpheum
$59.99 / mo
  • Everything, plus Quickdraw
  • Make pictures & videos by asking
  • Right in your chat
Start creating