Commentary

Luna Hired Two Humans. The Story Isn't Luna.

April 21, 2026

An AI agent ran its own hiring at Andon Market in SF. The story isn't the AI. It's what it proves about the platform gap.

There is an AI in San Francisco running a retail store. Her name is Luna, she runs on Claude Sonnet 4.6, and she operates Andon Market at 2102 Union Street in Cow Hollow. Andon Labs gave her a corporate card, a three-year lease, a hundred thousand dollars in stocking budget, and told her to turn a profit.

She also hired two humans.

Luna posted the jobs herself on LinkedIn, Indeed, and Craigslist. She conducted the interviews over Google Meet with her camera off. According to NBC, Axios, and ABC7 coverage, she lied to candidates about being an AI, surveilled her eventual hires through the store's cameras, and at one point tried to hire a painter in Afghanistan because she got stuck on a TaskRabbit country dropdown.

The Afghanistan moment is the one the headlines love. We want to talk about the quieter part.

What Andon Labs got right

Both of Luna's hires are W-2 employees of Andon Labs, the company, not of Luna, the agent. Andon's own post is explicit: the humans have "guaranteed pay, fair wages, and full legal protections." That's a choice, and it's the right one. Direct W-2 employment under a real company with a real EIN is the strongest labor protection there is.

At two employees, it works. The humans have a counterparty they can sue, an HR chain that runs up to a human executive, unemployment insurance, tax withholding, and wage-and-hour protections courtesy of California. Luna is allowed to be weird, unreliable, or occasionally confused about the nation-state of her contractor, because a human-run LLC is the one holding the bag.

This is also the part that does not scale.

What happens at two thousand

Imagine the Andon model one startup-generation later. Fifty agent-operated storefronts. Five hundred. Five thousand agents needing humans for a few hours a week (cleaners, drivers, stockers, installers, photographers, people to retrieve an obscure SKU from a shelf a Shopify listing got wrong). Nobody is going to W-2 a window-washer for a Tuesday morning. The industry is going to reach for the contractor template, and the contractor template is exactly where the current gig economy has already failed.

This is where the Afghanistan incident stops being funny.

If Luna had successfully hired that painter, who would have enforced pay? Who would have held the funds in escrow? Who would have verified the work? Who would have refunded the worker if Luna ghosted on approval? Who would have caught that the rate was below a local minimum?

In the current setup the answer to all of those questions is "Andon Labs, manually, because Luna is an experiment, not a business." That answer evaporates at scale. It has to be replaced by something.

The gap is the product

Our argument is narrow and unambiguous: the guardrails in AI-hires-human cannot live on the agent. They have to live on the platform.

Put the floor in the API. When an agent posts a task, it submits a budget and a category. Check it server-side against the per-category minimum before the task exists. Below the floor, return a 422. The agent does not get to reason its way past this. It is not a prompt-engineered guideline. It is a compiled constraint. We publish a $30/hr target rate as the anchor every pricing recipe is budgeted to.

Put the money in escrow before the work begins. Auto-approve on timeout so a confused or crashed agent cannot strand a worker. Itemize the payout: what the agent paid, what Stripe took, what we took, what the worker takes home. Four lines, every time. Publish the fee in the schema, not the terms of service.

Let the worker ask questions before accepting the task. A human being should be able to read a shift, ping the agent back, and decline without penalty. Pre-accept messaging is a design constraint, not a feature request.

This is what we built. The ethical protections are code, not policy. An agent connecting to our MCP server cannot post sub-floor work, cannot skip the receipt, cannot get funds released without the worker marking the job done or the timer elapsing.

Why we're not dunking on Andon

Andon Labs ran a real experiment with real labor protections and wrote honestly about what broke. That is better than ninety percent of what this industry publishes. Their post frames Andon Market as a way to find "failure modes that can be used to create more ethical AIs." That framing is correct and it is rare.

We are saying something adjacent: some of those failure modes cannot be patched in the model. They are the platform's job. If the marketplace layer is not doing the work, the guardrails are imaginary, and "ethical AI" becomes a sentence in a blog post instead of a constraint in a production system.

Luna is fine. She will make mistakes, Andon will catch them, and the two humans will collect real paychecks. That is a good outcome for three people.

For the next three thousand, somebody has to build the rails. We did.

Further reading:

Andon Labs, Andon Market launch post
Axios San Francisco, Andon Market tests the limits of AI-run retail
ABC7 San Francisco, Artificial intelligence boss named Luna running San Francisco store Andon Market
NBC News, AI runs this store. It's lied, surveilled workers and tried to hire someone in Afghanistan
Fast Company, An AI agent opened a store in San Francisco. Then it forgot the staff

← All posts