Loss functions exist for models. I built them for products. Autonomous agents iterate against three metrics. The app improves the way a model trains.
Platform access when it ships. No spam.
Model training has a loss function that drives improvement automatically. Product development didn't. Until now. These three metrics are the product's loss function. Autonomous agents iterate against them, round after round.
Each round, a swarm of synthetic users tests the app. Three scores come out. Agents iterate until they converge on the target, or the app gets killed.
Seconds to first useful result. Every dead end chips away at the score. DocBench started at 2.1, agents fixed it over 10 rounds.
f = activated · e-time · e-frictionThe aha is not enough. The persona keeps going: real documents, harder questions, exports. Payoff scores the whole session.
f = completion · qualitySame persona, next day, with memory. Did the app earn a second visit? Most apps fail here silently.
f = Σ returns / revisitsBuild. Score. Trace friction to features. Fix. Repeat.
Same principle as gradient descent, applied to the product itself.
View the full conveyor diagram →"When a simulated user says they got value, does a real person agree?"
"Can you tell early that an app has hit its ceiling?"
"What drives retention? How does session one shape whether someone returns?"
Your idea, defined as a value contract, built by agents, scored by synthetic users.