What do Japanese soccer fans and great AI have in common?
Sunday in Dallas, Japan and the Netherlands played to a 2-2 draw in the World Cup. After the final whistle, while everyone else filed out, Japanese fans stayed behind, pulled out blue bags, and cleaned up their section of the stadium. They've done this at every World Cup since 1998. Leave something the way you found it. That's the right standard for AI.
Let me explain why I keep thinking about this, because the connection isn't obvious until you sit with it.
Why do Japanese soccer fans clean up the stadium after matches?
There's a phrase behind what the Japanese fans do. "A bird leaves nothing behind." It's something kids learn in school, where students clean their own classrooms and hallways as part of growing up. By the time they're at a stadium in another country, the behavior is automatic. It isn't a campaign. It isn't performance. It isn't what they post about. It's what they do. The standard is so internalized that the act of cleaning up after themselves looks unremarkable to them and remarkable to everyone watching.
That's a quiet kind of greatness. And it's exactly what AI is missing right now.
Look, we're in the loud phase of AI. Every company is rolling out tools. Every press release is about new capability. Every product page is selling speed, scale, and the next leap forward. None of it is about cleanup. And cleanup is where the real test lives.
What is "AI residue" and why does it matter?
Here's what I mean. Think about the average professional rolling out AI inside their company in 2026. They get a tool that drafts emails. It works most of the time. But every email has to be read carefully because sometimes the AI invents a customer name, gets a date wrong, or commits to something the company never agreed to. The tool saves time on writing and adds time on verification. Then there's an AI that summarizes meetings. The summary is fine, but the action items are slightly off, and someone has to go back to the transcript to fix them. Then there's an AI that generates code, which has to be tested more carefully than human code because the bugs are subtle. Then there's an AI for the inbox, which categorizes things weirdly enough that the person spends ten minutes a day undoing it.
Now do the math. Did the AI save time, or did it move the work from one place to another? In most cases, the honest answer is the second one. The AI did the showy part. The cleanup landed on the human.
That's a bird that leaves a lot behind.
Microsoft studied the modern workday in its Work Trend Index and found that the average worker is interrupted every two minutes, around 275 times a day, and handles 117 emails and 153 chat messages on top of the actual job. That's the baseline mess of knowledge work right now, and every AI tool a company rolls out either adds to it or subtracts from it. Almost nobody is asking which.
What is the standard for good AI?
The Japanese fan standard says you measure a tool by what it leaves behind. A great AI implementation doesn't only produce output. It leaves the process cleaner than it found it. Fewer steps. Less verification. Less anxiety about whether something slipped. Less manual cleanup. When the human walks away from the interaction, the system is in better shape than before AI showed up, not worse.
That's a real standard. And it isn't the one most companies are holding their tools to.
Let me be specific about what good looks like. Good AI removes a category of work entirely, not most of it. It takes the part of the job a person hated, finishes it cleanly, and gives the person back something they didn't have before, like time, or focus, or sleep. It doesn't ask the person to babysit. It doesn't require a checklist for every output. It doesn't produce things the person has to undo. It does the work and it stays inside the lane.
Bad AI looks helpful and isn't. It does sixty percent of the task and asks you to do the other forty, with no warning about which forty. It produces output that looks confident but needs careful review. It adds to the inbox, adds to the queue, adds to the never-ending list of small things to verify. It leaves more behind than it took away. It's the fan who shows up loud, drinks his soda, drops the cup, and walks out.
I wrote recently that real estate doesn't need another dashboard. The same logic lives here. A dashboard displays the work without removing it. Bad AI is a fancier version of the same trap: it displays competence without actually doing the cleanup. Real removal is harder. Real removal leaves nothing for the person to handle afterward.
The hard part for me, watching this play out, is that bad AI gets sold the same way good AI does. Both come with slick demos, big promises, and confident benchmarks. The difference doesn't show up until you've been using the tool for a month and you start to notice that something feels off. The workload didn't actually drop. The team is faster on the visible work and slower on everything else. The cleanup is invisible until you tally it up.
How can leaders evaluate AI tools for their company?
So the discipline this moment requires is to evaluate AI not on the output but on the residue. The residue is what's left in the process after the AI is done. Less residue is the goal. More residue is the failure. And the only way to know which one you've got is to live with the tool for long enough to see what it leaves behind.
And here's the harder part for leaders. Once you know the standard, you have to be willing to act on it. The AI tool everyone in the company is excited about because it makes them feel modern isn't necessarily the AI tool making them more productive. You have to actually count the hours, count the verifications, count the small fixes that show up nowhere in the procurement document. If the residue is high, you cut the tool, even if your team likes it. If the residue is low, you double down, even if it doesn't look as impressive in the slide. The discipline isn't sexy. It's the part that separates leaders who let their company drown in AI clutter from the ones who quietly run leaner because they refuse to pay the cleanup tax.
I wrote a few weeks back that the biggest waste in business isn't effort, it's duplicated effort. Residue is a cousin of duplication. They both describe work that shouldn't be there, work the system creates and the human has to absorb. Good AI removes both. Bad AI generates both and calls it progress.
I think about this from where I sit. I'm building AI for real estate transactions. The whole point of what we're doing is to take a category of work, the paperwork that sits between the agent and the client, and remove it. Not most of it. The whole thing. Not in a way that requires a checklist on the back end. In a way where the agent walks away from the interaction and the process is in better shape than before. That standard is what we hold ourselves to. It's also harder than the loud version of AI. It takes more discipline to build something that quietly leaves the process cleaner than to build something that looks impressive in a demo.
But that's the standard worth holding. The Japanese fan standard. A bird leaves nothing behind.
If you're a leader looking at AI tools for your company right now, hold them to that test. Don't ask what they produce. Ask what they leave behind. Ask whether the workload actually dropped, not on paper but in lived hours. Ask whether the team is fixing AI output as part of the day. Ask whether the system is cleaner, or whether the cleanup has moved somewhere else and now lives in the calendar nobody talks about.
The best AI implementations of the next ten years won't be the loudest. They'll be the ones that leave the process better than they found it, again and again, quietly, until people stop noticing the AI is doing it. That's the sign you got it right. The thing the Japanese fans do at every World Cup is so normal to them that they don't think about it. The best AI in your company should feel the same way. Helpful, present, and quietly leaving things cleaner.
Good job, Japanese soccer fans. You set the standard. The rest of us, in AI and otherwise, have some catching up to do.
*Judd Hoffman is CEO and Co-Founder of Ethica AI, building AI-powered tools for real estate transaction workflows.*
