Why We Started an AI Translation Workflow with Community Language Research, Not a Dropdown

When people imagine AI translation products, they often imagine the hard part is the model call: upload an asset, choose a language, get back a translated result. In practice, one of the most important decisions happens before the first inference ever runs.

For us, this project did not begin as a clean technical exercise. It began during the ICE surge in Minnesota, when community-facing materials had to be translated and shared quickly for people who genuinely needed them. I spent many late nights working alongside a group of interns and many volunteers from the India Association of Minnesota, helping push this work forward manually.

And manual is the right word. We were translating, reviewing, reworking, and redistributing materials with human effort, urgency, and goodwill. For a while, we scaled with people power. But the more we did it, the more obvious the ceiling became. If demand rose, the only way to keep up was to ask more people to give more late-night hours. That is not a real scaling model. It is an emergency response.

We also tried spreading the workload using consumer LLM chat subscriptions, but that created a different bottleneck. Limited token budgets, fragmented sessions, and inconsistent output made it difficult to turn volunteer effort into a repeatable, scalable workflow. It helped at the margin, but it did not solve the underlying problem.

The turning point: the question was no longer “can AI help with translation?” The better question was “how do you build a translation workflow that is actually usable under real-world pressure?”

Start with the Community, Not the Control

A generic language dropdown is easy to ship. It is also usually a sign that no one has asked who the product is for. Instead of starting from a default list of world languages, we started from a real geographic context: Minnesota. That meant asking a grounded product question: which languages are actually spoken across the communities this workflow is meant to help?

That led us toward a more relevant set of language options shaped by immigrant, refugee, and multilingual communities rather than generic software defaults. In other words, the language picker stopped being a form field and became a product decision. A translation tool feels much more serious when the options suggest actual awareness of the communities it is meant to support.

The Second Filter Was Technical Reality

Community relevance alone is not enough. The next step was to cross-check the candidate languages against the practical capabilities of the underlying AI translation stack. That does not mean asking whether a provider claims broad multilingual support in the abstract. It means asking something narrower and more operational: which target languages are likely to behave reliably in this workflow, which ones are safe enough to expose in the UI, and which ones may be technically possible but not yet quality-safe enough for public-facing use?

That second filter mattered because a product can fail in a very human way if it offers language choices that exist in the UI but do not hold up in the actual output. So the final language list was not “all possible languages.” It was the intersection of community relevance and practical support.

This Changed the UX, Not Just the Data

Once the language list became more intentional, the UX had to follow. A long static dropdown was no longer the right interface. A better experience was a searchable type-ahead picker with clearer labeling, stronger defaults, and a little more metadata around what the user was selecting. That sounds small, but it changes the feel of the product. It says: this tool was designed for use, not just demoed into existence.

Surface more relevant language options.
Reduce scanning friction.
Create a better handoff into the translation workflow.

The Quality Problem Was More Interesting Than We Expected

One of the more surprising lessons was that translation quality is not always obvious from visual inspection alone. Some languages create an easy trap: the output can look “wrong” at first glance because it remains in a familiar script, even when the language itself is actually correct. That means correctness cannot always be judged by whether the text looks visually different from the source language.

We also ran into a more concrete failure mode: mixed-language output. A translated image might be mostly correct while still leaving a few visible English words or phrases behind. That kind of output is especially problematic in community-facing materials because it creates uncertainty right where trust matters most.

The Real Product Improvement Was the Workflow Boundary

Generate the translated image.
Inspect it for obvious leftover source-language fragments.
Run one focused repair pass if needed.
Stop there.

That last step matters. It is easy to keep iterating forever in search of perfection. But real products need cost discipline, latency discipline, and clear unit economics. So instead of building an open-ended correction loop, we used a capped second pass. In practice, that turned out to be a strong tradeoff: enough additional quality to matter, without turning every request into an unbounded search problem.

A useful lesson: the best production workflow is often not the smartest possible workflow. It is the smartest bounded workflow.

What Changed for Us

A more relevant language set.
A better selection experience.
A stronger quality boundary.
A clearer view of which translation failures matter operationally.
A more realistic sense of the economics of the workflow.

Most importantly, we had a better product philosophy. The right starting point for AI translation was not “what can the model do?” It was: who is this for, what languages matter to them, which of those can we support responsibly, and how do we expose that in a way that feels thoughtful and usable?

The Broader Lesson

A lot of AI products still begin from capability and then look for a use case. We are increasingly convinced the better approach is the reverse: begin from the user, the context, and the operational quality bar, then work backward into the AI. In our case, that lesson came from real volunteer effort before it came from software. People gave their time generously. Interns stayed up late. Community members and volunteers helped shoulder work that clearly mattered. That human effort is what made the need visible.

The software insight came after: if a workflow is important enough to depend on goodwill and midnight labor, it is important enough to deserve better tooling. That does not make the system less technical. It makes it more real. And in our experience, that is where a lot of the actual value lives.

Start with the Community, Not the Control

The Second Filter Was Technical Reality

This Changed the UX, Not Just the Data

The Quality Problem Was More Interesting Than We Expected

The Real Product Improvement Was the Workflow Boundary

What Changed for Us

The Broader Lesson

About the author

Arun Batchu

Trusting User-Agent Is Not Cron Auth

How We Built a Repeatable Paper-to-Podcast Workflow That Actually Ships