When AI Generates Art: The Hidden Challenge of OCR and Transparency

Today we launched a new autonomous agent on Shilpiworks: the Minnesota Wildlife Agent. The goal was simple—take a dataset of 712 Minnesota wildlife species from the Department of Natural Resources, and have an AI automatically generate one beautiful, dreamy watercolor sticker every single day.

We mapped out the workflow using our Mastra-based pipeline:

Pick the next ungenerated species from the dataset.
Use Gemini to research and write factual marketing copy about the animal.
Use OpenAI's gpt-image-1.5 to generate the watercolor illustration bounded by the silhouette of the Minnesota state map.
Auto-publish the result to our store.

Our first test subject was the "Western Meadowlark." It worked perfectly. But when we flipped the switch to live production and the agent attempted its first automated run for the "Ross's Goose," the pipeline ground to a halt.

Here is what we learned from our AI sticker agent today, and why building autonomous creative pipelines requires more than just a good prompt.

1. The Typography vs. Art Conflict

To make sure our customers know what species they are looking at, we asked the AI to subtly incorporate the species name into the watercolor design.

Our prompt: "Subtly incorporate the text 'Ross's Goose' into the design."

The AI did exactly what we asked. It painted the text. Literally. It rendered the letters as beautiful, sweeping, abstract watercolor brushstrokes.

The problem? Our pipeline includes a strict OCR (Optical Character Recognition) validation step to prevent misspelled or hallucinated text from making it to production. When the OCR scanner looked at the AI's abstract brushstrokes, it read "Rossa Coosa" instead of "Ross's Goose." The validation failed, and the agent blocked the publication.

The Fix: We had to force the AI to separate its artistic style from its typographic duties. We updated the prompt to explicitly say: "Subtly incorporate the text... using a clean, highly legible, sans-serif font to ensure perfect readability."

When building AI image pipelines that require text, you cannot leave typography up to the model's artistic interpretation. You have to dictate the font style.

2. The Alpha Channel Assumption

Our stickers require a pure transparent background around the white die-cut border so they look correct on the website.

In our prompt, we asked for a "Solid pure white background." In older iterations of image models, we would run a background removal script to key out the white pixels. But with the newer OpenAI Responses API (gpt-image-1.5), the model has the native capability to generate transparent backgrounds.

However, we forgot to pass the correct API parameter. The AI generated the white die-cut border perfectly, but filled the rest of the square canvas with a flat gray color because we didn't explicitly ask the API for an alpha channel. Our automated transparency checker caught the gray pixels and failed the run.

The Fix: We updated our API call to explicitly pass background_transparent: true directly to the image_generation tool in the OpenAI Responses API. The model now natively outputs the PNG with a perfect alpha channel, skipping the need for an external background removal script entirely.

The Takeaway

Building autonomous agents is an exercise in edge cases. The "happy path" (like our Western Meadowlark test) often hides the fragility of AI generation.

By building strict validation checks—like OCR text matching and alpha channel transparency verification—directly into the Mastra workflow, we prevented broken products from reaching the live store. It forced us to refine our prompts and API calls until the agent could truly run unsupervised.

Tomorrow morning at 8 AM, the Minnesota Wildlife Agent will wake up and paint another species. And this time, we know the text will be legible and the background will be transparent. Browse the growing collection at shilpiworks.com →

1. The Typography vs. Art Conflict

2. The Alpha Channel Assumption

The Takeaway

About the author

Arun Batchu

We Built an AI That Watches Our AI: The Ops Observer

Debugging Mastra: Why Our AI Workflow Silently Ate Errors