Embed-Free Search with the Vercel AI SDK

We wanted the Netrii assistant to answer questions from our own site content — blog posts, expert profiles, and wisdom products — without introducing a retrieval stack we would have to babysit forever. That meant no embedding pipeline, no vector database, and no background indexing job. Just structured content, a small search helper, and the Vercel AI SDK wiring to let the model ask for what it needs.

The core idea: if the source material is already structured markdown or typed objects, start with a searchable text index before you reach for embeddings.

What we actually built

The assistant has three pieces. First, a compact system prompt that explains what Netrii is and how the assistant should behave. Second, a searchContent tool that scans the site's own data directly. Third, a renderer that keeps internal links in the same browsing context while letting external links open normally. That is enough to make the assistant feel grounded without adding infrastructure.

const searchable = [
  post.title,
  post.excerpt,
  post.tags.join(' '),
  ...post.sections.flatMap(section => section.content ?? section.items ?? [])
].join(' ').toLowerCase()
 
if (searchable.includes(query) || queryWords.every(word => searchable.includes(word))) {
  return results.slice(0, 5)
}

That tiny index turned out to be enough for our use case because the content is already curated. A blog post is not an arbitrary document blob; it has a title, excerpt, tags, and sections. An expert profile has a name, role, and biography. A wisdom product has a description and metadata. Those fields are already semantically useful — they just need to be searchable.

Why not embeddings first?

We wanted low operational overhead. No embedding model choice, no reindexing pipeline, no separate datastore, no sync failures.
We wanted exactness. For site content, exact titles, phrases, and topic names matter more than fuzzy semantic similarity.
We wanted cheap cold starts. The search helper can derive its index from the repo's own data files at runtime.
We wanted debuggability. When a result is missing, it is easy to see whether the text was indexed, whether the query matched, or whether the prompt forgot to search.

Embeddings are useful when you have a large, messy corpus with lots of paraphrase and you need semantic recall. But for a content site with a small number of first-party artifacts, they can be a detour. We did not need approximate relevance as much as we needed predictable retrieval from a known set of files.

What the Vercel AI SDK made easy

Tool calling. streamText() lets the model decide when to search instead of forcing us to pre-search every question.
Message conversion. convertToModelMessages() bridges the UI chat state and the model input cleanly.
Streaming answers. The assistant can answer directly when it already has enough context, and call the search tool when it needs grounding.
Simple serverless shape. The retrieval helper and the LLM live in the same route, which keeps the architecture easy to reason about.

That combination matters more than it sounds. The SDK gives you the shape of the conversation, but it does not force a retrieval architecture on you. That leaves room for a deliberately boring search layer when boring is exactly what you want.

The guardrails that mattered

Search first, then answer. If a question might relate to site content, the assistant should search before it speaks from general knowledge.
No hallucinated content. If the search returns nothing, the assistant should say the site does not cover that topic yet instead of improvising.
Respect the site boundary. Off-topic questions should get a short, friendly redirect back to Netrii topics.
Render safely. Any HTML coming from the model must be sanitized before display.

The most important lesson here is that search quality is only half the problem. The other half is teaching the model when to trust the site index and when to stay humble. We learned that a good system prompt is not a replacement for retrieval — it is the instruction layer that makes retrieval useful.

When embed-free search is enough

If your content is first-party, structured, and relatively compact, embed-free search is often the right default. It is especially good when you need exact topic recall, predictable maintenance, and a system that a future maintainer can understand in one sitting. If the corpus grows into something much bigger and fuzzier, you can always move to a hybrid or vector-backed approach later.

That is the real reason we liked the pattern: it is small enough to ship, simple enough to debug, and flexible enough to evolve if the site outgrows it.

References

Vercel AI SDK docs — tool calling, streaming, and message conversion.
Reusable Agent Skills Need a Thin SKILL.md — the earlier post that shaped our progressive-disclosure thinking.

What we actually built

Why not embeddings first?

What the Vercel AI SDK made easy

The guardrails that mattered

When embed-free search is enough

References

About the author

Arun Batchu

Why Generic AI Falls Short for Technical Audiences: Building the ECE Sticker Agent

Reusable Agent Skills Need a Thin SKILL.md