What the user cannot put into words

Hybrid search in practice

Ancsa · Thu, 05/14/2026 - 14:35

The Safeguarding Resource and Support Hub (RSH) works to make the aid sector safer for the people it serves and the staff it employs with free, contextualised, multilingual resources, e-learning, mentorship and expert guidance to prevent and address sexual exploitation, abuse and sexual harassment. They genuinely make the world a better place, and I appreciate their work.

During their website redesign they have talked about an interesting pattern of how their users actually search. Two factors shape it. First, many users don't have the digital literacy that we tend to assume in European or Western contexts: search is not a tool they've used confidently for years. Second, and more importantly, many of them don't yet have the vocabulary for the words "abuse," "exploitation," or "harassment".

In technical terms: the existing keyword search needed to be extended with a semantic (vector) search, combined into a hybrid setup that returns relevant results not only on exact keyword matches but also on synonyms, longer questions, and descriptive phrasing. A further requirement was that it had to work in Arabic, Urdu and Bangla as well.

How does vector search work?

Put simply: we take pieces of text and represent each one as a point in a high-dimensional space. (What is a line segment in two dimensions becomes a vector in many dimensions, and the process of producing it is called embedding.) The closer two such vectors sit in this space, the closer the meanings of the texts they came from. Think of how a library organises books by subject: books on similar topics end up next to each other on the same shelf. When someone searches, we turn the query into a vector too, and look for the vectors nearest to it — much like reaching for the right shelf.

In a library, a classification system like the Universal Decimal Classification decides which book goes on which shelf. In our case, a large language model plays that role: it gives us the function that turns text into a vector.

While we're at it — what is RAG?

Imagine you want to learn something from the books in a library. Instead of searching for the answer yourself, you ask the librarian. They pull the five most relevant volumes off the shelf, summarise the answer for you, and point you to the specific passages.

RAG stands for Retrieval-Augmented Generation. The Retrieval part is the librarian picking the relevant books; the Augmented Generation part is them weaving those books and their own knowledge into an answer. Google does something similar with the AI summaries it now shows above search results.

RAG wasn't part of the brief here, but it fits naturally on top of a vector search architecture — an additional layer where results aren't just listed, but interpreted for the user.

How do you fit a long, multi-topic document — sometimes 50 pages — into a single fixed-size vector?

There are a few approaches. Let me illustrate them with deliberately exaggerated examples.

Truncation. Embed only a fixed-size chunk from the start of the text. Fast and cheap — but imagine trying to find Orwell's 1984 based only on its first chapter.

LLM-based summarisation. Generate a fixed-size, search-optimised summary of the whole document, and embed that. Which book does this sentence summarise: "In a totalitarian future society, a man rebels against the system and fails." 1984? Brave New World? Fahrenheit 451? On that summary alone, it could be any of them.

Sampling. Split the text into fixed-size chunks, pick a few, embed each, and average them into a single vector. But take the Wikipedia article on World War II: military operations, economic context, political alliances, the Holocaust, cultural impact, technological developments, aftermath. Each section is a dense semantic core on its own. The average comes out to something like "a major historical event" — uselessly generic.

All three suffer from the same wound: the semantic signal gets blurred, and neither the whole nor the important specifics come through. You can refine any of them further, but every refinement costs another step and more resources.

I wanted a solution where the information stays intact — if not entirely, then in as much depth as possible. Drupal doesn't ship with anything like this, so I built it from scratch. Every document gets split into fixed-size chunks, each chunk gets its own vector, and all chunks are linked to a shared "parent" record that holds the document's metadata (title, author, and so on).

When a user types a query into the search bar, two things happen in parallel:

The keyword search runs the usual way over document content and metadata — fast, and unbeatable on exact matches.

The semantic search embeds the user's query into the same vector space as the content, then finds the chunks whose meaning sits closest. The parent document of each matching chunk enters the result list with a score reflecting the strength of the semantic match.

The two result lists are then merged — and this is where the solution becomes genuinely project-specific. How you weight keyword versus vector hits isn't a matter of a universal formula: the nature of the content, the way users actually search, and the goal of the project all shape what the right ratio is. For legal texts, exact matches matter far more than they would on a lifestyle blog.

What hybrid search gives you

Keyword search is unbeatable on exact wording, identifiers and names. Vector search is unbeatable on paraphrases and conceptual overlap. Combining the two gives you the strengths of both.

With the right embedding model, vector search also crosses language boundaries. For RSH, that means an English query can surface relevant Arabic, Urdu or Bangla material without a separate translation step. Which languages are supported depends on the embedding model you pick.

If the embedding service goes down, there's no failure mode where the search returns nothing or breaks. Keyword search is always available, and embedding catches up when the service comes back.

The result was convincing: the search is fast, the hits are genuinely relevant, and the multilingual side works as expected. Most importantly, the people who don't quite know what they're searching for — the people who need help the most — won't be left without it.