Vector Search Is Not All You Need

Author:Murphy | View: 22053 | Time: 2025-03-23 12:42:21

Retrieval Augmented Generation (RAG) has revolutionized open-domain question answering, enabling systems to produce human-like responses to a wide array of queries. At the heart of RAG lies a retrieval module that scans a vast corpus to find relevant context passages, which are then processed by a neural generative module – often a pre-trained language model like GPT-3 – to formulate a final answer.

While this approach has been highly effective, it's not without its limitations.

One of the most critical components, the vector search over embedded passages, has inherent constraints that can hamper the system's ability to reason in a nuanced manner. This is particularly evident when questions require complex multi-hop reasoning across multiple documents.

Vector search refers to searching for information using vector representations of data. It involves two key steps:

Encoding data into vectors

First, the data being searched is encoded into numeric vector representations. For text data like passages or documents, this is done using embedding models like BERT or RoBERTa. These models convert text into dense vectors of continuous numbers that represent the semantic meaning. Images, audio, and other formats can also be encoded into vectors using appropriate Deep Learning models.

Searching using vector similarity

Once data is encoded into vectors, searching involves finding vectors similar to the vector representation of the search query. This relies on distance metrics like cosine similarity to quantify how close two vectors are and rank results. The vectors with the smallest distance (highest similarity) are returned as the most relevant search hits.

The key advantage of vector search is the ability to search for semantic similarity, not just literal keyword matches. The vector representations capture conceptual meaning, allowing more relevant yet linguistically distinct results to be identified. This enables a higher quality of search compared to traditional keyword matching.

However, transforming data into vectors and searching in high-dimensional semantic space also comes with limitations. Balancing the tradeoffs of vector search is an active area of research.

In this article, we'll dissect the limitations of vector search, exploring why it struggles to capture diverse relationships and intricate interconnections between documents. We'll also delve into alternative techniques, such as knowledge graph prompting, that promise to overcome these shortcomings.

Understanding the strengths and weaknesses of our current AI tools is essential as they become increasingly integrated into our lives. This article aims to provide a balanced view of where vector search shines and where it falls short in augmenting the reasoning capabilities of large language models.

Semantic Gap Between Questions and Answers

In vector search, both the input question and passages in the corpus are encoded as dense vector representations. Relevant context is retrieved by finding passages with the highest semantic similarity to the question vector.

However, questions often have an indirect relationship to the actual answers they seek.

The vector for "What is the capital of France?" may not necessarily have high similarity to a passage stating "Paris is the most populous city in France".

This semantic gap means that passages containing the answer may be overlooked.

The embeddings fail to capture the inferential link between the question and answer.

Passage Granularity Matters

In vector search systems, passages are typically represented by a single embedding vector. The granularity of these passages can vary.

If the passage is very large, like an entire document, it may encompass multiple concepts. Parts of the passage may be relevant, while other parts are not.

But with a single vector representing the entire passage, it is impossible to distinguish relevant sections from irrelevant ones. The passage as a whole may exhibit only weak similarity to the question vector.

Conversely, using sentence-level chunks can help isolate concepts. But this increases the number of vectors in the index, adding computational overhead.

There are inherent trade-offs between precision and tractability when choosing passage sizes.

Struggle With Complex Reasoning

Some questions demand synthesizing facts spread across multiple documents.

For example, "What is the earliest historical record of winemaking?" may require piecing together dates from various sources.

Vector search is ill-equipped for such multi-hop reasoning. Each passage is scored independently against the question. There is no mechanism to jointly analyze or connect information across separate results.

As questions get more complex, simple similarity search reaches its limits. The system struggles to collect and contextualize facts from different passages.

Black Box Model Workings

In standard vector search pipelines, it is opaque how initial retrieved passages are selected. The rankings depend on the inner workings of the semantic similarity model.

This lack of transparency makes results difficult to explain, verify, and improve. It also limits deployability for business-critical applications.

For increased oversight, the ranking algorithms should provide some interpretability into why certain passages are deemed relevant.

Modeling Diverse Relationships

A core limitation of standard vector search is its singular focus on semantic similarity.

However, real-world reasoning requires modeling diverse relationships between content.

Knowledge Graph Prompting: A New Approach for Multi-Document Question Answering

Knowledge graph overcomes this by explicitly encoding various connections into an interconnected graph structure. Specifically:

Topical relationships – Passages are linked if they share rare or key keywords. This captures similarity in the topics discussed.
Semantic relationships – Passage embeddings are compared to connect those with semantic proximity, even if they do not share terms.

This goes beyond surface-level topic matching.

Structural relationships – Passages are connected to the specific sections, pages, or documents they appear in.

This encodes the contextual hierarchy.

Temporal relationships – Passages discussing time-ordered events are linked chronologically.

This represents the flow of events.

Entity relationships – Coreference links are added between passages referencing the same real-world entities.

This allows entity-centric reasoning.

By incorporating these diverse signals beyond just semantic similarity, KGP provides a richer substrate for reasoning about interconnected information.

Structural Relationships

In contrast, standard vector search lacks any notion of these structural relationships. Passages are treated atomically without any surrounding context.

Knowledge graph's modeling of structure relationships merits further discussion. By linking passages to the specific documents or sections they appear in, the contextual hierarchy of the information is encoded.

This enables explicitly reasoning about the section a certain fact is contained in, the document it originates from, and the site it was published on.

Encoding the hierarchical document structure provides useful inductive biases for determining importance, validity, and relevance when reasoning across passages.

Temporal Relationships

This inductive bias is wholly absent in isolated vector search. Vector similarity scores do not factor temporal dynamics in any way. Retrieved passages are disconnected snapshots lacking narrative flow.

The explicit modeling of temporal relationships in KGP also provides significant advantages. Ordering passages chronologically based on the events they describe enables reasoning about unfolding narratives and timelines.

Knowledge graph overcomes this limitation by chaining events based on their relative timing. This unlocks richer reasoning capabilities.

Entity Relationships

With standard vector search, these entity links are not directly modeled. Valuable knowledge around entities is lost in the passage embeddings.

Knowledge graph's ability to connect entity references is a powerful asset. Linking passages that discuss the same real-world entities, concepts, or people allows focused reasoning around those shared elements.

KGP preserves this signal, enabling entity-centric exploration of the knowledge graph. This provides structural advantages when aggregating facts about specific entities across documents.

Conclusion

Vector search enables efficient approximate matching based on semantic similarity. However, it has clear limitations when used in isolation for the retrieval step in RAG systems.

Employing hybrid approaches that combine vector search with graph-based knowledge representation, multi-step reasoning modules, and transparent ranking algorithms can help overcome these weaknesses.

As always, there is no single solution – leveraging a diverse toolkit of techniques is key to robust retrieval for real-world question answering.