A Simple Strategy to Improve LLM Query Generation
In March 2024, I wrote a tutorial on Real Python detailing the steps to build a retrieval-augmented generation (RAG) chatbot using LangChain. The chatbot, formally named the Hospital System Chatbot, uses Neo4j to retrieve data from a synthetic hospital system dataset with information about patients, patient reviews, hospital locations, visits, insurance payers, and physicians.
The Hospital System Chatbot agent is hosted via FastAPI, and can be accessed through a Streamlit app – all packaged up with Docker Compose:

All code is available on GitHub, and anyone can read about the project in detail on Real Python.
GitHub – hfhoffman1144/langchain_neo4j_rag_app: A knowledge graph RAG app using LangChain and…
Since writing the tutorial, I've maintained the project on GitHub to improve the Hospital System Chatbot's capabilities and make it more production-ready. For instance, I've added unit tests, refactored code to correct for deprecated features, created deployments with GitHub actions, and most recently integrated dynamic few-shot prompting to improve Cypher query generation.
In this tutorial, we'll focus on improving the Hospital System Chatbot's ability to generate (Cypher) queries through dynamic few-shot prompting. This technique will take us closer to a more accurate, production-ready chatbot that we can seamlessly update over time.
Chatbot Overview
Let's start with a brief overview of the Hospital System Chatbot. For a deep dive into the chatbot or a primer on Neo4j and LangChain, see the original Real Python tutorial.
At its core, the Hospital System Chatbot is a LangChain agent with access to multiple tools to help it answer questions about a synthetic hospital system dataset, originally derived from a popular health care dataset on Kaggle. Here's a summary of the project's current features:
- Tool calling: The chatbot agent uses multiple tools such as LangChain chains for RAG and (simulated) API calls.
- RAG over unstructured data: The chatbot can answer questions about patient experiences based on their reviews. Patient reviews are embedded using OpenAI embedding models and stored in a Neo4j vector index. Currently, the RAG over unstructured data is bare-bones and doesn't implement advanced RAG techniques like reranking or query transformation.
- RAG over structured data (Text-to-Cypher): The chatbot can answer questions about structured hospital system data stored in a Neo4j graph database. If the chatbot agent thinks it can respond to an input natural language query by querying the Neo4j graph, it will try to generate and run a Cypher query.
- Serving via FastAPI and Streamlit: The chatbot agent is hosted via an asynchronous FastAPI endpoint and accessible through a Streamlit app.
Everything is packaged up and orchestrated with Docker Compose- more details about how to run the chatbot are in the README.
langchain_neo4j_rag_app/README.md at main · hfhoffman1144/langchain_neo4j_rag_app
In this tutorial, we'll focus on improving RAG over structured data. To see an example, suppose we ask the chatbot the following question:

To answer the question how many hospitals are in the hospital system?, the underlying LangChain agent has to first invoke the query generation tool. Here's what that looks like in the API logs:

In this case, the agent invoked the explore_hospital_database
tool which is a LangChain chain that generates Cypher queries and runs them on a Neo4j database.
To correctly answer our question, the explore_hospital_database
has to generate and run a Cypher query like this:
MATCH (h:Hospital)
RETURN COUNT(h) AS total_hospitals
This simple Cypher query counts the number of hospital nodes in the Neo4j database. If the tool can successfully generate and run this query in the database, the results are returned to the agent and displayed to the user.
For anyone interested in diving deeper into Text-To-Cypher, or exploring how to integrate LLMs with Neo4j, I highly recommend reading Tomaz Bratanic's articles.
In remainder of this tutorial, we'll focus on improving the agent's ability to generate Cypher queries through dynamic few-shot prompting. To motivate why we need this, let's first explore some major limitations of database query generation.
The Problem With Query Generation
Llm query generation tasks like Text-To-SQL and Text-To-Cypher are popular because of their potential to abstract away complex query logic. For non-technical stakeholders, LLM query generation can help answer critical questions without needing to understand a query language, request a report from an analyst, or wait for someone to build a dashboard. Instead, the stakeholder prompts a chatbot with a natural language query, and the results are returned in a digestible text summary.
At a high level, a database query generation workflow looks something like this:

Here's a breakdown of the steps:
- Natural Language Query: The user first enters a natural language query, like Show me the average billing amount by state, into an interface like a chatbot.
- Prompt: The user's natural language query is injected into a prompt that tells the LLM to generate a database query to answer the natural language query. The prompt needs to include information about the database such as schema definitions, column definitions, and example queries.
- LLM: The prompt is given to an LLM that attempts to generate and run a query in the database.
- Summarized Response: If the query executes, the LLM returns the results to the user in a digestible text summary.
The main issue with this approach is that the prompt must contain enough information about the database to generate an accurate query. This task is challenging because most real-world databases have complex designs. For example, transactional databases and data warehouses often store hundreds or thousands of interconnected tables, while a graph database can have hundreds of node and relationship types.
Because of this, it's challenging, if not impossible, for a static prompt to contain enough information about the database and example queries for the LLM to consistently generate accurate queries. Even if we can fit a complete database description within the LLM's context window, it might still struggle from a lack of relevant query examples.
To see this shortcoming, consider the Neo4j schema for the hospital system dataset:

Each node and relationship can also have properties. For instance, the Visit
node has the following properties:

This Neo4j database is relatively simple with six distinct node and relationship types. Although you can still write complex queries on this schema, a real-world database will be much more intricate and require more complex queries. Given the schema information and a few examples, we might expect an LLM to perform well at generating queries for this database.
Now suppose we create the following prompt for the LLM that generates Cypher queries for this data:
Task:
Generate Cypher query for a Neo4j graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note:
Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything other than
for you to construct a Cypher statement. Do not include any text except
the generated Cypher statement. Make sure the direction of the relationship is
correct in your queries. Make sure you alias both entities and relationships
properly (e.g. [c:COVERED_BY] instead of [:COVERED_BY]). Do not run any
queries that would add to or delete from
the database. Make sure to alias all statements that follow as with
statement (e.g. WITH v as visit, c.billing_amount as billing_amount)
If you need to divide numbers, make sure to
filter the denominator to be non zero.
Warning:
- Never return a review node without explicitly returning all of the properties
besides the embedding property
- Make sure to use IS NULL or IS NOT NULL when analyzing missing properties.
- You must never include the
statement "GROUP BY" in your query.
- Make sure to alias all statements that
follow as with statement (e.g. WITH v as visit, c.billing_amount as
billing_amount)
- If you need to divide numbers, make sure to filter the denominator to be non
zero.
String category values:
Test results are one of: 'Inconclusive', 'Normal', 'Abnormal'
Visit statuses are one of: 'OPEN', 'DISCHARGED'
Admission Types are one of: 'Elective', 'Emergency', 'Urgent'
Payer names are one of: 'Cigna', 'Blue Cross', 'UnitedHealthcare', 'Medicare',
'Aetna'
If you're filtering on a string, make sure to lowercase the property and filter
value.
A visit is considered open if its status is 'OPEN' and the discharge date is
missing.
Use state abbreviations instead of their full name. For example, you should
change "Texas" to "TX", "Colorado" to "CO", "North Carolina" to "NC", and so on.
The question is:
{question}
This prompt has two parameters – schema
and question
. The schema
parameter should contain all of the node and relationship definitions in the graph, and the question
parameter is the user's natural language query. As you can see, there are no examples of queries in this prompt.
With this prompt, we can ask the chatbot basic questions about the hospital:

The chatbot successfully generated a query to count the number of hospitals in the system using only the above prompt. Now, let's see what happens when we ask the chatbot a slightly more complicated question:

In response to this question, the Cypher generation chain generated and attempted to execute this Cypher query:
MATCH (v:Visit)
WHERE v.status = 'closed' AND v.admission_type = 'emergency'
RETURN AVG(duration(between(date(v.admission_date), date(v.discharge_date))))
AS average_duration_days
There are two issues with this query:
- The LLM ignored the information in the prompt stating that
Visit statuses are one of: 'OPEN', 'DISCHARGED'
. Instead, it tried to filter the visit status toclosed
. - The LLM tried to use a function called
between()
which doesn't exist in Neo4j.
These results are quite concerning, and this isn't an isolated example. Without any Cypher examples in the prompt, the Cypher generation chain has no chance of generating accurate queries in many scenarios:

We can improve the Cypher chain's query generation accuracy by including example questions and corresponding Cypher queries in the prompt. Here's an updated version of the prompt:
Task:
Generate Cypher query for a Neo4j graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note:
Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything other than
for you to construct a Cypher statement. Do not include any text except
the generated Cypher statement. Make sure the direction of the relationship is
correct in your queries. Make sure you alias both entities and relationships
properly (e.g. [c:COVERED_BY] instead of [:COVERED_BY]). Do not run any
queries that would add to or delete from
the database. Make sure to alias all statements that follow as with
statement (e.g. WITH v as visit, c.billing_amount as billing_amount)
If you need to divide numbers, make sure to
filter the denominator to be non zero.
Example queries for this schema:
# Who is the oldest patient and how old are they?
MATCH (p:Patient)
RETURN p.name AS oldest_patient,
duration.between(date(p.dob), date()).years AS age
ORDER BY age DESC
LIMIT 1
# Which physician has billed the least to Cigna?
MATCH (p:Payer)<-[c:COVERED_BY]-(v:Visit)-[t:TREATS]-(phy:Physician)
WHERE lower(p.name) = 'cigna'
RETURN phy.name AS physician_name, SUM(c.billing_amount) AS total_billed
ORDER BY total_billed
LIMIT 1
# How many non-emergency patients in North Carolina have written reviews?
match (r:Review)<-[:WRITES]-(v:Visit)-[:AT]->(h:Hospital)
where lower(h.state_name) = 'nc' and lower(v.admission_type) <> 'emergency'
return count(*)
# Which state had the largest percent increase in Cigna visits from 2022 to 2023?
MATCH (h:Hospital)<-[:AT]-(v:Visit)-[:COVERED_BY]->(p:Payer)
WHERE lower(p.name) = 'cigna' AND v.admission_date >= '2022-01-01' AND
v.admission_date < '2024-01-01'
WITH h.state_name AS state, COUNT(v) AS visit_count,
SUM(CASE WHEN v.admission_date >= '2022-01-01' AND
v.admission_date < '2023-01-01' THEN 1 ELSE 0 END) AS count_2022,
SUM(CASE WHEN v.admission_date >= '2023-01-01' AND
v.admission_date < '2024-01-01' THEN 1 ELSE 0 END) AS count_2023
WITH state, visit_count, count_2022, count_2023,
(toFloat(count_2023) - toFloat(count_2022)) / toFloat(count_2022) * 100
AS percent_increase
RETURN state, percent_increase
ORDER BY percent_increase DESC
LIMIT 1
Warning:
- Never return a review node without explicitly returning all of the properties
besides the embedding property
- Make sure to use IS NULL or IS NOT NULL when analyzing missing properties.
- You must never include the
statement "GROUP BY" in your query.
- Make sure to alias all statements that
follow as with statement (e.g. WITH v as visit, c.billing_amount as
billing_amount)
- If you need to divide numbers, make sure to filter the denominator to be non
zero.
String category values:
Test results are one of: 'Inconclusive', 'Normal', 'Abnormal'
Visit statuses are one of: 'OPEN', 'DISCHARGED'
Admission Types are one of: 'Elective', 'Emergency', 'Urgent'
Payer names are one of: 'Cigna', 'Blue Cross', 'UnitedHealthcare', 'Medicare',
'Aetna'
If you're filtering on a string, make sure to lowercase the property and filter
value.
A visit is considered open if its status is 'OPEN' and the discharge date is
missing.
Use state abbreviations instead of their full name. For example, you should
change "Texas" to "TX", "Colorado" to "CO", "North Carolina" to "NC", and so on.
The question is:
{question}
This new prompt includes four example questions along with their corresponding Cypher statements. Let's see how this improves answers to the previous questions:

And the other two questions:

While the examples in the updated prompt helped the Cypher generation chain answer a few more questions, it still struggles to answer questions.
We could continue adding more example questions to the prompt, but there are two fundamental flaws with fixed prompts for query generation:
- Many of the example queries included in the prompt are not relevant to the current natural language query given by the user.
- If we continue to add example queries to the prompt, we'll incur more costs and eventually surpass the LLM's context window.
If we want the LLM to generate reliable quieres, we need a better strategy – this is where dynamic few-shot prompting comes in.
Dynamic Few-Shot Prompting
In database query generation, dynamic few-shot prompting is a technique that updates the prompt with example database queries relevant to the user's current natural language query. The following diagram illustrates the general workflow for dynamic few-shot prompting in query generation:

Dynamic few-shot prompting adds a few additional steps on top of the original query generation workflow:
- Example Database Queries: Natural language queries are embedded and stored along with database queries as metadata in a vector index, allowing us to search for semantically relevant natural language queries and extract their corresponding database query.
- Vector Index: At inference time, we run the user's natural language query through the vector index to extract semantically similar natural language queries and their respective database queries. Ideally, this gives us a set of example database queries relevant to answering the current natural language query.
- Prompt: We inject the natural language query and relevant example database queries into the prompt, creating a prompt that changes with the user's natural language query.
When done correctly, dynamic few-shot prompting overcomes the limitations of our original query generation strategy. Namely, we don't have to store hundreds or thousands of static (and potentially irrelevant) example database queries in the prompt. Instead, the examples are updated based on the current natural language query.
Dynamic Few-Shot Prompting: a prompting technique that dynamically selects examples based on an input and formats them into a final prompt for a model.
Moreover, this strategy allows us to iteratively improve our chatbot's query generation accuracy. If the chatbot generates an incorrect database query for a given natural language query, and we know the correct database query, we can add the example to our vector index.
To see how this works in more detail, let's look at the code changes made to the Hospital System Chatbot to implement dynamic few-shot prompting.
Implementation
Let's walk through the main code updates to the Hospital System Chatbot – the complete code is available on GitHub.
Out of the box, LangChain does not support dynamic few-shot prompting for Cypher query generation, so we have to modify the [[GraphCypherQAChain](https://python.langchain.com/v0.2/docs/integrations/graphs/neo4j_cypher/)](https://python.langchain.com/v0.2/docs/integrations/graphs/neo4j_cypher/)
class – a class that creates a Text-To-Cypher chain. We first need to add a retriever attribute to GraphCypherQAChain
that will retrieve example database queries to inject into the prompt:
# chatbot_api/src/langchain_custom/graph_qa/cypher.py
...
class GraphCypherQAChain(Chain):
"""Chain for question-answering against a graph by generating Cypher statements.
*Security note*: Make sure that the database connection uses credentials
that are narrowly-scoped to only include necessary permissions.
Failure to do so may result in data corruption or loss, since the calling
code may attempt commands that would result in deletion, mutation
of data if appropriately prompted or reading sensitive data if such
data is present in the database.
The best way to guard against such negative outcomes is to (as appropriate)
limit the permissions granted to the credentials used with this tool.
See https://python.langchain.com/docs/security for more information.
"""
graph: GraphStore = Field(exclude=True)
cypher_generation_chain: Union[LLMChain, Runnable]
qa_chain: Union[LLMChain, Runnable]
graph_schema: str
input_key: str = "query" #: :meta private:
output_key: str = "result" #: :meta private:
top_k: int = 10
"""Number of results to return from the query"""
return_intermediate_steps: bool = False
"""Whether or not to return the intermediate steps along with the final answer."""
return_direct: bool = False
"""Whether or not to return the result of querying the graph directly."""
cypher_query_corrector: Optional[CypherQueryCorrector] = None
"""Optional cypher validation tool"""
use_function_response: bool = False
"""Whether to wrap the database context as tool/function response"""
cypher_example_retriever: Optional[VectorStoreRetriever] = None
"""Optional retriever to augment the prompt with example Cypher queries"""
node_properties_to_exclude: Optional[list[str]] = None
"""Optional list of node properties to exclude from context in the QA prompt"""
...
Here, we've added a cypher_example_retriever
attribute that is an optional LangChain [VectorStoreRetriever](https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/)
. We'll use cypher_example_retriever
to inject semantically-relevant Cypher queries into our prompt.
Next, we need to modify the .from_llm()
classmethod that we'll use to instantiate GraphCypherQAChain
:
# chatbot_api/src/langchain_custom/graph_qa/cypher.py
...
class GraphCypherQAChain(Chain):
"""Chain for question-answering against a graph by generating Cypher statements.
*Security note*: Make sure that the database connection uses credentials
that are narrowly-scoped to only include necessary permissions.
Failure to do so may result in data corruption or loss, since the calling
code may attempt commands that would result in deletion, mutation
of data if appropriately prompted or reading sensitive data if such
data is present in the database.
The best way to guard against such negative outcomes is to (as appropriate)
limit the permissions granted to the credentials used with this tool.
See https://python.langchain.com/docs/security for more information.
"""
...
cypher_example_retriever: Optional[VectorStoreRetriever] = None
"""Optional retriever to augment the prompt with example Cypher queries"""
...
@classmethod
def from_llm(
cls,
llm: Optional[BaseLanguageModel] = None,
*,
qa_prompt: Optional[BasePromptTemplate] = None,
cypher_prompt: Optional[BasePromptTemplate] = None,
cypher_llm: Optional[BaseLanguageModel] = None,
cypher_example_retriever: Optional[VectorStoreRetriever] = None,
qa_llm: Optional[Union[BaseLanguageModel, Any]] = None,
exclude_types: List[str] = [],
include_types: List[str] = [],
validate_cypher: bool = False,
qa_llm_kwargs: Optional[Dict[str, Any]] = None,
cypher_llm_kwargs: Optional[Dict[str, Any]] = None,
use_function_response: bool = False,
function_response_system: str = FUNCTION_RESPONSE_SYSTEM,
node_properties_to_exclude: Optional[list[str]] = None,
**kwargs: Any,
) -> GraphCypherQAChain:
"""Initialize from LLM."""
...
if cypher_example_retriever is not None:
cypher_generation_chain = (
{
"example_queries": itemgetter("question")
| cypher_example_retriever
| RunnableLambda(format_retrieved_documents),
"schema": itemgetter("schema"),
"question": itemgetter("question"),
}
| CYPHER_GENERATION_PROMPT_USE
| cypher_llm
| StrOutputParser()
)
else:
cypher_generation_chain = LLMChain(
llm=cypher_llm or llm, # type: ignore[arg-type]
**use_cypher_llm_kwargs, # type: ignore[arg-type]
)
...
return cls(
graph_schema=graph_schema,
qa_chain=qa_chain,
cypher_generation_chain=cypher_generation_chain,
cypher_query_corrector=cypher_query_corrector,
use_function_response=use_function_response,
cypher_example_retriever=cypher_example_retriever,
node_properties_to_exclude=node_properties_to_exclude,
**kwargs,
)
The primary change to.from_llm()
is that we create a different cypher_generation_chain
if the user specifies a cypher_example_retriever
attribute. This chain passes the question
input through cypher_example_retriever
to extract relevant example Cypher queries from the vector index. We format the example queries with a utility function called format_retrieved_documents()
, and pass the schema
, question
, and example_queries
into the prompt.
If the code that creates
cypher_generation_chain
looks unfamiliar, check out the LangChain docs on the LangChain Expression Language (LCEL).
These are the primary changes to GraphCypherQAChain
needed to implement dynamic few-shot prompting. Next, we create an instance of GraphCypherQAChain
to use as a tool for the hospital agent:
# chatbot_api/src/chains/hospital_cypher_chain.py
import os
from langchain_community.graphs import Neo4jGraph
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain_community.vectorstores.neo4j_vector import Neo4jVector
from langchain_openai import OpenAIEmbeddings
from src.langchain_custom.graph_qa.cypher import GraphCypherQAChain
NEO4J_URI = os.getenv("NEO4J_URI")
NEO4J_USERNAME = os.getenv("NEO4J_USERNAME")
NEO4J_PASSWORD = os.getenv("NEO4J_PASSWORD")
HOSPITAL_QA_MODEL = os.getenv("HOSPITAL_QA_MODEL")
HOSPITAL_CYPHER_MODEL = os.getenv("HOSPITAL_CYPHER_MODEL")
NEO4J_URI = os.getenv("NEO4J_URI")
NEO4J_USERNAME = os.getenv("NEO4J_USERNAME")
NEO4J_PASSWORD = os.getenv("NEO4J_Password")
NEO4J_CYPHER_EXAMPLES_INDEX_NAME = os.getenv("NEO4J_CYPHER_EXAMPLES_INDEX_NAME")
NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY = os.getenv(
"NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY"
)
NEO4J_CYPHER_EXAMPLES_NODE_NAME = os.getenv("NEO4J_CYPHER_EXAMPLES_NODE_NAME")
NEO4J_CYPHER_EXAMPLES_METADATA_NAME = os.getenv("NEO4J_CYPHER_EXAMPLES_METADATA_NAME")
graph = Neo4jGraph(
url=NEO4J_URI,
username=NEO4J_USERNAME,
password=NEO4J_PASSWORD,
)
graph.refresh_schema()
cypher_example_index = Neo4jVector.from_existing_graph(
embedding=OpenAIEmbeddings(),
url=NEO4J_URI,
username=NEO4J_USERNAME,
password=NEO4J_PASSWORD,
index_name=NEO4J_CYPHER_EXAMPLES_INDEX_NAME,
node_label=NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY.capitalize(),
text_node_properties=[
NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY,
],
text_node_property=NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY,
embedding_node_property="embedding",
)
cypher_example_retriever = cypher_example_index.as_retriever(search_kwargs={"k": 8})
After importing dependencies and loading environment variables, we instantiate a Neo4jGraph
object that connects to our Neo4j database. We then define cypher_example_index
– a Neo4jVector
object that connects to the vector index that stores example Cypher queries.
Here's a description of each argument to .from_existing_graph()
:
embedding
– The embedding model used to embed our example natural language queries. We'll useOpenAIEmbedding()
.url
,username
,password
– Credentials needed to connect to the Neo4j database.index_name
– The name of the vector index. We'll name our index questions.node_label
– The name to give the nodes that store example queries. We'll name these nodes Question.text_node_property
—The name of the node property to give to the natural language query/question. We'll use question.embedding_node_property
– The name of the property that stores the embedded question.
If the index_name
doesn't already exist in the graph, .from_existing_graph()
creates a new one. Otherwise, it references the existing index.
We then create a retriever object, cypher_example_retriever
, from cypher_example_index
that we'll use to search for semantically-relevant questions. By passing search_kwargs={"k": 8}
to .as_retriever()
, we instantiate a retriever that fetches the top 8 most similar questions to the input. The k
parameter can be tuned to optimize query generation performance.
After this, we define the prompt used by the Cypher generation LLM:
# chatbot_api/src/chains/hospital_cypher_chain.py
...
cypher_generation_template = """
Task:
Generate Cypher query for a Neo4j graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note:
Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything other than
for you to construct a Cypher statement. Do not include any text except
the generated Cypher statement. Make sure the direction of the relationship is
correct in your queries. Make sure you alias both entities and relationships
properly (e.g. [c:COVERED_BY] instead of [:COVERED_BY]). Do not run any
queries that would add to or delete from
the database. Make sure to alias all statements that follow as with
statement (e.g. WITH v as visit, c.billing_amount as billing_amount)
If you need to divide numbers, make sure to
filter the denominator to be non zero.
Example queries for this schema:
{example_queries}
Warning:
- Never return a review node without explicitly returning all of the properties
besides the embedding property
- Make sure to use IS NULL or IS NOT NULL when analyzing missing properties.
- You must never include the
statement "GROUP BY" in your query.
- Make sure to alias all statements that
follow as with statement (e.g. WITH v as visit, c.billing_amount as
billing_amount)
- If you need to divide numbers, make sure to filter the denominator to be non
zero.
String category values:
Test results are one of: 'Inconclusive', 'Normal', 'Abnormal'
Visit statuses are one of: 'OPEN', 'DISCHARGED'
Admission Types are one of: 'Elective', 'Emergency', 'Urgent'
Payer names are one of: 'Cigna', 'Blue Cross', 'UnitedHealthcare', 'Medicare',
'Aetna'
If you're filtering on a string, make sure to lowercase the property and filter
value.
A visit is considered open if its status is 'OPEN' and the discharge date is
missing.
Use state abbreviations instead of their full name. For example, you should
change "Texas" to "TX", "Colorado" to "CO", "North Carolina" to "NC", and so on.
The question is:
{question}
"""
cypher_generation_prompt = PromptTemplate(
input_variables=["schema", "example_queries", "question"],
template=cypher_generation_template,
)
The only difference between this prompt and the original is that we've included a parameter, example_queries
, that we'll use to store the example Cypher queries retrieved from the vector index. We also need to define a prompt to summarize the Cypher query results as an answer to the user's question, but we won't cover that here since it's the same as in the original chain.
Next, we instantiate the Cypher generation chain:
# chatbot_api/src/chains/hospital_cypher_chain.py
...
hospital_cypher_chain = GraphCypherQAChain.from_llm(
cypher_llm=ChatOpenAI(model=HOSPITAL_CYPHER_MODEL, temperature=0),
qa_llm=ChatOpenAI(model=HOSPITAL_QA_MODEL, temperature=0),
cypher_example_retriever=cypher_example_retriever,
node_properties_to_exclude=["embedding"],
graph=graph,
verbose=True,
qa_prompt=qa_generation_prompt,
cypher_prompt=cypher_generation_prompt,
validate_cypher=True,
top_k=100,
)
Here, we've instantiated a GraphCypherQAChain
object and passed in the cypher_example_retriever
to retrieve semantically relevant Cypher query examples and inject them into the prompt.
Lastly, we include hospital_cypher_chain
as a tool for the final chatbot agent.
# chatbot_api/src/chains/hospital_cypher_chain.py
import os
from typing import Any
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, tool
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents.format_scratchpad.openai_tools import (
format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser
from src.chains.hospital_review_chain import reviews_vector_chain
from src.chains.hospital_cypher_chain import hospital_cypher_chain
from src.tools.wait_times import (
get_current_wait_times,
get_most_available_hospital,
)
HOSPITAL_AGENT_MODEL = os.getenv("HOSPITAL_AGENT_MODEL")
agent_chat_model = ChatOpenAI(
model=HOSPITAL_AGENT_MODEL,
temperature=0
)
...
@tool
def explore_hospital_database(question: str) -> str:
"""
Useful for answering questions about patients,
physicians, hospitals, insurance payers, patient review
statistics, and hospital visit details. Use the entire prompt as
input to the tool. For instance, if the prompt is "How many visits
have there been?", the input should be "How many visits have
there been?".
"""
return hospital_cypher_chain.invoke(question)
...
agent_tools = [
explore_patient_experiences,
explore_hospital_database,
get_hospital_wait_time,
find_most_available_hospital,
]
agent_prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"""
You are a helpful chatbot designed to answer questions
about patient experiences, patient data, hospitals,
insurance payers, patient review statistics, hospital
visit details, wait times, and availability for
stakeholders in a hospital system.
""",
),
("user", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
]
)
agent_llm_with_tools = agent_chat_model.bind_tools(agent_tools)
hospital_rag_agent = (
{
"input": lambda x: x["input"],
"agent_scratchpad": lambda x: format_to_openai_tool_messages(
x["intermediate_steps"]
),
}
| agent_prompt
| agent_llm_with_tools
| OpenAIToolsAgentOutputParser()
)
hospital_rag_agent_executor = AgentExecutor(
agent=hospital_rag_agent,
tools=agent_tools,
verbose=True,
return_intermediate_steps=True,
)
Our chatbot agent can now call the explore_hospital_database
tool with dynamic-few shot prompting to answer questions requiring Cypher queries.
Adding Cypher Examples
To add Cypher examples to the questions
index, we can use the retriever's .add_texts()
method:
cypher_example_index = Neo4jVector.from_existing_graph(
embedding=OpenAIEmbeddings(),
url=NEO4J_URI,
username=NEO4J_USERNAME,
password=NEO4J_PASSWORD,
index_name=NEO4J_CYPHER_EXAMPLES_INDEX_NAME,
node_label=NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY.capitalize(),
text_node_properties=[
NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY,
],
text_node_property=NEO4J_CYPHER_EXAMPLES_TEXT_NODE_PROPERTY,
embedding_node_property="embedding",
)
question = "who is the oldest patient and how old are they?"
cypher_example = """
MATCH (p:Patient)
RETURN p.name AS oldest_patient,
duration.between(date(p.dob), date()).years AS age
ORDER BY age DESC
LIMIT 1
"""
node_id = cypher_example_index.add_texts(
texts=[question],
metadatas=[{"cypher": cypher_example}],
)
Here, we connect to our Neo4j Cypher example index using Neo4jVector.from_existing_graph()
. We then add an example question and its corresponding Cypher query using .add_texts()
. The texts
argument specifies the property we want to embed – the user's natural language question/query in this case. We also need to store the Cypher query as metadata so that when searching for semantically similar questions, we extract their corresponding Cypher queries.
After running this, we should see a new Question
node in the Neo4j database:

Question node in the Neo4j database. Image by Author.
This Question
node has properties id
, question
, cypher
, and embedding
. The embedding
property is the vectorized version of the question
, and it's what we search over when looking for relevant questions to the user's input query.
Now that we've built a Cypher generation chain that leverages dynamic few-shot prompting, we need an easy way to provide it with examples – that's what we'll cover next.
The Self-Service Portal
Now that we've built a Cypher generation chain that leverages dynamic few-shot prompting, we need an easy way to provide it with examples. Of course, we could load a predefined dataset of example queries, but we also want to build a system that allows us to correct query generation errors as they happen without having to run or change any code. We can accomplish this with a self-service portal.
langchain_neo4j_rag_app/cypher_example_portal at main · hfhoffman1144/langchain_neo4j_rag_app
The self-service portal is a UI that allows us to upload Cypher queries to the vector index. To see how this might work, let's ask the chatbot a question that it currently can't answer:

If we know how to answer this question, we can use the self-service portal to update the Cypher example vector index:

In this first input box, we enter the question the chatbot can't generate a correct Cypher query. To show that the LLM isn't just copying example queries, we've slightly changed the question.
In the second input box, we enter the Cypher query that helps the LLM answer this question. Next, we press the Validate button to execute input checks:

The first input check informs us that the example question doesn't exist in the vector index, so we're not adding a duplicate. The self-service portal then executes the query to ensure it's valid. Lastly, the portal displays an existing example question that has the highest semantic similarity to the example question we want to upload. This allows us to verify that we're not uploading an example question with the same semantic meaning as an existing question.
After we press the Upload button, the example question is added to the index:

Now we can go back to the chatbot and ask the same question it previously wasn't able to answer:

Nice! The Cypher example retriever successfully extracted and injected our new example into the prompt, allowing the Cypher generation chain to generate the correct query. We can also ask slight variations of the example question to see how the chatbot responds:

By adding one example to the index, we've enabled the chatbot to correctly respond to a new class of questions. Imagine how much we could improve the performance by adding more examples!
Considerations and Closing Remarks
In this tutorial, we improved the Hospital System Chatbot by implementing dynamic few-shot prompting for Cypher query generation. This enhancement allows the chatbot to generate more accurate and relevant Cypher queries by dynamically selecting and injecting example queries into the prompt based on the user's current natural language query.
Key Considerations:
- Context Window Limitations: While dynamic few-shot prompting helps alleviate the issue of context window limitations by only including relevant examples, it's still essential to monitor and manage the size of the prompt to ensure it remains within the LLM's context window.
- Quality of Examples: The accuracy of query generation heavily relies on the quality and relevance of the example queries stored in the vector index. Continuously curating and updating the example set is crucial for maintaining and improving performance.
- Security: Ensure that the database connection used by the chatbot has narrowly-scoped credentials to prevent any unauthorized data access or modifications. Implementing robust security measures is essential for protecting sensitive data.
- User Feedback Loop: Implementing a self-service portal allows users to provide feedback and correct errors in query generation, creating a continuous improvement loop that enhances the chatbot's capabilities over time.
- Performance Monitoring: Implementing monitoring tools to track the performance and accuracy of the query generation process can help identify areas for improvement and ensure the chatbot meets user expectations.
By implementing dynamic few-shot prompting and continuously iterating on the system, we can build a more accurate, reliable, and production-ready Hospital System Chatbot. This approach not only enhances the chatbot's current performance but also provides a framework for ongoing improvement and scalability.
References
- Build an LLM RAG Chatbot With LangChain – https://realpython.com/build-llm-rag-chatbot-with-langchain/