From Ephemeral to Persistence with LangChain: Building Long-Term Memory in Chatbots

Author:Murphy | View: 22668 | Time: 2025-03-22 20:41:54

In a previous article I wrote about how I created a conversational chatbot with OpenAI. However, if you have used any of the chatbot interfaces like ChatGPT or Claude et al., you would notice that when a session is closed and reopened, the memory is retained, and you can continue the conversation from where you left off. That is exactly the experience I want to create in this article.

I will use LangChain as my foundation which provides amazing tools for managing conversation history, and is also great if you want to move to more complex applications by building chains.

Code availability

Code for recreating everything in this article can be found at https://github.com/deepshamenghani/langchain_openai_persistence.

Single Q&A bot with LangChain and OpenAI

I will start by creating a loop for the user to input questions for the chatbot. I will assign this to the variable humaninput. For now, instead of an LLM output, I will assign the user inputted content into the result variable and print out the result. After this, I will connect with the OpenAI API to get results and then update this while loop to print out results from the client.

Python">while True:
    humaninput = input(">> ")
    result = humaninput    
    print(result)

Set up the environment

I will import the following packages after installing them using pip install package_name.

from langchain_openai import ChatOpenAI
import os
import dotenv

Connect with OpenAI

I will set up an account on OpenAI Platform and generate a unique API key. I will store this API key in my .env file as OPENAI_API to access it.

I will now connect with the OpenAI client to start creating the skeleton of the chatbot.

dotenv.load_dotenv()
llmclient = ChatOpenAI(openai_api_key=os.getenv("OPENAI_API"))

Create chat prompts

I will now create a chat prompt template that takes a human message and turns it into a template to pass on to the llm client for the answer. A chat prompt will provide context, instructions, and structure to the input I send to the model. Prompt templates are reusable structures so they make it easy to maintain consistency across conversations as well as add dynamic content into the prompt.

Here I will create a simple ChatPromptTemplate which expects a variable called humaninput. In the messages list, I will include a HumanMessagePromptTemplate which represents a message from the user in the conversation.

from langchain_core.prompts import HumanMessagePromptTemplate, ChatPromptTemplate

prompt = ChatPromptTemplate(
    input_variables=["humaninput"],  
    messages=[
        HumanMessagePromptTemplate.from_template("{humaninput}") 
    ]
)

Create a chain

Consider a chain as the basic building block of LangChain. It allows combining multiple components into a single, cohesive workflow. While for this scenario I am creating a simple chain, they can really help create complex, multi-step processes while maintaining a modular structure that is easy to reuse.

In my scenario specific case, my chain consists of three components:

The prompt template I created earlier
The LLM that will receive the prompt to generate a response.
String output parser to expect a consistent output format to work with.

This is what the chain should look like now:

apply prompt -> chat model -> string output

from langchain_core.output_parsers import StrOutputParser

chain = prompt | llmclient | StrOutputParser()

Update the result

Now that I have the chain, I can invoke it to get the output. The chain.invoke() runs the humaninput through the entire sequence of components I defined in my chain. So instead of printing the human input, I will now print the answer I receive from the LLM client.

while True:
    humaninput = input(">> ")
    result = chain.invoke({"humaninput": humaninput})   
    print(result)

Bring it all together

This is what the code looks like all together now. When I run it, I can interact and get answers directly from the client. The code to create a single conversation chatbot is in the file called main_base.py.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import HumanMessagePromptTemplate, ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
import os
import dotenv

dotenv.load_dotenv()
llmclient = ChatOpenAI(openai_api_key=os.getenv("OPENAI_API"))

prompt = ChatPromptTemplate(
    input_variables=["humaninput"],  
    messages=[
        HumanMessagePromptTemplate.from_template("{humaninput}")  
    ]
)

chain = prompt | llmclient | StrOutputParser()

while True:
    humaninput = input(">> ")
    result = chain.invoke({"humaninput": humaninput})   
    print(result)

It answered the first question correctly, which is a great start. When I ask the second question which is a follow up of the first one, it made up a random answer, because it doesn't have any history. Let's move to creating a conversational bot next which will resolve this issue.

Conversational bot

Initialize memory

LangChain provides several memory classes. I will use ConversationBufferMemory to retain all previous interactions without any summarization. The term memory_key="messagememory" sets the key under which the memory will be accessed and return_messages=True parameter tells the memory to return the history as a list of messages.

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="messagememory", return_messages=True)

Note that ConversationBufferMemory stores the entire conversation history, which might not always be ideal for very long conversations and can lead to performance issues. There are other classes available that store the conversation summary instead, which might work well for specific scenarios.

Update Prompt Template with Message Placeholder

I will now introduce a message placeholder for conversation history in my prompt template. I will use MessagesPlaceholder which allows the prompt to adapt dynamically as the conversation progresses.

from langchain_core.prompts import MessagesPlaceholder

prompt = ChatPromptTemplate(
    input_variables=["humaninput"],
    messages=[
        MessagesPlaceholder(variable_name="messagememory"),  
        HumanMessagePromptTemplate.from_template("{humaninput}")  
    ]
)

Update Chain to Load Memory

The memory now needs to be injected into the prompt chain. I will use RunnablePassthrough to retrieve the conversation history, incorporate it into the prompt, send it to the language model, and then format the output. This is what the chain should look like now:

load memory -> apply prompt -> chat model -> string output

from langchain_core.runnables import RunnablePassthrough

chain = (
    RunnablePassthrough.assign(
        messagememory=lambda x: memory.load_memory_variables({})["messagememory"]
    )
    | prompt  
    | llmclient  
    | StrOutputParser()  
)

Save interaction to memory

Now within the while loop, every time there is an interaction, I will save it to memory. I will use memory.save_context() to save the interaction to ConversationBufferMemory.

while True:
    humaninput = input(">> ")
    result = chain.invoke({"humaninput": humaninput})   
    print(result)
    memory.save_context({"humaninput": humaninput}, {"output": result})

Bring it all together

This is what the code looks like all together now. When I run it, I can interact and have an ongoing conversation. The code to create an ongoing conversation chatbot is in the file called main_conversation.py

from langchain_openai import ChatOpenAI
from langchain_core.prompts import HumanMessagePromptTemplate, ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables import RunnablePassthrough
import os
import dotenv

dotenv.load_dotenv()
llmclient = ChatOpenAI(openai_api_key=os.getenv("OPENAI_API"))

memory = ConversationBufferMemory(memory_key="messagememory", return_messages=True)

prompt = ChatPromptTemplate(
    input_variables=["humaninput"],
    messages=[
        MessagesPlaceholder(variable_name="messagememory"),  
        HumanMessagePromptTemplate.from_template("{humaninput}")  
    ]
)

chain = (
    RunnablePassthrough.assign(
        messagememory=lambda x: memory.load_memory_variables({})["messagememory"]
    ) 
    | prompt  
    | llmclient 
    | StrOutputParser()  
)

while True:
    humaninput = input(">> ")
    result = chain.invoke({"humaninput": humaninput})   
    print(result)
    memory.save_context({"humaninput": humaninput}, {"output": result})

It answered each question correctly. Now let me close this session and restart the session by rerunning the code.

Clearly, it has no recollection of the past session. But if you think about any AI chatbots available, the expectation is to be able to continue the conversation whenever you return to the session. So, let's try to accomplish that next.

Persistent Memory Across Sessions

I am going to now import FileChatMessageHistory to store all my chat into an external JSON file called messagememory.json saved locally in the same folder. When I reopen the session I will reload the chat using this file and start off from there.

from langchain_community.chat_message_histories import FileChatMessageHistory

memory = ConversationBufferMemory(
    chat_memory=FileChatMessageHistory("messagememory.json"),
    memory_key="messagememory",
    return_messages=True
)

This simple feature turns the past interaction into a persistent interaction that can be carried on across sessions providing long-term context.

However, it is important to note the considerations if you're dealing with multiple users or very long conversations, as it may require managing these JSON files. It might be a good idea to consider database storage for better performance and scalability. Storing conversations in files might also require an extra layer of security.

Bring it all together

This is what the code looks like all together now. When I run it, I can interact and have an ongoing conversation across multiple sessions. The code to create a persistent conversation chatbot is in the file called main_persistence.py

from langchain_openai import ChatOpenAI
from langchain_core.prompts import HumanMessagePromptTemplate, ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables import RunnablePassthrough
from langchain_community.chat_message_histories import FileChatMessageHistory
import os
import dotenv

dotenv.load_dotenv()
llmclient = ChatOpenAI(openai_api_key=os.getenv("OPENAI_API"))

memory = ConversationBufferMemory(
    chat_memory=FileChatMessageHistory("messagememory.json"),
    memory_key="messagememory",
    return_messages=True
)

prompt = ChatPromptTemplate(
    input_variables=["humaninput"],
    messages=[
        MessagesPlaceholder(variable_name="messagememory"),  
        HumanMessagePromptTemplate.from_template("{humaninput}")  
    ]
)

chain = (
    RunnablePassthrough.assign(
        messagememory=lambda x: memory.load_memory_variables({})["messagememory"]
    ) 
    | prompt  
    | llmclient 
    | StrOutputParser()  
)

while True:
    humaninput = input(">> ")
    result = chain.invoke({"humaninput": humaninput})   
    print(result)
    memory.save_context({"humaninput": humaninput}, {"output": result})

It answered each question correctly. Now let me close this session and restart the session by rerunning the code.

It is able to continue the conversation now even after restarting the session.

What's next!

In this article, I walked through the process of building a simple Q&A bot with LangChain and OpenAI, evolved it into a conversational bot with memory, and finally transformed it into a persistent bot that retains conversations across sessions. From here, you could expand its capabilities by:

Integrating it with a web interface for broader accessibility. You can check out – Incorporate an LLM Chatbot into Your Web Application with OpenAI, Python, and Shiny
Adding specific knowledge bases to make the bot an expert in certain domains. You can check out – How to build a fine-tuned customer support chatbot with Python and OpenAI

I hope this tutorial enabled you to perceive the power of LangChain in drastically simplifying the process of building a conversational bot with long-term memory. Happy coding!

Code for recreating everything in this article can be found at https://github.com/deepshamenghani/langchain_openai_persistence.

If you'd like, find me on Linkedin.

All images in this article are by the author unless mentioned otherwise.

Tags: Artificial Intelligence Data Science Machine Learning OpenAI Python