Buffer and Window Memory
Implement ConversationBufferMemory and ConversationBufferWindowMemory to keep the last N turns in context, and measure how window size affects coherence and cost.
Buffer Memory: Keep Everything
Buffer memory is the simplest strategy: store every message from every turn in a list and include the complete history in every API call. It preserves perfect context — the model can reference anything said at any point. The downside is linear context growth with no bound. For short sessions like a single task completion, buffer memory is perfectly appropriate and the easiest to implement.
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
return_messages=True, # return Message objects, not a string
memory_key='history' # key to inject into prompt template
)
# Add messages manually
memory.chat_memory.add_user_message('What is a neural network?')
memory.chat_memory.add_ai_message('A neural network is a system of layers...')
# Load what will be injected
print(memory.load_memory_variables({}))Integrating Buffer Memory with a Chain
To use buffer memory with LCEL, wire it through RunnableWithMessageHistory or use ConversationChain for the legacy approach. The memory object holds the history and the chain uses a MessagesPlaceholder in the prompt template to inject it. After each invocation, the memory automatically appends the new turn.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
store = {}
def get_history(session_id: str):
if session_id not in store:
store[session_id] = InMemoryChatMessageHistory()
return store[session_id]
chain = (
ChatPromptTemplate.from_messages([
('system', 'You are helpful.'),
MessagesPlaceholder('history'),
('human', '{input}')
])
| ChatOpenAI(model='gpt-4o-mini')
| StrOutputParser()
)
with_memory = RunnableWithMessageHistory(
chain, get_history,
input_messages_key='input',
history_messages_key='history'
)