The langchain-postgres package implementations of core LangChain abstractions using Postgres.
The package is released under the MIT license.
Feel free to use the abstraction as provided or else modify them / extend them as appropriate for your own application.
The package currently only supports the psycogp3 driver.
pip install -U langchain-postgres0.0.6:
- Remove langgraph as a dependency as it was causing dependency conflicts.
- Base interface for checkpointer changed in langgraph, so existing implementation would've broken regardless.
from langchain_postgres import PGVector, EmbeddingIndexType
PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
embedding_length=1536,
embedding_index=EmbeddingIndexType.hnsw,
embedding_index_ops="vector_cosine_ops",
)- Embedding length is required for HNSW index.
- Allowed values for
embedding_index_opsare described in the pgvector HNSW.
Can set ef_construction and m parameters for HNSW index.
Refer to the pgvector HNSW Index Options to better understand these parameters.
from langchain_postgres import PGVector, EmbeddingIndexType
PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
embedding_length=1536,
embedding_index=EmbeddingIndexType.hnsw,
embedding_index_ops="vector_cosine_ops",
ef_construction=200,
m=48,
)from langchain_postgres import PGVector, EmbeddingIndexType
PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
embedding_length=1536,
embedding_index=EmbeddingIndexType.ivfflat,
embedding_index_ops="vector_cosine_ops",
)- Embedding length is required for HNSW index.
- Allowed values for
embedding_index_opsare described in the pgvector IVFFlat.
from langchain_postgres import PGVector, EmbeddingIndexType
PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
embedding_length=1536,
embedding_index=EmbeddingIndexType.hnsw,
embedding_index_ops="bit_hamming_ops",
binary_quantization=True,
binary_limit=200,
)- Works only with HNSW index with
bit_hamming_ops. binary_limitincreases the limit in the inner binary search. A higher value will increase the recall at the cost of speed.
Refer to the pgvector Binary Quantization to better understand.
from langchain_postgres import PGVector
PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
enable_partitioning=True,
)- Create partitions of
langchain_pg_embeddingtable bycollection_id. Useful with a large number of embeddings with different collection.
Refer to the pgvector Partitioning
from langchain_postgres import PGVector, EmbeddingIndexType, IterativeScan
PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
embedding_length=1536,
embedding_index=EmbeddingIndexType.hnsw,
embedding_index_ops="vector_cosine_ops",
iterative_scan=IterativeScan.relaxed_order
)iterative_scancan be set toIterativeScan.relaxed_orderorIterativeScan.strict_orderor disabled withIterativeScan.off.- Requires an HNSW or IVFFlat index.
Refer to the pgvector Iterative Scan to better understand.
from langchain_postgres import PGVector, EmbeddingIndexType, IterativeScan
PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
embedding_length=1536,
embedding_index=EmbeddingIndexType.hnsw,
embedding_index_ops="vector_cosine_ops",
iterative_scan=IterativeScan.relaxed_order,
max_scan_tuples=40000,
scan_mem_multiplier=2
)max_scan_tuplescontrol when the scan ends wheniterative_scanis enabled.scan_mem_multiplierspecify the max amount of memory to use for the scan.
Refer to the pgvector Iterative Scan Options to better understand.
Can be used by specifying full_text_search parameter.
from langchain_postgres import PGVector
vectorstore = PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
)
vectorstore.similarity_search(
"hello world",
full_text_search=["foo", "bar & baz"]
)This adds the following statement to the WHERE clause:
AND document_vector @@ to_tsquery('foo | bar & baz')Can be used with retrievers like this:
from langchain_postgres import PGVector
vectorstore = PGVector(
collection_name="test_collection",
embeddings=FakeEmbedding(),
connection=CONNECTION_STRING,
)
retriever = vectorstore.as_retriever(
search_kwargs={
"full_text_search": ["foo", "bar & baz"]
}
)Refer to Postgres Full Text Search for more information.
The chat message history abstraction helps to persist chat message history in a postgres table.
PostgresChatMessageHistory is parameterized using a table_name and a session_id.
The table_name is the name of the table in the database where
the chat messages will be stored.
The session_id is a unique identifier for the chat session. It can be assigned
by the caller using uuid.uuid4().
import uuid
from langchain_core.messages import SystemMessage, AIMessage, HumanMessage
from langchain_postgres import PostgresChatMessageHistory
import psycopg
# Establish a synchronous connection to the database
# (or use psycopg.AsyncConnection for async)
conn_info = ... # Fill in with your connection info
sync_connection = psycopg.connect(conn_info)
# Create the table schema (only needs to be done once)
table_name = "chat_history"
PostgresChatMessageHistory.create_tables(sync_connection, table_name)
session_id = str(uuid.uuid4())
# Initialize the chat history manager
chat_history = PostgresChatMessageHistory(
table_name,
session_id,
sync_connection=sync_connection
)
# Add messages to the chat history
chat_history.add_messages([
SystemMessage(content="Meow"),
AIMessage(content="woof"),
HumanMessage(content="bark"),
])
print(chat_history.messages)See example for the PGVector vectorstore here