Skip to content
38 changes: 38 additions & 0 deletions Async explainer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Design Rationale: Mixed Interface PGVectorStore

This document outlines the design choices behind the PGVectorStore integration for LangChain, focusing on its dual interface that supports both synchronous and asynchronous usage patterns.

## Motivation: Performance through Asynchronicity

Database interactions are often I/O-bound, making asynchronous programming crucial for performance.

- **Non-Blocking Operations:** Asynchronous code prevents the application from stalling while waiting for database responses, improving throughput and responsiveness.
- **Asynchronous Foundation (`asyncio` and Drivers):** Built upon Python's `asyncio`, the integration is designed to work with asynchronous PostgreSQL drivers to handle database operations efficiently. While compatible drivers are supported, the `asyncpg` driver is specifically recommended due to its high performance in concurrent scenarios. You can explore its benefits ([link](https://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/)) and performance benchmarks ([link](https://fernandoarteaga.dev/blog/psycopg-vs-asyncpg/)) for more details.

This foundation ensures the core database interactions are fast and scalable.

## The Two-Class Approach: Enabling a Mixed Interface

To cater to different application architectures while maintaining performance, we provide two classes:

1. **`AsyncPGVectorStore` (Core Asynchronous Implementation):**
* This class contains the pure `async/await` logic for all database operations using `asyncpg`.
* It's designed for **direct use within asynchronous applications**. Users working in an `asyncio` environment can `await` its methods for maximum efficiency and direct control within the event loop.
* It represents the fundamental, non-blocking way of interacting with the database.

2. **`PGVectorStore` (Synchronous Interface / Asynchronous Internals):**
* This class provides both asynchronous & synchronous APIs.
* When one of its methods is called, it internally invokes the corresponding `async` method from `AsyncPGVectorStore`.
* It **manages the execution of this underlying asynchronous logic**, handling the necessary `asyncio` event loop interactions (e.g., starting/running the coroutine) behind the scenes.
* This allows users of synchronous codebases to leverage the performance benefits of the asynchronous core without needing to rewrite their application structure.

## Benefits of this Dual Interface Design

This two-class structure provides significant advantages:

- **Interface Flexibility:** Developers can **choose the interface that best fits their needs**:
* Use `PGVectorStore` for easy integration into existing synchronous applications.
* Use `AsyncPGVectorStore` for optimal performance and integration within `asyncio`-based applications.
- **Ease of Use:** `PGVectorStore` offers a familiar synchronous programming model, hiding the complexity of managing async execution from the end-user.
- **Robustness:** The clear separation helps prevent common errors associated with mixing synchronous and asynchronous code incorrectly, such as blocking the event loop from synchronous calls within an async context.
- **Efficiency for Async Users:** `AsyncPGVectorStore` provides a direct path for async applications, avoiding any potential overhead from the sync-to-async bridging layer present in `PGVectorStore`.