A FastAPI-based web service that uses Playwright to fetch and process web content. This service provides a robust API for web scraping with support for proxies, media blocking, and API key authentication.
- π Fast and async web scraping using Playwright
- π Optional API key authentication
- π Proxy support
- πΌοΈ Media blocking capabilities
- π³ Docker support
- ποΈ CI/CD with GitHub Actions
- π Interactive API documentation (Swagger UI)
- Clone the repository:
git clone git@github.com:watercrawl/playwright.git
cd playwright- Set up environment variables:
cp .env.example .env- Edit
.envfile with your settings:
AUTH_API_KEY=your-secret-api-key
PORT=8000
HOST=0.0.0.0- Build and run with Docker Compose:
docker compose up --buildThe service will be available at http://localhost:8000
Access the interactive API documentation at http://localhost:8000/docs
docker pull watercrawl/playwright:latest
docker run -p 8000:8000 -e AUTH_API_KEY=your-secret-key watercrawl/playwrightThe API documentation is available through Swagger UI at /docs endpoint. This provides:
- Interactive API documentation
- Request/response examples
- Try-it-out functionality
- OpenAPI specification
- GET
/health/liveness- Liveness probe - GET
/health/readiness- Readiness probe
- POST
/html- Fetch HTML content from a URL
{
"url": "https://example.com",
"proxy": {
"type": "http",
"host": "proxy.example.com",
"port": 8080,
"username": "user",
"password": "pass"
},
"block_media": true,
"user_agent": "custom-user-agent",
"locale": "en-US",
"extra_headers": {
"Custom-Header": "value"
}
}When AUTH_API_KEY is set in the environment, the API requires authentication using the X-API-Key header:
curl -X POST http://localhost:8000/html \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-api-key" \
-d '{"url": "https://example.com"}'- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Install Playwright browsers:
playwright install chromium- Run the application:
uvicorn main:app --reload- Access the API documentation:
- Open
http://localhost:8000/docsin your browser - Try out the endpoints directly from the Swagger UI
- View the OpenAPI specification at
/openapi.json
- Open
| Variable | Description | Default |
|---|---|---|
| AUTH_API_KEY | API key for authentication | None (disabled) |
| PORT | Server port | 8000 |
| HOST | Server host | 0.0.0.0 |
| PYTHONUNBUFFERED | Python unbuffered output | 1 |
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.