Read2Me

Overview

Read2Me is a FastAPI application that fetches content from provided URLs, processes the text, converts it into speech using Microsoft Azure's Edge TTS or with the local TTS models Kokoro TTS (via Kokoro FastAPI), or chatterbox, and tags the resulting MP3 files with metadata. You can either turn the full text into audio or have an LLM convert the seed text into a podcast. Currently Ollama and any OpenAI compatible API is supported. You can install the provided Chromium Extension in any Chromium-based browser (e.g. Chrome or Microsoft Edge) to send current urls or any text to the sever, add sources and keywords for automatic fetching.

This is a currently a beta version but I plan to extend it to support other content types (e.g., epub) in the future and provide more robust support for languages other than English. Currently, when using the default Azure Edge TTS, it already supports other languages and tries to autodetect it from the text but quality might vary depending on the language.

Features

Fetches and processes content from HTML URLs and saves it as a markdown file.
Converts text to speech using Microsoft Azure's Edge TTS (currently randomly selecting from the available multi-lingual voices to easily handle multiple languages).
Tags MP3 files with metadata, including the title, author, and publication date, if available.
Adds a cover image with the current date to the MP3 files.
For urls from wikipedia, uses the wikipedia python library to extract article content
Automatic retrieval of new articles from specified sources at defined intervals (currently hard coded to twice a day at 5AM and 5PM local time). Sources and keywords can be specified via text files.
Turn any seed text (url or manually entered text) into a podcast
Chrome Extension available on the Chrome WebStore: READ2ME Browser Companion. If you prefer installing the Extension from source, it's available in this repository as well.

Requirements

Python 3.10 or higher
uv for python package management
pnpm for javascript package management

Installation

FastAPI backend

Clone the repository:

git clone https://github.com/WismutHansen/READ2ME.git
cd read2me

Install python dependencies

uv sync # Replace name with the name of the script you want to run

Activate venv and install playwright Install playwright

source .venv/bin/activate && playwright install # or .venv\Scripts\activate && playwright install (on Windows)

Note: ffmpeg is required when using either for converting wav files into mp3.

Set up environment variables:

Rename .env.example file in the root director to .env and edit the content to your preference:

OUTPUT_DIR=Output # Directory to store output files
SOURCES_FILE=sources.json # File containing sources to retrieve articles from twice a day
IMG_PATH=front.jpg # Path to image file to use as cover
OLLAMA_BASE_URL=http://localhost:11434    # Standard Port for Ollama
OPENAI_BASE_URL=http://localhost:11434/v1 # Example for Ollama Open AI compatible endpoint
OPENAI_API_KEY=skxxxxxx                   # Your OpenAI API Key in case of using the official OpenAI API
MODEL_NAME=llama3.2:latest
LLM_ENGINE=Ollama #Valid Options: Ollama, OpenAI

You can use either Ollama or any OpenAI compatible API for title and podcast script generation (summary function also coming soon)

Start the backend
```
uv run main.py
```

Next.js frontend

 cd frontend && cp .env.local.example .env.local && pnpm install && pnpm run dev

you can access the frontend on http://localhost:3000

Add URLs for processing without frontend:

Send a POST request to http://localhost:7788/v1/url/full with a JSON body containing the URL:

{
  "url": "https://example.com/article"
}

You can use curl or any API client like Postman to send this request like this:

curl -X POST http://localhost:7788/v1/url/full/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/article"}'
  -d '{"tts-engine": "edge"}'

The repository also contains a working Chromium Extension that you can install in any Chromium-based browser (e.g. Google Chrome) when the developer settings are enabled.

Processing URLs:

The application periodically checks the tasks.json file for new Jobs to process. It fetches the content for a given url, extracts text, converts it to speech, and saves the resulting MP3 files with appropriate metadata.
Specify Sources and keywords for automatic retrieval:

Create a file called sources.json in your current working directory with URLs to websites that you want to monitor for new articles. You can also set global keywords and per-source keywords to be used as filters for automatic retrieval. If you set "*" for a source, all new articles will be retrieved. Here is an example structure:

{
  "global_keywords": ["globalkeyword1", "globalkeyword2"],
  "sources": [
    {
      "url": "https://example.com",
      "keywords": ["keyword1", "keyword2"]
    },
    {
      "url": "https://example2.com",
      "keywords": ["*"]
    }
  ]
}

Location of both files is configurable in .env file.

API Endpoints

POST /v1/url/full

Adds a URL to the processing list.

Request Body:

{
  "url": "https://example.com/article",
  "tts-engine": "edge"
}

Response:

{
  "message": "URL added to the processing list"
}

POST /v1/url/podcast
POST /v1/text/full
POST /v1/text/podcast

Dependencies

FastAPI: Web framework for building APIs.
Uvicorn: ASGI server implementation for serving FastAPI applications.
edge-tts: Microsoft Azure Edge Text-to-Speech library.
mutagen: Library for handling audio metadata.
Pillow: Python Imaging Library (PIL) for image processing.
trafilatura: Library for web scraping and text extraction.
requests: HTTP library for sending requests.
BeautifulSoup: Library for parsing HTML and XML documents.
pdfminer: Library for extracting text from PDF documents.
python-dotenv: Library for managing environment variables.
newspaper4k: Library for extracting articles from news websites.
wikipedia: Library for extracting information from Wikipedia articles.
schedule: Library for scheduling tasks. Used to schedule automatic news retrieval twice a day.
and many more but I plan on reducing the dependencies a bit by removing redundancies etc.

Contributing

Fork the repository.

Create a new branch:

git checkout -b feature/your-feature-name

Make your changes and commit them:
```
git commit -m 'Add some feature'
```

Push to the branch:

git push origin feature/your-feature-name

Submit a pull request.

License

This project is licensed under the Apache License Version 2.0, January 2004

Roadmap

language detection and voice selection based on detected language (currently only works for edge-tts).
Add support for handling of pdf files
Add support for LLM-based text processing like podcast transcript with local LLMs through Ollama or the OpenAI API
Add support for chatterbox TTS
Add support for automatic image captioning using local vision models or the OpenAI API

Acknowledgements

I would like to thank the following repositories and authors for their inspiration and code:

F5-TTS - A great open weights TTS model!
stylyetts2 - A great open source TTS engine, and really fast if using NVIDIA/CUDA
piperTTS - Another good local TTS engine that also works on low spec systems
AlwaysReddy - Thanks to these guys, I got piper TTS working in my project (now removed due to better TTS models available)
rvc-python - For improving generated speech
edge-tts - Best free online TTS engine
kokoro tts - The fastest local TTS model with awesome audio quality!
kokoro FastAPI - OpenAI-API compatible FastAPI server for Kokoro TTS
chatterbox - The best TTS model for English by far (in May 2025) thanks to the great work by resemble.ai!

Name		Name	Last commit message	Last commit date
Latest commit History 291 Commits
Chromium_Extension		Chromium_Extension
Firefox_Extension		Firefox_Extension
Fonts		Fonts
TTS		TTS
database		database
frontend		frontend
llm		llm
tests		tests
utils		utils
.env.example		.env.example
.gitignore		.gitignore
.project-root		.project-root
.python-version		.python-version
Banner.png		Banner.png
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
bump_version.py		bump_version.py
dockerfile		dockerfile
feeds.json		feeds.json
front.jpg		front.jpg
main.py		main.py
pyproject.toml		pyproject.toml
tasks.json		tasks.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Read2Me

Overview

Features

Requirements

Installation

FastAPI backend

Next.js frontend

Add URLs for processing without frontend:

API Endpoints

Dependencies

Contributing

License

Roadmap

Acknowledgements

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

WismutHansen/READ2ME

Folders and files

Latest commit

History

Repository files navigation

Read2Me

Overview

Features

Requirements

Installation

FastAPI backend

Next.js frontend

Add URLs for processing without frontend:

API Endpoints

Dependencies

Contributing

License

Roadmap

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages