Advanced Grammar Correction Suite

An end-to-end grammar and punctuation correction system built with a FastAPI backend and a Streamlit frontend. The service combines rule-based checks, multiple Hugging Face generative models, and structure-aware text processing to deliver high-quality corrections for everything from quick sentences to long, formatted documents.

Key Features

FastAPI Backend
- LanguageTool integration with intelligent chunking to keep corrections fast and consistent.
- Multiple transformer models (prithivida/grammar_error_correcter_v1, vennify/t5-base-grammar-correction, pszemraj/flan-t5-large-grammar-synthesis, grammarly/coedit-large) loaded with GPU support when available.
- Dynamic model selection (auto, lightweight, best) plus a text2text pipeline fallback for premium quality.
- Text structure analysis (word/sentence/paragraph counts, formatting detection) to adapt processing strategy, batching, and prompts automatically.
- Inline diff highlighting and similarity scoring to explain each correction.
- Health and model metadata endpoints for observability.
Streamlit Frontend
- Rich UI with live metrics, sample texts, and clipboard helpers.
- Sidebar controls for model preferences, advanced options, and API health checks.
- Progress indicators, error handling, and download options (plain text, report, JSON).
- Designed to support both quick edits and multi-paragraph document reviews.
Deployment Ready
- Dockerfile configured with Java (for LanguageTool) and caching for large Hugging Face models.
- Environment variables for Hugging Face caching (HF_HOME, TRANSFORMERS_CACHE) pre-set in the container.

Project Structure

Grammar/
├── grammer_backend.py     # FastAPI application with correction pipeline
├── grammer_frontend.py    # Streamlit UI for interacting with the API
├── grammerly.py           # Extended experimental backend (not used by default)
├── requirements.txt       # Python dependencies
├── Dockerfile             # Container definition (serves FastAPI by default)

Note: grammerly.py contains an advanced, experimental service with websockets and background tasks. The primary production entrypoints are grammer_backend.py (API) and grammer_frontend.py (UI).

Getting Started

Prerequisites

Python 3.12+
(Optional) CUDA-capable GPU for faster inference
Java runtime (installed automatically in Docker; required by LanguageTool when running natively)

1. Clone and set up a virtual environment

git clone <your-repo-url>
cd Grammar
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt

2. Run the FastAPI backend

uvicorn grammer_backend:app --host 0.0.0.0 --port 8000 --reload

The server exposes:

POST /correct/ – main correction endpoint
GET /models/ – model status (loaded models, device, pipeline availability)
GET /health/ – quick health check

On first run, large transformer weights may download from Hugging Face. Ensure you have sufficient disk space and a stable connection.

3. Launch the Streamlit frontend (in a second terminal)

streamlit run grammer_frontend.py

By default, the UI expects the API at http://localhost:8000. Adjust API_BASE_URL in grammer_frontend.py if you deploy elsewhere.

API Reference

`POST /correct/`

Request body

{
  "text": "Your input text here.",
  "use_best_model": true,
  "model_preference": "auto",      // "auto" | "lightweight" | "best"
  "preserve_formatting": true,
  "processing_mode": "intelligent" // currently accepted but reserved for future use
}

Response outline

{
  "original": "...",
  "text_analysis": {
    "word_count": 123,
    "sentence_count": 10,
    "paragraph_count": 3,
    "structure_type": "multi_paragraph",
    "needs_segmentation": true,
    "avg_paragraph_length": 41.0
  },
  "results": {
    "language_tool": {
      "corrected": "...",
      "highlighted": "<span>...</span>",
      "similarity": 0.94
    },
    "best_pipeline": {
      "corrected": "...",
      "highlighted": "...",
      "similarity": 0.97
    },
    "huggingface": {
      "pszemraj/flan-t5-large-grammar-synthesis": {
        "corrected": "...",
        "highlighted": "...",
        "similarity": 0.98
      }
    }
  },
  "final_best": "Best end-to-end correction",
  "models_used": [
    "prithivida/grammar_error_correcter_v1",
    "vennify/t5-base-grammar-correction",
    "pszemraj/flan-t5-large-grammar-synthesis",
    "grammarly/coedit-large"
  ],
  "device": "cuda",
  "processing_strategy": {
    "chunks_used": true,
    "structure_type": "multi_paragraph",
    "batch_size_used": 2
  }
}

`GET /models/`

Returns a list of loaded models, device information, and whether the premium pipeline is available.

`GET /health/`

Simple status payload with number of loaded models and current device.

Frontend Walkthrough

Settings Sidebar
- Toggle premium models, choose quality vs. speed (auto, lightweight, best).
- Optional similarity scores and comparison toggles.
- API health check button to retrieve backend status.
Text Input Panel
- Sample snippets for quick testing.

Word/character/sentence counters update live.
Clear text and clipboard helper buttons.

Correction Flow
- Visual progress bar and status message while waiting on the API.
- Metrics showing processing time, model count, and device used.
- Expandable cards for original text, best correction, LanguageTool results, best pipeline, and per-model outputs.
- Download buttons for plain text, a formatted report, or the raw JSON.

Docker Usage

Build and run the FastAPI service in a container:

docker build -t grammar-api .
docker run -p 8000:8000 grammar-api

The Dockerfile installs Java for LanguageTool and sets Hugging Face cache directories to /app/huggingface.
To include the Streamlit UI in a container, add another stage or run Streamlit separately against the containerized API.

Troubleshooting

Large model downloads fail: configure Hugging Face authentication or predownload models into HF_HOME/TRANSFORMERS_CACHE.
LanguageTool errors: ensure Java is available. In constrained environments, the code falls back to LanguageToolPublicAPI, but rate limits may apply.
API timeouts: long documents can trigger the default 5-minute timeout on the frontend. Increase the Streamlit request timeout or process in smaller chunks.
GPU not detected: verify CUDA drivers; the backend will run on CPU if GPU is unavailable, but processing slows down.

Contributing

Fork the repository and create a new branch.
Run formatting and linting tools you adopt (e.g., ruff, black, mypy) before sending a PR.
Document changes clearly—especially when adding models or altering the correction workflow.
Add or update tests/notebooks where relevant.

Next Steps

Add automated tests (unit + integration) for the chunking logic and API endpoints.
Expose authentication or rate limiting if deploying publicly.
Package the frontend with the backend for a single deployment target (e.g., Docker Compose).
Document Hugging Face model licensing considerations before commercial use.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
grammer_backend.py		grammer_backend.py
grammer_frontend.py		grammer_frontend.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Advanced Grammar Correction Suite

Key Features

Project Structure

Getting Started

Prerequisites

1. Clone and set up a virtual environment

2. Run the FastAPI backend

3. Launch the Streamlit frontend (in a second terminal)

API Reference

`POST /correct/`

`GET /models/`

`GET /health/`

Frontend Walkthrough

Docker Usage

Troubleshooting

Contributing

Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Languages

gautam-coder/Grammar_Correction

Folders and files

Latest commit

History

Repository files navigation

Advanced Grammar Correction Suite

Key Features

Project Structure

Getting Started

Prerequisites

1. Clone and set up a virtual environment

2. Run the FastAPI backend

3. Launch the Streamlit frontend (in a second terminal)

API Reference

POST /correct/

GET /models/

GET /health/

Frontend Walkthrough

Docker Usage

Troubleshooting

Contributing

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`POST /correct/`

`GET /models/`

`GET /health/`

Packages