Movie recommendation system with Python. Implements content-based filtering (TF-IDF + cosine similarity), collaborative filtering with matrix factorization (TruncatedSVD), and a hybrid approach. Evaluates with Precision@K, Recall@K, and NDCG. Includes rating distribution plots, top movies, and sample recommendations.
- Synthetic ratings dataset with users, movies, and multi-label genres
- Content-based: TF‑IDF on genres + item–item cosine similarity
- Collaborative: user–item matrix factorization (TruncatedSVD)
- Hybrid: weighted combination of content & collaborative scores
- Evaluation: Precision@K, Recall@K, NDCG@K (per-user & macro)
- Visuals: rating distribution, top movies, sample recommendations
- Outputs: metrics JSON, recommendation files, and charts
movie-recommendation-system/
├─ README.md
├─ LICENSE
├─ requirements.txt
├─ data/
│ └─ generate_ratings.py
├─ src/
│ ├─ build_recommender.py
│ └─ utils.py
└─ outputs/
└─ figures & reports (auto-created)
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
pip install -r requirements.txtpython data/generate_ratings.py --users 800 --movies 1200 --seed 42 --outdir data
# Produces: data/ratings.csv, data/movies.csvpython src/build_recommender.py --ratings data/ratings.csv --movies data/movies.csv --outdir outputs --k 10 --alpha 0.6 --seed 42Outputs
outputs/metrics.json– Precision@K, Recall@K, NDCG@K (content, collab, hybrid)outputs/recs_user_*.txt– sample recommendations for a few usersoutputs/ratings_hist.pngoutputs/top_movies.png
| Method | Precision@10 | Recall@10 | NDCG@10 |
|---|---|---|---|
| Collaborative | 0.0024 | 0.0107 | 0.0059 |
| Content-based | 0.0026 | 0.0129 | 0.0073 |
| Hybrid (α=0.6) | 0.0024 | 0.0107 | 0.0060 |
➡️ On this sparse synthetic dataset, content-based slightly outperforms collaborative and hybrid.
User 1
Movie 959 41.08
Movie 541 36.28
Movie 952 34.83
Movie 95 34.81
Movie 40 33.30
User 2
Movie 915 47.02
Movie 122 38.27
Movie 77 38.18
Movie 1097 37.65
Movie 952 37.04
User 3
Movie 800 52.00
Movie 410 45.77
Movie 9 45.59
Movie 738 41.34
Movie 713 41.00
- Positive items are defined as ratings ≥ 4 for evaluation.
- The hybrid weight
alphacontrols emphasis on collaborative (α→1) vs. content (α→0).