Skip to content

Conversation

@stephantul
Copy link
Contributor

This PR adds a configurable pad token to the training pipeline. In previous versions, we always assumed this token was [PAD], which is almost always the case, but isn't necessarily true. This lets the user configure the pad token directly, while setting [PAD] as the default.

@stephantul stephantul marked this pull request as ready for review September 7, 2025 18:32
@stephantul stephantul requested a review from Pringled September 7, 2025 18:32
@codecov
Copy link

codecov bot commented Sep 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
model2vec/train/base.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@stephantul stephantul merged commit 55b955a into main Sep 8, 2025
7 checks passed
@stephantul stephantul deleted the configure-pad-token branch September 8, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants