Skip to content

Conversation

@HAOCHENYE
Copy link
Collaborator

No description provided.

@HAOCHENYE HAOCHENYE requested a review from Copilot December 24, 2025 04:32
@HAOCHENYE HAOCHENYE changed the title [Patch] Skip calculate loss of empty <think></think> (ಥ﹏ಥ) Dec 24, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a temporary monkey patch to skip loss calculation for empty <think></think> tags in training data. The patch is controlled via the XTUNER_SKIP_EMPTY_THINK environment variable and is explicitly designed to be disposable rather than a permanent solution.

Key Changes:

  • Added _hardcode_patch.py with _SkipEmptyThink mixin class that filters empty thinking tags from labels
  • Modified multiple tokenize function classes through runtime base class injection
  • Wrapped tokenization methods to apply label processing post-tokenization

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
xtuner/v1/datasets/_hardcode_patch.py Implements the core monkey patching logic with _SkipEmptyThink mixin and label processing wrapper for filtering <think></think> sequences
xtuner/v1/datasets/__init__.py Imports the hardcode patch module to activate the monkey patching when the environment variable is set

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

logger = get_logger()


# Need to inherit ABC otherwise the assignment of `__bases__` will faild
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "faild" should be "fail".

Suggested change
# Need to inherit ABC otherwise the assignment of `__bases__` will faild
# Need to inherit ABC otherwise the assignment of `__bases__` will fail
Copilot uses AI. Check for mistakes.
class _SkipEmptyThink(ABC):
def __init__(self, tokenizer: PreTrainedTokenizer, **kwargs):
# For `__mro__` compatibility
super().__init__(tokenizer=tokenizer)
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __init__ method accepts **kwargs but never uses or forwards them. This could silently ignore important configuration parameters that were intended for the parent class or other mixins in the MRO chain. Consider using super().__init__(tokenizer=tokenizer, **kwargs) to properly forward any additional arguments.

Suggested change
super().__init__(tokenizer=tokenizer)
super().__init__(tokenizer=tokenizer, **kwargs)
Copilot uses AI. Check for mistakes.
@HAOCHENYE HAOCHENYE force-pushed the yehc/fucking-hard-code branch 3 times, most recently from 46633ad to f607e1a Compare December 24, 2025 05:13
@HAOCHENYE HAOCHENYE force-pushed the yehc/fucking-hard-code branch from f607e1a to cd748ba Compare December 24, 2025 05:14
@hhaAndroid hhaAndroid merged commit 2ce4c13 into InternLM:main Dec 24, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants