Skip to content

Conversation

@monster29000
Copy link

No description provided.

Copy link

@RahulVadisetty91 RahulVadisetty91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re4544

The function num_tokens_from_string is using the hardcoded model "gpt-3.5-turbo" in tiktoken. encoding_for_model("gpt-3.5-turbo"). If you are working with BART models this might not be the appropriate tokenizer. Instead, use the correct tokenizer based on the facebook/bart-large-cnn model, which you are using for summarization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants