Skip to content

Conversation

@dpressel
Copy link
Owner

Add support for OPT. It is:

  • a decoder-only model with learned-positional embeddings up to 2k
  • same checkpoint structure as BART without encoder
  • GPT2 byte-level tokenizer with a different vocabulary
  • ReLU activations instead of GeLU

The HuggingFace Tokenizers library cannot use the tokenizer_config.json provided in the repo, so I created a tokenizers.json using the GPT2 one as an example and adding a post-processor to the tokenizer. My tokenizer.json is available from https://www.dropbox.com/s/ut8qj4nynhkq4cd/tokenizer.json?dl=1 which, once saved as tokenizer.json locally, can be used with the opt_completer example.

dpressel added 4 commits July 25, 2022 00:23
Still need to switch to tokenizers lib for tok-ing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants