Skip to content

Conversation

@danielhanchen
Copy link
Contributor

  1. Fixes RMS Layernorm downcasting prematurely. We move it to the very end.
    unnamed

  2. Fixes embedding matrix scaling / normalizer upcasting to float32. Instead we must use float16 or bfloat16 for the normalizer.
    unnamed-1

1. Fixes RMS Layernorm downcasting prematurely. We move it to the very end.
2. Fixes embedding matrix scaling / normalizer upcasting to float32. Instead we must use float16 or bfloat16 for the normalizer.
@pengchongjin
Copy link
Contributor

Thanks, @danielhanchen !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants