Using a vendor for serving LLaMA - Llama Tutorial

From the course: LLaMa for Developers

Start my 1-month free trial Buy for my team

Using a vendor for serving LLaMA

“

- [Instructor] So far, we've worked with Llama both in a chat setting on Hugging Face and running it locally in a Colab. In this video, we're going to discuss how we can use a vendor for accessing Llama. So let's go ahead into a GitHub project. I'm here on the ray-project on the llmperf-leaderboard. What this repo does is benchmark how quickly Llama models are served by different vendors. As you can see here, eight different vendors are benchmarked and all of them offer a solution to run Llama. More recently, the Groq set of chips and APIs have been shown to be fastest in industry. So I'm going to go ahead and try those out. I'm going to open up console.groq.com and log in, and I'm going to hit Login with Google. Currently, Groq cloud is free, but it has a pretty low rate limit, so probably shouldn't be used in production. So let's go ahead and create an API key by clicking on API Keys on the left, I have one here already,…

Unlock this course with a free trial

Join today to access over 25,300 courses taught by industry experts.

Using a vendor for serving LLaMA - Llama Tutorial

From the course: LLaMa for Developers

Using a vendor for serving LLaMA

Download courses and learn on the go

Contents

Start learning today.

Explore Business Topics

Explore Creative Topics

Explore Technology Topics