Cost projecting and deploying LLMs to production - ChatGPT Tutorial

From the course: Advanced Guide to ChatGPT, Embeddings, and Other Large Language Models (LLMs)

Start my 1-month free trial Buy for my team

Cost projecting and deploying LLMs to production

“

- When we think about cost projecting with large language models, we're generally talking about either closed-source or open-sourced. So, let's start with closed-source. When you're thinking about deploying a closed-source LLM, like GPT-4 or Claude or Cohere, you're generally being charged by the token or more precisely batch of tokens. So, deploying an API implementation of a prompt on an LLM is more or less just counting the number of input and output tokens that match and matching that against the pricing. So, as an example, let's think about OpenAI's embedding product. When we used OpenAI's embeddings in previous lessons, what I didn't tell you is that they charge, at least at the time of recording, $0.0004 for every 1000 tokens that you embed using the engine that we did, Ada-002. Now, if we assume an average of 500 tokens per document, which is roughly a page of text, then the cost per document would be $0.0002. So, to embed a million documents, it would cost approximately $200.…

Unlock this course with a free trial

Join today to access over 25,200 courses taught by industry experts.

Cost projecting and deploying LLMs to production - ChatGPT Tutorial

From the course: Advanced Guide to ChatGPT, Embeddings, and Other Large Language Models (LLMs)

Cost projecting and deploying LLMs to production

Download courses and learn on the go

Contents

Start learning today.

Explore Business Topics

Explore Creative Topics

Explore Technology Topics