From the course: AWS Certified Machine Learning Engineer Associate (MLA-C01) Cert Prep
Cost tradeoffs of AWS GenAI services
From the course: AWS Certified Machine Learning Engineer Associate (MLA-C01) Cert Prep
Cost tradeoffs of AWS GenAI services
(soft gentle music) (soft gentle music ends) - [Instructor] Hello, guys. So in today's lesson we're going to talk about the cost trade-offs of AWS Generative AI Services. So let's first start by talking about the responsiveness and the availability in the AWS Gen AI Services. For the responsiveness, AWS Gen AI Services offer varying response times, depending on the model size and the configuration itself. For real-time applications like chatbots, it's important to prioritize the lower latency, though this could lead to increased costs due to the higher resource consumption. Concerning the availability, high availability is typically insured, but for mission-critical applications requiring multi-region deployments, for example, this could also increase costs. The trade off here is achieving better performance for your most critical applications. Regarding the cost impact, keep in mind that lower latency and higher availability come with a price. So while these features boost performance, they also impact the overall expenses. Now let's cover performance, redundancy, and regional coverage. For the performance, the model size, the use of the hardware accelerators, like the AWS Trainium instances and the model configuration could have a significant impact on both the performance and the costs. So larger models often provide better results, but they require more resources, leading to increased costs. For applications that need full tolerance, built-in redundancy improves reliability. However, this also adds to the expenses, so it's important to weigh the need for high reliability against your budget. Also AWS Services aren't available in every region, and deploying your application closer to the user base could improve the latency, but may involve higher regional pricing. Let's now dive into the pricing models and the customization for the gen AI services. Many services follow a token-based pricing model where you are charged based on the number of tokens processed. This means that the more text or the more data you process, the higher the cost, so it's important to understand and to monitor your usage in order to estimate the costs. Also if your application needs guaranteed capacity and performance, you could then choose provisioned throughput. This approach provide consistent performance, but may increase your costs, making it only ideal for mission-critical applications. Services like Amazon Bedrock, they offer customization options, which allow you to tailor models to your business needs. While this customization could improve the long-term performance, but it also comes with added costs, so it's important to balance the initial investment with the potential for greater value over time.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Download courses and learn on the go
Watch courses on your mobile device without an internet connection. Download courses using your iOS or Android LinkedIn Learning app.
Contents
-
-
(Locked)
Intro: Data storage and ingestion1m 10s
-
(Locked)
The three Vs1m 54s
-
(Locked)
Types of data3m 27s
-
(Locked)
Batch versus streaming1m 32s
-
(Locked)
OLTP vs. OLAP2m 11s
-
Data formats4m 10s
-
(Locked)
Data modeling3m 19s
-
(Locked)
Data warehouses1m 17s
-
(Locked)
Data lakes3m 1s
-
(Locked)
Data ingestion scenarios3m 5s
-
(Locked)
Amazon FSx4m 9s
-
(Locked)
Hands-on learning: Loading data into model training resource8m 24s
-
(Locked)
Amazon Kinesis Data Streams9m 18s
-
(Locked)
Hands-on learning: Create a data stream3m 30s
-
(Locked)
Using EFS with Lambda1m 25s
-
(Locked)
Hands-on learning: Create an AWS Lambda function to consume a Kinesis Data Stream3m 50s
-
(Locked)
Amazon Kinesis Client Library (KCL)2m 52s
-
(Locked)
Apache Kafka7m 32s
-
Amazon MSK6m 33s
-
(Locked)
Kinesis vs. MSK4m 1s
-
(Locked)
Amazon Data Firehose4m 9s
-
(Locked)
Hands-on learning: Configure an Amazon Data Firehose stream5m 33s
-
(Locked)
Amazon Managed Service for Apache Flink2m 22s
-
(Locked)
Amazon Kinesis Analytics5m 22s
-
(Locked)
Amazon Kinesis Video Streams5m 47s
-
(Locked)
Amazon Redshift5m 14s
-
(Locked)
Amazon Redshift Serverless5m 4s
-
(Locked)
Storage platforms4m 14s
-
(Locked)
Aligning to access patterns8m 35s
-
(Locked)
Cost and performance comparisons3m 4s
-
(Locked)
Extracting data from storage6m 56s
-
Summary of storage options7m 43s
-
(Locked)
Exam cram11m 34s
-
(Locked)
-
-
(Locked)
Intro: Exploratory data analysis1m 9s
-
(Locked)
Plots6m 15s
-
(Locked)
Data types9m 10s
-
Data distribution3m 42s
-
(Locked)
Feature engineering2m 3s
-
(Locked)
Data transformation (numbers-categories)11m 9s
-
(Locked)
Data transformation (text-images)17m 15s
-
(Locked)
Imputation techniques7m 11s
-
(Locked)
Unbalanced data4m 36s
-
(Locked)
Outliers3m 33s
-
(Locked)
Amazon EMR introduction3m 52s
-
(Locked)
Apache Hadoop1m 48s
-
(Locked)
Hadoop frameworks2m 18s
-
(Locked)
Apache Spark3m 12s
-
(Locked)
Amazon EMR architecture7m 48s
-
Hands-on learning: Launch an EMR cluster13m 7s
-
(Locked)
Transforming streaming data (Lambda and Spark)3m 52s
-
(Locked)
EMR Serverless3m 16s
-
(Locked)
Amazon SageMaker Feature Store8m 42s
-
(Locked)
AWS Glue8m 18s
-
(Locked)
Hands-on learning: AWS Glue (crawler and transformation)9m 1s
-
(Locked)
AWS Glue Data Catalog1m 17s
-
(Locked)
Hands-on learning: Create an AWS Glue Data Catalog3m 13s
-
(Locked)
AWS Glue DataBrew3m 1s
-
(Locked)
Hands-on learning: Create a DataBrew project5m 37s
-
Amazon Athena5m 38s
-
(Locked)
Hands-on learning: Running SQL queries in Athena7m 9s
-
(Locked)
Exam cram5m 52s
-
(Locked)
-
-
(Locked)
Intro: Machine learning1m 8s
-
(Locked)
Taxonomy of AI13m 21s
-
Traditional vs. AI methods for solving problems6m 19s
-
(Locked)
AI real-world applications4m 16s
-
(Locked)
Business view for AI3m 25s
-
(Locked)
Sources of ML models8m 18s
-
(Locked)
Machine learning categories10m 6s
-
(Locked)
Regression5m 15s
-
(Locked)
Regression-model evaluation7m 47s
-
(Locked)
Classification3m 51s
-
(Locked)
Classification-model evaluation23m 41s
-
(Locked)
Dimensionality reduction6m 5s
-
(Locked)
Deep learning19m 28s
-
(Locked)
Natural language processing (NLP)4m 39s
-
(Locked)
Computer vision (CV)4m 21s
-
Convolutional neural network (CNN)5m 15s
-
(Locked)
Recurrent neural network3m 49s
-
(Locked)
Advancements in NLP7m 39s
-
(Locked)
Neural network characteristics7m 17s
-
(Locked)
Neural networks' problems3m
-
(Locked)
Overfitting and underfitting3m 23s
-
(Locked)
Preventing overfitting4m 20s
-
(Locked)
Validation techniques3m 27s
-
(Locked)
Decision trees11m 7s
-
(Locked)
Ensemble learning2m 21s
-
Reducing model size6m 16s
-
(Locked)
Performance, training time, and cost tradeoffs6m 52s
-
(Locked)
AI use cases5m
-
(Locked)
Interpreting ML models6m 41s
-
(Locked)
Exam cram8m 1s
-
(Locked)
-
-
(Locked)
Intro: Managed AI services1m 22s
-
(Locked)
AI services1m 4s
-
(Locked)
Amazon Comprehend6m 8s
-
Hands-on learning: Customer reviews sentiment analysis13m 34s
-
(Locked)
Amazon Translate3m 40s
-
(Locked)
Hands-on learning: Amazon Translate2m 58s
-
(Locked)
Amazon Transcribe4m 15s
-
(Locked)
Hands-on learning: Amazon Transcribe4m 37s
-
(Locked)
Amazon Polly4m 19s
-
(Locked)
Hands-on learning: Amazon Polly1m 19s
-
Amazon Rekognition6m 2s
-
(Locked)
Hands-on learning: Amazon Rekognition9m 56s
-
(Locked)
Amazon Textract7m 12s
-
(Locked)
Hands-on learning: Amazon Textract3m 50s
-
(Locked)
Amazon Forecast8m 41s
-
(Locked)
Amazon Lex5m 43s
-
(Locked)
Amazon Fraud Detector3m 18s
-
(Locked)
Amazon Personalize15m 19s
-
(Locked)
Amazon Kendra5m 26s
-
(Locked)
Hands-on learning: Amazon Kendra7m 11s
-
(Locked)
Amazon Bedrock17m 43s
-
(Locked)
Hands-on learning: PartyRock (Amazon Bedrock playground)6m 57s
-
(Locked)
Amazon Augmented AI7m 18s
-
EC2 instances for AI5m 45s
-
(Locked)
Amazon Q Business7m 24s
-
(Locked)
Amazon Q Apps4m 15s
-
(Locked)
Hands-on learning: Amazon Q Business7m 42s
-
(Locked)
Hands-on learning: Amazon Q Apps7m 36s
-
(Locked)
Amazon Q Developer7m 14s
-
(Locked)
Exam cram1m 12s
-
(Locked)
-
-
(Locked)
Intro: Modelling (SageMaker built-in algorithms)1m 3s
-
Amazon SageMaker, SageMaker Studio12m 10s
-
(Locked)
Hands-on learning: Amazon SageMaker walkthrough2m 54s
-
(Locked)
Hands-on learning: Create an Amazon SageMaker notebook instance4m 35s
-
(Locked)
Built-in algorithms overview4m 19s
-
(Locked)
Linear Learner8m 27s
-
(Locked)
XGBoost5m 1s
-
(Locked)
LightGBM7m 5s
-
(Locked)
K-Nearest Neighbours4m
-
(Locked)
Factorization Machines4m 38s
-
(Locked)
DeepAR5m 13s
-
(Locked)
Image classification6m 4s
-
(Locked)
Object detection3m 38s
-
Semantic segmentation4m 13s
-
(Locked)
Seq2Seq3m 49s
-
(Locked)
BlazingText5m 8s
-
(Locked)
Neural Topic Model (NTM)2m 38s
-
(Locked)
Latent Dirichlet Allocation (LDA)1m 55s
-
(Locked)
Random Cut Forest (RCF)3m 27s
-
(Locked)
K-means clustering3m 24s
-
(Locked)
Hierarchical clustering8m 36s
-
Object2Vec5m 59s
-
(Locked)
Principal Component Analysis (PCA)2m 22s
-
(Locked)
IP Insights4m
-
(Locked)
Reinforcement learning4m 13s
-
(Locked)
Built-in algorithms recap4m 27s
-
(Locked)
Hyperparameter tuning (automatic model tuning)6m 6s
-
(Locked)
Hands-on learning: Hyperparameter tuning job3m 22s
-
(Locked)
Exam cram6m 58s
-
(Locked)
-
-
(Locked)
Intro: Amazon SageMaker services1m 2s
-
Amazon SageMaker Ground Truth5m 48s
-
(Locked)
Hands-on learning: Create a labelling job7m 8s
-
(Locked)
SageMaker Data Wrangler8m 53s
-
(Locked)
Hands-on learning: SageMaker Data Wrangler14m 52s
-
(Locked)
SageMaker Model Monitor6m 2s
-
(Locked)
Bias in machine learning11m 58s
-
(Locked)
Amazon SageMaker Clarify8m 35s
-
(Locked)
Hands-on learning: Amazon SageMaker Clarify17m 10s
-
Amazon SageMaker Feature Store8m 42s
-
(Locked)
SageMaker Canvas4m 12s
-
(Locked)
Hands-on learning: SageMaker Canvas19m 9s
-
(Locked)
SageMaker Model Registry6m 20s
-
(Locked)
Exam cram11m 21s
-
(Locked)
-
-
(Locked)
Intro: Model deployment53s
-
(Locked)
Online inference (real-time)20m 57s
-
(Locked)
Batch transform2m 17s
-
(Locked)
Other deployments8m 8s
-
(Locked)
Multi-model vs. multi-container endpoints10m 24s
-
(Locked)
Hands-on learning: Multi-model endpoint7m 16s
-
Hands-on learning: Multi-container endpoint2m 49s
-
(Locked)
SageMaker deployment7m 48s
-
(Locked)
Hands-on learning: XGBoost (churn prediction)6m 43s
-
(Locked)
Hands-on learning: Script mode3m 1s
-
(Locked)
Hands-on learning: Bring your own (BYO) Docker4m
-
(Locked)
SageMaker instance types3m 2s
-
(Locked)
SageMaker SDK7m 11s
-
(Locked)
Distributed training5m 20s
-
(Locked)
SageMaker Debugger3m 33s
-
Hands-on learning: SageMaker serverless inference6m 9s
-
(Locked)
SageMaker Autopilot3m 33s
-
(Locked)
Amazon SageMaker Inference Recommender6m 37s
-
(Locked)
Amazon SageMaker Serverless Inference5m 24s
-
(Locked)
Inference pipeline5m 3s
-
(Locked)
Hands-on learning: SageMaker Model Monitor15m 51s
-
(Locked)
SageMaker Neo6m 29s
-
(Locked)
SageMaker security6m 54s
-
(Locked)
Deployment target services10m 10s
-
(Locked)
Maintainable, scalable, cost-effective deployments8m 38s
-
(Locked)
Automatic scaling metrics4m 16s
-
(Locked)
Performance tradeoff analysis4m 10s
-
(Locked)
Apache Airflow, SageMaker Pipelines6m
-
(Locked)
Isolated ML system13m 12s
-
(Locked)
Exam cram11m 16s
-
(Locked)
-
-
(Locked)
Intro: AWS infrastructure, MLOps, and orchestration1m 6s
-
On-demand vs. provisioned resources3m
-
(Locked)
Scaling policies5m 27s
-
(Locked)
Infrastructure as code (IaC) services7m 21s
-
(Locked)
Docker containers and microservices6m 9s
-
(Locked)
Amazon Elastic Container Service (ECS)5m 51s
-
(Locked)
Hands-on learning: Launch Docker containers on AWS Fargate7m 40s
-
(Locked)
Docker containers with SageMaker15m
-
(Locked)
SageMaker MLOps for Kubernetes and SageMaker projects5m 34s
-
(Locked)
CI/CD overview5m 59s
-
(Locked)
GitFlow, GitHub Flow9m 48s
-
(Locked)
(CI/CD) Pipelines using AWS CodePipeline, CodeBuild, and CodeDeploy6m 6s
-
(Locked)
Automated tests in CI/CD pipelines8m 52s
-
(Locked)
Services to automate orchestration in ML3m 41s
-
Hands-on learning: CI/CD for training and deployment2m 21s
-
(Locked)
Model retraining framework7m 54s
-
(Locked)
Exam cram7m 38s
-
(Locked)
-
-
(Locked)
Intro: Foundation models and applications19s
-
(Locked)
Foundation model lifecycle7m 59s
-
(Locked)
Selection criteria for pre-trained models2m 27s
-
(Locked)
Tweaking inference parameters8m 33s
-
(Locked)
Hands-on learning: Tweaking inference parameters5m 20s
-
(Locked)
Embeddings and vector databases12m 10s
-
(Locked)
Retrieval augmented generation (RAG)8m 21s
-
(Locked)
RAG use cases5m 5s
-
RAG in Amazon Bedrock2m 47s
-
(Locked)
Hands-on learning: Amazon Bedrock knowledge bases7m
-
(Locked)
Optimizing foundation models13m 36s
-
(Locked)
Choosing the right approach: Fine-tuning vs. RAG8m
-
(Locked)
Fine-tuning a foundation model (deep dive)7m 37s
-
(Locked)
Data preparation for fine-tuning4m 19s
-
(Locked)
Evaluating a foundation model2m 56s
-
(Locked)
Foundation model performance metrics4m 17s
-
(Locked)
Business objectives for foundation models3m 22s
-
(Locked)
-
-
(Locked)
Intro: GenAI services and infrastructure10s
-
(Locked)
AWS services for GenAI10m 58s
-
(Locked)
Choosing foundation models and AWS GenAI service9m 7s
-
(Locked)
Why AWS services for GenAI?3m 40s
-
(Locked)
EC2 for GenAI8m 21s
-
(Locked)
Why AWS infrastructure for GenAI3m 47s
-
Cost tradeoffs of AWS GenAI services2m 57s
-
(Locked)
-
-
(Locked)
Intro: Monitoring and optimization48s
-
(Locked)
ML Lens for monitoring9m 39s
-
(Locked)
CloudWatch for ML1m 54s
-
(Locked)
AWS X-Ray4m 19s
-
(Locked)
Amazon QuickSight5m 43s
-
(Locked)
Hands-on learning: Create an analysis using QuickSight3m 51s
-
(Locked)
AWS CloudTrail for ML5m 6s
-
(Locked)
SageMaker monitoring4m 36s
-
(Locked)
Regulatory compliance standards for AI systems5m 3s
-
(Locked)
AWS services for regulatory compliance3m 7s
-
(Locked)
Exam cram4m 23s
-
(Locked)