From the course: AWS Certified Data Engineer Associate (DEA-C01) Cert Prep

Unlock this course with a free trial

Join today to access over 25,300 courses taught by industry experts.

Column compression

Column compression

- [Instructor] There are a number of ways that we can optimize the performance of a Redshift cluster. In this lesson, we'll talk about compressing the data to reduce its size, and disk IO, which will increase query performance. Recall that Redshift stores data in columnar format. As a result, you can reduce the storage size and query time by applying a compression encoding to each column according to what works best for the type of data stored in the column. Redshift supports a number of different compression encodings, and you could specify the one to use for each column in a table. When you specify raw for the encoding, no compression is used. This is common for boolean, real, and double precision numbers. AZ64 works well for compressing integer, decimals, and date types. Byte dictionary creates a separate dictionary for the values in a column, thereby not needing to store repeated values in their full size. Delta just stores a difference between values that follow each other in the…

Contents