From the course: Building a Privacy Program in the Age of GenAI

The real meaning of "big data"

- [Instructor] To use a cliche, data is like oxygen. Just as you and I need oxygen to breathe, companies need data to survive and thrive. There is a key difference, though. Instead of being stuck at 21% like oxygen, data grows, keeps growing, and then grows some more. Once data enters the company systems, growth is the only way to go, it seems. But why is that? Rather than just blaming big tech companies, a lot of whom are on my resume, by the way, let me offer you a more detailed theory. This is about efficiency and velocity. Companies are accountable to shareholders, who demand efficiency because efficiency in theory leads to growth, higher margins, and bigger profits. This means that the innovators who work for these companies often resist processes and bureaucracies they deem unnecessary. Engineers, for example, insist on easy access to data and tools. This often leads to multiple copies of data, loose controls, and weak checkpoints. Data, as you can see on this slide, spreads throughout the company systems like fire through dry grass. Engineers working on AI, for example, feel like they need their own programs to have access to their own dedicated datasets. Security teams, for example, believe that keeping the company's data safe requires them to have their own copies of the data as well. This leads to a massive proliferation of data and the tools that create even more copies of the data as well. Consumers, like you and I, also play a part. We demand velocity, which is just a fast service. When we open a retail app, for example, we want the best deals and the fastest shipping possible. When we open a rideshare app, we want our taxis to show up as quickly as possible. Collectively, we have built an ecosystem that necessitates and facilitates the replication of data. This is how what starts as data becomes big data, as this funnel shows going from the ingestion point on the far left to the data warehouses on the far right.

Contents