This is the fifth and final part of a 5 part story about how to democratize your data using a modern data lake built for the cloud.
With this final part, we discuss the steps needed to ensure your built-for-the-cloud data lake is up to par for your business needs.
Building a modern data lake requires technology to easily store data in raw form, provide immediate exploration of that data, refine it in a consistent and managed way, and make it easy to support a broad range of operational analytics.
These steps should be followed when getting started:
This is part 4 of a 5 part story about how to democratize your data using a modern data lake built for the cloud.
Here we discuss the benefits of modern, cloud-built data lake and how it can help to reduce infrastructure costs and leverage multiple data forms to gain further insights for business decisions.
The first two generations of data lakes were constructed either by using open source technologies (like Apache Hadoop) or by customizing an object store from a cloud storage provider (Amazon Web Services, Microsoft Azure, or Google Cloud Platform).
These earlier approaches created multiple issues:
This is part 3 of a 5 part story about how to democratize your data using a modern data lake built for the cloud.
Here we introduce the modernization of a data lake and what it means to have a modern architecture for this new data lake implementation.
Some of today’s most valuable data doesn’t come with a predefined structure, and that’s where data lakes shine.
Modern data lakes are more versatile and often take the form of a cloud-based analytics layer that optimizes query performance against data stored in a data warehouse or an external object store for deeper…
This is part 2 of a 5 part story about how to democratize your data using a modern data lake built for the cloud.
Here we discuss why the modern data lake emerged and what business needs it arose to address.
Data lakes have arisen to solve a growing problem: the need for a scalable, low-cost data repository that allows organizations to easily store all data types from a diverse set of sources, and then analyze that data to make evidence-based decisions.
Data lakes are an ideal way to gather, store and analyze enormous amounts of data in one location. The modern cloud data lake leverages the power, flexibility, and near-infinite scalability of the cloud.
The term data lake was coined to describe a new type of data repository for storing massive amounts of raw data in its native form, in…
To accomplish their commitment to delivering safer and more reliable transportation, Uber relies heavily on data-driven decisions at every level. Since 2014, they’ve developed a Big Data solution that ensures reliability, scalability, and ease-of-use, and are now focusing on increasing their platform’s speed and efficiency.
Let’s take a deep dive into Uber’s Big Data platform and learn how they’re expanding their ecosystem to become more reliable and efficient.
Before 2014, Uber’s limited data could fit into a few traditional online transaction processing (OLTP) databases (MySQL and PostgreSQL in their case).
To leverage data, engineers had to access each database or…
I'm a data engineer looking to broaden my knowledge and am passionate about Big Data. I also enjoy blogging about data and big data infrastructures!