AWS launches fully-managed data lake service
Amazon Web Services has launched a new, fully-managed service for customers to build, secure, and manage data lakes.
AWS Lake Formation aims to simplify and automate the creation of a data lake, including data collection, cleaning, and cataloguing, as well as data availability for analytics.
According to AWS, “Customers can easily bring their data into a data lake from a variety of sources using pre-defined templates, automatically classify and prepare the data, and centrally define granular data access policies to govern access by the different groups within an organisation.”
AWS Lake Formation also automates manual, time-consuming steps, like provisioning and configuring storage, crawling the data to extract schema and metadata tags, automatically optimising the partitioning of the data, and transforming the data into formats like Apache Parquet and ORC that are ideal for analytics.
AWS Lake Formation cleans and deduplicates data using machine learning to improve data consistency and quality.
AWS Lake Formation simplifies data access and security by providing a single, centralised place to set up and manage data access policies, governance, and auditing across Amazon S3 and multiple analytics engines.
AWS Lake Formation also reduce dataset hunting time by providing a central, searchable catalogue, which describes the available data sets and their appropriate business use.
Customers will be able to analyse data through AWS analytics and machine learning services such as Amazon Redshift, Amazon Athena, and Amazon Glue.
AWS expects to add capabilities for Amazon EMR, Amazon QuickSight, and Amazon SageMaker following in the coming months.
“Our customers tell us that Amazon S3 is the ideal place to house their data lakes, which is why AWS hosts more data lakes than anyone else – with tens of thousands and growing every day. They’ve also told us that they want it to be easier and faster to set up and manage their data lakes,” says AWS vice president of databases, analytics and machine learning, Raju Gulabani.
“That’s why we built AWS Lake Formation, so customers can spend more time learning from their data and innovating, rather than wrestling that data into functioning data lakes. AWS Lake Formation is available today and we’re excited to see how customers use it as one of the building blocks for growing and transforming their businesses and customer experience.”
AWS says there are no additional charges required to use AWS Lake Formation, and customers pay only for the underlying AWS services used.
AWS Lake Formation is available today in Asia Pacific (Tokyo), Europe (Ireland), US East (Ohio), US East (N. Virginia), and US West (Oregon), with additional regions coming soon.