Data Ingestion with AWS Glue
Data Ingestion with AWS Glue
In this section, we will perform the following steps:
- Configure Role Permissions for the resources we use.
- Create a Data Catalog from our cleaned dataset with AWS Glue Crawler.
- Transform the CSV dataset into Apache Parquet format using AWS Glue jobs.
- Create a Data Catalog for data converted into Apache Parquet format.
- Check Schema information.
Our goal is to prepare the data ready for querying using Amazon Athena.
- Configure role for AWS Glue
- Create Data Catalog
- Transform to Parquet
- Transform to Parquet-2
- Create New Data Catalog
- Check Schema Information