Upload cleaned dataset

Upload cleaned dataset

In this step, we will upload the dataset that we have cleaned and transformed back to our S3 bucket.

Upload to Cloud9

  1. Access the Cloud9 admin interface
    • Click Open IDE.

Datalake

  1. Click on the File Menu, click Upload Local Files.
    • Drag and drop the cleaned folder, containing the data after cleaning that you have downloaded in the previous step.
    • Ensure that the data looks like the image after uploading.

Datalake

Make sure that the data upload process is completed before proceeding to the next step.

You can download the cleaned data and the structured directory below for use, or for reference.

  1. In the Cloud9 terminal interface, run the following command to upload the cleaned dataset to the S3 bucket.
aws s3 cp ./cleaned s3://yourname-0000-datalake/cleaned --recursive

Datalake

  1. Verify that the data has been successfully uploaded to the S3 bucket before proceeding to the next step.

Datalake