Amazon Redshift and AWS Glue

project image
If you work with petabyte-scale data, this is for you. It is a complete guide including theory, good practices and practical exercises with a main focus on Amazon Redshift service.

Paweł Hajduk
BigData Cloud Architect & DevOps

Paweł Hajduk

For whom?

  • Anyone interested in Big Data, in particular those working with peta-byte scale data.

Any requirements?

Previous experience with big data is a plus. No previous experience with AWS is required.

What are the topics?

  1. Introducing AWS - brief introduction into cloud computing, with emphasis on AWS ecosystem and services available there.
  2. OLTP vs OLAP.
  3. Introduction to Amazon Redshift.
  4. Row vs Column-oriented DB.
  5. Redshift - explaining cluster types.
  6. Monitoring of the cluster.
  7. Multilevel security: data protection, IAM, internal privileges and logs.
  8. A recipe for designing tables.
  9. Sorting: Blocks and Zone Maps, explaining compound and interleaved sorting.
  10. Deep dive into Redshift architecture.
  11. Ingesting data.
  12. Copying data - options and good practices.
  13. Explaining ETL (Extract, Transform and Load) with AWS Glue AWS Data Pipeline examples.
  14. Explaining ELT (Extract, Load, and Transform) with Amazon Redshift Spectrum example.
  15. Explaining BI (Business Intelligence) with Amazon QuickSight example.
  16. Introduction to Data Lake.
  17. Practical exercises with Amazon Athena, AWS Lake Formation and AWS Glue.

Want to hear more?