In this lab you will build several Data Pipelines that will ingest data from a publicly available dataset into BigQuery.
Overview
Syllabus
- GSP290
- Overview
- Setup
- Task 1. Ensure that the Dataflow API is successfully enabled
- Task 2. Download the starter code
- Task 3. Create Cloud Storage Bucket
- Task 4. Copy files to your bucket
- Task 5. Create the BigQuery dataset
- Task 6. Build a Dataflow pipeline
- Task 7. Data ingestion
- Task 8. Review pipeline python code
- Task 9. Run the Apache Beam pipeline
- Task 10. Data transformation
- Task 11. Run the Apache Beam pipeline
- Task 12. Data enrichment
- Task 13. Review pipeline python code
- Task 14. Run the Apache Beam pipeline
- Task 15. Data lake to Mart
- Task 16. Review pipeline python code
- Task 17. Run the Apache Beam Pipeline
- Test your understanding
- Congratulations!