What you'll learn:
- Apache Beam
- ETL
- Python
- Google Cloud
- DataFlow
- Google Cloud Storage
- Big Query
This course wants to introduce you to the Apache Foundation's newest data pipeline development framework: The Apache Beam, and how this feature is becoming popular in partnership with Google Dataflow. In a summary, we want to cover the following topics:
1. Understand your inner workings
2. What are your benefits
3. Explain how to use on your local machine without installation via Google Colab for development
4. Its main functions
5. Configure Apache Beam python SDK locallyvice
6. How to deploy this resource on Google Dataflow to a Batch pipeline
This course is dynamic, you will be receiving updates whenever possible.
It is important to remember that this course does not teach Python, but uses it. So, get comfortable with knowing Python basics, defining a function, creating objects and data types.
Also, if you are interested in learning section 4, which consists of deploying a pipeline on Google Dataflow, you will need to have a free counter in GCP. It's a simple process, but it requires a credit card!
I kindly ask you you to consider all the efforts to put this course together and give a nice rate at the end of the course, even tough the course is simple, it was made with all good intent to share knowledge for cheap price. Thanks and hope you enjoy!
___________________________________________________________________________________________________________
Requirements:
· Basic knowledge of Python
· Have Python 3.7 or greater installed locally (from section 4)
· Free account at GCP (from section 4)
Schedule:
· Section 2 – Concepts
· Section 3 – Main Functions
· Section 4 – Apache Beam on Google Dataflow