Google tech talk #6: Distributed ML Pipelines with Tensorflow

GDG Cloud Seattle
Thu, Jun 25, 2020, 9:00 AM (PDT)

About this event

Make sure to register and attend the event here:

Start date: June 25, 9am PST / 12pm EST

Welcome to the session 6 of the Beam Learning Months!

Production ML workloads often require very large compute and system resources, which leads to the application of distributed processing on clusters. On premises or cloud-based infrastructure cost requires maximum efficient use of resources. This makes distributed processing pipeline frameworks such as Apache Flink ideal for ML workloads.
In addition, production ML must address issues of modern software methodology, as well as issues unique to ML. Different types of ML have different requirements, often driven by the different data lifecycles and sources of ground truth. Implementations often suffer from limitations in modularity, scalability, and extensibility.
In this talk, we discuss production ML applications and review TensorFlow Extended (TFX), Flink, Apache Beam, and Google experience with ML in production.

All previous sessions are recorded, and watch on youtube:

* Session 1: May 6th, Interactive Introduction to Apache Beam
* Session 2: May 13th, Best practices to a production-ready pipeline
* Session 3: May 20th, Introduction to the Spark Runner
* Session 4: May 27th, The Best of Both Worlds: Unlocking the Power of Apache Beam with Apache Flink
* Session 5: Jun 3rd, Feature Powered by Apache Beam – Beyond Lambda