Join us for an Open data lake day on Google Open Source Live!
May 6, 2021, 4:00 – 6:00 PM
About this event
Google data analytics experts will share updates on everything from Open Table Formats to Breaking Down Innovation and Cloud Barriers.
Throughout the event speakers will answer your questions via the Live Q&A Forum. We’ll wrap up the event with an After Party on Google Meet for an opportunity to connect with the speakers and other attendees.
Event: Open data lake day on Google Open Source Live
Date: Thursday, May 6 at 9:00 am - 11:00 am PST
9:00 am Opening
TJ Laher, Product Marketing Manager (Google)
Priyanka Vergadia, Developer Advocate (Google)
9:03 am Keynote: Open Data Lakes: Breaking Down Innovation and Cloud Barriers with Open Source
Amr Awadallah oversees Cloud Developer Relations, including the overall Cloud Information Experience, Advocacy and Developer Program Engineering. Prior to joining Google in Nov 2019, Amr co-founded Cloudera in 2008 and as CTO, he spent the last 11 years working closely with enterprise customers and developers around the world. He also served as vice president of product intelligence engineering at Yahoo! and ran one of the very first organizations to use Hadoop for big data analysis and business intelligence. Amr received his PhD in EE from Stanford University (thesis was on leveraging virtual machines to implement internet scale applications), and his Bachelor and Master Degrees from Cairo University, Egypt.
Amr Awadallah, VP for Cloud Developer Relations (Google)
9:25 am Session 2: Scale out Deep Learning with Spark
Recent advances to the Apache Spark ecosystem make it a great framework for distributing Deep Learning jobs. You’ll learn more about why as well as how to distribute a TensorFlow job using Spark.
Brad Miro, Developer Programs Engineer (Google)
9:47 am Session 3: Open Table Formats and Apache Iceberg Improvements
Table formats (Hudi, Iceberg, Delta Lake) are getting adopted by enterprises. In this talk, we will dive into what they are, how they work and introduce the new changes in Apache Iceberg.
Roderick Yao, Engineer & Product Manager (Google)
10:09 am Session 4: Scaling Hive Metastore with gRPC and Cloud Spanner
Introduce the improvements on Hive Metastore by adding the options to use gRPC as the access interface and Cloud Spanner as the backend database, which enhance the RPC functionalities, authorization/authentication, and database scalability.