Open data lake day on Google Open Source Live

Join us for an Open data lake day on Google Open Source Live!

May 6, 2021, 4:00 – 6:00 PM

4
RSVP'd

Key Themes

CloudData

About this event

Google data analytics experts will share updates on everything from Open Table Formats to Breaking Down Innovation and Cloud Barriers.

Throughout the event speakers will answer your questions via the Live Q&A Forum. We’ll wrap up the event with an After Party on Google Meet for an opportunity to connect with the speakers and other attendees.

--------------------------------------------------------------------------------------------------------------

[How to Register]

Register to attend here: http://goo.gle/OpenDataLakeday

--------------------------------------------------------------------------------------------------------------

Event: Open data lake day on Google Open Source Live

Date: Thursday, May 6 at 9:00 am - 11:00 am PST

Agenda (PST):

9:00 am Opening

TJ Laher, Product Marketing Manager (Google)

Priyanka Vergadia, Developer Advocate (Google)

9:03 am Keynote: Open Data Lakes: Breaking Down Innovation and Cloud Barriers with Open Source

Amr Awadallah oversees Cloud Developer Relations, including the overall Cloud Information Experience, Advocacy and Developer Program Engineering. Prior to joining Google in Nov 2019, Amr co-founded Cloudera in 2008 and as CTO, he spent the last 11 years working closely with enterprise customers and developers around the world. He also served as vice president of product intelligence engineering at Yahoo! and ran one of the very first organizations to use Hadoop for big data analysis and business intelligence. Amr received his PhD in EE from Stanford University (thesis was on leveraging virtual machines to implement internet scale applications), and his Bachelor and Master Degrees from Cairo University, Egypt.

Amr Awadallah, VP for Cloud Developer Relations (Google)

9:25 am Session 2: Scale out Deep Learning with Spark

Recent advances to the Apache Spark ecosystem make it a great framework for distributing Deep Learning jobs. You’ll learn more about why as well as how to distribute a TensorFlow job using Spark.

Brad Miro, Developer Programs Engineer (Google)

9:47 am Session 3: Open Table Formats and Apache Iceberg Improvements

Table formats (Hudi, Iceberg, Delta Lake) are getting adopted by enterprises. In this talk, we will dive into what they are, how they work and introduce the new changes in Apache Iceberg.

Roderick Yao, Engineer & Product Manager (Google)

10:09 am Session 4: Scaling Hive Metastore with gRPC and Cloud Spanner

Introduce the improvements on Hive Metastore by adding the options to use gRPC as the access interface and Cloud Spanner as the backend database, which enhance the RPC functionalities, authorization/authentication, and database scalability.

Zhou Fang, Software Engineer (Google)

10:30 am After Party on Google Meet

11:00 am End

Organizers

  • Csaba Toth

    SportsBoard

    GDG lead, WTM ambassador

  • Jennifer Brookshire

    The Penny Hoarder

    Software Tester

  • Estefania Flores

    ambassador

  • Saige Shafer

    ambassador

  • Grace Ann Aranico

    ambassador

Contact Us