Multimodality with Gemini(Unleashing the power of Text, Videos, images etc).

Apr 22, 1:00 – 3:00 PM


Our final workshop on multimodality with Gemini!

Gemini is the most capable and general model Google has ever built. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information, including text, code, images, and video. 

This talk dives into the exciting world of Gemini, a cutting-edge foundation model developed by Google. Discover how Gemini seamlessly integrates text and image processing, enabling you to:

  • Generate realistic images from text descriptions.
  • Analyze and understand the content of images.
  • Perform cross-modal tasks like image captioning and visual question-answering
  • Explore the potential of multimodality for various applications, from creative content generation to advanced information retrieval.

    Join us to unlock the power of Gemini and push the boundaries of AI!



Monday, April 22, 2024
1:00 PM – 3:00 PM UTC


1:00 PM
1:10 PM
2:20 PM
2:45 PM


  • Henry Ruiz

    GDE, Machine Learning


  • Agien Petra

    Community Manager


  • Noella Mbongeya

    MBY logistic company

    WTM Ambassador

  • Ida Delphine

    GDG Organizer

  • Nangah Amandine

    University of Bamenda

    GDG Co-organizer

  • Nui Lewis

    Graphic Designer

  • Agien Petra

    GDG Co-organizer

  • Ghany Elisha

    Graphic Designer

  • Kinyuy Kelly

    GDG Co-organizer

