GenAI in the Modern Enterprise and LLMs on GKE

Google Norway, 6 Bryggegata, Oslo, 0250

GDG Cloud Oslo, Norway

16:30: Welcome, food, and mingle17:00: TALK 1, by Abdel, 30 minGenerative AI adoption starts from business needs, not te...

Mar 25, 3:30 – 5:30 PM (UTC)

13 RSVP'd

RSVP

Key Themes

AICloud

About this event

16:30: Welcome, food, and mingle

17:00: TALK 1, by Abdel, 30 min

Generative AI adoption starts from business needs, not technological aspects.

Enterprises constantly strive for a competitive edge through technology, and LLM solutions offer unique potential. However, this happens only once we clearly understand that our business requirements transcend our current technical capabilities.

Let's roll up our sleeves and learn hands-on how to build, test and deploy cutting-edge, powerful Gen AI applications in the Modern Enterprise, in a Serverless environment, using Java, AI orchestration frameworks and multiple LLMs, with a concrete, real-world production use-case as a backdrop. The workshop empowers the enterprise Java developer to unlock new, creative possibilities for their Java apps and build features in novel ways. It caters to the seasoned Java developer in equal measure as to the curious newcomer to GenAI and is crafted as a follow-along workshop. What are you going to leave this session with:

  • a well-balanced, end-to-end, multi-modal RAG application built in Java, ready to run in the cloud and serve as a reference architecture for a modern generative AI enterprise app

  • an idempotent solution built in BOTH SpringAI and Langchain4J, today's dominant Java AI orchestration frameworks

  • deploy Gen AI apps in Cloud Run, a serverless environment

  • use multiple LLMs deployed in

  • Managed environments - Google VertexAI

  • Local environments - Ollama, Testcontainers

  • Kubernetes - vLLM - an optimized LLM serving engine

  • full codebase, configuration and deployment instructions.

17:30: TALK 2, by Mofi, 30 min

Serving All The LLMs on GKE

In this session we will talk about all* the ways to serve a Large Language Model on GKE. From smaller models that fit on a single GPU to models that require Multiple GPU nodes. We will use open source tools like vLLM, TGI, Ollama to serve a number of Open Models like Gemma, Llama, DeepSeek in a GKE cluster using both GPU and TPU. We will discuss optimization techniques for speed and cost.

18:00: More mingle.

When

When

Tuesday, March 25, 2025
3:30 PM – 5:30 PM (UTC)

Agenda

3:30 PMNetworking and Food
4:00 PMPresentations
5:00 PMMingle and discussions

Speakers

  • Abdelfettah Sghiouar

    Google

    Google Cloud Engineer

  • Mofi Rahman

    Google

    Developer Relations Engineer

Host

  • Rustam Mehmandarov

    Google Developer Expert for Cloud

    Senior Software Engineer

Organizers

  • Rustam Mehmandarov

    GDG Organizer

  • Leonard Sheng Sheng Lee

    Co-organiser

Contact Us