Building and hosting LLM based applications using GCP serverless stack

Google Pittsburgh - Bakery Square, 6425 Penn Avenue, Pittsburgh, 15206

This session will explain how to architect, build and deploy generative AI applications using LLM on GCP. Two different applications will be presented and discussed in depth: 1) Resume Chatbot, 2) CrossFit Workout Scheduler. Attendees will see live demonstrations of the functioning applications, understand retrieval augmented generation, few shot learning, developer tooling, and more.

Sep 28, 2023, 7:00 – 9:00 PM


Key Themes

AICloudDataEnterprise/Business SolutionsMachine LearningVertex AIWeb

About this event

Part 1. Resume Chatbot (by Roman Kharkovski)

Watch the live demo and dive into the architecture and code of the "Resume Chatbot", a project designed to enable querying of resumes stored as PDF files using plain English. Utilizing an array of Language Learning Models—including Google's PaLM, Gen AI Enterprise Search, Vertex AI, and OpenAI's ChatGPT. Hosted on Google Cloud Platform (GCP), the application is designed with Python and FastAPI hosted on CloudRun, protected by IAP and storing data in GCS and Firestore. It uses the LlamaIndex framework and LangChain for optimized data extraction. 

We will discuss common obstacles and solutions for using LLMs for querying private datasets in a secure GCP environment. Participants will gain insights into how queries, ranging from specific skill assessments to contrasting multiple resumes, are processed and responded to. The session will provide a hands-on overview of the application's web UI, an understanding of its backend components, and a walkthrough of its deployment strategy on GCP. 

The project is available under Apache 2.0 licenses on GitHub:

Part 2. CrossFit Workout Scheduler (by Misha Kharkovski)

Generative AI opens limitless possibilities for optimizing and streamlining applications and microservices, both for personal and enterprise use cases. Watch this live demo of a CrossFit workout-interpreting microservice “WoDCal”, which retrieves workout data from an API, uses Google PaLM 2 to interpret the workout’s duration, then posting that to a Google Calendar event all without user intervention through the use of a Cloud Function triggered by Pub/Sub event. This project was coded in Python and deployed on Google Cloud.

Details of the project will be discussed, such as few-shot learning, limitations of estimating workout durations for certain types of workouts, and general deployment tips to allow developers to be empowered to deploy their own generative-ai assisted projects. 



Thursday, September 28, 2023
7:00 PM – 9:00 PM UTC


  • Roman Kharkovski


    Principal Architect

  • Misha Kharkovski

    University of Pittsburgh


  • Derek Gordon


    GDG Organizer

  • Chris Pearlman


    GDG Organizer

  • Roman Kharkovski


    Principal Architect

Contact Us