Streaming NLP infrastructure on Dataflow

Streaming NLP infrastructure on Dataflow

Jul 19, 2022, 10:00 – 11:00 PM


Key Themes

CloudDataMachine Learning

About this event

This tech talk is one session at the beam summit. go to the event website for details and rsvp:

Trustpilot is an e-commerce reviews platform delivering millions of new reviews to businesses each week. We are using Apache Beam on GCP Dataflow to deliver real-time streaming inferences with the latest NLP transformer models.

Our talk will touch on:

Infrastructure setup to enable Python Beam to interface with Kafka for streaming data

Taking advantage of Beam’s unified programming model to enable batch jobs for backfilling via BigQuery

Working with GPUs on Dataflow to speed up local model inference

MLOps: Using Dataflow as part of a continuous evaluation model monitoring setup


  • Bill

    GDG Organizer

Contact Us