Online AI Talk: Insights on Data Challenge in Deep Learning Projects

Online event

This is online AI talk event, you can join from anywhere with zoom, please register and attend here: https://learn.xnextcon.com/event/eventdetails/W20092310 Abstract: Data is the most precious resource of deep learning research. As such, it should be handled carefully, from data gathering, data annotation, data QA and data versioning. However, even if you managed to perform all the above task

Sep 23, 2020, 5:00 – 6:00 PM

RSVP'd

Key Themes

About this event

This is online AI talk event, you can join from anywhere with zoom, please register and attend here:
https://learn.xnextcon.com/event/eventdetails/W20092310

Abstract:
Data is the most precious resource of deep learning research. As such, it should be handled carefully, from data gathering, data annotation, data QA and data versioning. However, even if you managed to perform all the above tasks in the best possible way, data holds challenges that can dramatically affect your performance.

In this talk, we discuss the fact that your data is most likely biased and that it affects the performance of your model. We will show how to identify data bias and what can be done to address it. Particularly, we focus on class imbalance. We provide illustrative experiments to accompany these ideas. Our experiments focus on an object detection task, which have additional complexities beyond vanilla classification tasks. We explore how different data balancing methods (data resampling and loss reweighting) affect the performance of minority and majority classes in such settings.

In addition we will peek into the diminishing effect of annotated data. Deep learning models are notorious for their endless appetite for training data. The process of acquiring high quality annotated data consumes a relatively large amount of resources. Monitoring the diminishing effect provides a way to assess how much data is needed for the different stages of the project lifecycle and even predicting whether the current model architecture will be able to achieve the target metric. This knowledge effectively provides a tool for optimal management of time, manpower, and computing resources.

Finally, we will discuss the features needed for a dataset management tool that can help identify and tackle the data challenge in your deep learning projects. We will demonstrate the effectiveness of using such a tool on popular computer vision tasks.

Social networking with speakers, attendees 30mins before/after the event on slack. Join slack by the invitation: https://bit.ly/3gi7bjf . The two channels:
#jobs for job posting from speakers, partners, sponsors companies, and you can Q&A with hiring managers right in the channel.
#events for events Q&A, mixing and networking with speakers and other peer attendees.

When

When

Wednesday, September 23, 2020
5:00 PM – 6:00 PM UTC

Organizer

  • Bill

    GDG Organizer

Contact Us