Setting a Hadoop Cluster & Practice MapReduce

GDG Sfax

This workshop provides hands-on experience in setting up a Hadoop cluster, from installation to configuration, and demonstrates the power of MapReduce for distributed data processing. Participants will learn to deploy Hadoop on a cluster, understand its ecosystem, and implement basic MapReduce programs to process large datasets. By the end of the session, attendees will have a solid foundation in

Dec 8, 5:00 – 8:00 PM (UTC)

28 RSVP'd

Key Themes

Data

About this event

This comprehensive workshop is designed to give participants practical experience in setting up a Hadoop cluster and working with MapReduce for big data processing. The session will begin with an introduction to Hadoop, explaining its architecture and key components such as HDFS (Hadoop Distributed File System), YARN (Yet Another Resource Negotiator), and MapReduce.

Participants will learn step-by-step how to install and configure Hadoop on both single-node and multi-node clusters, ensuring they understand the underlying processes required to set up a distributed computing environment. The session will cover the configuration of core Hadoop components, including HDFS, and YARN, with an emphasis on optimizing cluster performance and troubleshooting common issues.

Once the cluster is up and running, the workshop will focus on MapReduce programming. Attendees will explore how MapReduce works, its role in distributed data processing, and how to write and execute MapReduce programs to process large-scale datasets efficiently. They will practice creating MapReduce jobs using Java, and gain an understanding of how data is split and processed across the cluster, as well as techniques for debugging and optimizing these jobs.

By the end of the workshop, participants will have gained the skills necessary to:

Set up and configure a Hadoop cluster (single-node and multi-node).

Understand the architecture and components of Hadoop.

Implement MapReduce jobs to process large datasets in a distributed environment.

Optimize and troubleshoot Hadoop clusters and MapReduce jobs.

This workshop is ideal for anyone looking to get hands-on experience with Hadoop and MapReduce, whether for academic purposes, data engineering roles, or exploring big data technologies in a practical setting.

Organizers

  • Anoir Feki

    Events & Relationships Manager

  • Habib M. Kammoun

    University of Sfax

    GDG co-Organizer

  • YASMINE CHELLY

    Datagram

    GDG Mentor

  • IMEN SELMI

    Media Manager

  • Imen Masmoudi

    Logistics Manager

  • Mohamed dridi

    misfat

    Logistics Manager

  • Mohamed Moussa

    Media Manager

  • ameni elabed

    GDG Member

  • ayman ktari

    centroid solution

    Events manager

  • Maryem Aloulou

    Events & Relationships Manager

Contact Us