Big Data and Hadoop Training

Home / Big Data and Hadoop Training

Big Data & Hadoop Training in Kolkata

Join the data revolution. Learning Big data and Hadoop which has almost become the de facto standard for storing, processing and analyzing hundreds of terabytes, and even petabytes of data.

Our program on Big Data Hadoop Course is designed to equip you with the necessary technical knowledge to create a big data organization that can reveal insights from all types of data and systems.

Data Brio Academy is the authorised training partner of Webel (Govt. of West Bengal Enterprise) and AIMA (All India Management Association)

Course Objectives

At the end of the course, participants should be able to:

  • Master the concepts of HDFS and MapReduce framework
  • Understand Hadoop 2.x Architecture
  • Setup Hadoop Cluster and write Complex MapReduce programs
  • Learn the data loading techniques using Sqoop and Flume
  • Perform Data Analytics using Pig, Hive and YARN
  • Implement HBase and MapReduce Integration
  • Implement Advanced Usage and Indexing
  • Schedule jobs using Oozie
  • Implement best Practices for Hadoop Development
  • Work on a Real Life Project on Big Data Analytics

Who should go for this course?

Predictions by research and technology houses say 2017 will be the year when Hadoop finally becomes a cornerstone of business technology agenda. To stay ahead in the game, Hadoop has become a must-know technology for the following professionals:

  • Analytics Professionals
  • BI /ETL/DW Professionals
  • Project Managers
  • STesting Professionals
  • Mainframe Professionals
  • Software Developers and Architects
  • Graduates aiming to build a career in Big Data

Participants’ take away

  • Understand Big Data and Hadoop Ecosystem
  • Hadoop distributed file system (HDFS)
  • Using Map Reduce API and writing algorithms
  • Best practices for developing and debugging Map Reduce programs
  • Advanced Map reduced concepts and algorithms
  • Managing and Monitoring Hadoop cluster
  • Importing and Exporting data using Sqoop
  • Hive, HBase and Pig for analysis
  • R for Hadoop for analytics


  • Java / C++ / Python programming knowledge
  • Linux / UNIX

Learning Modules

MODULE 1 : Big Data

  • Introduction to Big Data
  • What Is Big Data?
  • Types & elements of big data
  • Business Applications
  • Technologies Used

    • Distributed & Parallel computing
    • Virtualization & its importance to big data

    MODULE 2 : Hadoop Ecosystem
    Introduction to Hadoop Ecosystem

    • Introduction to Hadoop Ecosystem
    • Hortonworks Sandbox Setup

    Introduction to HDFS

    • Namenode/datanode
    • Jobtracker/tasktracker
    • Data Replication
    • File Read/ Write

    Lab 1: HDFS Commands
    MODULE 4: MapReduce
    MapReduce Essentials

    • MR Daemons
    • MR Framework
    • MR API
    • Mapper/ Reduce class
    • Writing Mapper/Reducer
    • Combiners & Partitioners
    • Testing & Debugging

    Lab 2: MapReduce Example 1
    Lab 3: MapReduce Example 2

    MODULE 5 : MapReduce using Hadoop Streaming API
    Introduction to Streaming API

    Lab 4: Hadoop Streaming API using R
    Lab 5: Hadoop Streaming API using Python

    MODULE 6 : Pig, Hive & HBase
    Introduction to Pig

    • Pig Latin
    • Structure
    • Functions
    • Expression
    • Relational operation Schema

    Lab 6: Hive
    Introduction to Hive

    • Hive architecture
    • Loading/Quering data into a table

    Lab 7: HBase
    Introduction to HBase

    • HBase data model
    • HBase Vs RDBMS

    Lab 8: HBase

    MODULE 7 : Sqoop, Oozie & Flume
    Introduction to Sqoop, Oozie, Flume

    • Sqoop Commands
    • Oozie architecture & workflow
    • Packaging & deploying an Oozie Workflow application
    • Flume architecture & components

    Lab 9: Sqoop
    Lab 10: Sqoop Advanced
    Lab 11: Oozie
    Lab 12: Flume

    MODULE 8 : Overview of Hadoop Advanced Concepts & Administration
    Introduction to Spark, Storm & Kafka

    Introduction to Ambari
    Lab 13: Ambari

    Big Data Hadoop Advance Course

    Big Data success requires professionals who can prove their mastery with the tools and techniques of the Hadoop stack. However, experts predict a major shortage of advanced analytics skills over the next few years.

    Our Certified Professional program delivers the most rigorous and recognized Big Data credential. Data Brio Academy certifies true specialists who have demonstrated their abilities to execute at the highest level on both traditional exams and hands-on challenges with live data sets. Our certification program is both a tool managers can use to verify expertise and a resource for finding or cultivating the talent they need to launch and scale their Big Data projects.

    Be a Certified Developer for Apache Hadoop

    Know the Methods Used by Top Developers

    Individuals who achieve Certified Developer for Apache Hadoop accreditation have demonstrated their technical knowledge, skill, and ability to write, maintain, and optimize Apache Hadoop development projects.

    Big Data Advance : Course Content

    Module 1: Big Data and Application

    • Business application of Big Data
    • Understanding the Hadoop Ecosystem

    Module 2: Managing an Enterprise Wide Big Data Ecosystem

    • Big Data Technology Foundations
    • Big Data management Systems –Databases and Warehouses

    Module 3: Storing and Processing Data – HDFS and MapReduce

    • Storing & Customizing MapReduce Execution
    • Writing a MapReduce Program Using Streaming
    • Testing and Debugging MapReduce Applications

    Module  5: Increasing Efficiency with Hadoop Tools: Hive and Pig

    • Exploring Hive
    • Advanced Querying with Hive
    • Analyzing Data with Pig

    Module 5: Additional Hadoop Tools: Sqoop, Flume, YARN and Storm

    • Efficiently transferring Bulk data Using Sqoop
    • Flume
    • Beyond MapReduce – YARN
    • An Introduction to Oozie
    •  Storm on YARN

    Module 6: Leveraging NoSQL, Hadoop Security, on Cloud and Real Time

    • Hello MoSQL
    • Working with NoSQL
    • Hadoop Security

    Module 7: R Hadoop implementation  for advance analytics

    Why Data Brio ?

    • Learn directly from industry practitioners with more than 22 years of corporate experience with companies like Dell R&D, Infosys, Perot System, Tektronics etc. etc.
    • Certification from WEBEL
    • Only institute with transparent faculty profiles directly from industry. Get the business perspective instead of learning just the tools & theories
    • Unique training methodology with hands-on sessions, real-time case studies, assignments with data sets and projects with end to end life cycle
    • End-to-end life cycle experience of real-time project. Internship provision in our parent company Business Brio. Business Brio (member of NASSCOM and CII -Confedration of Indian Industry) is an award-winning company that offers consulting and projects services in Big Data and Analytics to clients around the globe 
    • 100% Placement assistance through dedicated placement cell (resume workshop, interview guidance and placement opportunities)



    Email id



    Please select location

    Select Course

    Select a Preferred Time to Call Back

    Comments / Questions

    Enter the characters displayed

    Contact Us

    We're not around right now. But you can send us an email and we'll get back to you, asap.

    Not readable? Change text. captcha txt

    Start typing and press Enter to search