Big Data and Hadoop Training

Home / Big Data and Hadoop Training

Big Data & Hadoop Training in Kolkata

Join the data revolution. Learning Big data and Hadoop which has almost become the de facto standard for storing, processing and analyzing hundreds of terabytes, and even petabytes of data.

Our program on Big Data Hadoop Course is designed to equip you with the necessary technical knowledge to create a big data organization that can reveal insights from all types of data and systems.

The Big data and Hadoop certification issued by WEBEL

Course Objectives

At the end of the course, participants should be able to:

  • Master the concepts of HDFS and MapReduce framework
  • Understand Hadoop 2.x Architecture
  • Setup Hadoop Cluster and write Complex MapReduce programs
  • Learn the data loading techniques using Sqoop and Flume
  • Perform Data Analytics using Pig, Hive and YARN
  • Implement HBase and MapReduce Integration
  • Implement Advanced Usage and Indexing
  • Schedule jobs using Oozie
  • Implement best Practices for Hadoop Development
  • Work on a Real Life Project on Big Data Analytics

Who should go for this course?

Predictions by research and technology houses say 2017 will be the year when Hadoop finally becomes a cornerstone of business technology agenda. To stay ahead in the game, Hadoop has become a must-know technology for the following professionals:

  • Analytics Professionals
  • BI /ETL/DW Professionals
  • Project Managers
  • STesting Professionals
  • Mainframe Professionals
  • Software Developers and Architects
  • Graduates aiming to build a career in Big Data

Participants’ take away

  • Understand Big Data and Hadoop Ecosystem
  • Hadoop distributed file system (HDFS)
  • Using Map Reduce API and writing algorithms
  • Best practices for developing and debugging Map Reduce programs
  • Advanced Map reduced concepts and algorithms
  • Managing and Monitoring Hadoop cluster
  • Importing and Exporting data using Sqoop
  • Hive, HBase and Pig for analysis
  • R for Hadoop for analytics

Pre-requisites

  • Java / C++ / Python programming knowledge
  • Linux / UNIX

Attractive Discounts

  • Early Bird Discount (For registrations up to 7 days before course commencement)
  • Group discount (For group registration of 3 or more students)

Learning Modules

Module 1

Introduction to Big Data

  • History of Data Management—Evolution of Big Data
  • Types and elements of Big Data
  • Application of Big Data in the Business Context
  • Careers in Big Data

Technologies for handling Big Data

  • Distributed and Parallel Computing for Big Data
  • Virtualization and its Importance to Big Data
  • Introducing Hadoop – Architecture of Hadoop cluster, Installation and configuration

 Module 2

Big Data Technology Foundations

  • Exploring the Big Data Stack
  • RDBMSs and Big Data Environment
  • Storing Data in Hadoop (HDFS and Hbase)
  • Combining HDFS and HBase for Effective Data Storage
  • Architecture of HDFS

Processing your data with MapReduce

  • Getting to Know MapReduce
  • Designing MapReduce Implementations

Customizing MapReduce Execution

  • Controlling MapReduce Execution with Input Format
  • Reading and organizing Data with Custom Record Reader and Output Formats
  • Optimizing Your MapReduce Execution with a Combiner
  • Controlling Reducer Execution with Partitioners

Writing a MapReduce Program in Java

  • Basic MapReduce API Concepts
  • Writing MapReduce Drivers, Mappers, and Reducers in Java
  • Speeding Up Hadoop Development by Using Eclipse

 Module 3

Monitoring Management

  • Managing HDFS with tools like FSCK and DFSADMIN
  • Using HDFS & Job Tracker Web UI
  • Commissioning and decommissioning of nodes
  • Hands-on Exercise

Sqoop

  • Importing and exporting data from RDBMS
  • Case studies

 Module 4

PIG, HIVE, Hbase

  • Pig philosophy and architecture
  • Pig Latin and the Grunt shell
  • Loading Data, Data types and schemas
  • Intro to UDF and Scripts
  • Pig Latin Details: structure, functions, expressions, relational operators
  • HBase vs. RDBMS
  • HBase Master and Region Servers
  • Pipeline
  • Intro to ZooKeeper
  • Data Modeling
  • Column Families and Regions
  • Write pipeline / Read
  • Catalog Tables
  • Hands-on Exercise

Big Data Hadoop Advance Course

Big Data success requires professionals who can prove their mastery with the tools and techniques of the Hadoop stack. However, experts predict a major shortage of advanced analytics skills over the next few years.

Our Certified Professional program delivers the most rigorous and recognized Big Data credential. Data Brio Academy certifies true specialists who have demonstrated their abilities to execute at the highest level on both traditional exams and hands-on challenges with live data sets. Our certification program is both a tool managers can use to verify expertise and a resource for finding or cultivating the talent they need to launch and scale their Big Data projects.

Be a Certified Developer for Apache Hadoop

Know the Methods Used by Top Developers

Individuals who achieve Certified Developer for Apache Hadoop accreditation have demonstrated their technical knowledge, skill, and ability to write, maintain, and optimize Apache Hadoop development projects.

Big Data Advance : Course Content

Module 1: Big Data and Application

  • Business application of Big Data
  • Understanding the Hadoop Ecosystem

Module 2: Managing an Enterprise Wide Big Data Ecosystem

  • Big Data Technology Foundations
  • Big Data management Systems –Databases and Warehouses

Module 3: Storing and Processing Data – HDFS and MapReduce

  • Storing & Customizing MapReduce Execution
  • Writing a MapReduce Program Using Streaming
  • Testing and Debugging MapReduce Applications

Module  5: Increasing Efficiency with Hadoop Tools: Hive and Pig

  • Exploring Hive
  • Advanced Querying with Hive
  • Analyzing Data with Pig

Module 5: Additional Hadoop Tools: Sqoop, Flume, YARN and Storm

  • Efficiently transferring Bulk data Using Sqoop
  • Flume
  • Beyond MapReduce – YARN
  • An Introduction to Oozie
  •  Storm on YARN

Module 6: Leveraging NoSQL, Hadoop Security, on Cloud and Real Time

  • Hello MoSQL
  • Working with NoSQL
  • Hadoop Security

Module 7: R Hadoop implementation  for advance analytics

Why Data Brio ?

  • Learn directly from industry practitioners with more than 22 years of corporate experience with companies like Dell R&D, Infosys, Perot System, Tektronics etc. etc.
  • Certification from WEBEL
  • Only institute with transparent faculty profiles directly from industry. Get the business perspective instead of learning just the tools & theories
  • Unique training methodology with hands-on sessions, real-time case studies, assignments with data sets and projects with end to end life cycle
  • End-to-end life cycle experience of real-time project. Internship provision in our parent company Business Brio. Business Brio (member of NASSCOM and CII -Confedration of Indian Industry) is an award-winning company that offers consulting and projects services in Big Data and Analytics to clients around the globe 
  • 100% Placement assistance through dedicated placement cell (resume workshop, interview guidance and placement opportunities)

Name

Telephone

Email id

Qualification

Profession
StudentEmployed

Please select location
Delhi-NCRKolkata

Select Course

Comments / Questions

Enter the characters displayed
captcha

Contact Us

We're not around right now. But you can send us an email and we'll get back to you, asap.

Not readable? Change text. captcha txt

Start typing and press Enter to search