- Home /
- Shop All /
- Web Dev & Mobile /
- Web Development /
- Big Data Essentials Bootcamp + Labs
Big Data Essentials Bootcamp
Course Specifications
Course Number:
035002
Course Length:
5 days
Course Description
Overview:
Big Data Essential Bootcamp covers the technologies essential in building modern-day Big Data applications. Such applications usually include scalable storage and computation (Hadoop), fast scalable databases (NoSQL), and interactive and real-time analytics, as well as machine learning (Spark). These areas are all covered in the course.
In order to build a successful Big Data system, it is not enough to just know how to run appropriate technologies. It is important to also master the right architectural approach. The Big Data Essentials Bootcamp contains practical labs and best practices for architecting Big Data systems.
Course Objectives:
In this course, you will investigate and employ all key Big Data technologies in use today.
You will:
- Master the best practices of Hadoop programming in Pig, Hive, and Java.
- Enhance Big Data solutions for real-time data access with the help of NoSQL technologies, such as HBase and Cassandra.
- Grasp the essence of NoSQL data modeling in contrast to SQL data modeling.
- Utilize Spark for interactive data analytics with Scala or SQL.
- Implement near-real-time analytical processing with Spark Streaming.
Target Student:
This course is for developers implementing Big Data systems, architects who plan and architect them, and technical managers who want a solid foundation for their work.
Prerequisites:
To ensure your success in this course, you should be familiar with at least one programming language and be comfortable working with a command-line interface.
Course-specific Technical Requirements
Lab Environments
The use of web-based lab environments is required to deliver this course. Lab access is included with the courseware purchase. Please contact customerservice@logicaloperations.com for information on how to access the labs for instructor preparation and for classroom use.
Instructor Preparation
Train the Trainer (TTT) preparation prior to delivering this course is highly recommended. TTT services can be arranged through Logical Operations. Please contact customerservice@logicaloperations.com for more information.
Course Content
Lesson 1: Big Data Overview
Big Data
Big Data Use Cases
Designing a Big Data System
Technologies: Hadoop
Technologies: NoSQL
Analytics
Putting It All Together
Lesson 2: Hadoop Introduction
Introduction to Hadoop
The Future of Hadoop
Lesson 3: HDFS and MapReduce Primer
HDFS
MapReduce
YARN
Future of Hadoop Processing Engines
Lesson 4: Hive
Hadoopy, Hive, and SQL
Hive Design and Architecture
HiveQL
First Look at Hive
Hive Partitions
Hive Joins
Hive UDFs
Text Analytics with Hive
Lesson 5: Hive 2
Data Access
Feature Generation
Filter/Search/Transpose
Binning and Smoothing
Tez
Lesson 6: Pig
Understand Apache Pig
Pig Concepts/History
Pig by Example
Pig as an ETL Pipeline
Lesson 7: Hadoop Cluster Planning
Planning Hadoop Hardware
Planning Software Install
Lesson 8: Hadoop Install and Configure
Different Installation Configurations in Hadoop
Install Hadoop
Configure Hadoop Cluster
Common Configuration Properties
Making Installation and Configuration Easier
Hadoop Advanced Configuration
Lesson 9: Hadoop Data Ingest
Flume
Sqoop
REST
Import Best Practices
Lesson 10: NoSQL Intro
RDBMS and NoSQL
ACID in NoSQL
CAP Theorem
NoSQL Stores
Columnar Storage
Lesson 11: Cassandra Intro
Introduction & Architecture
Cassandra Use Cases
Data Organization
First Look at Cassandra
Replication & Consistency
Lesson 12: Cassandra Data Modeling 1
Keyspaces and Tables
CQL Queries
Indexing
Lesson 13: Cassandra Data Modeling 2
Collections
Composite Keys
Time Series Data
Counters
Lightweight Transactions
Lesson 14: Cassandra Data Modeling Labs
MyFlix (Netflix)
YouTube
Online Shopping (Amazon)
User Activity (Facebook)
Lesson 15: Scala Primer
Introduction
Collections
Functions/Methods
Class/Object/Trait
Lesson 16: Introduction to Spark
Introduction
Spark vs. Hadoop
A First Look at Spark
Lesson 17: Spark Data Model 1
Data Model Overview
RDD Concepts
Spark Workflow
Working with RDDs
Key-Value Pairs
Caching
Lesson 18: Spark Data Model 2
DataFrames
Working with DataFrames
Spark SQL
DataSet
Spark and Hive
Data Formats
Lesson 19: Spark API/Applications
Core API
Building and Running Applications
Application Lifecycle
Logging & Debugging
Lesson 20: Machine Learning Primer
Machine Learning Concepts
Machine Learning Vocabulary
Text Mining
Recommendations
Lesson 21: Spark Streaming
Streaming
Spark Streaming Overview
Architecture
Programming
Structured Streaming
Transformations
Apache Kafka
SKU | 035002SE |
---|---|
Weight | 0.0000 |
Coming Soon | No |
Days of Training | 5 |
Audience | Student |
Product Family | Partnerware |
Product Type | Digital Courseware |
Electronic | Yes |
ISBN | No |
Language | English |
Page Count | No |
Curriculum Library | No |
Year | No |
Manufacturer's Product Code | No |
Current Revision | 1.0 |
---|---|
Revision Notes | No Revision Information Available |
Original Publication Date | 2017-05-18 00:00:00 |