Grootste online IT opleider

Beste klantenservice

Veel e-learning in prijs verlaagd

Na betaling, direct starten

Training: Hadoop Operations

€ 399,00
€ 482,79 Incl. BTW

Bestellen namens een bedrijf?

Duur: 42 uur |

Taal: Engels (US) |

Online toegang: 90 dagen |

In Onbeperkt Leren

Gegevens

In deze Hadoop training / cursus maakt u op een uitgebreide manier kennis met Hadoop clusters. Zo komt het ontwerpen van Hadoop clusters aan bod alsook Hadoop in de cloud aan bod. Vervolgens gaat de cursus verder met het deployen en beveiligen van Hadoop clusters. Aan het einde van de cursus komt capacity management, cloudera manager en performance tuning van Hadoop clusters aan bod.

Onderwerpen die onder andere aan bod komen zijn EC2, AWS, EMR job, Hive daemons, Install Hadoop, NameNode failure, DataNode, YARN, HDFS, event management, JogHistoryServer logs, performance tuning en nog veel meer.

Resultaat

Na het volgen van deze training heeft u uitgebreide kennis verkregen op het gebied van Hadoop clusters.

Voorkennis

Basiskennis van cloud computing, big data en databases is een pré.

Doelgroep

Softwareontwikkelaar, Webontwikkelaar, Databasebeheerders

Inhoud

Hadoop Operations

42 uur

Designing Hadoop Clusters

  • start the course
  • describe the principles of supercomputing
  • recall the roles and skills needed for the Hadoop engineering team
  • recall the advantages and shortcomings of using Hadoop as a supercomputing platform
  • describe the three axioms of supercomputing
  • describe the dumb hardware and smart software, and the share nothing design principles
  • describe the design principles for move processing not data, embrace failure, and build applications not infrastructure
  • describe the different rack architectures for Hadoop.
  • describe the best practices for scaling a Hadoop cluster.
  • recall the best practices for different types of network clusters
  • recall the primary responsibilities for the master, data, and edge servers
  • recall some of the recommendations for a master server and edge server
  • recall some of the recommendations for a data server
  • recall some of the recommendations for an operating system
  • recall some of the recommendations for hostnames and DNS entries
  • describe the recommendations for HDD
  • calculate the correct number of disks required for a storage solution
  • compare the use of commodity hardware with enterprise disks
  • plan for the development of a Hadoop cluster
  • set up flash drives as boot media
  • set up a kickstart file as boot media
  • set up a network installer
  • identify the hardware and networking recommendations for a Hadoop cluster

Hadoop in the Cloud

  • start the course
  • describe how cloud computing can be used as a solution for Hadoop
  • recall some of the most come services of the EC2 service bundle
  • recall some of the most common services that Amazon offers
  • describe how the AWS credentials are used for authentication
  • create an AWS account
  • describe the use of AWS access keys
  • describe AWS identification and access management
  • set up AWS IAM
  • describe the use of SSH key pairs for remote access
  • set up S3 and import data
  • provision a micro instance of EC2
  • prepare to install and configure a Hadoop cluster on AWS
  • create an EC2 baseline server
  • create an Amazon machine image
  • create an Amazon cluster
  • describe what the command line interface is used for
  • use the command line interface
  • describe the various ways to move data into AWS
  • recall the advantages and limitations of using Hadoop in the cloud
  • recall the advantages and limitations of using AWS EMR
  • describe EMR End-user connections and EMR security levels
  • set up an EMR cluster
  • run an EMR job from the web console
  • run an EMR job with Hue
  • run an EMR job with the command line interface
  • write an Elastic MapReduce script for AWS

Deploying Hadoop Clusters

  • start the course
  • describe the configurations management tools
  • simulate a configuration management tool
  • build an image for a baseline server
  • build an image for a DataServer
  • build an image for a master server
  • provision an admin server
  • describe the layout and structure of the Hadoop cluster
  • provision a Hadoop cluster
  • distribute configuration files and admin scripts
  • use init scripts to start and stop a Hadoop cluster
  • configure a Hadoop cluster
  • configure logging for the Hadoop cluster
  • build images for required servers in the Hadoop cluster
  • configure a MySQL database
  • build the Hadoop clients
  • configure Hive daemons
  • test the functionality of Flume, Sqoop, HDFS, and MapReduce
  • test the functionality of Hive and Pig
  • configure Hcatalog daemons
  • configure Oozie
  • configure Hue and Hue users
  • install Hadoop on to the admin server

Hadoop Cluster Availability

  • start the course
  • describe how Hadoop leverages fault tolerance
  • recall the most common causes for NameNode failure
  • recall the uses for the Checkpoint node
  • test the availability for the NameNode
  • describe the operation of the NameNode during a recovery
  • swap to a new NameNode
  • recall the most common causes for DataNode failure
  • test the availability for the DataNode
  • describe the operation of the DataNode during a recovery
  • set up the DataNode for replication
  • identify and recover from a missing data block scenario
  • describe the functions of Hadoop high availability
  • edit the Hadoop configuration files for high availability
  • set up a high availability solution for NameNode
  • recall the requirements for enabling an automated failover for the NameNode
  • create an automated failover for the NameNode
  • recall the most common causes for YARN task failure
  • describe the functions of YARN containers
  • test YARN container reliability
  • recall the most common causes of YARN job failure
  • test application reliability
  • describe the system view of the Resource Manager configurations set for high availability
  • set up high availability for the Resource Manager
  • move the Resource Manager HA to alternate master servers

Securing Hadoop Clusters

  • start the course
  • describe the four pillars of the Hadoop security model
  • recall the ports required for Hadoop and how network gateways are used
  • install security groups for AWS
  • describe Kerberos and recall some of the common commands
  • diagram Kerberos and label the primary components
  • prepare for a Kerberos installation
  • install Kerberos
  • configure Kerberos
  • describe how to configure HDFS and YARN for use with Kerberos
  • configure HDFS for Kerberos
  • configure YARN for Kerberos
  • describe how to configure Hive for use with Kerberos
  • configure Hive for Kerberos
  • describe how to configure Pig, Sqoop, and Oozie for use with Kerberos
  • configure Pig and HTTPFS for use with Kerberos
  • configure Oozie for use with Kerberos
  • configure Hue for use with Kerberos
  • describe how to configure Flume for use with Kerberos
  • describe the security model for users on a Hadoop cluster
  • describe the use of POSIX and ACL for managing user access
  • create access control lists
  • describe how to encrypt data in motion for Hadoop, Sqoop, and Flume
  • encrypt data in motion
  • describe how to encrypt data at rest
  • recall the primary security threats faced by the Hadoop cluster
  • describe how to monitor Hadoop security
  • configure Hbase for Kerberos

Operating Hadoop Clusters

  • start the course
  • monitor and improve service levels
  • deploy a Hadoop release
  • describe the purpose of change management
  • describe rack awareness
  • write configuration files for rack awareness
  • start and stop a Hadoop cluster
  • write init scripts for Hadoop
  • describe the tools fsck and dfsadmin
  • use fsck to check the HDFS file system
  • set quotas for the HDFS file system
  • install and configure trash
  • manage an HDFS DataNode
  • use include and exclude files to replace a DataNode
  • describe the operations for scaling a Hadoop cluster
  • add a DataNode to a Hadoop cluster
  • describe the process for balancing a Hadoop cluster
  • balance a Hadoop cluster
  • describe the operations involved for backing up data
  • use distcp to copy data from one cluster to another
  • describe MapReduce job management on a Hadoop cluster
  • perform MapReduce job management on a Hadoop cluster
  • plan an upgrade of a Hadoop cluster

Stabilizing Hadoop Clusters

  • start the course
  • describe the importance of event management
  • describe the importance of incident management
  • describe the different methodologies used for root cause analysis
  • recall what Ganglia is and what it can be used for
  • recall how Ganglia monitors Hadoop clusters
  • install Ganglia
  • describe Hadoop Metrics2
  • install Hadoop Metrics2 for Ganglia
  • describe how to use Ganglia to monitor a Hadoop cluster
  • use Ganglia to monitor a Hadoop cluster
  • recall what Nagios is and what it can be used for
  • install Nagios
  • manage Nagios contact records
  • manage Nagios Push
  • use Nagios commands
  • use Nagios to monitor a Hadoop cluster
  • use Hadoop Metrics2 for Nagios
  • describe how to manage logging levels
  • describe how to configure Hadoop jobs for logging
  • describe how to configure log4j for Hadoop
  • describe how to configure JogHistoryServer logs
  • configure Hadoop logs
  • describe the problem management lifecycle
  • recall some of the best practices for problem management
  • describe the categories of errors for a Hadoop cluster
  • conduct a root cause analysis on a major problem
  • use different monitoring tools to identify problems, failures, errors and solutions

Capacity Management for Hadoop Clusters

  • start the course
  • compare the differences of availability versus performance
  • describe different strategies of resource capacity management
  • describe how schedulers perform various resource management
  • set quotas for the HDFS file system
  • recall how to set the maximum and minimum memory allocations per container
  • describe how the fair scheduling method allows all applications to get equal amounts of resource time
  • describe the primary algorithm and the configuration files for the Fair Scheduler
  • describe the default behavior of the Fair Scheduler methods
  • monitor the behavior of Fair Share
  • describe the policy for single resource fairness
  • describe how resources are distributed over the total capacity
  • identify different configuration options for single resource fairness
  • configure single resource fairness
  • describe the minimum share function of the Fair Scheduler
  • configure minimum share on the Fair Scheduler
  • describe the preemption functions of the Fair Scheduler
  • configure preemption for the Fair Scheduler
  • describe dominant resource fairness
  • write service levels for performance
  • use the fail scheduler with multiple users

Performance Tuning of Hadoop Clusters

  • start the course
  • recall the three main functions of service capacity
  • describe different strategies of performance tuning
  • list some of the best practices for network tuning
  • install compression
  • describe the configuration files and parameters used in performance tuning of the operating system
  • describe the purpose of Java tuning
  • recall some of the rules for tuning the datanode
  • describe the configuration files and parameters used in performance tuning of memory for daemons
  • describe the purpose of memory tuning for YARN
  • recall why the Node Manager kills containers
  • performance tune memory for the Hadoop cluster
  • describe the configuration files and parameters used in performance tuning of HDFS
  • describe the sizing and balancing of the HDFS data blocks
  • describe the use of TestDFSIO
  • performance tune HDFS
  • describe the configuration files and parameters used in performance tuning of YARN
  • configure Speculative execution
  • describe the configuration files and parameters used in performance tuning of MapReduce
  • tune up MapReduce for performance reasons
  • describe the practice of benchmarking on a Hadoop cluster
  • describe the different tools used for benchmarking a cluster
  • perform a benchmark of a Hadoop cluster
  • describe the purpose of application modeling
  • optimize memory and benchmark a Hadoop cluster

Cloudera Manager and Hadoop Clusters

  • start the course
  • describe what cluster management entails and recall some of the tools that can be used
  • describe different tools from a functional perspective
  • describe the purpose and functionality of Cloudera Manager
  • install Cloudera Manager
  • use Cloudera Manager to deploy a cluster
  • use Cloudera Manager to install Hadoop
  • describe the different parts of the Cloudera Manager Admin Console
  • describe the Cloudera Manager internal architecture
  • use Cloudera Manager to manage a cluster
  • manage Cloudera Manager's services
  • manage hosts with Cloudera Manager
  • set up Cloudera Manager for high availability
  • user Cloudera Manager to manage resources
  • use Cloudera Manager's monitoring features
  • manage logs through Cloudera Manager
  • improve cluster performance with Cloudera Manager
  • install and configure Impala
  • install and configure Sentry
  • implement security administration using Hive
  • perform backups, snapshots, and upgrades using Cloudera Manager
  • configure Hue with My SQL
  • import data using Hue
  • use Hue to run a Hive job
  • use Hue to edit Oozie workflows and coordinators
  • format HDFS, create an HDFS directory, import data, run a WordCount, and view the results

Opties bij cursus

Wij bieden, naast de training, in sommige gevallen ook diverse extra leermiddelen aan. Wanneer u zich gaat voorbereiden op een officieel examen dan raden wij aan om ook de extra leermiddelen te gebruiken die beschikbaar zijn bij deze training. Het kan voorkomen dat bij sommige cursussen alleen een examentraining en/of LiveLab beschikbaar is.

Examentraining (proefexamens)

In aanvulling op deze training kunt u een speciale examentraining aanschaffen. De examentraining bevat verschillende proefexamens die het echte examen dicht benaderen. Zowel qua vorm als qua inhoud. Dit is de ultieme manier om te testen of u klaar bent voor het examen. 

LiveLab

Als extra mogelijkheid bij deze training kunt u een LiveLab toevoegen. U voert de opdrachten uit op de echte hardware en/of software die van toepassing zijn op uw Lab. De LiveLabs worden volledig door ons gehost in de cloud. U heeft zelf dus alleen een browser nodig om gebruik te maken van de LiveLabs. In de LiveLab omgeving vindt u de opdrachten waarmee u direct kunt starten. De labomgevingen bestaan uit complete netwerken met bijvoorbeeld clients, servers, routers etc. Dit is de ultieme manier om uitgebreide praktijkervaring op te doen.

Waarom Icttrainingen.nl?

Via ons opleidingsconcept bespaar je tot 80% op trainingen

Start met leren wanneer je wilt. Je bepaalt zelf het gewenste tempo

Spar met medecursisten en profileer je als autoriteit in je vakgebied.

Ontvang na succesvolle afronding van je cursus het certificaat van deelname van Icttrainingen.nl

Krijg inzicht in uitgebreide voortgangsinformatie van jezelf of je medewerkers

Kennis opdoen met interactieve e-learning en uitgebreide praktijkopdrachten door gecertificeerde docenten

Bestelproces

Zodra wij uw order en betaling hebben verwerkt, zetten wij uw trainingen klaar en kunt u aan de slag. Heeft u toch nog vragen over ons orderproces kunt u onderstaande button raadplegen.

lees meer over het orderproces

hoe werkt aanvragen met STAP

Wat is inbegrepen?

Certificaat van deelname ja
Voortgangsbewaking ja
Award Winning E-learning ja
Geschikt voor mobiel ja
Kennis delen Onbeperkte toegang tot onze community met IT professionals
Studieadvies Onze consultants zijn beschikbaar om je te voorzien van studieadvies
Studiemateriaal Gecertificeerde docenten met uitgebreide kennis over de onderwerpen
Service Service via chat, telefoon, e-mail (razendsnel)

Platform

Na bestelling van je training krijg je toegang tot ons innovatieve leerplatform. Hier vind je al je gekochte (of gevolgde) trainingen, kan je eventueel cursisten aanmaken en krijg je toegang tot uitgebreide voortgangsinformatie.

Life Long Learning

Meerdere cursussen volgen? Misschien is ons Life Long Learning concept wel wat voor u

lees meer

Neem contact op

Studieadvies nodig? Neem contact op!


Contact