HADOOP QUIZ DESCRIPTION

The Hadoop list includes the HBase database, the Apache Mahout ________ system, and matrix operations.

  • Machine learning
     

  • Pattern recognition
     

  •  Statistical classification
     

  •  Artificial intelligence

________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer.

  • Partitioner
     

  • OutputCollector
     

  •  Reporter
     

  •  All of the mentioned

________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution.

  • Map Parameters
     

  • JobConf
     

  • MemoryConf
     

  •  None of the mentioned

Point out the wrong statement.

  • Reducer has 2 primary phases
     

  •  Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures
     

  •  It is legal to set the number of reduce-tasks to zero if no reduction is desired
     

  • The framework groups Reducer inputs by keys (since different mappers may have output the same key) in the sort stage

What was Hadoop written in?

  • Java (software platform)
     

  •  Perl
     

  • Java (programming language)
     

  •  Lua (programming language)

Point out the correct statement.

  •  Hadoop do need specialized hardware to process the data
     

  •  Hadoop 2.0 allows live stream processing of real-time data
     

  •  In the Hadoop programming framework output files are divided into lines or records
     

  • None of the mentioned

A ________ node acts as the Slave and is responsible for executing a Task assigned to it by the JobTracker.

  • MapReduce
     

  • Mapper
     

  •  TaskTracker
     

  •  JobTracker

According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop?

  • Big data management and data mining
     

  •  Data warehousing and business intelligence
     

  •  Management of Hadoop clusters
     

  •  Collecting and storing unstructured data

Which of the following platforms does Hadoop run on?

  • Bare metal
     

  •  Debian
     

  • Cross-platform
     

  •  Unix-like

_______ is the most popular high-level Java API in Hadoop Ecosystem

  • Scalding
     

  •  HCatalog
     

  • Cascalog
     

  •  Cascading

__________ is general-purpose computing model and runtime system for distributed data analytics.

  •  Mapreduce
     

  •  Drill
     

  •  Oozie
     

  • None of the mentioned

Facebook Tackles Big Data With _______ based on Hadoop.

  • ‘Project Prism’
     

  •  ‘Prism’
     

  •  ‘Project Big’
     

  • ‘Project Data’

__________ part of the MapReduce is responsible for processing one or more chunks of data and producing the output results.

  • Maptask
     

  •  Mapper
     

  •  Task execution
     

  • All of the mentioned

 Point out the correct statement.

  • Applications can use the Reporter to report progress
     

  •  The Hadoop MapReduce framework spawns one map task for each InputSplit generated by the InputFormat for the job
     

  •  The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format
     

  •  All of the mentioned

IBM and ________ have announced a major initiative to use Hadoop to support university courses in distributed computer programming.

  • Google Latitude
     

  • Android (operating system)
     

  • Google Variations
     

  • Google

_________ maps input key/value pairs to a set of intermediate key/value pairs.

  • Mapper

     

  •  Reducer
     

  •  Both Mapper and Reducer
     

  •  None of the mentioned

The Pig Latin scripting language is not only a higher-level data flow language but also has operators similar to ____________

  • SQL
     

  •  JSON
     

  •  XML
     

  •  All of the mentioned

Hadoop is a framework that works with a variety of related tools. Common cohorts include ____________

  • MapReduce, Hive and HBase
     

  •  MapReduce, MySQL and Google Apps
     

  • MapReduce, Hummer and Iguana
     

  •  MapReduce, Heron and Trumpet

_______ jobs are optimized for scalability but not latency.

  • Mapreduce
     

  •  Drill
     

  •  Oozie
     

  • Hive

Which of the following genres does Hadoop produce?

  • Distributed file system
     

  •  JAX-RS
     

  •  Java Message Service
     

  •  Relational Database Management System

_________ is the default Partitioner for partitioning key space.

  • HashPar
     

  •  Partitioner
     

  •  HashPartitioner
     

  •  None of the mentioned

 As companies move past the experimental phase with Hadoop, many cite the need for additional capabilities, including _______________

  • Improved data storage and information retrieval
     

  •  Improved extract, transform and load features for data integration
     

  •  Improved data warehousing functionality
     

  •  Improved security, workload management, and SQL support

All of the following accurately describe Hadoop, EXCEPT ____________

  • Open-source
     

  •  Real-time
     

  •  Java-based
     

  •  Distributed computing approach

Although the Hadoop framework is implemented in Java, MapReduce applications need not be written in ____________

  • Java
     

  •  C
     

  •  C#
     

  •  None of the mentioned

_______ function is responsible for consolidating the results produced by each of the Map() functions/tasks.

  •  Reduce
     

  •  Map
     

  •  Reducer
     

  • All of the mentioned

Hadoop achieves reliability by replicating the data across multiple hosts and hence does not require ________ storage on hosts.

  • RAID
     

  •  Standard RAID levels
     

  •  ZFS
     

  •  Operating system

Point out the correct statement.

  • Hadoop is an ideal environment for extracting and transforming small volumes of data
     

  • Hadoop stores data in HDFS and supports data compression/decompression
     

  • The Giraph framework is less useful than a MapReduce job to solve graph and machine learning
     

  • None of the mentioned

Mapper implementations are passed the JobConf for the job via the ________ method.

  •  JobConfigure.configure
     

  •  JobConfigurable.configure
     

  •  JobConfigurable.configurable
     

  •  None of the mentioned

_______ is a utility which allows users to create and run jobs with any executables as the mapper and/or the reducer.

  • Hadoop Strdata
     

  •  Hadoop Streaming
     

  •  Hadoop Stream
     

  •  None of the mentioned

Point out the wrong statement.

  • Elastic MapReduce (EMR) is Facebook’s packaged Hadoop offering
     

  • Amazon Web Service Elastic MapReduce (EMR) is Amazon’s packaged Hadoop offering
     

  •  Scalding is a Scala API on top of Cascading that removes most Java boilerplate
     

  • All of the mentioned