HADOOP QUIZ DESCRIPTION Total Questions −30 00 Max Time − 15:00 The Hadoop list includes the HBase database, the Apache Mahout ________ system, and matrix operations. Machine learning Pattern recognition Statistical classification Artificial intelligence ________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer. Partitioner OutputCollector Reporter All of the mentioned ________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution. Map Parameters JobConf MemoryConf None of the mentioned Point out the wrong statement. Reducer has 2 primary phases Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures It is legal to set the number of reduce-tasks to zero if no reduction is desired The framework groups Reducer inputs by keys (since different mappers may have output the same key) in the sort stage What was Hadoop written in? Java (software platform) Perl Java (programming language) Lua (programming language) Point out the correct statement. Hadoop do need specialized hardware to process the data Hadoop 2.0 allows live stream processing of real-time data In the Hadoop programming framework output files are divided into lines or records None of the mentioned A ________ node acts as the Slave and is responsible for executing a Task assigned to it by the JobTracker. MapReduce Mapper TaskTracker JobTracker According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop? Big data management and data mining Data warehousing and business intelligence Management of Hadoop clusters Collecting and storing unstructured data Which of the following platforms does Hadoop run on? Bare metal Debian Cross-platform Unix-like _______ is the most popular high-level Java API in Hadoop Ecosystem Scalding HCatalog Cascalog Cascading __________ is general-purpose computing model and runtime system for distributed data analytics. Mapreduce Drill Oozie None of the mentioned Facebook Tackles Big Data With _______ based on Hadoop. ‘Project Prism’ ‘Prism’ ‘Project Big’ ‘Project Data’ __________ part of the MapReduce is responsible for processing one or more chunks of data and producing the output results. Maptask Mapper Task execution All of the mentioned Point out the correct statement. Applications can use the Reporter to report progress The Hadoop MapReduce framework spawns one map task for each InputSplit generated by the InputFormat for the job The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format All of the mentioned IBM and ________ have announced a major initiative to use Hadoop to support university courses in distributed computer programming. Google Latitude Android (operating system) Google Variations Google _________ maps input key/value pairs to a set of intermediate key/value pairs. Mapper Reducer Both Mapper and Reducer None of the mentioned The Pig Latin scripting language is not only a higher-level data flow language but also has operators similar to ____________ SQL JSON XML All of the mentioned Hadoop is a framework that works with a variety of related tools. Common cohorts include ____________ MapReduce, Hive and HBase MapReduce, MySQL and Google Apps MapReduce, Hummer and Iguana MapReduce, Heron and Trumpet _______ jobs are optimized for scalability but not latency. Mapreduce Drill Oozie Hive Which of the following genres does Hadoop produce? Distributed file system JAX-RS Java Message Service Relational Database Management System _________ is the default Partitioner for partitioning key space. HashPar Partitioner HashPartitioner None of the mentioned As companies move past the experimental phase with Hadoop, many cite the need for additional capabilities, including _______________ Improved data storage and information retrieval Improved extract, transform and load features for data integration Improved data warehousing functionality Improved security, workload management, and SQL support All of the following accurately describe Hadoop, EXCEPT ____________ Open-source Real-time Java-based Distributed computing approach Although the Hadoop framework is implemented in Java, MapReduce applications need not be written in ____________ Java C C# None of the mentioned _______ function is responsible for consolidating the results produced by each of the Map() functions/tasks. Reduce Map Reducer All of the mentioned Hadoop achieves reliability by replicating the data across multiple hosts and hence does not require ________ storage on hosts. RAID Standard RAID levels ZFS Operating system Point out the correct statement. Hadoop is an ideal environment for extracting and transforming small volumes of data Hadoop stores data in HDFS and supports data compression/decompression The Giraph framework is less useful than a MapReduce job to solve graph and machine learning None of the mentioned Mapper implementations are passed the JobConf for the job via the ________ method. JobConfigure.configure JobConfigurable.configure JobConfigurable.configurable None of the mentioned _______ is a utility which allows users to create and run jobs with any executables as the mapper and/or the reducer. Hadoop Strdata Hadoop Streaming Hadoop Stream None of the mentioned Point out the wrong statement. Elastic MapReduce (EMR) is Facebook’s packaged Hadoop offering Amazon Web Service Elastic MapReduce (EMR) is Amazon’s packaged Hadoop offering Scalding is a Scala API on top of Cascading that removes most Java boilerplate All of the mentioned Previous Next Total Question16 Wrong Answer13 Right Answer13