HADOOP QUIZ DESCRIPTION Total Questions −30 00 Max Time − 15:00 Point out the wrong statement. Hardtop processing capabilities are huge and its real advantage lies in the ability to process terabytes & petabytes of data Hadoop uses a programming model called “MapReduce”, all the programs should conform to this model in order to work on the Hadoop platform The programming model, MapReduce, used by Hadoop is difficult to write and test All of the mentioned The Hadoop list includes the HBase database, the Apache Mahout ________ system, and matrix operations. Machine learning Pattern recognition Statistical classification Artificial intelligence What was Hadoop written in? Java (software platform) Perl Java (programming language) Lua (programming language) Point out the correct statement. MapReduce tries to place the data and the compute as close as possible Map Task in MapReduce is performed using the Mapper() function Reduce Task in MapReduce is performed using the Map() function All of the mentioned Above the file systems comes the ________ engine, which consists of one Job Tracker, to which client applications submit MapReduce jobs. MapReduce Google Functional programming Facebook Hadoop achieves reliability by replicating the data across multiple hosts and hence does not require ________ storage on hosts. RAID Standard RAID levels ZFS Operating system _______ is a platform for constructing data flows for extract, transform, and load (ETL) processing and analysis of large datasets. Pig Latin Oozie Pig Hive What was Hadoop named after? Creator Doug Cutting’s favorite circus act Cutting’s high school rock band The toy elephant of Cutting’s son A sound Cutting’s laptop made during Hadoop development _________ has the world’s largest Hadoop cluster. Apple Datamatics Facebook None of the mentioned Mapper implementations are passed the JobConf for the job via the ________ method. JobConfigure.configure JobConfigurable.configure JobConfigurable.configurable None of the mentioned Point out the wrong statement. A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner The MapReduce framework operates exclusively on <key, value> pairs Applications typically implement the Mapper and Reducer interfaces to provide the map and reduce methods None of the mentioned _______ is the most popular high-level Java API in Hadoop Ecosystem Scalding HCatalog Cascalog Cascading Sun also has the Hadoop Live CD ________ project, which allows running a fully functional Hadoop cluster using a live CD. OpenOffice.org OpenSolaris GNU Linux ________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer. Partitioner OutputCollector Reporter All of the mentioned The right number of reduces seems to be ____________ 0.90 0.80 0.36 0.95 According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop? Big data management and data mining Data warehousing and business intelligence Management of Hadoop clusters Collecting and storing unstructured data Input to the _______ is the sorted output of the mappers. Reducer Mapper Shuffle All of the mentioned _________ is the default Partitioner for partitioning key space. HashPar Partitioner HashPartitioner None of the mentioned IBM and ________ have announced a major initiative to use Hadoop to support university courses in distributed computer programming. Google Latitude Android (operating system) Google Variations Google ________ hides the limitations of Java behind a powerful and concise Clojure API for Cascading. Scalding HCatalog Cascalog All of the mentioned _______ is a utility which allows users to create and run jobs with any executables as the mapper and/or the reducer. Hadoop Strdata Hadoop Streaming Hadoop Stream None of the mentioned Which of the following platforms does Hadoop run on? Bare metal Debian Cross-platform Unix-like All of the following accurately describe Hadoop, EXCEPT ____________ Open-source Real-time Java-based Distributed computing approach __________ is general-purpose computing model and runtime system for distributed data analytics. Mapreduce Drill Oozie None of the mentioned Point out the wrong statement. Reducer has 2 primary phases Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures It is legal to set the number of reduce-tasks to zero if no reduction is desired The framework groups Reducer inputs by keys (since different mappers may have output the same key) in the sort stage Which of the following genres does Hadoop produce? Distributed file system JAX-RS Java Message Service Relational Database Management System __________ part of the MapReduce is responsible for processing one or more chunks of data and producing the output results. Maptask Mapper Task execution All of the mentioned _________ maps input key/value pairs to a set of intermediate key/value pairs. Mapper Reducer Both Mapper and Reducer None of the mentioned The Pig Latin scripting language is not only a higher-level data flow language but also has operators similar to ____________ SQL JSON XML All of the mentioned Which of the following phases occur simultaneously? Shuffle and Sort Reduce and Sort Shuffle and Map All of the mentioned Previous Next Total Question16 Wrong Answer13 Right Answer13