APACHE PIG QUIZ DESCRIPTION

Point out the correct statement.

  • Invoke the Grunt shell using the “enter” command
     

  •  Pig does not support jar files
     

  • Both the run and exec commands are useful for debugging because you can modify a Pig script in an editor
     

  •  All of the mentioned

Pig operates in mainly how many nodes?

  • Two
     

  • Three
     

  • Four
     

  • Five

Point out the wrong statement.

  • Pig can invoke code in language like Java Only
     

  •  Pig enables data workers to write complex data transformations without knowing Java
     

  •  Pig’s simple SQL-like scripting language is called Pig Latin, and appeals to developers already familiar with scripting languages and SQL
     

  • Pig is complete, so you can do all required data manipulations in Apache Hadoop with Pig

You can specify parameter names and parameter values in one of the ways?

  •  As part of a command line.
     

  •  In parameter file, as part of a command line
     

  •  With the declare statement, as part of Pig script
     

  •  All of the mentioned

Pig Latin is _______ and fits very naturally in the pipeline paradigm while SQL is instead declarative.

  • functional
     

  • procedural
     

  •  declarative
     

  • all of the mentioned

The ________ class mimics the behavior of the Main class but gives users a statistics object back.

  • PigRun
     

  • PigRunner
     

  • None of the mentioned

  • RunnerPig
     

Use the __________ command to run a Pig script that can interact with the Grunt shell (interactive mode).

  • fetch
     

  •  declare
     

  •  run
     

  • all of the mentioned

Which of the following script is used to check scripts that have failed jobs?

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate (Chararray) j#'STATUS' as status, j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME'

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, (Long) r#'NUMBER_RED

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'QUEUE_NAME' as queue;
    c = group b by (id,

  • None of the mentioned

________ abstract class has three main methods for loading data and for most use cases it would suffice to extend it.

  • Load
     

  • LoadFunc
     

  • FuncLoad
     

  • None of the mentioned

__________ return a list of hdfs files to ship to distributed cache.

  • relativeToAbsolutePath()
     

  •  setUdfContextSignature()
     

  • getCacheFiles()
     

  •  getShipFiles()

_______ operator is used to review the schema of a relation.

  • DUMP
     

  •  DESCRIBE
     

  •  STORE
     

  •  EXPLAIN

You can run Pig in interactive mode using the ______ shell.

  • Grunt
     

  •  FS
     

  • HDFS
     

  •  None of the mentioned

Which of the following find the running time of each script (in seconds)?

  • Check this: Programming Books | Programming MCQs
    a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#&#

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = for a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, 
        &

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'QUEUE_NAME' as queue;
    c = group b by (id,

  • All of the mentioned
     

PigUnit runs in Pig’s _______ mode by default.

  •  local
     

  • tez
     

  • mapreduce
     

  • none of the mentioned

Which of the following command is used to show values to keys used in Pig?

  • set
     

  • declare
     

  •  display
     

  •  all of the mentioned

______ is a framework for collecting and storing script-level statistics for Pig Latin.

  • Pig Stats
     

  •  PStatistics
     

  •  Pig Statistics
     

  •  None of the mentioned

In comparison to SQL, Pig uses ______________

  • Lazy evaluation
     

  •  ETL
     

  •  Supports pipeline splits
     

  •  All of the mentioned

Which of the following scripts that generate more than three MapReduce jobs?

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = group a by (j#'PIG_SCRIPT_ID', j#'USER', j#'JOBNAME');
    c = for b generate group.$1, group.$2, COUNT(a);

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = display a by (j#'PIG_SCRIPT_ID', j#'USER', j#'JOBNAME');
    c = foreach b generate group.$1, group.$2, COUNT(a

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = group a by (j#'PIG_SCRIPT_ID', j#'USER', j#'JOBNAME');
    c = foreach b generate group.$1, group.$2, COUNT(a);

  • None of the mentioned

Which of the following is the default mode?

  • Mapreduce
     

  • Tez
     

  • Local
     

  •  All of the mentioned

Which of the following is shortcut for DUMP operator?

  • de alias
     

  •  d alias
     

  • q
     

  • None of the mentioned

Which of the following has methods to deal with metadata?

  • LoadPushDown
     

  •  LoadMetadata
     

  •  LoadCaster
     

  • All of the mentioned

Which of the following script determines the number of scripts run by user and queue on a cluster?
 

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate (Chararray) j#'STATUS' as status, j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME'

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, (Long) r#'NUMBER_RED

  • a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);
    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'QUEUE_NAME' as queue;
    c = group b by (id,

  • None of the mentioned
     

Which of the following operator is used to view the map reduce execution plans?

  • DUMP
     

  • DESCRIBE
     

  • STORE
     

  • EXPLAIN

__________ method enables the RecordReader associated with the InputFormat provided by the LoadFunc is passed to the LoadFunc.

  • getNext()
     

  •  relativeToAbsolutePath()
     

  • prepareToRead()
     

  •  all of the mentioned

The loader should use ______ method to communicate the load information to the underlying InputFormat.

  •  relativeToAbsolutePath()
     

  •  setUdfContextSignature()
     

  •  getCacheFiles()
     

  • setLocation()

A loader implementation should implement __________ if casts (implicit or explicit) from DataByteArray fields to other types need to be supported.

  • LoadPushDown
     

  •  LoadMetadata
     

  •  LoadCaster
     

  • All of the mentioned

 $ pig -x tez_local … will enable ________ mode in Pig.

  • Mapreduce
     

  •  Tez
     

  • Local
     

  •  None of the mentioned

Which of the following file contains user defined functions (UDFs)?

  • script2-local.pig
     

  • pig.jar
     

  •  tutorial.jar
     

  • excite.log.bz2

___________ method will be called by Pig both in the front end and back end to pass a unique signature to the Loader.

  •  relativeToAbsolutePath()
     

  •  setUdfContextSignature()
     

  • getCacheFiles()
     

  •  getShipFiles()

Which of the following command can be used for debugging?

  • exec
     

  • execute
     

  • error
     

  • throw