Welcome to 123ArticleOnline.com!
ALL >> Education >> View Article

Apache-hadoop-developer Practice Questions

By Author: Kristi McLeod
Total Articles: 155
Comment this article

Question: 1

Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application
containers and monitoring application resource usage?

A. ResourceManager
B. NodeManager
C. ApplicationMaster
D. ApplicationMasterService
E. TaskTracker
F. JobTracker

Answer: B

Reference: Apache Hadoop YARN - Concepts & Applications

Question: 2

You want to run Hadoop jobs on your development workstation for testing before you submit
them to your production cluster.
Which mode of operation in Hadoop allows you to most closely simulate a production cluster while
using a single machine?

A. Run all the nodes in your production cluster as virtual machines on your development workstation.
B. Run the hadoop command with the -jt local and the -fs file:///options.
C. Run the DataNode, TaskTracker, NameNode and JobTracker daemons on a single machine.
D. Run simldooop, the Apache open-source software for simulating Hadoop clusters.

Answer: C
...
...
Question: 3

You have the following key-value pairs as output from your Map task:
(the, 1) (fox, 1) (faster, 1) (than, 1) (the, 1) (dog, 1)
How many keys will be passed to the Reducer's reduce method?

A. Six
B. Five
C. Four
D. Two
E. One
F. Three

Answer: B

Explanation:
Only one key value pair will be passed from the two (the, 1) key value pairs.

Question: 4

Which project gives you a distributed, Scalable, data store that allows you random, realtime
read/write access to hundreds of terabytes of data?

A. HBase
B. Hue
C. Pig
D. Hive
E. Oozie
F. Flume
G. Sqoop

Answer: A

Explanation:
Use Apache HBase when you need random, realtime read/write access to your Big Data.
Note: This project's goal is the hosting of very large tables -- billions of rows X millions of columns -
- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned,
column-oriented store modeled after Google's Bigtable: A Distributed Storage System for Structured
Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google
File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.
Features
Linear and modular scalability. Strictly consistent reads and writes. Automatic and configurable
sharding of tables Automatic failover support between RegionServers. Convenient base classes for
backing Hadoop MapReduce jobs with Apache HBase tables. Easy to use Java API for client access.
Block cache and Bloom Filters for real-time queries. Query predicate push down via server side Filters
Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding
options Extensible jruby-based (JIRB) shell Support for exporting metrics via the Hadoop metrics
subsystem to files or Ganglia; or via JMX
Reference: http://hbase.apache.org/ (when would I use HBase? First sentence)

Question: 5

Which one of the following statements describes a Pig bag. tuple, and map, respectively?

A. Unordered collection of maps, ordered collection of tuples, ordered set of key/value pairs
B. Unordered collection of tuples, ordered set of fields, set of key value pairs
C. Ordered set of fields, ordered collection of tuples, ordered collection of maps
D. Ordered collection of maps, ordered collection of bags, and unordered set of key/value pairs

Answer: B

Question: 6

Which HDFS command copies an HDFS file named foo to the local filesystem as localFoo?

A. hadoop fs -get foo LocalFoo
B. hadoop -cp foo LocalFoo
C. hadoop fs -Is foo
D. hadoop fs -put foo LocalFoo

Answer: A

Question: 7

You are developing a MapReduce job for sales reporting. The mapper will process input keys
representing the year (IntWritable) and input values representing product indentifies (Text).
Indentify what determines the data types used by the Mapper for a given job.

A. The key and value types specified in the JobConf.setMapInputKeyClass and
JobConf.setMapInputValuesClass methods
B. The data types specified in HADOOP_MAP_DATATYPES environment variable
C. The mapper-specification.xml file submitted with the job determine the mapper's input key and
value types.
D. The InputFormat used by the job determines the mapper's input key and value types.

Answer: D

Explanation:
The input types fed to the mapper are controlled by the InputFormat used.
The default input format, "TextInputFormat," will load data in as (LongWritable, Text) pairs.
The long value is the byte offset of the line in the file. The Text object holds the string
contents of the line of the file.
Note: The data types emitted by the reducer are identified by setOutputKeyClass()
andsetOutputValueClass(). The data types emitted by the reducer are identified by
setOutputKeyClass() and setOutputValueClass().
By default, it is assumed that these are the output types of the mapper as well. If this is not
the case, the methods setMapOutputKeyClass() and setMapOutputValueClass() methods
of the JobConf class will override these.
Reference: Yahoo! Hadoop Tutorial, THE DRIVER METHOD

Question: 8

All keys used for intermediate output from mappers must:

A. Implement a splittable compression algorithm.
B. Be a subclass of FileInputFormat.
C. Implement WritableComparable.
D. Override isSplitable.
E. Implement a comparator for speedy sorting.

Answer: C

Explanation:
The MapReduce framework operates exclusively on pairs, that is, the framework views
the input to the job as a set of pairs and produces a set of pairs as the
output of the job, conceivably of different types.
The key and value classes have to be serializable by the framework and hence need to implement the
Writable interface. Additionally, the key classes have to implement the WritableComparable interface
to facilitate sorting by the framework.
Reference: MapReduce Tutorial

Question: 9

Review the following data and Pig code:

What command to define B would produce the output (M,62,95l02) when invoking the DUMP
operator on B?

A. B = FILTER A BY (zip = = '95102' AND gender = = M");
B. B= FOREACH A BY (gender = = 'M' AND zip = = '95102');
C. B = JOIN A BY (gender = = 'M' AND zip = = '95102');
D. B= GROUP A BY (zip = = '95102' AND gender = = 'M');

Answer: A

Question: 10

Assuming the following Hive query executes successfully:

Which one of the following statements describes the result set?

A. A bigram of the top 80 sentences that contain the substring "you are" in the lines column of the
input data A1 table.
B. An 80-value ngram of sentences that contain the words "you" or "are" in the lines column of the
inputdata table.
C. A trigram of the top 80 sentences that contain "you are" followed by a null space in the lines
column of the inputdata table.
D. A frequency distribution of the top 80 words that follow the subsequence "you are" in the lines
column of the inputdata table.

Answer: D

Total Views: 580Word Count: 1063See All articles From Author

Add Comment

Education Articles

1. Digital Marketing Course In Collaboration: A Smart Career Choice
Author: Aima Courses

2. A Complete Guide To Osep Special Education Programs And Online Resources
Author: Passyourcert

3. Rcdd Certification Guide: Requirements, Exam, Cost, And A Practical Pass Strategy
Author: Passyourcert

4. Salesforce Devops Training In India | Online Course
Author: Vamsi Ulavapati

5. Smile Design In Chennai – Transform Your Smile With Advanced Cosmetic Dentistry
Author: ivar

6. Porcelain Veneers In Chennai Porcelain Veneers In Chennai – The Ideal Choice For A Radiant And Natural Smile
Author: ivar

7. Discover Your Path To A Safe Career With An Industrial Safety Courses
Author: Safety Isa

8. Best Saloon In Anna Nagar East – Professional Beauty And Grooming Services For Every Style
Author: MAHESH

9. Keratin Treatment In Anna Nagar – Get Smooth, Frizz-free, And Healthy Hair
Author: MAHESH

10. Us F-1 Visa Rejection 2026: Common Reasons & How To Avoid Them
Author: Nivesa EdTech

11. Non Invasive Hifu Face Lift In Anna Nagar – Lift And Tighten Your Skin Without Surgery
Author: MAHESH

12. How To Dominate The Cia Certification Exam: A Strategic Blueprint For It Professionals
Author: Passyourcert

13. Fpm Programmes At Mdi Gurgaon: A Pathway To Excellence In Management Research
Author: Rohit Ridge

14. Abacus Class Half Moon Bay | Build Strong Maths Skills With Sip Abacus
Author: SIP Abacus NZ

15. Train The Trainer For Security Industry: Skills, Qualifications & Opportunities
Author: Amba Training