Home > Cloud Computing > Quizzes > CCD-410: Cloudera Certified Developer for Apache Hadoop (CCDH) Exam
CCD-410: Cloudera Certified Developer for Apache Hadoop (CCDH) Exam
Fast practice, instant feedback. Timer auto-submits when time’s up.
Avg score: 27% Most missed: “MapReduce v2 (MRv2/YARN) splits which major functions of the Job Tracker into se…”
CCD-410: Cloudera Certified Developer for Apache Hadoop (CCDH) Exam
Time left 00:00
25 Questions

1. In the reducer, the MapReduce API provides you with an iterator over Writable values. What does calling the next () method return?
2. Analyze each scenario below and indentify which best describes the behavior of the default partitioner?
3. In a large MapReduce job with m mappers and n reducers, how many distinct copy operations will there be in the sort/shuffle phase?
4. You want to perform analysis on a large collection of images. You want to store this data in HDFS and process it with MapReduce but you also want to give your data analysts and data scientists the ability to process the data directly from HDFS with an interpreted high-level programming language like Python. Which format should you use to store this data in HDFS?
5. Which project gives you a distributed, Scalable, data store that allows you random, realtime read/write access to hundreds of terabytes of data?
6. MapReduce v2 (MRv2/YARN) is designed to address which two issues?
7. Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application containers and monitoring application resource usage?
8. You have just executed a MapReduce job. Where is intermediate data written to after being emitted from the Mapper s map method?
9. You want to populate an associative array in order to perform a map-side join. You ?v decided to put this information in a text file, place that file into the DistributedCache and read it in your Mapper before any records are processed. Indentify which method in the Mapper you should use to implement code for reading the file and populating the associative array?
10. What data does a Reducer reduce method process?
11. You have user profile records in your OLPT database, that you want to join with web logs you have already ingested into the Hadoop file system. How will you obtain these user records?
12. Table metadata in Hive is:
13. Determine which best describes when the reduce method is first called in a MapReduce job?
14. On a cluster running MapReduce v1 (MRv1), a TaskTracker heartbeats into the JobTracker on your cluster, and alerts the JobTracker it has an open map task slot. What determines how the JobTracker assigns each map task to a TaskTracker?
15. You need to move a file titled weblogs into HDFS. When you try to copy the file, you can t. You know you have ample space on your DataNodes. Which action should you take to relieve this situation and store more files in HDFS?
16. Which describes how a client reads a file from HDFS?
17. What types of algorithms are difficult to express in MapReduce v1 (MRv1)?
18. You have written a Mapper which invokes the following five calls to the OutputColletor.collect method: output.collect (new Text ( Apple ), new Text ( Red ) ) ; output.collect (new Text ( Banana ), new Text ( Yellow ) ) ; output.collect (new Text ( Apple ), new Text ( Yellow ) ) ; output.collect (new Text ( Cherry ), new Text ( Red ) ) ; output.collect (new Text ( Apple ), new Text ( Green ) ) ; How many times will the Reducer s reduce method be invoked?
19. You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt and #data.txt. How many files will be processed by the FileInputFormat.setInputPaths () command when it's given a path object representing this directory?
20. All keys used for intermediate output from mappers must:
21. Your client application submits a MapReduce job to your Hadoop cluster. Identify the Hadoop daemon on which the Hadoop framework will look for an available slot schedule a MapReduce operation.
22. To process input key-value pairs, your mapper needs to lead a 512 MB data file in memory. What is the best way to accomplish this?
23. A client application creates an HDFS file named foo.txt with a replication factor of 3.Identify which best describes the file access rules in HDFS if the file has a single block that is stored on data nodes A, B and C?
24. Indentify which best defines a SequenceFile?
25. You use the hadoop fs put command to write a 300 MB file using and HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another user see when trying to access this life?