Home > CompTIA A+ Exam > Quizzes > Cloudera CCD-410 Cloudera Certified Developer for Apache Hadoop (CCDH) Practice Test 2
Cloudera CCD-410 Cloudera Certified Developer for Apache Hadoop (CCDH) Practice Test 2
Fast practice, instant feedback. Timer auto-submits when time’s up.
Avg score: 25% Most missed: “To process input key-value pairs, your mapper needs to lead a 512 MB data file i…”
Cloudera CCD-410 Cloudera Certified Developer for Apache Hadoop (CCDH) Practice Test 2
Time left 00:00
25 Questions

1. All keys used for intermediate output from mappers must:
2. Assuming default settings, which best describes the order of data provided to a reducer's reduce method:
3. Given a directory of files with the following structure: line number, tab character, string: Example: 1abialkjfjkaoasdfjksdlkjhqweroij 2kadfjhuwqounahagtnbvaswslmnbfgy 3kjfteiomndscxeqalkzhtopedkfsikj You want to send each line as one record to your Mapper. Which InputFormat should you use to complete the line: conf.setInputFormat (____.class) ; ?
4. You have user profile records in your OLPT database, that you want to join with web logs you have already ingested into the Hadoop file system. How will you obtain these user records?
5. In the reducer, the MapReduce API provides you with an iterator over Writable values. What does calling the next () method return?
6. What data does a Reducer reduce method process?
7. In a MapReduce job, you want each of your input files processed by a single map task. How do you configure a MapReduce job so that a single map task processes each input file regardless of how many blocks the input file occupies?
8. Your client application submits a MapReduce job to your Hadoop cluster. Identify the Hadoop daemon on which the Hadoop framework will look for an available slot schedule a MapReduce operation.
9. How are keys and values presented and passed to the reducers during a standard sort and shuffle phase of MapReduce?
10. Indentify which best defines a SequenceFile?
11. Which process describes the lifecycle of a Mapper?
12. What is a SequenceFile?
13. You use the hadoop fs put command to write a 300 MB file using and HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another user see when trying to access this life?
14. What types of algorithms are difficult to express in MapReduce v1 (MRv1)?
15. Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application containers and monitoring application resource usage?
16. Your cluster's HDFS block size in 64MB. You have directory containing 100 plain text files, each of which is 100MB in size. The InputFormat for your job is TextInputFormat. Determine how many Mappers will run?
17. You want to run Hadoop jobs on your development workstation for testing before you submit them to your production cluster. Which mode of operation in Hadoop allows you to most closely simulate a production cluster while using a single machine?
18. To process input key-value pairs, your mapper needs to lead a 512 MB data file in memory. What is the best way to accomplish this?
19. A client application creates an HDFS file named foo.txt with a replication factor of 3. Identify which best describes the file access rules in HDFS if the file has a single block that is stored on data nodes A, B and
20. MapReduce v2 (MRv2/YARN) splits which major functions of the JobTracker into separate daemons?
21. Which best describes what the map method accepts and emits?
22. You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses TextInputFormat: the mapper applies a regular expression over input values and emits key-values pairs with the key consisting of the matching text, and the value containing the filename and byte offset. Determine the difference between setting the number of reduces to one and settings the number of reducers to zero.
23. You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt and #data.txt. How many files will be processed by the FileInputFormat.setInputPaths () command when it's given a path object representing this directory?
24. You wrote a map function that throws a runtime exception when it encounters a control character in input data. The input supplied to your mapper contains twelve such characters totals, spread across five file splits. The first four file splits each have two control characters and the last split has four control characters. Indentify the number of failed task attempts you can expect when you run the job with mapred.max.map.attempts set to 4:
25. Which project gives you a distributed, Scalable, data store that allows you random, realtime read/write access to hundreds of terabytes of data?