You ve written a MapReduce job that will process 500 million input records and generated 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reduces which is a potential bottleneck. A custom implementation of which interface is most likely to reduce the amount of intermediate data transferred across the network?

🎲 Try a Random Question  |  Total Questions in Quiz: 54  |  🧠 Study this quiz with Flashcards
This question is part of a full practice quiz:
CCD-410: Cloudera Certified Developer for Apache Hadoop (CCDH) Exam — practice the complete quiz, review flashcards, or try a random question.


You ve written a MapReduce job that will process 500 million input records and generated 500 million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a significant amount of intermediate data that it needs to transfer between mappers and reduces which is a potential bottleneck. A custom implementation of which interface is most likely to reduce the amount of intermediate data transferred across the network?