New posts in mapreduce

Hadoop speculative task execution

How to get the input file name in the mapper in a Hadoop program?

MultipleOutputFormat in hadoop

Is gzip format supported in Spark?

Find all duplicate documents in a MongoDB collection by a key field

data block size in HDFS, why 64MB?

Integration testing Hive jobs

Reading file as single record in hadoop

Why is the final reduce step extremely slow in this MapReduce? (HiveQL, HDFS MapReduce)

Kotlin - How to convert a list of objects into a single one after map operation?

Hadoop Word count: receive the total number of words that start with the letter "c"

Hadoop FileAlreadyExistsException: Output directory hdfs://<namenode public dns>:9000/input already exists

Remove Duplicates from MongoDB

What is Hive: Return Code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

Hadoop DistributedCache is deprecated - what is the preferred API?

Count lines in large files

MongoDB Stored Procedure Equivalent

Oozie: Launch Map-Reduce from Oozie <java> action?

Hadoop truncated/inconsistent counter name

MongoDB aggregation comparison: group(), $group and MapReduce