New posts in hadoop

What's the purpose of a JOIN where no column from 2nd table is being used?

pytz.exceptions.UnknownTimeZoneError when loading pytz with zipimport in Python

Spark on yarn concept understanding

data block size in HDFS, why 64MB?

Hive insert query like SQL

Namenode not getting started

math operations between queries Impala SQL

HDFS error: could only be replicated to 0 nodes, instead of 1

Hive: how to show all partitions of a table?

Integration testing Hive jobs

Reading file as single record in hadoop

Hive query performance for high cardinality field

"Permission denied" errors whe starting a single node cluster in Hadoop

Hadoop Word count: receive the total number of words that start with the letter "c"

Hadoop FileAlreadyExistsException: Output directory hdfs://<namenode public dns>:9000/input already exists

How to Delete a directory from Hadoop cluster which is having comma(,) in its name?

How to get ID of a map task in Spark?

Differences between Amazon S3 and S3n in Hadoop

What is Hive: Return Code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

Is there any way to get the column name along with the output while execute any query in Hive?