New posts in hadoop

What's the purpose of a JOIN where no column from 2nd table is being used?

sql hadoop join hive hql

pytz.exceptions.UnknownTimeZoneError when loading pytz with zipimport in Python

python hadoop timezone pytz

Spark on yarn concept understanding

hadoop apache-spark hdfs hadoop-yarn

data block size in HDFS, why 64MB?

database hadoop mapreduce block hdfs

Hive insert query like SQL

sql hadoop hive hiveql

Namenode not getting started

math operations between queries Impala SQL

sql hadoop impala hue

HDFS error: could only be replicated to 0 nodes, instead of 1

amazon-ec2 hadoop

Hive: how to show all partitions of a table?

Integration testing Hive jobs

java testing hadoop mapreduce hive

Reading file as single record in hadoop

java hadoop mapreduce

Hive query performance for high cardinality field

sql hadoop hive query-optimization

"Permission denied" errors whe starting a single node cluster in Hadoop

Hadoop Word count: receive the total number of words that start with the letter "c"

java hadoop mapreduce

Hadoop FileAlreadyExistsException: Output directory hdfs://<namenode public dns>:9000/input already exists

ubuntu hadoop mapreduce

How to Delete a directory from Hadoop cluster which is having comma(,) in its name?

file hadoop comma

How to get ID of a map task in Spark?

scala hadoop apache-spark hadoop-yarn

Differences between Amazon S3 and S3n in Hadoop

hadoop amazon-s3 hdfs

What is Hive: Return Code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

hadoop mapreduce hive

Is there any way to get the column name along with the output while execute any query in Hive?

hadoop hive rdbms