New posts in hadoop

Large scale data processing Hbase vs Cassandra [closed]

nosql hadoop cassandra hbase data-processing

There are 0 datanode(s) running and no node(s) are excluded in this operation

ubuntu hadoop amazon-ec2 hdfs hadoop2

Parquet vs ORC vs ORC with Snappy

hadoop hive parquet snappy orc

Is it better to use the mapred or the mapreduce package to create a Hadoop Job?

hadoop mapreduce

Could not start ZK at requested port of 2181, while export HBASE_MANAGES_ZK=false

linux hadoop hbase zookeeper cloudera

How can I include a python package with Hadoop streaming job?

Is it possible to Managing 20 TB data using MySQL?

mysql database hadoop hbase

Python read file as stream from HDFS

python hadoop subprocess hdfs

how many mappers and reduces will get created for a partitoned table in hive

hadoop hive mapreduce reduce mapper

How do I find out the version of Zookeeper I am running?

apache2 versions hadoop

What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism?

performance apache-spark hadoop apache-spark-sql

How to know Hive and Hadoop versions from command prompt?

hadoop map reduce secondary sorting

hadoop mapreduce hadoop-partitioning

How to use Sqoop in Java Program?

java hadoop sqoop

How to Define Custom partitioner for Spark RDDs of equally sized partition where each partition has equal number of elements?

scala hadoop apache-spark

Is there a .NET equivalent to Apache Hadoop? [closed]

c# .net hadoop mapreduce

How do I output the results of a HiveQL query to CSV?

database hadoop hive hiveql

Hadoop JBOD disk configuration on HP Smart Array 410/i disk controller

hp hp-proliant hadoop hp-smart-array storage

winutils error:Error while running spark on windows

scala apache-spark hadoop apache-spark-sql

Best choice for NTP client configuration

centos ntp hadoop