How can I access S3/S3n from a local Hadoop 2.6 installation?

For some reason, the jar hadoop-aws-[version].jar which contains the implementation to NativeS3FileSystem is not present in the classpath of hadoop by default in the version 2.6 & 2.7. So, try and add it to the classpath by adding the following line in hadoop-env.sh which is located in $HADOOP_HOME/etc/hadoop/hadoop-env.sh:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/tools/lib/*

Assuming you are using Apache Hadoop 2.6 or 2.7

By the way, you could check the classpath of Hadoop using:

bin/hadoop classpath

import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.amazonaws:aws-java-sdk:1.10.34,org.apache.hadoop:hadoop-aws:2.6.0 pyspark-shell'

import pyspark
sc = pyspark.SparkContext("local[*]")

from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)

hadoopConf = sc._jsc.hadoopConfiguration()
myAccessKey = input() 
mySecretKey = input()
hadoopConf.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
hadoopConf.set("fs.s3.awsAccessKeyId", myAccessKey)
hadoopConf.set("fs.s3.awsSecretAccessKey", mySecretKey)

df = sqlContext.read.parquet("s3://myBucket/myKey")

@Ashrith's answer worked for me with one modification: I had to use $HADOOP_PREFIX rather than $HADOOP_HOME when running v2.6 on Ubuntu. Perhaps this is because it sounds like $HADOOP_HOME is being deprecated?

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:${HADOOP_PREFIX}/share/hadoop/tools/lib/*

Having said that, neither worked for me on my Mac with v2.6 installed via Homebrew. In that case, I'm using this extremely cludgy export:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$(brew --prefix hadoop)/libexec/share/hadoop/tools/lib/*