hadoop copy a local file system folder to HDFS
Solution 1:
You could try:
hadoop fs -put /path/in/linux /hdfs/path
or even
hadoop fs -copyFromLocal /path/in/linux /hdfs/path
By default both put
and copyFromLocal
would upload directories recursively to HDFS.
Solution 2:
In Short
hdfs dfs -put <localsrc> <dest>
In detail with an example:
Checking source and target before placing files into HDFS
[cloudera@quickstart ~]$ ll files/
total 132
-rwxrwxr-x 1 cloudera cloudera 5387 Nov 14 06:33 cloudera-manager
-rwxrwxr-x 1 cloudera cloudera 9964 Nov 14 06:33 cm_api.py
-rw-rw-r-- 1 cloudera cloudera 664 Nov 14 06:33 derby.log
-rw-rw-r-- 1 cloudera cloudera 53655 Nov 14 06:33 enterprise-deployment.json
-rw-rw-r-- 1 cloudera cloudera 50515 Nov 14 06:33 express-deployment.json
[cloudera@quickstart ~]$ hdfs dfs -ls
Found 1 items
drwxr-xr-x - cloudera cloudera 0 2017-11-14 00:45 .sparkStaging
Copy files HDFS using -put
or -copyFromLocal
command
[cloudera@quickstart ~]$ hdfs dfs -put files/ files
Verify the result in HDFS
[cloudera@quickstart ~]$ hdfs dfs -ls
Found 2 items
drwxr-xr-x - cloudera cloudera 0 2017-11-14 00:45 .sparkStaging
drwxr-xr-x - cloudera cloudera 0 2017-11-14 06:34 files
[cloudera@quickstart ~]$ hdfs dfs -ls files
Found 5 items
-rw-r--r-- 1 cloudera cloudera 5387 2017-11-14 06:34 files/cloudera-manager
-rw-r--r-- 1 cloudera cloudera 9964 2017-11-14 06:34 files/cm_api.py
-rw-r--r-- 1 cloudera cloudera 664 2017-11-14 06:34 files/derby.log
-rw-r--r-- 1 cloudera cloudera 53655 2017-11-14 06:34 files/enterprise-deployment.json
-rw-r--r-- 1 cloudera cloudera 50515 2017-11-14 06:34 files/express-deployment.json