How to export a Hive table into a CSV file?
Solution 1:
or use this
hive -e 'select * from your_Table' | sed 's/[\t]/,/g' > /home/yourfile.csv
You can also specify property set hive.cli.print.header=true
before the SELECT
to ensure that header along with data is created and copied to file.
For example:
hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[\t]/,/g' > /home/yourfile.csv
If you don't want to write to local file system, pipe the output of sed
command back into HDFS
using the hadoop fs -put
command.
It may also be convenient to SFTP to your files using something like Cyberduck, or you can use scp
to connect via terminal / command prompt.
Solution 2:
If you're using Hive 11 or better you can use the INSERT
statement with the LOCAL
keyword.
Example:
insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;
Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.
Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.
Solution 3:
That should work for you
-
tab separated
hive -e 'select * from some_table' > /home/yourfile.tsv
-
comma separated
hive -e 'select * from some_table' | sed 's/[\t]/,/g' > /home/yourfile.csv
Solution 4:
You can not have a delimiter for query output,after generating the report (as you did).
you can change the delimiter to comma.
It comes with default delimiter \001
(inivisible character).
hadoop fs -cat /user/data/output/test/* |tr "\01" "," >>outputwithcomma.csv
check this also