How to export a Hive table into a CSV file?

Solution 1:

or use this

hive -e 'select * from your_Table' | sed 's/[\t]/,/g'  > /home/yourfile.csv

You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file. For example:

hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[\t]/,/g'  > /home/yourfile.csv

If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.

It may also be convenient to SFTP to your files using something like Cyberduck, or you can use scp to connect via terminal / command prompt.

Solution 2:

If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.

Example:

insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;

Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.

Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.

Solution 3:

That should work for you

  • tab separated

    hive -e 'select * from some_table' > /home/yourfile.tsv
  • comma separated

    hive -e 'select * from some_table' | sed 's/[\t]/,/g' > /home/yourfile.csv

Solution 4:

You can not have a delimiter for query output,after generating the report (as you did).

you can change the delimiter to comma.

It comes with default delimiter \001 (inivisible character).

hadoop fs -cat /user/data/output/test/* |tr "\01" "," >>outputwithcomma.csv

check this also