Loading Data from a .txt file to Table Stored as ORC in Hive

LOAD DATA just copies the files to hive datafiles. Hive does not do any transformation while loading data into tables.

So, in this case the input file /home/user/test_details.txt needs to be in ORC format if you are loading it into an ORC table.

A possible workaround is to create a temporary table with STORED AS TEXT, then LOAD DATA into it, and then copy data from this table to the ORC table.

Here is an example:

CREATE TABLE test_details_txt( visit_id INT, store_id SMALLINT) STORED AS TEXTFILE;
CREATE TABLE test_details_orc( visit_id INT, store_id SMALLINT) STORED AS ORC;

-- Load into Text table
LOAD DATA LOCAL INPATH '/home/user/test_details.txt' INTO TABLE test_details_txt;

-- Copy to ORC table
INSERT INTO TABLE test_details_orc SELECT * FROM test_details_txt;

Steps:

  1. First create a table using stored as TEXTFILE  (i.e default or in whichever format you want to create table)
  2. Load data into text table.
  3. Create table using stored as ORC as select * from text_table;
  4. Select * from orc table.

Example:

CREATE TABLE text_table(line STRING);

LOAD DATA 'path_of_file' OVERWRITE INTO text_table;

CREATE TABLE orc_table STORED AS ORC AS SELECT * FROM text_table;

SELECT * FROM orc_table;   /*(it can now be read)*/

Since Hive does not do any transformation to our input data, the format needs to be the same: either the file should be in ORC format, or we can load data from a text file to a text table in Hive.