Generate insert SQL statements from a CSV file
I need to import a csv file into Firebird and I've spent a couple of hours trying out some tools and none fit my needs.
The main problem is that all the tools I've been trying like EMS Data Import and Firebird Data Wizard expect that my CSV file contains all the information needed by my Table.
I need to write some custom SQL in the insert statement, for example, I have a CSV file with the city name, but as my database already has all the cities in another table (normalized), I need to write a subselect in the insert statement to lookup for the city and write its ID, also I have a stored procedure to cread GUIDS.
My insert statement would be something like this:
INSERT INTO PERSON (ID, NAME, CITY_ID) VALUES((SELECT NEW_GUID FROM CREATE_GUID), :NAME, (SELECT CITY_ID FROM CITY WHERE NAME = :CITY_NAME)
How can I approach this?
It's a bit crude - but for one off jobs, I sometimes use Excel.
If you import the CSV file into Excel, you can create a formula which creates an INSERT statement by using string concatenation in the formula. So - if your CSV file has 3 columns that appear in columns A, B, and C in Excel, you could write a formula like...
="INSERT INTO MyTable (Col1, Col2, Col3) VALUES (" & A1 & ", " & B1 & ", " & C1 & ")"
Then you can replicate the formula down all of your rows, and copy, and paste the answer into a text file to run against your database.
Like I say - it's crude - but it can be quite a 'quick and dirty' way of getting a job done!
Well, if it's a CSV, and it this is a one time process, open up the file in Excel, and then write formulas to populate your data in any way you desire, and then write a simple Concat formula to construct your SQL, and then copy that formula for every row. You will get a large number of SQL statements which you can execute anywhere you want.
Fabio,
I've done what Vaibhav has done many times, and it's a good "quick and dirty" way to get data into a database.
If you need to do this a few times, or on some type of schedule, then a more reliable way is to load the CSV data "as-is" into a work table (i.e customer_dataload) and then use standard SQL statements to populate the missing fields.
(I don't know Firebird syntax - but something like...)
UPDATE person
SET id = (SELECT newguid() FROM createguid)
UPDATE person
SET cityid = (SELECT cityid FROM cities WHERE person.cityname = cities.cityname)
etc.
Usually, it's much faster (and more reliable) to get the data INTO the database and then fix the data than to try to fix the data during the upload. You also get the benefit of transactions to allow you to ROLLBACK if it does not work!!
You could import the CSV file into a table as is, then write an SQL query that does all the required transformations on the imported table and inserts the result into the target table.
So something like:
<(load the CSV file into temp_table - n, city_name)>
insert into target_table
select t.n, c.city_id as city
from temp_table t, cities c
where t.city_name = c.city_name
Nice tip about using Excel, but I also suggest getting comfortable with a scripting language like Python, because for some task it's easier to just write a quick python script to do the job than trying to find the function you need in Excel or a pre-made tool that does the job.
I'd do this with awk.
For example, if you had this information in a CSV file:
Bob,New York
Jane,San Francisco
Steven,Boston
Marie,Los Angeles
The following command will give you what you want, run in the same directory as your CSV file (named name-city.csv
in this example).
$ awk -F, '{ print "INSERT INTO PERSON (ID, NAME, CITY_ID) VALUES ((SELECT NEW_GUID FROM CREATE_GUID), '\''"$1"'\'', (SELECT CITY_ID FROM CITY WHERE NAME = '\''"$2"'\''))" }' name-city.csv
Type awk --help
for more information.