Problems when updating an MySQL database, python
In the example below there is two dataframes, df and df1. I also have a MySQL database with two tables. Tables1 and Table2, both has the same columns as df.
What I want to do is:
- Use df1 to locate the correct row in table1,
- Move the row to table2 and update the columns 'REMOVED' (From False to True) and 'REMOVED_DATE (from None to '2022-01-10).
- Remove the row from table1
Below you see my effort. But it dosen't work, and I can imagine there is a better solution available. Any help would be much appreciated.
df=pd.DataFrame({'ID':[25, 10],
'DATE':['2021-11-13', '2021-12-03'],
'LOC': ['NY', 'ML'],
'MAIL':['[email protected]', '[email protected]'],
'NR': ['100000', '200000'],
'REMOVED': [False, False],
'REMOVED_DATE':[None,None]})
df1=pd.DataFrame({'ID':[25],
'REMOVED':[True],
'REMOVED_DATE':['2022-01-11']})
cols = "`,`".join([str(i) for i in df.columns.tolist()]) #use df to mirror the columns in the tables in the database.
for i, row in df1.iterrows():
id=row['ID']
sql=f"SELECT ID FROM table1 WHERE id={id}" # Check if ID is in table1.
cursor.execute(sql)
z=cursor.fetchall()
if not z:
print('insert') # if not, move on.
else: #If ID in, update, move and delete
sql=f"""UPDATE table1 SET (`" +cols + "`) VALUES (" + "%s,"*(len(row)-1) + "%s) WHERE id={id}""" # Update the whole row where ID match.
cursor.execute(sql, tuple(row))
mydb.commit()
sql=f"""INSERT INTO table2 SELECT * FROM table1 WHERE id={id}""" #insert it into table2
cursor.execute(str(sql), tuple(row))
mydb.commit()
sql=f"""DELETE from table1 WHERE id={id}""" # Delete it from table1
cursor.execute(sql)
mydb.commit()
You don't need to update table1
-- there's nothing in the specification that indicates this is needed, and you're going to delete it anyway.
Your code doesn't change REMOVED
and REMOVED_DATE
when it's copying from table1
to table2
. You can do this by replacing SELECT *
with selecting all the columns other than those, and substituting the new values you want in the SELECT
list.
You should also use parametrized queries rather than substituting values with f-strings.
cols = ",".join(f"`{col}`" for col in df.columns.tolist())
new_cols = ",".join('%s' if col in ('REMOVED', 'REMOVED_DATE') else f"`{col}`" for col in df.columns.tolist())
insert_sql=f"""INSERT INTO table2 ({cols}) SELECT {new_cols} FROM table1 WHERE id=%s""" #insert it into table2
delete_sql = """DELETE from table1 WHERE id=%s"""
select_sql = "SELECT ID FROM table1 WHERE id=%s"
for i, row in df1.iterrows():
id=row['ID']
cursor.execute(select_sql, (id,))
row = cursor.fetchone()
if not row:
print('insert') # if not, move on.
else: #If ID in, update, move and delete
cursor.execute(insert_sql, (False, '2022-01-10', id))
cursor.execute(delete_sql, (id,))
mydb.commit()
You should only commit once for each set of operations when moving and deleting, so that they're an atomic set.