Map values and update a Pandas DataFrame
I have a dataframe
that looks like this
df1 = pd.DataFrame({'casting_id': ['V4CA', 'V4CA', 'V4CB', 'V4CL', 'V4CD'],
'ele_1': [3.232, 3.208, 3.178, 3.187, 3.344],
'ele_2': [2.06, 1.926, 1.87, 1.947, 1.816],
'ele_3': [0.717, 0.714, 0.7, 0.72, 0.666],
'ele_4': [0.089, 0.087, 0.084, 0.085, 0.069],
'ele_5': [0.03, 0.042, 0.043, 0.039, 0.072],
'ele_6': [0.958, 0.959, 0.964, 0.968, 0.132],
'ele_7': [0.257, 0.266, 0.272, 0.301, 0.186],
'ele_8': [0.025, 0.033, 0.032, 0.028, 0.062]})
casting_id | ele_1 | ele_2 | ele_3 | ele_4 | ele_5 | ele_6 | ele_7 | ele_8 | |
---|---|---|---|---|---|---|---|---|---|
0 | V4CA | 3.232 | 2.06 | 0.717 | 0.089 | 0.03 | 0.958 | 0.257 | 0.025 |
1 | V4CA | 3.208 | 1.926 | 0.714 | 0.087 | 0.042 | 0.959 | 0.266 | 0.033 |
2 | V4CB | 3.178 | 1.87 | 0.7 | 0.084 | 0.043 | 0.964 | 0.272 | 0.032 |
3 | V4CL | 3.187 | 1.947 | 0.72 | 0.085 | 0.039 | 0.968 | 0.301 | 0.028 |
4 | V4CD | 3.344 | 1.816 | 0.666 | 0.069 | 0.072 | 0.132 | 0.186 | 0.062 |
There is another dataframe which has a standard for each element against the casting_id
df2 = pd.DataFrame({'id': [1, 2, 3, 4, 5],
'ele_id': ['ele_1', 'ele_2', 'ele_3', 'ele_4', 'ele_5'],
'max': [0.8, 0.8, 0.5, 0.3, 0.2],
'casting_id': ['V4CA', 'V4CB', 'V4CC', 'V4CD', 'V4CE'],
'min': [0.5, 0.3, 0.1, 0.1, 0.0]})
id | ele_id | max | casting_id | min |
---|---|---|---|---|
1 | ele_1 | 0.8 | V4CA | 0.5 |
2 | ele_2 | 0.8 | V4CB | 0.3 |
3 | ele_3 | 0.5 | V4CC | 0.1 |
4 | ele_4 | 0.3 | V4CD | 0.1 |
5 | ele_5 | 0.2 | V4CE | 0 |
I want to form a dataframe that uses the mapping and add the color value in each cell.
For example: For V4CA
the min
value is 0.5 and max
is 0.8 for ele_1
.
In main Dataframe, the first row has ele_1
above max, so its value should change to [3.232, 'Red']
Value above max = red, below min = green, between min and max = black.
I have tried something like this:
def func(num, casting_id,ele_id):
try:
# print(ele_id)
if casting_id in [None]:
return [num, "black"]
rec = min_max_df.loc[(min_max_df["casting_id"]==casting_id) & (min_max_df["ele_id"]==ele_id)]
if rec.empty:
return [num, "black"]
_min = rec["min"][0]
_max = rec["max"][0]
color = "black" if _min<num<_max else "green" if num>_max else "red"
return [num, color]
except Exception as e:
return num
And tried df.apply(lambda x: x.apply(func,args=(x.casting_id, x.name, list(x.index))), axis=1)
But I am not getting values as expected.
Solution 1:
Use melt
to flatten your first dataframe then merge
with the second one. Finally, process color with select
and pivot
your new dataframe:
# Step 1: flat and merge
out = df1.melt('casting_id', var_name='ele_id', ignore_index=False).reset_index() \
.merge(df2, on=['casting_id', 'ele_id'], how='left')
# Step 2a: compute color
out['color'] = np.select([out['value'] < out['min'], out['value'] > out['max']],
choicelist=['green', 'red'], default='black')
# Step 2b: create a list from value and color
out['value'] = out[['value', 'color']].apply(list, axis=1)
# Step 3: reshape to your original dataframe
out = out.pivot(['index', 'casting_id'], 'ele_id', 'value') \
.reset_index('casting_id').rename_axis(index=None, columns=None)
Output:
>>> out
casting_id ele_1 ele_2 ele_3 ele_4 ele_5 ele_6 ele_7 ele_8
0 V4CA [3.232, red] [2.06, black] [0.717, black] [0.089, black] [0.03, black] [0.958, black] [0.257, black] [0.025, black]
1 V4CA [3.208, red] [1.926, black] [0.714, black] [0.087, black] [0.042, black] [0.959, black] [0.266, black] [0.033, black]
2 V4CB [3.178, black] [1.87, red] [0.7, black] [0.084, black] [0.043, black] [0.964, black] [0.272, black] [0.032, black]
3 V4CL [3.187, black] [1.947, black] [0.72, black] [0.085, black] [0.039, black] [0.968, black] [0.301, black] [0.028, black]
4 V4CD [3.344, black] [1.816, black] [0.666, black] [0.069, green] [0.072, black] [0.132, black] [0.186, black] [0.062, black]
Setup:
df1 = pd.DataFrame({'casting_id': ['V4CA', 'V4CA', 'V4CB', 'V4CL', 'V4CD'],
'ele_1': [3.232, 3.208, 3.178, 3.187, 3.344],
'ele_2': [2.06, 1.926, 1.87, 1.947, 1.816],
'ele_3': [0.717, 0.714, 0.7, 0.72, 0.666],
'ele_4': [0.089, 0.087, 0.084, 0.085, 0.069],
'ele_5': [0.03, 0.042, 0.043, 0.039, 0.072],
'ele_6': [0.958, 0.959, 0.964, 0.968, 0.132],
'ele_7': [0.257, 0.266, 0.272, 0.301, 0.186],
'ele_8': [0.025, 0.033, 0.032, 0.028, 0.062]})
df2 = pd.DataFrame({'id': [1, 2, 3, 4, 5],
'ele_id': ['ele_1', 'ele_2', 'ele_3', 'ele_4', 'ele_5'],
'max': [0.8, 0.8, 0.5, 0.3, 0.2],
'casting_id': ['V4CA', 'V4CB', 'V4CC', 'V4CD', 'V4CE'],
'min': [0.5, 0.3, 0.1, 0.1, 0.0]})
Solution 2:
Given:
import pandas as pd
df = pd.DataFrame(
columns=['casting_id', 'ele_1', 'ele_2', 'ele_3', 'ele_4', 'ele_5', 'ele_6', 'ele_7', 'ele_8'],
data=[
['V4CA', 3.232, 2.06, 0.717, 0.089, 0.03, 0.958, 0.257, 0.025],
['V4CA', 3.208, 1.926, 0.714, 0.087, 0.042, 0.959, 0.266, 0.033],
['V4CB', 3.178, 1.87, 0.7, 0.084, 0.043, 0.964, 0.272, 0.032],
['V4CL', 3.187, 1.947, 0.72, 0.085, 0.039, 0.968, 0.301, 0.028],
['V4CD', 3.344, 1.816, 0.666, 0.069, 0.072, 0.132, 0.186, 0.062],
]
)
min_max_df = pd.DataFrame(
columns=['id', 'ele_id', 'max', 'casting_id', 'min'],
data=[
[1, 'ele_1', 0.8, 'V4CA', 0.5],
[2, 'ele_2', 0.8, 'V4CB', 0.3],
[3, 'ele_3', 0.5, 'V4CC', 0.1],
[4, 'ele_4', 0.3, 'V4CD', 0.1],
[5, 'ele_5', 0.2, 'V4CE', 0],
]
)
you can solve with a one-liner:
pd.DataFrame(
data=[
list(row[:3]) + [[row[3], "black" if pd.isna(row[4]) or row[6] <= row[3] <= row[5] else ['green', 'red'][row[3] > row[5]]]]
for row in df.reset_index().melt(
id_vars=["casting_id", 'index'],
var_name="ele_id",
value_name="value"
).merge(
min_max_df, left_on=["casting_id", "ele_id"], right_on=["casting_id", "ele_id"], how='left',
).itertuples(name=None, index=False)
],
columns=["casting_id", 'index', "ele_id", 'value'],
).pivot(
values='value', index=["casting_id", 'index'], columns='ele_id'
).reset_index().sort_values('index').drop(columns=['index'])
This is what you get:
casting_id ele_1 ele_2 ele_3 ele_4 ele_5 ele_6 ele_7 ele_8
0 V4CA [3.232, red] [2.06, black] [0.717, black] [0.089, black] [0.03, black] [0.958, black] [0.257, black] [0.025, black]
1 V4CA [3.208, red] [1.926, black] [0.714, black] [0.087, black] [0.042, black] [0.959, black] [0.266, black] [0.033, black]
2 V4CB [3.178, black] [1.87, red] [0.7, black] [0.084, black] [0.043, black] [0.964, black] [0.272, black] [0.032, black]
4 V4CL [3.187, black] [1.947, black] [0.72, black] [0.085, black] [0.039, black] [0.968, black] [0.301, black] [0.028, black]
3 V4CD [3.344, black] [1.816, black] [0.666, black] [0.069, green] [0.072, black] [0.132, black] [0.186, black] [0.062, black]