Map values and update a Pandas DataFrame

I have a dataframe that looks like this

df1 = pd.DataFrame({'casting_id': ['V4CA', 'V4CA', 'V4CB', 'V4CL', 'V4CD'],
                    'ele_1': [3.232, 3.208, 3.178, 3.187, 3.344],
                    'ele_2': [2.06, 1.926, 1.87, 1.947, 1.816],
                    'ele_3': [0.717, 0.714, 0.7, 0.72, 0.666],
                    'ele_4': [0.089, 0.087, 0.084, 0.085, 0.069],
                    'ele_5': [0.03, 0.042, 0.043, 0.039, 0.072],
                    'ele_6': [0.958, 0.959, 0.964, 0.968, 0.132],
                    'ele_7': [0.257, 0.266, 0.272, 0.301, 0.186],
                    'ele_8': [0.025, 0.033, 0.032, 0.028, 0.062]})

	casting_id	ele_1	ele_2	ele_3	ele_4	ele_5	ele_6	ele_7	ele_8
0	V4CA	3.232	2.06	0.717	0.089	0.03	0.958	0.257	0.025
1	V4CA	3.208	1.926	0.714	0.087	0.042	0.959	0.266	0.033
2	V4CB	3.178	1.87	0.7	0.084	0.043	0.964	0.272	0.032
3	V4CL	3.187	1.947	0.72	0.085	0.039	0.968	0.301	0.028
4	V4CD	3.344	1.816	0.666	0.069	0.072	0.132	0.186	0.062

There is another dataframe which has a standard for each element against the casting_id

df2 = pd.DataFrame({'id': [1, 2, 3, 4, 5],
                    'ele_id': ['ele_1', 'ele_2', 'ele_3', 'ele_4', 'ele_5'],
                    'max': [0.8, 0.8, 0.5, 0.3, 0.2],
                    'casting_id': ['V4CA', 'V4CB', 'V4CC', 'V4CD', 'V4CE'],
                    'min': [0.5, 0.3, 0.1, 0.1, 0.0]})

id	ele_id	max	casting_id	min
1	ele_1	0.8	V4CA	0.5
2	ele_2	0.8	V4CB	0.3
3	ele_3	0.5	V4CC	0.1
4	ele_4	0.3	V4CD	0.1
5	ele_5	0.2	V4CE	0

I want to form a dataframe that uses the mapping and add the color value in each cell.

For example: For V4CA the min value is 0.5 and max is 0.8 for ele_1. In main Dataframe, the first row has ele_1 above max, so its value should change to [3.232, 'Red']

Value above max = red, below min = green, between min and max = black.

I have tried something like this:

def func(num, casting_id,ele_id):  
    try:
   
        # print(ele_id)
        if casting_id in [None]:
            return [num, "black"]
        rec = min_max_df.loc[(min_max_df["casting_id"]==casting_id) & (min_max_df["ele_id"]==ele_id)]
        if rec.empty:
            return [num, "black"]
        _min = rec["min"][0]
        _max = rec["max"][0]
        color = "black" if _min<num<_max else "green" if num>_max else "red"
        return [num, color]
    except Exception as e:
        return num

And tried df.apply(lambda x: x.apply(func,args=(x.casting_id, x.name, list(x.index))), axis=1)

But I am not getting values as expected.

Solution 1:

Use melt to flatten your first dataframe then merge with the second one. Finally, process color with select and pivot your new dataframe:

# Step 1: flat and merge
out = df1.melt('casting_id', var_name='ele_id', ignore_index=False).reset_index() \
         .merge(df2, on=['casting_id', 'ele_id'], how='left')

# Step 2a: compute color
out['color'] = np.select([out['value'] < out['min'], out['value'] > out['max']], 
                         choicelist=['green', 'red'], default='black')

# Step 2b: create a list from value and color
out['value'] = out[['value', 'color']].apply(list, axis=1)

# Step 3: reshape to your original dataframe
out = out.pivot(['index', 'casting_id'], 'ele_id', 'value') \
         .reset_index('casting_id').rename_axis(index=None, columns=None)

Output:

>>> out
  casting_id           ele_1           ele_2           ele_3           ele_4           ele_5           ele_6           ele_7           ele_8
0       V4CA    [3.232, red]   [2.06, black]  [0.717, black]  [0.089, black]   [0.03, black]  [0.958, black]  [0.257, black]  [0.025, black]
1       V4CA    [3.208, red]  [1.926, black]  [0.714, black]  [0.087, black]  [0.042, black]  [0.959, black]  [0.266, black]  [0.033, black]
2       V4CB  [3.178, black]     [1.87, red]    [0.7, black]  [0.084, black]  [0.043, black]  [0.964, black]  [0.272, black]  [0.032, black]
3       V4CL  [3.187, black]  [1.947, black]   [0.72, black]  [0.085, black]  [0.039, black]  [0.968, black]  [0.301, black]  [0.028, black]
4       V4CD  [3.344, black]  [1.816, black]  [0.666, black]  [0.069, green]  [0.072, black]  [0.132, black]  [0.186, black]  [0.062, black]

Setup:

df1 = pd.DataFrame({'casting_id': ['V4CA', 'V4CA', 'V4CB', 'V4CL', 'V4CD'],
                    'ele_1': [3.232, 3.208, 3.178, 3.187, 3.344],
                    'ele_2': [2.06, 1.926, 1.87, 1.947, 1.816],
                    'ele_3': [0.717, 0.714, 0.7, 0.72, 0.666],
                    'ele_4': [0.089, 0.087, 0.084, 0.085, 0.069],
                    'ele_5': [0.03, 0.042, 0.043, 0.039, 0.072],
                    'ele_6': [0.958, 0.959, 0.964, 0.968, 0.132],
                    'ele_7': [0.257, 0.266, 0.272, 0.301, 0.186],
                    'ele_8': [0.025, 0.033, 0.032, 0.028, 0.062]})

df2 = pd.DataFrame({'id': [1, 2, 3, 4, 5],
                    'ele_id': ['ele_1', 'ele_2', 'ele_3', 'ele_4', 'ele_5'],
                    'max': [0.8, 0.8, 0.5, 0.3, 0.2],
                    'casting_id': ['V4CA', 'V4CB', 'V4CC', 'V4CD', 'V4CE'],
                    'min': [0.5, 0.3, 0.1, 0.1, 0.0]})

Solution 2:

Given:

import pandas as pd

df = pd.DataFrame(
    columns=['casting_id', 'ele_1', 'ele_2', 'ele_3', 'ele_4', 'ele_5', 'ele_6', 'ele_7', 'ele_8'],
    data=[
        ['V4CA', 3.232, 2.06, 0.717, 0.089, 0.03, 0.958, 0.257, 0.025],
        ['V4CA', 3.208, 1.926, 0.714, 0.087, 0.042, 0.959, 0.266, 0.033],
        ['V4CB', 3.178, 1.87, 0.7, 0.084, 0.043, 0.964, 0.272, 0.032],
        ['V4CL', 3.187, 1.947, 0.72, 0.085, 0.039, 0.968, 0.301, 0.028],
        ['V4CD', 3.344, 1.816, 0.666, 0.069, 0.072, 0.132, 0.186, 0.062],
    ]
)

min_max_df = pd.DataFrame(
    columns=['id', 'ele_id', 'max', 'casting_id', 'min'],
    data=[
        [1, 'ele_1', 0.8, 'V4CA', 0.5],
        [2, 'ele_2', 0.8, 'V4CB', 0.3],
        [3, 'ele_3', 0.5, 'V4CC', 0.1],
        [4, 'ele_4', 0.3, 'V4CD', 0.1],
        [5, 'ele_5', 0.2, 'V4CE', 0],
        ]
)

you can solve with a one-liner:

pd.DataFrame(
    data=[
        list(row[:3]) + [[row[3], "black" if pd.isna(row[4]) or row[6] <= row[3] <= row[5] else ['green', 'red'][row[3] > row[5]]]]
        for row in df.reset_index().melt(
            id_vars=["casting_id", 'index'],
            var_name="ele_id",
            value_name="value"
        ).merge(
            min_max_df, left_on=["casting_id", "ele_id"], right_on=["casting_id", "ele_id"], how='left',
        ).itertuples(name=None, index=False)
    ],
    columns=["casting_id", 'index', "ele_id", 'value'],
).pivot(
    values='value', index=["casting_id", 'index'], columns='ele_id'
).reset_index().sort_values('index').drop(columns=['index'])

This is what you get:

       casting_id           ele_1           ele_2           ele_3           ele_4           ele_5           ele_6           ele_7           ele_8
0            V4CA    [3.232, red]   [2.06, black]  [0.717, black]  [0.089, black]   [0.03, black]  [0.958, black]  [0.257, black]  [0.025, black]
1            V4CA    [3.208, red]  [1.926, black]  [0.714, black]  [0.087, black]  [0.042, black]  [0.959, black]  [0.266, black]  [0.033, black]
2            V4CB  [3.178, black]     [1.87, red]    [0.7, black]  [0.084, black]  [0.043, black]  [0.964, black]  [0.272, black]  [0.032, black]
4            V4CL  [3.187, black]  [1.947, black]   [0.72, black]  [0.085, black]  [0.039, black]  [0.968, black]  [0.301, black]  [0.028, black]
3            V4CD  [3.344, black]  [1.816, black]  [0.666, black]  [0.069, green]  [0.072, black]  [0.132, black]  [0.186, black]  [0.062, black]

PyMC3/Arviz: CDF value from trace

Mocking a script with no `if __name__=="__main__":` block

Can some One explain && and || operators for the following code please? [duplicate]

Can I reduce size of array in CUDA

Module not found, did you mean "*js"?

Warning: Different store and key passwords not supported for PKCS12 KeyStores. Ignoring user-specified -keypass value. in Android Studio

How to determine the exact lib/include variable generated by find_packge() in config mode? [duplicate]

Why we should use a URL for the Go module name?

How to place independent groups of radio buttons on same row and place well panels around the groups in order to delineate them?

Format string with dictionary in C#

Window Function not working in HackerRank MySQL-8.0.20? [duplicate]

count, print out and delete rows that do not contain a digit