Python: pandas: match row value to column name/ key's value
Problem
"How to match a row value to a column name and take that intersecting value in pandas"
Context
We have a pandas df like this:
df = pd.DataFrame([{'name': 'john', 'john': 1, 'mac': 10}, {'name': 'mac', 'john': 2, 'mac': 20}], columns=["name", "john", "mac"])
Looking like this:
name | john | mac
john | 1 | 10
mac | 2 | 20
Desired output
name | john | mac | value
john | 1 | 10 | 1
mac | 2 | 20 | 20
In words, the column "value"
should take the number from the corresponding column where name intersects.
So, if name == 'john'
, then take the value from column 'john'
So, if name == 'mac'
, then take the value from column 'mac'
Tried so far
Bunch of lambdas (none successful).
Specifications
Python: 3.5.2
Pandas: 0.18.1
Solution 1:
Deprecation Notice:
lookup
was deprecated in v1.2.0
You could use DataFrame.lookup
, which accepts the row and column labels to use:
In [66]: df
Out[66]:
name john mac
0 john 1 10
1 mac 2 20
In [67]: df["value"] = df.lookup(df.index, df.name)
In [68]: df
Out[68]:
name john mac value
0 john 1 10 1
1 mac 2 20 20
Note that this will have problems with duplicate row labels (which could be trivially worked around by adding a reset_index). It should be faster than calling apply
, which can be pretty slow, but if your frames aren't too large both should work well enough.
Solution 2:
well imo lambda is the way to go, but you can make it very short such has:
df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df = df.set_index('name')
df
Out[64]:
john mac
name
john 5 10
mac 10 15
df['values'] = df.apply(lambda x: x[x.name], axis=1)
In[68]: df
Out[68]:
john mac values
name
john 5 10 5
mac 10 15 15
I did set the index to name for convinience but you could do it without it such has:
df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df['values'] = df.apply(lambda x: x[x['name']], axis=1)
df
Out[71]:
name john mac values
0 john 5 10 5
1 mac 10 15 15