Remove when 2 columns are duplicated, but keep based on value of a third column (pandas)
Solution 1:
You could run duplicated
to identify the last duplicate and extend the selection per group using groupby
+transform('any')
:
df[((~df[['Product No.', 'Barcode']].duplicated(keep='last'))
.groupby(df['Input ID']).transform('any'))]
output:
Input ID Barcode Product No.
3 2 225 111
4 2 225 111
5 2 225 111
6 2 225 111
9 4 226 222
10 4 226 222
13 6 227 222
14 6 227 222