How to filter column names from multiindex dataframe for a specific condition?


df1 = pd.DataFrame(
{
    "empid" : [1,2,3,4,5,6],
    "empname" : ['a', 'b','c','d','e','f'],
    "empcity" : ['aa','bb','cc','dd','ee','ff']
})
df1

df2 = pd.DataFrame(
{
    "empid" : [1,2,3,4,5,6],
    "empname" : ['a', 'b','m','d','n','f'],
    "empcity" : ['aa','bb','cc','ddd','ee','fff']
})
df2

df_all = pd.concat([df1.set_index('empid'),df2.set_index('empid')],axis='columns',keys=['first','second'])
df_all

df_final = df_all.swaplevel(axis = 'columns')[df1.columns[1:]]
df_final

orig = df1.columns[1:].tolist()
print (orig)
['empname', 'empcity']

df_final = (df_all.stack()
                  .assign(comparions=lambda x: x['first'].eq(x['second']))
                  .unstack()
                  .swaplevel(axis = 'columns')
                  .reindex(orig, axis=1, level=0))
print (df_final)

How to filter level[0] column name list where comparions = False from the dataframe df_final(consider there are more than 300 column like this at level 0)

enter image description here

Solution 1:

First test if in level comparions are all Trues by DataFrame.xs with DataFrame.all:

s = df_final.xs('comparions', level=1, axis=1).all()

And then invert mask for test at least one False with filter indices:

L = s.index[~s].tolist()
print (L)
['empname', 'empcity']

Getting Saucelab error "There is no device that matches your criteria."

How to change the encoding of all illegitimate characters in Visual Studio Code at once?

What does the four horizontal lines symbol mean in vs code and does it mean my gitignore is not setup?

Could not load library cudnn_cnn_infer64_8.dll. Error code 126 Please make sure cudnn_cnn_infer64_8.dll is in your library path

MemoryMappedFileSecurity missing in .NET 6

How to be sure ClamAV database is up to date?

Android Studio only gives me SHA1, I need SHA256

How to connect MYSQL database to other classes through constructor [duplicate]

Why in range for loop do begin/end need to be copyable?

How to set the EC2 resource instance count from a map value in a for_each in Terraform

Sorting objects and lists in nested JSON

Second newest file

How to filter column names from multiindex dataframe for a specific condition?

Solution 1:

Related

Recent Posts