Lambda function for creating two different columns in pandas dataframes

I have a pandas dataframe having HTML based text field from which I want to derive two fields; the count of tags in it and clean text without any tag. I am using BeautifulSoup to perform the functions. Say,

df_ads['content_elements_cnt'] = df_ads['content'].apply(lambda x: dict(Counter([element.name for element in BeautifulSoup(x).html if element.name != None])))
df_ads['content_refined'] = df_ads['content'].apply(lambda x : BeautifulSoup(x).text)

Is it possible if I can encapsulate the above two statements in one function, call it in apply function to generate two columns (I want to utilize BeautifulSoup instantiation and looping only for one). In other words, is there an efficient way of doing these two operations?


Solution 1:

You could use a helper function and return a Series:

def bs_extract(x):
    soup = BeautifulSoup(x)
    return pd.Series({'content_elements_cnt': dict(Counter([element.name for element in soup.html if element.name != None])),
                      'content_refined': soup.text})

df_ads[['content_elements_cnt', 'content_refined']] = df_ads['content'].apply(bs_extract)

NB. the code is untested (no input provided)