Unexpected behavior by str_remove_* in stringr package

Solution 1:

The parens in your current_house are being interpreted as regex groups. Use stringr::fixed to fix that:

setDT(df)
df[, temp2 := str_remove(temp, current_house)           # initial, not working
  ][, temp3 := str_remove(temp, fixed(current_house))   # working
  ][]
#                                        temp                           current_house                                   temp2   temp3
#                                      <char>                                  <char>                                  <char>  <char>
#  1:                            Lazard 528 2                                  Lazard                                   528 2   528 2
#  2:                              KPMG 525 1                                    KPMG                                   525 1   525 1
#  3:                              KPMG 525 1                                    KPMG                                   525 1   525 1
#  4:                              KPMG 524 4                                    KPMG                                   524 4   524 4
#  5:                              KPMG 524 4                                    KPMG                                   524 4   524 4
#  6:                              KPMG 524 4                                    KPMG                                   524 4   524 4
#  7:                              KPMG 524 4                                    KPMG                                   524 4   524 4
#  8: Development and Investment Bank of T... Development and Investment Bank of T... Development and Investment Bank of T...   524 4
#  9: Development and Investment Bank of T... Development and Investment Bank of T... Development and Investment Bank of T...   524 4
# 10: Development and Investment Bank of T... Development and Investment Bank of T... Development and Investment Bank of T...   524 4
# ---                                                                                                                                
# 31:            LionTree Advisors, LLC 500 1                  LionTree Advisors, LLC                                   500 1   500 1
# 32:       Ping'an Securities Co.,Ltd. 496 1             Ping'an Securities Co.,Ltd.                                   496 1   496 1
# 33:       Ping'an Securities Co.,Ltd. 496 1             Ping'an Securities Co.,Ltd.                                   496 1   496 1
# 34:       Ping'an Securities Co.,Ltd. 496 1             Ping'an Securities Co.,Ltd.                                   496 1   496 1
# 35:       Ping'an Securities Co.,Ltd. 496 1             Ping'an Securities Co.,Ltd.                                   496 1   496 1
# 36:       Ping'an Securities Co.,Ltd. 496 1             Ping'an Securities Co.,Ltd.                                   496 1   496 1
# 37:       Ping'an Securities Co.,Ltd. 496 1             Ping'an Securities Co.,Ltd.                                   496 1   496 1
# 38:    Guotai Junan Securities Co Ltd 496 1          Guotai Junan Securities Co Ltd                                   496 1   496 1
# 39:    Guotai Junan Securities Co Ltd 496 1          Guotai Junan Securities Co Ltd                                   496 1   496 1
# 40:                               EY 493 16                                      EY                                  493 16  493 16

You might want to wrap str_remove with trimws(.), since temp3 here has leading blanks:

head(df$temp3)
# [1] " 528 2" " 525 1" " 525 1" " 524 4" " 524 4" " 524 4"

df[, temp3 := trimws(str_remove(temp, fixed(current_house)))]
head(df$temp3)
# [1] "528 2" "525 1" "525 1" "524 4" "524 4" "524 4"