Find length and placement of alphabet characters in character string R

I have a dataframe composed of 3 columns:

df <- data.frame(column1 = c("ab 34, 35, 36", "cb 23", "df 45, gh 46", "gh 21"),
                 column2 = c("ID_27", "ID_28", "ID_29", "ID_30"),
                 column3 = c("area51", "area52", "area53", "area54"))
> df
        column1 column2 column3
1 ab 34, 35, 36   ID_27  area51
2         cb 23   ID_28  area52
3  df 45, gh 46   ID_29  area53
4         gh 21   ID_30  area54

I need to identify the length of the letter prefixes in column1. This means being able to identify and filter rows similar to row 3 here which contains more than two letters, but in a large data set. Eg (output after filter).

column1 column2 column3
3 df 45, gh 46   ID_29  area53

Ideally I would also like to get the position of the first letter in each prefix. This could be achieved by putting the length and position values into two additional but separate columns in the dataframe, or by filtering it. Either way which achieves the same result is fine. Both tidyverse and older approaches welcome.

If anyone wants to go further, the idea is to attach the two letter prefix to the number codes which follow it, up until the letter prefix changes like the example above. The following step is to then paste or copy these new values in separate rows while simultaneously duplicating the values in the other columns.


df <- data.frame(column1 = c("ab 34, 35, 36", "cb 23", "df 45, gh 46", "gh 21"),
                 column2 = c("ID_27", "ID_28", "ID_29", "ID_30"),
                 column3 = c("area51", "area52", "area53", "area54"))

library(dplyr)
library(stringr)

df %>% filter(str_count(column1, "[a-z]+") > 1)

# A tibble: 1 x 3
# Groups:   column3 [1]
  column1      column2 column3
  <fct>        <fct>   <fct>  
1 df 45, gh 46 ID_29   area53 

Here's a solution that gets you pretty far toward your ultimate goal: "if anyone wants to go further, the idea is to attach the two letter prefix to the number codes which follow it, up until the letter prefix changes like the example above. The following step is to then paste or copy these new values in separate rows while simultaneously duplicating the values in the other columns."

library(dplyr)
library(stringr)
library(tidyr)
df %>% 
  mutate(
    # isolate the prefix in a list:
    prefix = str_extract_all(column1,"[a-z]+")) %>%
  # cast the list values in `prefix` into long format:
  unnest_longer(prefix) %>%
    # extract the digits in a list:
  mutate(
    digits = str_extract_all(column1,"\\d+")) %>% 
  # cast the list values in `digits` into long format:
  unnest_longer(digits) %>%
  # paste `prefix`, `digits` together:
  mutate(column1 = paste0(prefix, digits)) %>%
  # remove obsolete columns:
  select(-c(prefix, digits))
# A tibble: 9 × 3
  column1 column2 column3
  <chr>   <chr>   <chr>  
1 ab34    ID_27   area51 
2 ab35    ID_27   area51 
3 ab36    ID_27   area51 
4 cb23    ID_28   area52 
5 df45    ID_29   area53 
6 df46    ID_29   area53 
7 gh45    ID_29   area53 
8 gh46    ID_29   area53 
9 gh21    ID_30   area54