Structure of variables not recognised when dataframe is a tibble

I have made a function which assesses the structure of an input variable and then performs conditional descriptive statistics depending on what the variable is, means and sd for numeric and frequencies and proportions for factors.

However, when the dataframe is a tibble the method I have used to identify the structure of the variable doesn't seem to work. Here is some toy data

set.seed(123)
df <- tibble(a = round(rnorm(5),1),
             b = factor(letters[1:5]))
glimpse(df)

# Output
# Rows: 5
# Columns: 2
# $ a <dbl> -0.6, -0.2, 1.6, 0.1, 0.1
# $ b <fct> a, b, c, d, e

Now if we ask R what type of variable each column is using the is.x() suite of functions it fails

is.numeric(df[,"a"])
# [1] FALSE

is.factor(df[,"b"])
# [1] FALSE

But, if we turn the dataframe to a data.frame type object it identifies them correctly

df <- as.data.frame(df)

is.numeric(df[,"a"])
# [1] TRUE

is.factor(df[,"b"])
# [1] TRUE

Now of course I could just convert the data.frame to a tibble in my function, but I was just curious how to get the result I got with the data.frame with the tibble, or some equivalent workaround?


The answer is to use [[ to subset the columns from tibble or a dataframe which would give you consistent results. To differentiate between dataframe and tibble let's call the tibble variable as df_tib and dataframe variable as df_dat.

df_tib <- df
df_dat <- data.frame(df)

is.numeric(df_tib[['a']])
#[1] TRUE
is.numeric(df_dat[['a']])
#[1] TRUE

is.factor(df_tib[['b']])
#[1] TRUE
is.factor(df_dat[['b']])
#[1] TRUE

The reason why the issue occurs is how they (dataframe and tibble) react while subsetting with [.

df_tib[, 'a']

# A tibble: 5 x 1
#      a
#  <dbl>
#1  -0.6
#2  -0.2
#3   1.6
#4   0.1
#5   0.1

df_dat[, 'a']
#[1] -0.6 -0.2  1.6  0.1  0.1

df_tib returns a tibble when you subset with [ whereas since you have a single column in df_dat it returns a vector. is.factor and is.numeric would always return FALSE on dataframe/tibble object.