Modifying Contingency Tables

I am using R.

With the following data:

set.seed(123)

v1 <- c("2010-2011","2011-2012", "2012-2013", "2013-2014", "2014-2015") 
v2 <- c("A", "B", "C", "D", "E")
v3 <- c("Z", "Y", "X", "W" )

data_1 = data.frame(var_1 = rnorm(871, 10,10), var_2 = rnorm(871, 5,5))

data_1$dates <- as.factor(sample(v1, 871, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

data_1$types <- as.factor(sample(v2, 871, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

data_1$types2 <- as.factor(sample(v3, 871, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))


data_2 = data.frame(var_1 = rnorm(412, 10,10), var_2 = rnorm(412, 5,5))

data_2$dates <- as.factor(sample(v1, 412, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

data_2$types <- as.factor(sample(v2, 412, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

data_2$types2 <- as.factor(sample(v3, 412, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))

data_3 = data.frame(var_1 = rnorm(332, 10,10), var_2 = rnorm(332, 5,5))

data_3$dates <- as.factor(sample(v1, 332, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))

data_3$types <- as.factor(sample(v2, 332, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))

data_3$types2 <- as.factor(sample(v3, 332, replace=TRUE, prob=c(0.3, 0.5, 0.1, 0.1)))

I then combined them all into a single data frame ("problem_data"):

data_1 <- data.frame(name="data_1", data_1)
data_2 <- data.frame(name="data_2", data_2)
data_3 <- data.frame(name="data_3", data_3)

problem_data <- rbind(data_1, data_2, data_3)

I then made the following contingency table:

library(memisc) 
summary <- xtabs(~dates+name+types+types2, problem_data)
t = ftable(summary, row.vars=1, col.vars=2:4)
show_html(t)

enter image description here

Using only the xtabs() and the ftable() commands, is it possible to modify the above contingency table so that it looks like this?

  • data_1 : 2010-2011","2011-2012", "2012-2013", "2013-2014", "2014-2015"
  • data_2: 2010-2011","2011-2012", "2012-2013", "2013-2014", "2014-2015"
  • etc

enter image description here

I have tried different permutations within the xtabs() command:

# permutation 1
summary1 <- xtabs(~dates+name+types+types2, problem_data)
t1 = ftable(summary1, row.vars=1, col.vars=2:4)
show_html(t1)

# permutation 2
summary2 <- xtabs(~name+dates+types+types2, problem_data)
t2 = ftable(summary2, row.vars=1, col.vars=2:4)
show_html(t2)

# permutation 3
summary3 <- xtabs(~types+name+dates+types2, problem_data)
t3 = ftable(summary3, row.vars=1, col.vars=2:4)
show_html(t3)

# permutation 4
summary4 <- xtabs(~types2+dates+name+types2, problem_data)
t4 = ftable(summary4, row.vars=1, col.vars=2:4)
show_html(t4)

But so far, nothing seems to be working.

Can someone please show me how to do this?

Thanks!


The degree of customization allowed by the memisc package is not enough for your requirements. For this kind of tasks, I would suggest using packages like flextable. Here is the code

library(flextable)
library(dplyr)
library(tidyr)

x <- ftable(xtabs(~name+dates+types+types2, problem_data), row.vars = 1:2, col.vars = 3:4)

ft <- as.data.frame(x) |> 
  arrange(name, dates, types, types2) |> 
  pivot_wider(names_from = c(types, types2), values_from = Freq)

header <- names(ft)
header1 <- replace(sub("([^_]+)_([^_]+)", "\\1", header), 1:2, c("", "types"))
names(header1) <- header
header2 <- replace(sub("([^_]+)_([^_]+)", "\\2", header), 2L, "types2")
names(header2) <- header

flextable(ft) |> 
  merge_v(j = "name") |> # create a merged first column
  delete_part("header") |> # remove the old header
  add_header(values = header2) |> 
  add_header(values = header1) |> # recreate headers
  merge_h(part = "header") |> # merge accordingly
  align(align = "left", part = "all") |> 
  hline_top(border = officer::fp_border(width = 1L), part = "header") |> 
  hline_bottom(border = officer::fp_border(width = 1L), part = "header") |> 
  hline_bottom(border = officer::fp_border(width = 1L), part = "body") |> 
  fix_border_issues() |> 
  fit_to_width(15L) |> # set the table width to your desired one. I use 15 inches for demonstration.
  save_as_html(path = "test.html")

The output ("test.html") looks like this

result

If using memisc, AFAIK, this is the best result you can get

x <- ftable(xtabs(~name+dates+types+types2, problem_data), row.vars = 1:2, col.vars = 3:4)
memisc::show_html(x)

memisc


You need to set one attribute, see: https://adv-r.hadley.nz/vectors-chap.html?q=attr()#getting-and-setting

attr(t, "col.vars")$name <- c(
    "data 1: along list of years",
    "data 2: another one",
    "data 3: yadada"
)

Editing to provide a bit more on attributes.

In R you build object on top of vector with the help of attributes.

See this example:

df <- data.frame( x = 1:2,
                  y = LETTERS[1:2])
attributes(df)

When you do names(df) this very close of doing attr(df, "names").

names(df) is what you can call a getting function it allow you to get a value from an object.

if you do:

names(df) <- c("foo", "bar")

you are just changing the names but not making any kind of other modification in the object, this is called a setting. I am probably getting the terminology a bit off but this is the idea you should get.

After that when you want to understand how an object is made in R you can use attributes and structure to get an idea. The documentation on ftable also helped here (https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/ftable). See:

This information is then re-arranged as a matrix whose rows and columns correspond to unique combinations of the levels of the row and column variables (as specified by row.vars and col.vars, respectively).

After that you just need to call the attr(object, "name_of_attributes"). Here it give us a list with 3 vectors (name, types and types2). I assume you can do more complex stuff with the function (loike types3, types4 etc..). Last was just doing the same logic with names() to just "set" the modification you wanted.

You can probably fine tune it way more!