How to create a circular bar plot using ggplot

I am trying to create a circular bar plot using my data, but I could not even organize the data frame in order to do it. I have 121 seed analizes from 3 different years (column named campana), and from 4 regions of a province (column named zona). I would like to make a graph like the one in the image, using the different areas (zonas) instead of the letters A-B-C-D, and using the stacked bars to show the incidence (frequency) of fungi on each year. Each fungus frequency is in a column with the fungus name, such as A.padwickii, Microdochium, Bipolaris, etc. I don't expect you to do all the work, just the first step of helping me organize the data in the correct way.

tibble::tribble(
     ~muestra_n, ~campana, ~zona, ~inc_pan, ~percent_gran_m_pan, ~peso_pan, ~percent_perdida_peso, ~severidad, ~phoma, ~microdochium, ~a_padwickii, ~bipolaris, ~curvularia, ~fusarium, ~exserohilum,
             16,        2,     2,       98,                14.6,      2.85,      9.90007401924501,        0.1,      0,            11,            0,          0,           0,         0,            0,
             18,        2,     2,     97.1,                 7.8,      1.34,      29.9991559044484,       0.44,     13,             0,            0,          0,           0,         0,            0,
             19,        2,     2,      100,                14.3,      1.35,    0.0288073746879383,       0.66,      0,            16,            0,          0,          18,         0,            0,
             20,        2,     1,      100,                 9.6,      1.49,      1.64930877284015,       0.09,      0,            12,            1,          2,          44,         0,            0,
             21,        2,     1,      100,                17.9,      3.04,      1.46085792877853,       0.56,      0,             9,            0,          0,           1,         0,            0,
             22,        2,     1,      100,                37.4,       2.1,      5.60829881602581,        0.6,     41,             8,            0,          0,           1,         0,            1
     )

enter image description here


Solution 1:

Not enough info to know for sure, but hopefully this moves you in the right direction with a few tips. In my experience, polar plots tend to require a lot of customization for labeling based on where it fits, rather than any particular standard. This makes for a lot more complication when you make them.

library(tidyverse)
YOUR_DATA %>%

  # this step reshapes your data to a format more suited for ggplot2; each
  # fungi appearance is given its own row
  pivot_longer(phoma:last_col()) %>%

  # this sets up the main plot
  ggplot(aes(name, value)) +
  geom_col() +

  # these add the lines and text for zonas. If you have many of them, it 
  # might make sense to figure these out as a separate data frame and feed
  # these into lines like `geom_segment(data = SUMMARY) + geom_text(data = 
  # SUMMARY)`
  annotate("segment", x = c(0.5,2.5,4.5), xend = c(2.4,4.4,7), y = -10, yend = -10) +
  annotate("text", x = c(1.5,3.5,5.5), y = -20, label = LETTERS[1:3]) +

  # This adds space in the middle, changing it from a pie chart to a donut
  scale_y_continuous(limits = c(-100, NA)) +
  coord_polar() +   # converts into polar coordinates
  theme_void()      # gets rid of grid lines and axis labels

enter image description here

Solution 2:

As Jon Spring said, polar plots can be very difficult to customize in the way you've described. They are also often difficult to read and interpret, but that's another matter. For these reasons, most plotting systems don't make it easy to create heavily customized polar plots.

To achieve what you want, we'll need to do a lot manual work, including:

  1. Reshape the data into "long" format to work with ggplot.
  2. Custom calculations to correctly rotate each fungus text label.
  3. Custom calculations for the "zone" secondary x-axis.
  4. An obscure use of ggplot's summary functions to ensure that text labels are placed correctly at the top of each bar stack.

The following code implements all of this. Here I've duplicated the data under a second value of "campana" to verify that things plotted correctly when the bars stack. Comments presented alongside:

# reshape the data to "long" format
df.long <- df %>% # YOUR DATA
  select(zona, campana, phoma:exserohilum) %>% 
  pivot_longer(-c(zona, campana), names_to = 'fungus', values_to = 'freq') %>% 
  filter(freq > 0)

# duplicate the data and change campana so that we have some stacks to work with
df.long <- df.long %>% 
  bind_rows(mutate(df.long, campana = 3))

df.long.summary <- df.long %>% 
  group_by(combined_label = paste(zona, fungus, sep = '_'), fungus, campana, zona) %>% # a label that combines zona and fungus for the x-axis
  summarize(freq = sum(freq)) %>% # totals
  ungroup %>% 
  mutate(
    x = as.numeric(factor(combined_label)), # the numerical position of each x-axis label
    campana = factor(campana), # optional, but ggplot produces better colors this way
    angle = (max(x) + 0.45 - x) / max(x) * 2*pi * 180/pi + 90, # calculate an angle for each text label
    angle = angle %% 360, # constrain to [0-360]
    flip = angle > 90 & angle < 270, # angles that fall in this range will need to be flipped for easier reading
    angle = ifelse(flip, angle + 180, angle), # flip the angle
    hjust = ifelse(flip, 1.1, -0.1) # flip the hjust parameter to match
  )

# create secondary x-axis parameters for "zona"
df.zona <- df.long.summary %>% 
  group_by(zona) %>% 
  summarize(
    zona_x = min(x) - 0.4,
    zona_xend = max(x) + 0.4,
    label_x = mean(x),
    angle = 90 - (max(x) - zona_x) / max(x) * 2*pi * 180/pi + 90
  )

plot.fungi <- df.long.summary %>% 
  ggplot(data = ., aes(x = x, y = freq)) +
  geom_col(position = 'stack', aes(fill = campana)) +
  geom_text(aes(label = fungus, x = x, y = freq, angle = angle, hjust = hjust), stat = 'summary', fun = sum) +
  geom_segment(data = df.zona, aes(x = zona_x, xend = zona_xend), y = -10, yend = -10) +
  geom_text(data = df.zona, aes(label = zona, x = label_x, y = -12, angle = 0, vjust = sign(angle) * 2)) +
  coord_polar(clip = 'off') +
  scale_y_continuous(limits = c(-100, NA)) +
  theme_void()
print(plot.fungi)

enter image description here

It would be much simpler to create a cartesian (non-polar) plot of the same data (in my opinion, this is also easier to read):

# reshape the data to "long" format
df.long <- df %>% 
  select(zona, campana, phoma:exserohilum) %>% 
  pivot_longer(-c(zona, campana), names_to = 'fungus', values_to = 'freq') %>% 
  filter(freq > 0)

df.bar <- df.long %>% 
  mutate(
    campana = factor(campana),
    zona = paste0('Zona ', zona)
  ) %>% 
  ggplot(data = ., aes(x = fungus, y = freq, fill = campana)) +
  geom_col(position = 'stack') +
  facet_grid(facets = ~zona, scales = 'free_x', space = 'free_x') +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1),
    strip.background.x = element_rect(fill = '#cccccc', color = NA)
  )
print(df.bar)

enter image description here