ggplot2 histogram showing proportion of group by bin instead of count

Assuming a dataset made of two groups:

dataA<-rnorm(200,3,sd=2)
dataB<-rnorm(500,5,sd=3)
all<-data.frame(dataset=c(rep('A',length(dataA)),rep('B',length(dataB))),value=c(dataA,dataB))

We can plot the histogram with the two groups like this:

ggplot(all,aes(value,fill=dataset))+geom_histogram(bins=50,position='stack')

I would like to obtain the same kind of plot but with the proportion of each group instead of the count for every bin.

I found the following way to do it by calculating the proportion manually for each group:

ggplot(all,aes(x=value,fill=dataset))+geom_histogram(aes(y=c(..count..[..group..==1]/(..count..[..group..==1]+..count..[..group..==2]),..count..[..group..==2]/(..count..[..group..==1]+..count..[..group..==2]))),position='stack',bins=50)+ylab('proportion')

This gives the expected result (below), but it's a very inelegant solution. I'm probably missing something here, is there a better way to obtain the same (or a similar) result?

enter image description here


You might be looking for position = 'fill' instead of 'stack'.

library(ggplot2)
set.seed(42)

dataA <- rnorm(200, 3, sd = 2)
dataB <- rnorm(500, 5, sd = 3)

all <- data.frame(
  dataset = c(rep('A', length(dataA)),rep('B', length(dataB))),
  value   = c(dataA, dataB)
)

ggplot(all, aes(value, fill = dataset)) +
  geom_histogram(bins = 50, position = 'fill')
#> Warning: Removed 14 rows containing missing values (geom_bar).

Created on 2022-01-15 by the reprex package (v2.0.1)


library(ggplot2) 
library(scales)
  
dataA<-rnorm(200,3,sd=2)
dataB<-rnorm(500,5,sd=3)
all<-data.frame(dataset=c(rep('A',length(dataA)),rep('B',length(dataB))),value=c(dataA,dataB))  
  
ggplot(all, aes(value, fill = dataset)) +
  geom_histogram(aes(y = stat(count / sum(count)))) +
  scale_y_continuous(labels = scales::percent_format())
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Created on 2022-01-15 by the reprex package (v2.0.1)