How to create a dendrogram with colored branches?

I would like to create a dendrogram in R which has colored branches, like the one shown below. enter image description here

So far I used following commands to create a standard dendrogram:

d <- dist(as.matrix(data[,29]))   # find distance matrix 
 hc <- hclust(d)                # apply hirarchical clustering 
 plot(hc,labels=data[,1], main="", xlab="") # plot the dendrogram

How should I modify this code to obtain desired result ?

Thanks in advance for your help.


Solution 1:

You could use the dendextend package, aimed for tasks such as this:

# install the package:
if (!require('dendextend')) install.packages('dendextend'); library('dendextend')

## Example:
dend <- as.dendrogram(hclust(dist(USArrests), "ave"))
d1=color_branches(dend,k=5, col = c(3,1,1,4,1))
plot(d1) # selective coloring of branches :)
d2=color_branches(d1,k=5) # auto-coloring 5 clusters of branches.
plot(d2)
# More examples are in ?color_branches

enter image description here

You can see many examples in the presentations and vignettes of the package, in the "usage" section in the following URL: https://github.com/talgalili/dendextend

Solution 2:

You should use dendrapply (help document).

For instance:

# Generate data
set.seed(12345)
desc.1 <- c(rnorm(10, 0, 1), rnorm(20, 10, 4))
desc.2 <- c(rnorm(5, 20, .5), rnorm(5, 5, 1.5), rnorm(20, 10, 2))
desc.3 <- c(rnorm(10, 3, .1), rnorm(15, 6, .2), rnorm(5, 5, .3))

data <- cbind(desc.1, desc.2, desc.3)

# Create dendrogram
d <- dist(data) 
hc <- as.dendrogram(hclust(d))

# Function to color branches
colbranches <- function(n, col)
  {
  a <- attributes(n) # Find the attributes of current node
  # Color edges with requested color
  attr(n, "edgePar") <- c(a$edgePar, list(col=col, lwd=2))
  n # Don't forget to return the node!
  }

# Color the first sub-branch of the first branch in red,
# the second sub-branch in orange and the second branch in blue
hc[[1]][[1]] = dendrapply(hc[[1]][[1]], colbranches, "red")
hc[[1]][[2]] = dendrapply(hc[[1]][[2]], colbranches, "orange")
hc[[2]] = dendrapply(hc[[2]], colbranches, "blue")

# Plot
plot(hc)

Which gives:

Colored dendrogram