Drawing sequence logos in D3

How would I go about drawing a sequence logo with D3?

From Wikipedia

a sequence logo is a graphical representation of the sequence conservation of nucleotides (in a strand of DNA/RNA) or amino acids (in protein sequences). A sequence logo consists of a stack of letters at each position. The relative sizes of the letters indicates their frequency in the sequences. The total height of the letters depicts the information content of the position, in bits.

An example:

enter image description here

Typically, data comes in the form of a matrix such that the row names of the matrix is the amino acids/DNA sequence and the columns denote the position of the sequence.

So if I had a character space of A B C and sequences of length 4 the matrix would look something like this

  1     2     3     4
A 0.1   0.8   0.2   0.1
B 0.3   0.2   0.3   0.05
C 0     0.1   0.4   0.4

The values in the matrix would correspond to the relative height of the character


Solution 1:

As inspiration, I started with the stacked bar chart: http://bl.ocks.org/3886208

A crude implementation is here: http://jsfiddle.net/QcPZ9/

One of the more confusing parts is:

data.forEach(function(d) {
    var y0 = 0;
    d.bits = d.map( function( entry ) { 

        return { bits: entry.bits, letter: entry.letter, y0: y0, y1 : y0 += +entry.bits };          
    } )
    d.bitTotal = d.bits[d.bits.length - 1].y1; 
});

Basically, it says to compute the total number of bits for each sequence entry (what ends up being a column). It also maintains the previous bits, so that the y-offsets (stacking) can be computed.

As a whole, this could be improved by using a defined symbol or graphic for the letters, instead of a font hack.

Solution 2:

Here is the solution:

This is a fork of the cmonkey implementation: http://jsfiddle.net/fgborja/rMArY/

I made some adjustments in the characters using the Inkspace. The 'sequencelogo' font is embedded as glyphs in javascript.

   function sequencelogoFont(){
      var font = svg.append("defs").append("font")
                                       .attr("id","sequencelogo")
      //...
      font.append("glyph")
        .attr("unicode","A") 
        .attr("vert-adv-y","50") 
        .attr("d","M500 767l-120 -409h240zM345 948h310l345 -1000h-253l-79 247h-338l-77 -247h-253l345 1000v0z")      
      //...
    }

It becomes more portable if you convert the svg font to ttf, woff and eot and put them as source in the css file.

(plus aminoacids logos)