I’ve seen very few scientific journal articles make use of interactive visualizations the way, for example, the New York Times, does. Datasets are large and complex and giving scientists a way to browse the data can add a lot to understanding the data.
As an example, I looked at our dataset of DMD boys who had been exome sequenced. in our Genetic Modifier study. Using R/Bioconductor, I compute the average depth of coverage per exons for each of the 79 exons of the muscle form of the DMD gene. The goal was to create an interactive heatmap style rendering of the data. Rows are samples, columns are exons and each cell represents the amount of average sequencing coverage per exon. Most interesting (and easy to see) are the exonic deletions where coverage is 0 because that region is missing.
The visualization is here mutation explorer.
I’ll try to explain the code in a later post.