## Monday, May 4, 2015

### Forestogram

Forestogram

Hierarchical clustering algorithms are typically visualized as a dendrogram. A typical bottom-up algorithm would treat each individual data as a single cluster and successively merge pairs of cluster until all of them have been merged into a single cluster. Multiple distance calculating methods can be use to find the next merging pair of cluster. As an example, using an euclidean distance would merge the pair of cluster with the smallest distance between them.

A simple dendrogram example can be generated in R, using mtcars dataset and hclust function, with the following command:

hc = hclust(dist(mtcars))
plot(hc, hang = -1)

### Forestogram

Now here's a simple example of the comming Forestogram package for R. It shows the 3D representation of a row and column dendrogram on top of a colormap. Both dendrograms can be extracted through a projection on the row or column side. The forestogram is also cut to obtain a partition of disjoint clusters. The R package is still under development but the source code is available on github at Forestogram

In this example, the clustering algorithm is replace by pre-calculated hypothetical values for merging rows and columns.

Use mouse left click to rotate and mouse scroll wheel for zoom in/out

Here's the R code to generate this plot:


# Example of tree drawing.
library(rgl)
source('Forestogram.R')

# Matrix size.
n_row = 3
n_col = 4
size = c(n_row, n_col)

# Index of the row or column to merge.
merge_matrix = MergeMatrix(size,
c(-1, -2,
-1, -2,
1, -3,
2, -4,
1, -3))

height_vector = c(1, 1.3, 4, 7, 10)
rowcol_vector = c(0, 1, 1, 1, 0)

data = matrix(c(-1, -2, -3, -4,
-5, -6, -7, -8,
-9, -10, -11, -50),
nrow = n_row, ncol = n_col, byrow = TRUE)

Forestogram(size, merge_matrix,
height_vector,
rowcol_vector,
data,
cut_height = 3,
draw_cut = TRUE,
draw_side_tree = TRUE,
draw3D = TRUE,
draw2D_grid = TRUE,
line_width = 4,
line_width_2D = 1,
base_contour_width = 1,
cut_base_contour_width = 1)