Hierarchical clustering algorithms are typically visualized as a dendrogram. A typical bottom-up algorithm would treat each individual data as a single cluster and successively merge pairs of cluster until all of them have been merged into a single cluster. Multiple distance calculating methods can be use to find the next merging pair of cluster. As an example, using an euclidean distance would merge the pair of cluster with the smallest distance between them.
Forestogram
Now here's a simple example of the comming Forestogram package for R. It shows the 3D representation of a row and column dendrogram on top of a colormap. Both dendrograms can be extracted through a projection on the row or column side. The forestogram is also cut to obtain a partition of disjoint clusters. The R package is still under development but the source code is available on github at Forestogram
In this example, the clustering algorithm is replace by pre-calculated hypothetical values for merging rows and columns.
You must enable Javascript to view this page properly.
Use mouse left click to rotate and mouse scroll wheel for zoom in/out
Here's the R code to generate this plot:
# Example of tree drawing.
library(rgl)
source('Forestogram.R')
# Matrix size.
n_row = 3
n_col = 4
size = c(n_row, n_col)
# Index of the row or column to merge.
merge_matrix = MergeMatrix(size,
c(-1, -2,
-1, -2,
1, -3,
2, -4,
1, -3))
height_vector = c(1, 1.3, 4, 7, 10)
rowcol_vector = c(0, 1, 1, 1, 0)
data = matrix(c(-1, -2, -3, -4,
-5, -6, -7, -8,
-9, -10, -11, -50),
nrow = n_row, ncol = n_col, byrow = TRUE)
Forestogram(size, merge_matrix,
height_vector,
rowcol_vector,
data,
cut_height = 3,
draw_cut = TRUE,
draw_side_tree = TRUE,
draw3D = TRUE,
draw2D_grid = TRUE,
line_width = 4,
line_width_2D = 1,
base_contour_width = 1,
cut_base_contour_width = 1)
No comments:
Post a Comment