Title:
|
Biclustering Gene Expression Data using MSR Difference Threshold |
Author:
|
Sumam, Mary Idicula; Shyama, Das
|
Abstract:
|
Biclustering is simultaneous clustering of both rows
and columns of a data matrix. A measure called Mean Squared
Residue (MSR) is used to simultaneously evaluate the coherence
of rows and columns within a submatrix. In this paper a novel
algorithm is developed for biclustering gene expression data
using the newly introduced concept of MSR difference threshold.
In the first step high quality bicluster seeds are generated using
K-Means clustering algorithm. Then more genes and conditions
(node) are added to the bicluster. Before adding a node the MSR
X of the bicluster is calculated. After adding the node again the
MSR Y is calculated. The added node is deleted if Y minus X is
greater than MSR difference threshold or if Y is greater than
MSR threshold which depends on the dataset. The MSR
difference threshold is different for gene list and condition list
and it depends on the dataset also. Proper values should be
identified through experimentation in order to obtain biclusters
of high quality. The results obtained on bench mark dataset
clearly indicate that this algorithm is better than many of the
existing biclustering algorithms |
URI:
|
http://dyuthi.cusat.ac.in/purl/4100
|
Date:
|
2009 |