Tutorials: Topology-preserving gene selection and clustering

Topology-Preserving Selection and Clustering (TPSC)

GO TO ➢ [ Summary · Vector Space Model · SOM · SVD ] ➢ [ Hybrid SOM-SVD · Two-Phase Clustering ] ➢ [ HOWTO ] ➢ [ Citations ]

SVD

Singular value decomposition (SVD) reveals promising potentials in the recognition of biologically meaningful features from microarray data (Alter et al., 2000; Holter et al., 2000). It allows the linear transformation of expression data from a genes × samples hyperspace to a greatly reduced eigengenes × eigensamples space, capturing some characteristic variables that represent essential patterns of temporal changes in gene expression. Although powerful in recognition of dominant expression pattern, the effectiveness of SVD appears to be largely dependent on the choice of data pre-processing (Holter et al., 2000). Such linear method, if directly applied to complex microarray data, may lead to loss-of-information. Therefore, it is appealing to first apply a non-linear method (i.e., SOM) for data pre-processing, followed by a linear method (i.e., SVD) for dominant pattern recognition.

Decomposition of SVD The output matrix A can be decomposed by SVD as follows: A=USVT, where U is an M×N matrix whose columns are the left singular vectors (eigensamples), VT is an N×N matrix whose rows are the right singular vectors (eigenvectors), and S is an N×N diagonal matrix of singular values, whose on-diagonal entries (eigenexpressions) are in descending order.

Tutorials: Topology-preserving gene selection and clustering