Functional Genomics of Developing Barley Seeds
Tools for Analysis
Visual exploration of gene expressionsDownload ANSI-C source code for HiT-MDS-2 data embedding software.
Multidimensional Scaling (MDS) is a powerful dimension reduction technique for embedding high-dimensional data into a low-dimensional target space. Thereby, similarities or distance relationships between the source data are reconstructed in the target space in an optimum manner. High-Throughput MDS (HiT-MDS-2) is particularly suited for dealing with large data sets, such as multi-parallel gene expression probes. Particularly in screening of coexpressed genes, and validation of cluster centroids such as derived from k-means or neural gas, the HiT-MDS-2 method with correlation-based gene profile comparison is a valuable tool for visual data inspection.
Download source code:
The code is free for use under the GNU General Public License.
Back to supplement main page.
F-1: A number of 4824 temporal gene expression profiles (upper left: z-score normalized graphs),
embedded by HiT-MDS-2 in 2D (right panel). HiT-MDS-2 has been used with (1-Pearson-Correlation).
Visual browsing (link 1) helps compiling an ordered list of coexpressed genes (link 2 to lower left panel).
The data set (without clusters) is enclosed in the source archive.
The HiT-MDS-2 tool has been developed for inspection of gene expression data. It has been first applied to expressions in developing barley endosperm described here.
Neural Gas (NG) is a prototype-based Hebbian learning method suitable for online clustering. It has got good convergence properties and yields excellent quantization while, at the same time, being very tolerant to initialization. See Wikipedia notes for further details. Pearson correlation similarity should be considered instead of Euclidean distance for derivation of faithful gene expression clusters. NG is part of the SOM-Toolbox.
Self Organizing Map (SOM) is a method used for simultaneous data clustering and dimension reduction. It is similar to neural gas, but restricted to a low-dimensional lattics of connected nodes, suitable for display. See Wikipedia notes for further details. Pearson correlation similarity should be considered for gene expression data. For creating spreadsheets ordered by gene expression profiles, one-dimensional SOMs are very useful. SOM is part of the SOM-Toolbox.