Data-Driven Modeling using Spherical Self-Organizing Feature Maps
Publisher: | Dissertation |
Pub date: | 2006 |
Pages: | 156 |
ISBN-10: | 1581123191 |
ISBN-13: | 9781581123197 |
Categories: | Computer Science Computers Science |
Abstract
Researchers and data analysts are increasingly relying on graphical tools to assist them in modeling their data, generating their hypotheses, and gaining deeper insights on their experimentally acquired data. Recent advances in technology have made available more improved and novel modeling and analysis media that facilitate intuitive, task-driven exploratory analysis and manipulation of the displayed graphical representations. In order to utilize these emerging technologies researchers must be able to transform experimentally acquired data vectors into a visual form or secondary representation that has a simple structure and, is easily transferable into the media. As well, it is essential that it can be modified or manipulated within the display environment.This thesis presents a data-driven modeling technique that utilizes the basic learning strategy of an unsupervised clustering algorithm, called the self-organizing feature map, to adaptively learn topological associations inherent in the data and preserve them within the topology imposed by its predefined spherical lattice, thereby transforming the data into a 3D tessellated form. The tessellated graphical forms originate from a sphere thereby simplifying the process of computing its transformation parameters on re-orientation within an interactive, task-driven, graphical display medium. A variety of data sets including six sets of scattered 3D coordinate data, chaotic attractor data, the more commonly used Fisher’s Iris flower data, medical numeric data, geographic and environmental data are used to illustrate the data-driven modeling and visualization mechanism.
The modeling algorithm is first applied to scattered 3D coordinate data to understand the influence of the spherical topology on data organization. Two cases are examined, one in which the integrity of the spherical lattice is maintained during learning and, the second, in which the inter-node connections in the spherical lattice are adaptively changed during learning. In the analysis, scattered coordinate data of freeform objects with topology equivalent to a sphere and those whose topology is not equivalent to a sphere are used. Experiments demonstrate that it is possible to get reasonably good results with the degree of resemblance, determined by an average of the total normalized error measure, ranging from 6.2x10-5 – 1.1x10-3. The experimental analysis using scattered coordinate data facilitates an understanding of the algorithm and provides evidence of the topology-preserving capability of the spherical self-organizing feature map.
The algorithm is later implemented using abstract, seemingly random, numeric data. Unlike in the case of 3D coordinate data, wherein the SOFM lattice is in the same coordinate frame (domain) as the input vectors, the numeric data is abstract. The criterion for deforming the spherical lattice is determined using mathematical and statistical functions as measures-of information that are tailored to reflect some aspect of meaningful, tangible, inter-vector relationships or associations embedded in the spatial data that reveal some physical aspect of the data. These measures are largely application-dependent and need to be defined by the data analyst or an expert. Interpretation of the resulting 3D tessellated graphical representation or form (glyph) is more complex and task dependent as compared to that of scattered coordinate data. Very simple measures are used in this analysis in order to facilitate discussion of the underlying mechanism to transform abstract numeric data into 3D graphical forms or glyphs. Several data sets are used in the analysis to illustrate how novel characteristics hidden in the data, and not easily apparent in the string of numbers, can be reflected via 3D graphical forms.
The proposed data-driven modeling approach provides a viable mechanism to generate 3D tessellated representations of data that can be easily transferred to a graphical modeling and analysis medium for interactive and intuitive exploration.
About the Author
Dr. Sangole is currently a post-doctoral fellow in the School of Physical and Occupational Therapy at McGill University, Montreal - Quebec. Her research includes studying motor control of the hand in patients with hemiparesis following a stroke, with the objective of identifying motor compensatory strategies in the upper limb. Her overall research interests include investigating motor control issues in rehabilitation, specifically neuromuscular control of movement.
She is a mechanical engineer by training and her past research experience included interactive design, deformable
modeling, scientific data visualization and Computer Aided Design (CAD) data exchange. Driven by a personal interest in
rehabilitation she moved to Galveston, Texas to do research at the University of Texas Medical Branch and the Transitional
Learning Center--a post-acute rehabilitation facility for persons with brain injury. After completing a 2-year
postdoctoral stint in Texas she returned to Canada to continue her academic career.