STREAM MODELING and VISUALIZATION

PIs:

Margaret H. Dunham

Vijay Kumar

Students:

Donya Quick

Charlie Isaksson

In today’s world, sensors are everywhere.  They collect information about traffic on roads and traffic in networks.  They collect information about rivers to predict flooding and oceans to predict tsunamis.  They are used in military aircraft and vehicles to collect information about the surroundings.  Satellites orbiting the earth obtain information concerning the environment and have been used to confirm global warming and holes in the ozone layer.  In smart homes sensors gather information about behavior of the elderly and sick to identify potential health problems in real time.  Indeed the ubiquitous use of sensors will only increase in the future. 

What is done (or should be done) with all of the gathered data?   Much research has examined the design of Data Stream Management Systems (DSMS) to collect, preprocess, and query this stream data.  Stream data query mechanisms allow one to query fast stream data to get the most recent information from the stream and offers timely information for modern database applications. Data aggregation seems to be the second topic of choice. Individual researchers have examined subsets of the data targeted to specific domains to develop algorithms and techniques targeted to a specific application.

The objective of our research is to develop modeling and visualization techniques that can be applied to streaming data obtained from most sensors.  What is needed is a higher level approach to processing sensor data.  What is needed is a way to provide actionable intelligence to the domain experts monitoring the collected data.  We, thus, assume that the end users of the sensor data are domain experts rather than sensor, database, or computer science experts.  They can not be expected to look at the data at a low level of detail.  They can not be expected to request queries to examine the data.  This pushed data provides the information needed to make decisions.

Our research centers around the following areas:

·         Stream Abstraction:  We propose a hierarchical approach to managing stream data created by sensors.  This new technique facilitates implementation of diverse software solutions to the many different types of data and requirements presented by sensor systems.  At the same time it facilitates software reuse for many of the individual components of the system.

·         Stream Modeling:  Markov Chains (MC) have been extensively used in many applications.    However, the static nature of MCs does not fit into this dynamic environment.  We have developed a dynamic modeling technique based on MCs calle the Extensible Markov Model (EMM) technique as a modeling tool for these complex spatiotemporal environments.  There are many advantages to the use of EMM for data stream modeling:

o        Scalability – EMMs grow at a sublinear rate.  Our experiments indicate that the EMM size  may be only 1% of what it would be if the growth were linear.

o        Continued Learning – EMM dynamically “learns” the stream data model.  When EMM has learned the model, the growth stops.  It starts again when the model changes.  This makes the EMM ideal for a dynamically changing environment such as traffic (Web or automobile). 

o        Concept Drift – As the stream data arrives, changes may occur over time.  The EMM allows these concept drifts to be detected and the graph facilitates the easy deletion of old obsolete states of data.

·         Stream Visualization:  Our visualization is at the domain expert level and captures both time and ordering.  It addition it is easily “read” for many sets of sensors.