Flood Prediction

Since June of 2001, the SMU Database Research Group has been attempting to use Data Mining techniques to make predictions using data that is spatially and temporally distributed.

Challenge:

Our specific application is to try and predict flood events on a river using data about the current and past weather conditions. The techniques we have explored so far include time-series analysis, neural networks, and Markov models.

Research:

We have focused on applying the principles of Hidden Markov Models (HMMs) and derived models to solve the problem above. A HMM is made up of two stochastic (or probabilistic) processes that can produce a sequence of observable symbols. HMMs can be trained to 'recognize' a particular sequence of symbols using specialized algorithms. In one of my proposals for solving the flood problem, several HMMs are trained to recognize the weather conditions that precede a flood event. The hope is that they will be able to predict a flood before it happens based on the current weather conditions.

Proposals:

We are also attempting to model the water level of a river using the internal states of a HMM. The weather conditions that influence the water level are allowed to influence the transitions between the states in the HMM. The hope is that, given a set of weather conditions that precede a flood, the HMM will make the state transitions that lead to a 'flood state'.

The proposals are quite simple conceptually but, as they say in speech recognition research, the devil is in the details. Both models have been implemented in Java and are currently being tested using simple data sets and looking for ways to improve their performance.

In preliminary tests, the models have performed well using best-case or trivial scenarios but have produced large errors on the more realistic datasets. By making some modifications to the models, we hope to reduce these errors so that we can eventually decide how appropriate HMMs are for modeling river conditions. HMMs are efficient at modeling real world processes that are characterized by periods of steady behavior and periods of gradual change.

This research project was initially motivated by the SIVAM project -- an effort to monitor the conditions of the Amazon River in Brazil. Our long-term goal is to adapt some of our models to make flood predictions on the Amazon River. In the mean time, we hope to submit some papers describing our models and their performance.