by Craig S. Mullins
Have you heard about stream computing? Basically, it involves the ingestion of data -
By analyzing large streams of data and looking for trends, patterns, and "interesting" data, stream computing can solve problems that were not practical to solve using traditional computing methods. Another useful way of thinking about this is as RTAP -
Consider a healthcare example. IBM and the University of Ontario Institute of Technology (UOIT) are using an IBM stream computing product, InfoSphere Streams, to help doctors detect subtle changes in the condition of critically ill premature babies. The software ingests a constant stream of biomedical data, such as heart rate and respiration, along with clinical information about the babies. Monitoring premature babies as a patient group is especially important as certain life-
But the stream of healthcare data can be constantly monitored with a stream computing solution. As such, many types of early diagnoses can be made that would take medical professionals much longer to draw. For example, a rhythmic heartbeat can indicate problems (like infections); a normal heartbeat is more variable. Analyzing an ECG stream can highlight this pattern and alert medical professionals to a problem that might otherwise go undetected for a long period. Detecting the problem early can allow doctors to treat an infection before it causes great harm.
A stream computing application can get quite complex. Continuous applications, composed of individual operators, can be interconnected and operate on multiple data streams. Again, think about the healthcare example. There can be multiple streams (blood pressure, heart, temperature, etc.), from multiple patients (because infections travel from patient to patient), having multiple diagnoses.
Consider, a second example: law enforcement. A stream computing application can monitor a stream of video data produced by a surveillance camera. Much of the stream will not be interesting. It becomes interesting when a person shows up in the video. The stream computing application can constantly analyzing the video stream, performing scene detection and face identification. When something “interesting” is found, that section of video can be captured and retained. And the face might even be matched automatically against a database of known criminals.
As mentioned earlier, the IBM product for stream computing is called InfoSphere Streams. It runs on xSeries blades (up to 125 x86 blades) using Linux. It is based on three main abstractions:
The data streams into the system, which is built as a series of progressing, cascading steps. Each step progressively refines the analysis looking for information, patterns, trends, and diagnoses. IBM's stream computing offerings and research is the result of more than 20 years of IBM information management expertise, five years of development by IBM Research, and more than 200 patents.
The ability to process millions of data points per second and perform advanced analytics on the data stream can help to usher in a shift in the way we manage and deal with vast amounts of data.
Not all data can be, or even needs to be, persisted in a database. The future is here and it might be time for us to re-
From Database Trends and Applications, May 2010.
© 2012 Craig S. Mullins,