[ back to Tutorials
] [ prev ] [ next
]
Data Stream Management Systems - A Technology
for Network Monitoring and Traffic Analysis?
Prof. Vera Goebel and Prof. Thomas
Plagemann
University of Oslo, Department of Informatics, Oslo, Norway
Date & time: Tuesday, June 14, 2005, 14:00-18:00
Location: Faculty of Electrical Engineering and
Computing, room: White Hall
ABSTRACT
In the last decade, a new class of data-intensive applications,
like sensor networks, network traffic analysis, financial tickers,
web or telecommunications transaction log analysis, has become
widely recognized. These applications require support for on-line
analysis of rapidly changing data streams. However, traditional
database management systems (DBMSs) have no pre-defined notion
of time and cannot handle data on-line (i.e., in main memory without
storing the data on disk) in near real-time. During the last five
years, Data Stream Management Systems (DSMSs) have been developed
to handle transient data streams on-line and to process continuous
queries on these data streams. Currently, first prototypes and
systems are becoming available.
The fundamental difference between a classical DBMS and a DSMS
is the data stream model. Instead of processing a query over a
persistent set of data that is stored in advance on disk, queries
are performed in DSMSs over a data stream. In a data stream, data
elements arrive on-line and stay only for a limited time period
in memory. Consequently, the DSMS has to handle the data elements
before the buffer is overwritten by new incoming data elements.
The order in which the data elements arrive cannot be controlled
by the system. Once a data element has been processed it cannot
be retrieved again without storing it explicitly. The size of
data streams is potentially unbounded and can be thought of as
an open-ended relation. In DSMSs, continuous queries evaluate
continuously the arriving data elements. Standard operator types
that are supported by most existing DSMSs are filtering, mapping,
aggregates, and joins. Since continuous streams may not end, intermediate
results of continuous queries are often generated over a predefined
window and then either stored, updated, or used to generate a
new data stream of intermediate results. Window techniques are
especially important for aggregation and join queries. Examples
for DSMSs include STREAM, GigaScope, and TelegraphCQ.
Many traffic analysis tasks are solved with tools that are developed
in an ad-hoc, incremental, and cumbersome way instead of seeking
systematic solutions that are easy to reuse and understand. The
huge amount of data that has to be managed and analyzed together
with the fact that many different analysis tasks are performed
over a small set of different network trace formats, motivated
us to study whether DBMSs and especially DSMSs might be useful
for network monitoring and to develop traffic analysis tools.
TUTORIAL GOAL
It is the goal of this tutorial to share with the participants
our insights in using DBMSs and DSMSs to systematically address
network monitoring and traffic analysis problems. We will explain
the main concepts of DSMS, give an overview of the state-of-the-art
and the existing prototypes. Since we gained practical experience
with using the DSMSs TelegraphCQ and STREAM, we will go into more
depth with these systems. Furthermore, we show how off-line analysis
is systematically addressed in the IntraBase system which is based
on PostgreSQL.
Thus, the participants should be enabled to evaluate whether
current or future DSMSs might be helpful for their needs and how
traditional DBMS might be used for traffic analysis.
TUTORIAL OUTLINE
- Introduction:
- What are DSMS?
- DSMS vs. DBMS
- Why do we need DSMS (now)?
- Network monitoring and traffic analysis
- Traditional script-based approach
- Requirements on systematic approaches
- Network monitoring with TelegraphCQ
- Four simple tasks
- T-RAT
- Performance study
- Lessons learned
- Off-line Traffic Analysis in IntraBase
- DSMS Concepts, Issues and Prototypes
- Summary and outlook
INTENDED AUDIENCE
Engineers and researchers that are interested in network monitoring
and traffic analysis; and that are interested in this new technology.
Either just to learn about it, or to understand whether and how
DSMSs and DBMSs might be used for their current and future work.
A basic understanding of the core notions of relational DBMSs
will be helpful, but is not mandatory.
SPEAKER'S BIOGRAPHY
Vera Goebel received her Diploma in Computer
Science from the University Erlangen-Nuernberg (Germany)
in 1989, and her PhD from the University of Zurich (Switzerland)
in 1994. From 1994 to 1996 she was a PostDoc at UniK - Center
of Technology at Kjeller and Telenor (Norway). Since 1997,
Vera Goebel is Professor at the Department of Informatics
of the University of Oslo where she works in the Distributed
Multimedia Systems Group. Her research interests include
database systems, operating systems, middleware, QoS, and
distributed systems. She has published over 50 refereed
papers in her field, and she has served as member and chair
for programme committees of many major international conferences
and workshops.
|
Thomas Plagemann received his Diploma in
Computer Science from the University Erlangen-Nuernberg
(Germany) in 1990, and his Doctor of Technical Science from
Swiss Federal Institute of Technology (ETH) Zurich (Switzerland)
in 1994. In 1995, he was honored with the Medal of the ETH
Zurich for his doctoral thesis, in which he developed the
Da CaPo communication subsystem. From 1994 to 1996 Thomas
Plagemann was a researcher at UniK - Center of Technology
at Kjeller and Telenor(Norway). Since 1996, Thomas Plagemann
is Professor at the University of Oslo where he heads the
Distributed Multimedia Systems Group. His research interests
include multimedia middleware, QoS, operating system support
for distributed multimedia systems and data stream management
systems for network monitoring. He has published over 60
refereed papers in his field, he is member of the Editorial
Board of the ACM Transactions on Multimedia Computing, Communications,
and Applications; and he has served as member and chair
for programme committees of many major international conferences
and workshops. Furthermore, he has given several successful
tutorials at international events, like MIPS 2004, ACM Multimedia
2002, DAIS 2001, PROMS 2001, IDMS'99, ConTEL'99.
|
[ back to Tutorials
] [ prev ] [ next
]