Skip to content

Suite of Benchmark Applications for Stream Processing Systems

License

Notifications You must be signed in to change notification settings

lucarin91/StreamBenchmarks

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License: LGPL v3 HitCount

StreamBenchmarks

This repository includes a set of seven streaming applications taken from the literature, and from existing repositories (e.g., here), which have been cleaned up properly. All the applications can be run in a homogeneous manner and their execution collects statistics of throughput and latency in different ways. The applications are:

  • FraudDetection (FD) -> applies a Markov model to calculate the probability of a credit card transaction being a fraud
  • SpikeDetection (SD) -> finds out the spikes in a stream of sensor readings using a moving-average operator of 1,000 events and a filter based on a fixed threshold
  • TrafficMonitoring (TM) -> processes a stream of events emitted from taxis in the city of Beijing. An operator is responsible for identifying the road that vehicle is riding and another operator updates the average speed of vehicles for each road
  • WordCount (WC) -> counts the number of instances of each word present in a text file
  • Yahoo! Streaming Benchmark (YSB) -> emulates an advertisement application. The goal is to compute 10-seconds windowed counts of advertisement campaigns that have the same type
  • LinearRoad (LR) -> emulates a tolling system for the vehicle expressways. The system uses a variable tolling technique accounting for traffic congestion and accident proximity to calculate toll charges
  • VoipStream (VS) -> it has been used in the evaluation of Blockmon. It detects telemarketing users by analyzing call detail records using a set of Bloom filters

The seven applications are available in three different Stream Processing Systems (SPSs):

  • Apache Storm
  • Apache Flink
  • WindFlow (link)

The same applications (except YSB) have also been provided in BriskStream, a research SPS for multicores. They can be found here. If you need to test YSB in BriskStream, send an email to me and I will share the source code with you.

Dataset files are quite large. For some applications, the scripts to generate them have been provided in this repository. For the other applications, send me an email.

Contributors

The main developer and maintainer of this repository is Gabriele Mencagli. Other authors of the source code are two former Master students in my group: Alessandra Fais and Andrea Cardaci.

About

Suite of Benchmark Applications for Stream Processing Systems

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 74.9%
  • C 13.6%
  • HTML 3.0%
  • Java 2.2%
  • Python 2.1%
  • SWIG 1.6%
  • Other 2.6%