Home
Pi1 - Lehrstuhl Praktische Informatik I
Laboratory for Dependable Distributed Systems
University of Mannheim










About


We offer several sets of behavior reports on this site. All behavior reports are generated by CWSandbox. The binaries of the reference set are carfully selected from our malware database and are all labeled by several anti-virus products. Each application data set consist of binaries, which were collected on the internet whithin 24 hours. They are not filtered and are only labeled with the result of Kaspersky Anti-Virus.

If you have any questions please contact Ben Stock.

Data sets

Our reference data set is extracted from our large database of malware binaries maintained at the CWSandbox. The malware binaries have been collected over a period of three years from a variety of sources. From the overall database, we select binaries which have been assigned to a known class of malware by the majority of six independent anti-virus products. We append the overall anti-virus label to the filename of each report. Although anti-virus labels suffer from inconsistency, we expect the selection using different scanners to be reasonable consistent and accurate. To compensate for the skewed distribution of classes, we discard classes with less than 20 samples and restrict the maximum contribution of each class to 300 binaries. The selected malware binaries are then executed and monitored using CWSandbox, resulting in a total of 3.131 behavior reports in MIST format. A listing of the contained malware classes is provided here. You can download the data set in the original CWSandbox encoding, in the sequential CWSandbox and in the MIST encoding.

Malware class#
ADULTBROWSER262
ALLAPLE300
BANCOS48
CASINO140
DORFDO65
EJIK168
FLYSTUDIO33
LDPINCH43
LOOPER209
MAGICCASINO174
PODNUHA300
POSION26
Malware class#
PRONDIALER98
RBOT101
ROTATOR300
SALITY85
SPYGAMES139
SWIZZOR78
VAPSUP45
VIKING_DLL158
VIKING_DZ68
VIRUT202
WOIKOINER50
ZHELATIN41


The application data set consists of seven chunks of malware binaries obtained from the anti-malware vendor Sunbelt Software. The binaries correspond to malware collected during seven consecutive days in August 2009 and originate from a variety of sources. Sunbelt Software uses these very samples to create and update signatures for their VIPRE anti-malware product as well as for their security data feed ThreatTrack. The complete application data set consists of 33.698 behavior reports in MIST format. We also append the results of Kaspersky Anti-Virus - thanks to Virustotal - to the filname of the reports. Statistics for the data set and the characteristics of the contained behavior reports are provided here. The data set can be downloaded in serveral encodings.

Data set description
Collection periodAugust 1-7, 2009
Collection locationSunbelt Software
Data set size (kilobytes)21.808.644
Number of reports33.698

Data set statisticsmin.avg.max.
Reports per day3.7604.8146.746
Instructions per report1511.921103.039
Size per report (kilobytes)16475.783


Downloads


The malware binaries are not accessible over this webinterface. Please contact Philipp Trinius if you are interessted in the malware binaries.


Data SetHashesCWS ReportsSequential ReportsMIST ReportsMalheur
Reference data set download download download download download
Application data set 1 download download download download download
Application data set 2 download download download download download
Application data set 3 download download download download download
Application data set 4 download download download download download
Application data set 5 download download download download download
Application data set 6 download download download download download
Application data set 7 download download download download download