SIEVE: Cybersecurity Log Dataset Collection for SIEM Event Classification
For a detailed description of the dataset, refer to:
P. Artioli, G. Pellegrini, A. Magrì, V. Dentamaro, S. Galantucci, G. Semeraro : “SIEVE: Generating a Cybersecurity Log Dataset Collection for SIEM Event Classification, Computer Networks [LINK TO DOI].
Data set format
The datasets are in CSV format to facilitate immediate use in machine learning pipelines with the following header columns:
- category: the categorization field that captures the action taken as it was described by the source (e.g., authentication-success, http-request-success, process-started, user-deletion)
- log: The raw log entry
The datasets included 30 balanced event classes manually assigned by a panel of cybersecurity experts using the Elastic Common Schema event categorization guidelines. To achieve general consensus and avoid conflicts, the experts performed two rounds of blind reevaluation on 20 percent of the randomly sampled patterns, resulting in a Krippendorff alpha (substantial agreement) score of 0.82.
To access the SIEVE dataset, send an email request to sieve.requests@bvtech.com containing:
- Your name and contact information
- Your affiliation (university, research institution, or company)
- A brief description of the intended use of the dataset
- Confirm that you will cite the source of the SIEVE dataset in any resulting publications or applications as follows:
P. Artioli, G. Pellegrini, A. Magrì, V. Dentamaro, S. Galantucci, G. Semeraro : “SIEVE: Generating a Cybersecurity Log Dataset Collection for SIEM Event Classification, Computer Networks [LINK TO DOI].
GROTTAGLIE:
Corso Europa, 3
74023 Grottaglie (TA)
Tel.: +39.02.8596171
Fax: +39.02.89093321
RUTIGLIANO:
S.P. 84 Adelfia-Rutigliano, C.da Caggiano
70018 Rutigliano (BA)
Tel.: +39.02.8596171
Fax: +39.02.89093321





Project funded by the European Regional Development Fund Puglia POR Puglia 2014 - 2020 - Axis I - Specific Objective 1a - Action 1.1 (R&D), and with the support of the University of Bari and the Massachusetts Institute of Technology (MIT).