Skip to content

S2-group/msr4sa-systematic-mapping-study-rep-pkg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MSR Approaches for Software Architecture – a Systematic Mapping Study

DOI

This is the replication package of the systematic mapping study on MSR approaches for software architecture published in the Information and Software Technology journal in 2025. The title of the study is "Mining Software Repositories for Software Architecture – a Systematic Mapping Study".

The repository contains all the material required for replicating the study, including a complete list of papers, and all the data analysis phases.

The study has been designed, performed, and reported by the following researchers:

- Mohamed Soliman - Paderborn University, Paderborn, Germany
- Michel Albonico - Federal University of Technology, Paraná, IntelAgir - Brazil
- Ivano Malavolta - Vrije Universiteit Amsterdam, S2 Group - The Netherlands
- Andreas Wortmann - University of Stuttgart, ISW - Germany

For any information, interested researchers can email any of the investigators listed above.

Repository Structure

The directory is structured as follows:

scripts/dblp_search.py          Script for initial paper crawling.
data/starting_set.csv           Spreadsheet with the first studies.
data/snowballing.csv            Spreadsheet with the papers from snowballing.
data/thematic_analysis.csv      Spreadsheet with the thematic analysis.
data/system_names.csv           Spreadsheet with the list of systems.
data/graph_generation.xlsx      Spreadsheet used for graph generation.
Petersen - SMS checklist.pdf    Calculation of the quality score of this study according to Petersen et al.

The spreadsheet for graph generation is also available online.

Crawling Papers

If you are interested in replicating our papers' crawling, here are a few details that can help you.

You must be in the scripts folder:

$ cd scripts/

Then, you must download the last DBLP snapshot and extract it in the scripts folder (it's a big file > 4GB), and download the compatible document type definition (DTD) file - usually the first one after the downloaded XML file.

Now, it is time to set everything (snapshot file, year range, etc.) at the beginning of the dblp_search.py file. We plan to use a properties file for this shortly.

Once everything is set, you run the dblp_search.py script, and the papers should be selected (it will be a bit long given the file size).

$ python3 dblp_search.py > papers.csv

The snowballing was manually conducted and assisted with the Zotero tool.


This repository is licensed under MIT license.