MSR Approaches for Software Architecture – a Systematic Mapping Study

This is the replication package of the systematic mapping study on MSR approaches for software architecture published in the Information and Software Technology journal in 2025. The title of the study is "Mining Software Repositories for Software Architecture – a Systematic Mapping Study".

The repository contains all the material required for replicating the study, including a complete list of papers, and all the data analysis phases.

The study has been designed, performed, and reported by the following researchers:

- Mohamed Soliman - Paderborn University, Paderborn, Germany
- Michel Albonico - Federal University of Technology, Paraná, IntelAgir - Brazil
- Ivano Malavolta - Vrije Universiteit Amsterdam, S2 Group - The Netherlands
- Andreas Wortmann - University of Stuttgart, ISW - Germany

For any information, interested researchers can email any of the investigators listed above.

Repository Structure

The directory is structured as follows:

scripts/dblp_search.py          Script for initial paper crawling.
data/starting_set.csv           Spreadsheet with the first studies.
data/snowballing.csv            Spreadsheet with the papers from snowballing.
data/thematic_analysis.csv      Spreadsheet with the thematic analysis.
data/system_names.csv           Spreadsheet with the list of systems.
data/graph_generation.xlsx      Spreadsheet used for graph generation.
Petersen - SMS checklist.pdf    Calculation of the quality score of this study according to Petersen et al.

The spreadsheet for graph generation is also available online.

Crawling Papers

If you are interested in replicating our papers' crawling, here are a few details that can help you.

You must be in the scripts folder:

$ cd scripts/

Then, you must download the last DBLP snapshot and extract it in the scripts folder (it's a big file > 4GB), and download the compatible document type definition (DTD) file - usually the first one after the downloaded XML file.

Now, it is time to set everything (snapshot file, year range, etc.) at the beginning of the dblp_search.py file. We plan to use a properties file for this shortly.

Once everything is set, you run the dblp_search.py script, and the papers should be selected (it will be a bit long given the file size).

$ python3 dblp_search.py > papers.csv

The snowballing was manually conducted and assisted with the Zotero tool.

This repository is licensed under MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
data		data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Petersen - SMS checklist.pdf		Petersen - SMS checklist.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSR Approaches for Software Architecture – a Systematic Mapping Study

Repository Structure

Crawling Papers

About

Releases 2

Packages

Contributors 2

Languages

License

S2-group/msr4sa-systematic-mapping-study-rep-pkg

Folders and files

Latest commit

History

Repository files navigation

MSR Approaches for Software Architecture – a Systematic Mapping Study

Repository Structure

Crawling Papers

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages