||The Software Evolution Repository (SER) public area collects and makes readily available relevant artifacts to carry out research and education in software evolution. SER is associated and maintained by the SOCCER Lboratory. As collecting software evolution artifacts is very costly, the SER goal is to share and thus minimize the cost of running experiments. The public area contains hundreds of artifacts of many C, C++, and Java software systems, such as Mozilla, Eclipse, JBoss, together with excerpts from bug tracking systems.
This area contains data from open source programs such as Linux, Mozilla or Eclipse. Depending from the data sets various artifacts are made available including releases, bug reports, snapshots, reverse engineered class diagrams or metrics.
Reverse engieered class diagrams are usually in one of the two formats: plain graph and AOL; see the papers:
Giuliano Antoniol, Gerardo Casazza, Massimiliano Di Penta, and Roberto Fiutem. Object-Oriented Design Patterns Recovery
. Journal of Systems and Software, 59:181-196, November 2001
Segla Kpodjedo, Filippo Ricca, Philippe Galinier, Giuliano Antoniol, and Yann-Gael Gueheneuc. Studying Software Evolution of Large Object-oriented Systems using an ETGM algorithm
. Journal of Software Maintenance - Research and Practice, Sept 2010
Eclipse relases, snapshots (bi-weekly) and bug reports (2006-2009) resolved and fixed (parsed and row)
Stable and unstable from 1.0 to 2.4.X
Mozilla relases, snapshots (bi-weekly) and bug reports resolved and fixed (parsed and row), reverse engineered class diagram are also available.
This technical reports details techniques to generate test input data firing divide by zero exception in Java code
violation complete data analysis
This technical reports
reports the complete analysis (figures, data, tables, etc) for the
server-before-client violations. The analysis has been perfomed on three Java systems Xerces, Ant adn ArgoUML, two release each
thus overall six releases.
Data from papers and technical reports
Null pointer exceptions (NPE) are a common problem, these are data from the paper: Daniele Romano, Massimiliano Di Penta, and Giuliano Antoniol. An Approach for Search Based Testing of Null Pointer Exceptions. In 4th IEEE International Conference on Software Testing, Verification and Validation (ICST 2011), Berlin, Germany, pages 160-169, June, 22-24 2011
TIDIER is an advance identifier split technique; the dataset contains identifiers, dictionaries and a manually defined oracle; see also: Latifa Guerrouj, Massimiliano Di Penta, Giuliano Antoniol, and Yann-Gael Gueheneuc. TIDIER: An Identifier Splitting Approach using Speech Recognition Techniques. Journal of Software Maintenance and Evolution: Research and Practice (JSME), 2011
Mutation testing allow to produce high quality test inout data, here the mutants of the sample program used or GECCO 2007 ANT colony paper are made available; see:
Kamel Ayari, Salah Bouktif, and Giuliano Antoniol. Automatic Mutation Test Input Data Generation via Ant Colony. In , volume GECCO 2007, SBSE track, London, UK, pages 1074-1081
Non protected exception raising data see: Neelesh Bhattacharya, Abdelilah Sakti, Giuliano Antoniol, Yann-Gael Gueheneuc, and Gilles Pesant. Divide-by-zero Exceptions Raising via Branch Coverage. In Proceedings of the 3rd International Symposium on Search-based Software Engineering (SSBSE), September 2011
MC/DC is a testing coverage criteria imposed by standards such as DOI-178B; these data are from the paper: Zeina Awedikian, Kamel Ayari, and Giuliano Antoniol. MC/DC automatic test input data generation. In GECCO, ontreal, Quebec, Canada, July 8-12, pages 1657-1664, July 2009
Renaming identifires, methods or classes is a common proctice in software evolution; this dataset is from the paper: Laleh Eshkevari, Venera Arnaoudova, Massimiliano Di Penta, Rocco Oliveto, Yann-Gael Gueheneuc, and Giuliano Antoniol. An Exploratory Study of Identifier Renamings. In Proceedings of the Working Conference on Mining Software Repositories (MSR), Honolulu, Hawaii, pages 33-42, May, 21-22 2011
This datasets contains several releases (including the reverse engineered class diagram) of the Mozilla ECMA Script engine (Rhino); manually tagged bug data are also available via M. Eaddy WEB site.
This replication package contains all the information concerning the study of the context impact of
identifier split and expansion. All the artifacts user in the empirical study are available: source code, oracle as well as questionnaires.
This replication package contains all the information concerning the study of identifiers split and expansion.
Compared tool include our Samurai re-implementation, TIDIER as well as TRIS our fastest and more accurate
identifier expansion tool. The package is made by source code, dictionaries, oracle as well as frequency tables.
This replication package contains all the information concerning the study of the context impact Anti Patterns (AP) on
All the artifacts user in the empirical study are available: source code, AP and refactoring examples.
Renaming identifires, methods or classes is a common proctice in software evolution; this dataset supersedes the older reanaming dataset.
The dataset contains the evolution history of five open source Java projects used to evaluates REPENT accuracy and completeness. On the
available data REPENT exhibits a precision of about 88%, and a recall of 79%. The tarball contains all the artifacts useful for a replication study.
- WordPress Analysis Replication Data
This replication package contains all the information concerning the study of the WordPress interferences.
All the detected interferences, static and dynamic data are available. WordPress 3.6 and 3.7 as well as plugins source code,
is not contained in the package as it can be downloaded from the net.
- Code Reviews Analysis Replication Data
This replication package contains all the information concerning the study of code review.
It is composed by data extracted mining the Gerrit repository of six Java open source projects.
- Android energy data - Replication Data
This replication package contains all the information concerning the study of Android energy data.
It is composed by data extracted measuring energy on the BeagleBone board and the app code.
- Android ADAGO - Replication Data
This replication package contains all the information concerning the study of Android ADAGO recommendation system.
It is composed by eight category of aplications, 144 applications and execution information. For more details see the
ADAGO technical report.
Tons of Java applications: It contains almost 50.000 open source projects (mainly in Java) comprising around 4 million source files and another 4 million binary artefacts.
Testing related resources
Software-artifact Infrastructure Repository (SIR) for experimentation.
PROMISE org and repository
Several data sets for defect prediction, effort estimation, text mining and so on.