Martin Shepperd is Professor of Software Technology and
Modelling at Brunel University
London in the Department
of Computer Science. He is also a Fellow of the British
Computer Society.
His research publications can be viewed (and many downloaded) from google scholar or from Researchgate.
He is a member of the Brunel
Software Engineering Laboratory (BSEL). His research
interests include empirical software engineering and machine
learning.
M. Shepperd, C. Mair, and M. Jørgensen, “An Experimental Evaluation of a
De-biasing Intervention for Professional Software Developers,”
in 33rd ACM Software Applications Conference (SAC’18),
Pau, France, 2018.
This paper describes a series of experiments with professional
developers to examine how we could use training to reduce the
impact of the anchoring bias on productivity estimates.
In 2014 in conjunction with Tracy Hall and David Bowes we
published a paper entitled "Researcher
bias: The use of machine learning in software defect prediction"
that describes the results of a meta-analysis of computational
experiments in software engineering that showed that research
group is a far more important explanatory variable than the choice
of algorithm under investigation. This triggered a response
from Tantithamthavorn, McIntosh, Hassan,and Matsumoto, "Comments on
'researcher bias: The use of machine learning in software defect
prediction' " in 2016 where they suggest the effect is less
strong than we propose due to problems of collinearity. We
have responded (in
a recently accepted rejoinder for TSE) indicating that
whilst we appreciate their interest and scientific discourse we
disagree with their arguments because (i) they use a small subset
of our data and (ii) collinearity for categorical variables with a
sparse matrix will almost invariably have some levels that are a
linear combination of others. In such cases the corresponding
coefficients are not estimated, but the remaining estimates are
not affected unlike in the case of collinearity for continuous
predictors.
Recently I have been working with my former PhD student Dr Boyce Sigweni on realistic ways to validate software project prediction systems. Our initial ideas were published at EASE 2016 and we were very pleased to be awarded a prize for the best short paper. Our point is cross-validation techniques that fail to account for time are biased in that they consistently underestimate the true variance and measures of location for the model errors. Our approach is called Grow-One-At-a-Time (GOAT)!
Our work on replication and blind analysis with Dr Davide Fucci (Oulu) as the lead author recently received the Best Full Paper Award at ESEM 2016.
D. Fucci, G. Scanniello, S. Romano, M. Shepperd, B. Sigweni, F. Uyaguari, et al., "An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach," 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Ciudad Real, Spain, 2016.Recently I've completed an analysis of imbalanced learners for dealing with highly imbalanced software defect data sets. This work was with Prof Qinbao Song and Yuchen Guo (from the Xi'an Jiaotong University). The paper "A Comprehensive Investigation of the Role of Imbalanced Learning for Software Defect Prediction" is under review, however a draft is available. Comments welcome.
Click here for details of 2018-19 FYPs.
For examples of problematic empirical research we can study the careers of Dr Johnny Researcher and his colleague Prof Suzie Important-Person incuding two of their recent publications. The first [1] shows some nice data trawling and statistical mumbo jumbo, whilst [2] is a classic work of flawed and under-powered experimental design.
[1] Researcher, J. and Important-Person, S., "An Empirical Study of How X47 is a Strong Determinant of Y", IEEE Transactions on Sophistry, Vol. 11, No. 4, December 2015
Email: [first name] DOT [surname] @brunel.ac.uk
Twitter: @ProfMShepperd
Blog: empiricalsoftwareengineering.wordpress.com