Professor Martin Shepperd

Photo of Martin Shepperd

Martin Shepperd is Professor of Software Technology and Modelling at Brunel University London in the Department of Computer Science. I am also a Fellow of the British Computer Society.

My research publications can be viewed (and many downloaded) from google scholar or from Researchgate.

I am member of the Brunel Software Engineering Laboratory (BSEL).  My research interests include empirical software engineering and machine learning. In 2014 in conjunction with Tracy Hall and David Bowes we published a paper entitled "Researcher bias: The use of machine learning in software defect prediction" that describes the results of a meta-analysis of computational experiments in software engineering that showed that research group is a far more important explanatory variable than the choice of algorithm under investigation.  This triggered a response from Tantithamthavorn, McIntosh, Hassan,and Matsumoto, "Comments on 'researcher bias: The use of machine learning in software defect prediction' " in 2016 where they suggest the effect is less strong than we propose due to problems of collinearity.  We have responded (in a recently accepted rejoinder for TSE) indicating that whilst we appreciate their interest and scientific discourse we disagree with their arguments because (i) they use a small subset of our data and (ii) collinearity for categorical variables with a sparse matrix will almost invariably have some levels that are a linear combination of others. In such cases the corresponding coefficients are not estimated, but the remaining estimates are not affected unlike in the case of collinearity for continuous predictors.

Recently I have been working with my former PhD student Dr Boyce Sigweni on realistic ways to validate software project prediction systems. Our initial ideas were published at EASE 2016 and we were very pleased to be awarded a prize for the best short paper. Our point is cross-validation techniques that fail to account for time are biased in that they consistently underestimate the true variance and measures of location for the model errors. Our approach is called Grow-One-At-a-Time (GOAT)!

Our work on replication and blind analysis with Dr Davide Fucci (Oulu) as the lead author recently received the Best Full Paper Award at ESEM 2016.

D. Fucci, G. Scanniello, S. Romano, M. Shepperd, B. Sigweni, F. Uyaguari, et al., "An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach," 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Ciudad Real, Spain, 2016.
Recently I've completed an analysis of imbalanced learners for dealing with highly imbalanced software defect data sets. This work was with Prof Qinbao Song and Yuchen Guo (from the Xi'an Jiaotong University).  The paper "A Comprehensive Investigation of the Role of Imbalanced Learning for Software Defect Prediction" is under review, however a draft is available.  Comments welcome.


Click here for details of 2017-18 FYPs.

Research Grants

My recent EPSRC project Comparing Software Defect Classifiers Correctly explores the impact of using statistically meaningful measures such as the φ correlation coefficient, otherwise known as the Matthews correlation coefficient. The impact of using such statistics will be evaluated on previously published studies and recommendations made available to the research community.
I was principal investigator for the EPSRC funded project MeLLow. Click here for the workshop webpages.

Upcoming Conferences

I am serving on the programme committee for:

Refereeing and Editorial Work

I'm a reviewer for a number of journals some of which are listed by publons I'm also Associate Editor of the Springer journal Empirical Software Engineering and a Board Member for the Elsevier journal Information & Software Technology.

Problematic Science

For examples of problematic empirical research we can study the careers of Dr Johnny Researcher and his colleague Prof Suzie Important-Person incuding two of their recent publications. The first [1] shows some nice data trawling and statistical mumbo jumbo, whilst [2] is a classic work of flawed and under-powered experimental design.

[1] Researcher, J. and Important-Person, S., "An Empirical Study of How X47 is a Strong Determinant of Y", IEEE Transactions on Sophistry, Vol. 11, No. 4, December 2015
[2] Researcher, J. and Important-Person, S., "Experimental Investigations of Programming Productivity Factors IEEE Transactions on Nescient Research, Vol. 11, No. 4, April 2016.

Prof. Martin Shepperd
Dept. of Computer Science
Room SJ023b
St John's Building
Brunel University London
Uxbridge, UB8 3PH
United Kingdom

Email: [first name] DOT [surname]
Twitter: @ProfMShepperd

Last updated: 20.7.2017