A common problem with empirical studies, especially with multi-company data sets such as the ISBSG Repository, is that researchers often just treat data as numbers. There is little thought to what the data means, or how to prepare data to ensure that comparable things are compared.
A related problem is that data preparation is often not documented carefully. Other researchers then don't know exactly what the analysed data means. Using examples drawn from several studies that used the ISBSG Repository, I will highlight the need for: