Relativity in Software Engineering Measurements
Posted On August 21, 2008 by Sneha Latha filed under
Relevance of the term “relativity” in software engineering measurements would appear strange as it is closely associated with the famous 20th century physicist Albert Einstein, for his landmark work on the theory of relativity. The basic principle behind the theory of relativity is that all physical measurements are not absolute but relative. A similar phenomenon has been observed during the measurement of various attributes of software products, processes or projects. The value for the number of lines of code (LOC), owing to the absence of a standard definition, differs from language to language. Similarly, Halstead’s software science metrics varies depending on the implementation language used.
Introduction
You cannot control what you cannot measure. Measurement is a necessity for evaluating and hence controlling various aspects of software and software related activities, like quality of the resulting software product under construction or the productivity of the project or improvement of the software processes.
All streams of engineering and physical sciences make use of measurements to quantify various characteristics of their products. Similarly a need for measures to appraise various aspects of software product, processes and projects has been realized to make it an exact science. Since software has no physical attributes, conventional measures are of no use and accordingly a number of measures called metrics have been defined to quantify things like the size, complexity, reliability of a software product and other aspects pertaining to the activities in software development.
Some of these characteristics can be measured directly while those that cannot be measured directly have indirect measures. In case of a software product, metrics simply provide the scale for quantifying qualities. Actual measurement must be performed on a given software system in order to use metrics for quantifying characteristics of the given software.
Classification of software measurement
In software measurement activity there are three classes of entities of interest.
1. Processes: Any software related activity that takes place over time;
2. Products: Any artifact, deliverable or documents that arises out of the process; and
3. Resources: Items that are inputs to processes. There is a distinction between attributes of these, which are internal and external [6].
· Internal attributes: of a product, process or resource are those that can be measured purely in terms of the product, process or resource itself. For example, length is an internal attribute of any software document, while elapsed time is an internal attribute of any software process; and
· External attributes: of a product, process or resource are those that can be measured only with respect to how the product, process or resource relates to other entities in the environment. For example, reliability of a program (a product itself) is dependent not just on the program itself but on the compiler, machine and user. Productivity is an external attribute of a resource, namely people (either as individual or groups). It is clearly dependent on many aspects of the processes and the quality of products delivered.
Software managers and software users would like to measure and predict external attributes. Unfortunately, these can only be measured indirectly. For example, the productivity of personnel is most commonly measured as a ratio of size of code delivered (an internal product attribute) and effort (an internal process attribute). The problem with this over-simplistic measure of productivity has been well documented. Also, the “quality” of a software system (a very high level external product attribute) and size measured by KLOC [2], while reasonable for developers this measure of quality cannot be said to be a valid measure from the point of view of the user [6].
What is measurement?
Measurement is defined as the process by which a number or symbols are assigned to attributes of entities in the real world in such a way as to describe them, accordingly to clearly defined rules [3], [4]. An activity may be an object, such as a person or a software specification or an event like a journey or the testing phase of a software project. An attribute is a feature or property of the entity, such as the height or blood pressure of a person, length or functionality (of a specification), cost (of journey) or duration (of the testing phase).
In order that the measured values are unambiguous, a model can be defined for the entities being measured. The model reflects a specific viewpoint. The need for good models is particularly relevant in software engineering measurement. For example, even as simple a measure of length of programs as lines of code (LOC) requires a well-defined model of programs that would enable us to identify unique lines unambiguously. Similarly, for a measure of the effort spent on, say the unit listing process, we would need an agreed “model” of the process that at least makes clear when the process begins and ends.
There are two broad types of measurements, direct and indirect. Direct measurement of an attribute is measurement that does not depend on the measurement of any other attribute. Indirect measurement of an attribute is measurement that involves the measurement of one or more attributes. It turns out that while some attributes can be measured directly, we get more sophisticated measurement if we measure indirectly [6].
Uses of measurements: Assessment and prediction
There are two broad uses of measurements – for assessment and for prediction. Predictive measurement of an attribute A will generally depend on a mathematical model relating A to some existing measures of attributes like A1, A2…An.
Accurate predictive measurement is inevitably dependent on careful measurement of attributes A1, A2…An. For example, an accurate estimate of the project resources cannot be obtained by simply applying a cost estimation model with fixed parameters. However, careful measurement of key attributes of completed projects could lead to accurate resource predictions for future projects. Similarly, it is possible to get accurate predictions of the reliability of software in operation but these again depend on careful data collection relating to failure times during alpha-testing [5].
For predictive measurement, the model alone is not sufficient. Additionally, we need to define the procedures for determining model parameters and interpreting the results.
Measurement activities must have clear objectives
The basic definition of measurement suggests that any measurement activity must proceed with very clear objectives or goals. First, you need to know whether you want to measure for assessment or for prediction. Next, you need to know exactly which entities are the subject of interest and then you need to decide which attributes of the chosen entities are the really significant ones. The definition of measurement makes clear the need to specify both an entity and an attribute before any measurement can be undertaken [6].
Relativity in software measurements
Software measurement, like measurement in any other discipline, must adhere to the science of measurement if it is to gain widespread acceptance and validity. Another factor is the standardization of software measures causing no confusion or ambiguity like in LOC. Depending on how you count the lines of source code, you are likely to get a different value for LOC. Similar concerns have been raised by Churcher and Shepperd [9] and Martin Hitz and Behzad Montazeri [10] with the proposal of Chidamber and Kemerer’s Metrics Suite [11] for Object-Oriented Design. Similarly, the use of syntax sensitive metrics like Halstead’s software science metric [7] will give different values for the same implementation, depending upon the implementation language that is used. Hallstead shows that the length of the program can be approximated by the expression:
N=n1 log2 n1+n2 log2 n2
And program volume V may be defined as:
V=N log2 n, where
n1 is the distinct operators, n2 is the number of distinct operands and N=n1+n2
It should be noted that V will vary with the programming language and represents the volume of information (in bits) required to specify a program [8].
The phenomenon of relativity in software measurement occurs because of a number of reasons. Metrics could play a significant role in enhancing the quality of software products, by improving the productivity and accordingly tuning the software processes. At the same time, unless care is taken, they can prove to be harmful too.
Reasons other than language/platform differences causing relativity in software measurements are misinterpretation of data, improper/incomplete definition of metrics, lack of communication and training of individuals employed for collecting metrics data and use of commercial tools to collect metrics. In the following sections, we show some of the situations where this can have a negative impact on the metrics program.
Misinterpretation of metrics data
Misinterpretation of data can lead one to draw wrong conclusions. For example, if a programmer’s defect density increases despite quality improvement effort, he/she might wrongly conclude that improvements are doing more harm than good and may revert to old ways of working.
Improper/incomplete metrics definition
Vague or ambiguous metric definitions cause different practitioners to interpret differently. For example, time spent on fixing a bug found during testing may be classified as test effort by one person, as coding effort by another and as rework by a third. Same concerns have been raised by Churcher and Shepperd [9] in the CK Metrics suite [11].
Lack of communication and training
Lack of communication and training may lead to problems if the participants in the metrics program do not understand what exactly they are supposed to do. If people responsible for collecting metrics data do not understand the measurements and haven’t been trained to perform their tasks, the data they collect will not be reliable.
Tools being used to collect metrics data
Most organizations make use of automated tools supplied by different vendors for collecting and interpreting data. Depending on the tool you are using and how a specific metric has been implemented in that particular tool, metrics values are likely to differ from tool to tool supplied by different vendors.
From the above observations, it should be clear that where implementation of a sensible metrics program can help manage software projects and organizations in an effective manner, care has to be taken to negate the negative impact caused by the relativity phenomenon in software measurements.
Conclusions
Induction of software metrics program in a software development organization is a good step to raise the quality of software products, increase productivity and improve software processes. There are hundreds of aspects of software products, projects and processes that can be measured to get very valuable information about the same. However, care has to be taken while deciding as to what should be measured and what should not be measured. Because of the inherent phenomenon of relativity in software measurement, inferences are likely to differ from language to language or person to person and project to project.
The phenomenon of relativity in software measurements can be harmful if proper care is not taking while implementing a metrics program. One should remember that an inverse relationship exits between quality and quantity unless care is taken. If the collected metrics, such as LOC, is used to measure the productivity of a programmer and decisions taken accordingly for reward or punishment, it could be fatal as the person generating lesser LOC will be forced to change his/her behavior, thereby reducing the quality of the resulting code.
Therefore, while initiating a software metrics plan, utmost attention has to be paid to evaluate the psychological impact it may have on the employees of the organization. In software measurement, it is a must that all the measurements have sufficient scientific basis and are free from the effect of relativity and ambiguity. For this, we need to have a framework of standard measures to make a metric program in an organization successful.
References:
1. Fenton N. E. “Software Metrics: A Rigorous Approach:” London Chapman & Hall, 1991.
2. Inglis J. “Standard Software Quality Metrics” AT&T Tech. Journal Vol. 65, no. 2 pp 113-118, 1985.
3. L. Finkelstein, “A Review of The Fundamental concepts measurement”, Measurement, Vol. 2 No. pp 25-34, 1984.
4. E. S. Roberts, “Measurement Theory with Applications to Decision Making, Utility and Social Science”, Reading M. A.: Addison Wesley, 1979.
5. S. Brocklehurst, P. V. Chan, B. Littlewood et. al.” Recalibrating Software Reliability Models” IEEE Transactions on Software Engineering. Vol. 16 No. 4 pp 458-470 Apr. 1990.
6. Fenton N. “Software Measurement: A Necessary Scientific Basis”, IEEE Transactions on Software Engineering Vol. No. 3 March 1994.
7. Halstead M., Elements of Software Science, North Holland 1977.
8. Pressman S. Roger, Software Engineering, A Practitioner’s Approach 5th edition, Mc GrawHill. 2001.
9. Churcher N. I. and M. J. Shepperd, “Comments on ‘A Metric Suite for Object-Oriented Design’, IEEE Transaction on Software Engineering, Vol. 21, no.3, pp 263-265, Mar, 1995.
10. Martin Hitz and Behzad Montazeri, Chidamber and Kemerer’s Metrics Suite: A Measurement Theory Perspective, IEEE Transactions on Software Engineering, Vol. 22, No. 4 April 1996.
11. S. R. Chidamber and C. F. Kemerer, “A Metrics Suite for Object-Oriented Design”, IEEE Transaction on Software Engineering, vol. 20, no.6, pp 476-493, June 1994.
R. K. Pandey works at the University Institute of Computer Science and Applications. He can be reached at: rkpandey18@rediffmail.com. Vinay Tiwari is available at Vinaytiwari999@yahoo.com.
