Software Engineering Methodologies

Software Metrics

Michael L. Collard, Ph.D.

Department of Computer Science, The University of Akron

Definitions

Measure - a quantitative indication of the extent, amount, dimension, capacity, or size of some attribute of a product or process

Metric - a quantitative measure of the degree to which a system, component, or process possesses a given attribute. "A handle or guess about a given attribute."

  • Measure, e.g., Number of errors
  • Metric, e.g., Number of errors found per person-hours expended

Why Measure Software?

  • Determine the quality of the current product or process
  • Predict qualities of a product/process
  • Improve the quality of a product/process

Example Metrics

  • Defect rates
  • Error rates
  • Measured by:
  • individual
  • module
  • stage of development
  • Errors should be categorized by origin, type, cost

Metric Classification

  • Products
  • Explicit results of software development activities.
  • Deliverables, documentation, by-products
  • Processes
  • Activities related to the production of software
  • Resources
  • Inputs into the software development activities
  • hardware, knowledge, people

Product vs. Process

  • Process Metrics
  • Insights of process paradigm, software-engineering tasks, work product, or milestones.
  • Lead to long-term process improvement.
  • Product Metrics
  • Assesses the state of the project
  • Track potential risks
  • Uncover problem areas
  • Adjust workflow or tasks
  • Evaluate the team's ability to control quality

Types of Measures

  • Direct Measures internal attributes
  • Cost, effort, LOC, speed, memory
  • Indirect Measures external attributes
  • Functionality, quality, complexity, efficiency, reliability, maintainability

Size Oriented Metrics

  • Size of the software produced
  • Lines Of Code(LOC), KLOC, MLOC
  • Effort measured in person months
  • Errors/KLOC
  • Defects/KLOC
  • Cost/LOC
  • Documentation Pages/KLOC
  • LOC is programmer & language dependent

LOC Issues

  • What about blank lines?
  • What about comments?
  • SLOC - Statement Lines of Code Measurement
  • What about generated code?
  • What about different languages?
  • Do we count all the languages?

LOC Metrics

  • Easy to use
  • Easy to compute
  • Can compute LOC of existing systems
  • Often lose cost and requirements traceability
  • Language & programmer dependent

Function Oriented Metrics

  • Function Point Analysis [Albrecht ’79, ’83]
  • International Function Point Users Group (IFPUG)
  • Indirect measure
  • Derived using empirical relationships based on countable (direct) measures of the software system (domain and requirements)

Computing Functions Points

  • Number of user inputs:
  • Distinct input from the user
  • Number of user outputs:
  • Reports, screens, error messages, etc.
  • Number of user inquiries:
  • Online input that generates some result
  • Number of files:
  • Logical file (database)
  • Number of external interfaces:
  • Data files/connections as an interface to other systems

Compute Function Points

  • FP = Total Count * [0.65 + 0.01 * ∑(Fi)]
  • Total count is all the counts times a weighting factor
  • Each organization determines a weighting factor via empirical data
  • Fi (i = 1 to 14) are complexity adjustment values

Complexity Adjustment

  • Is reliable backup and recovery required?
  • Is networking involved?
  • Is distributed processing used?
  • Is performance critical?
  • Will the system run in an existing, heavily-utilized operational environment?
  • Does the system require online data entry?
  • Does the online data entry require multiple screens or operations?

Complexity Adjustment (cont)

  • Are the master files updated online or offline?
  • Are the inputs, outputs, files, or inquiries complex?
  • Is the internal processing complex?
  • Is the code designed to be reusable?
  • Are conversions and installations included in the design?
  • Is the system designed for multiple installations in different organizations?
  • Is the application designed to facilitate change and ease of use by the user?

Using FP

  • Errors/FP
  • Defects/FP
  • Cost/FP
  • Documentation Pages/FP
  • FP/person month

Using FP

  • FP and LOC metrics are relatively accurate predictors of effort and cost
  • Need a baseline of historical information to use them properly
  • Language dependent
  • Productivity factors: People, problem, process, product, and resources
  • We cannot easily reverse engineer FP from existing systems

Complexity Metrics

  • LOC - a function of complexity
  • Language and programmer-dependent
  • Halstead’s Software Science (entropy measures)

Halstead’s Software Science

  • n1 - number of distinct operators
  • n2 - number of distinct operands
  • N1 - total number of operators
  • N2 - total number of operands

Example

  • Distinct operators: if, (, ), {, }, >, <, =, *, ;
  • Distinct operands: k, 20, 30, x
  • n1 = 10
  • n2 = 4
  • N1 = 13
  • N2 = 7

Halstead's Metrics

  • Experimentally verified [1970s]
  • Length: N = N1 + N2 = 13 + 7 = 20
  • Vocabulary: n = n1 + n2 = 10 + 4 = 14
  • Estimated length: = n1 log2 n1 + n2 log2 n2 = 10 log2(10) + 4 log2(4) = 10 * 3.3 + 4 * 2 = 33 + 8 = 41
    • Close estimate of length for well-structured programs
  • Purity ratio: PR = / N = 41 / 20 = 2

Program Complexity

  • Volume: V = N log2 n = 20 log2 14 = 20 * 3.8 = 76
    • Number of bits to provide a unique designator for each of the n items in the program vocabulary.
  • Program effort: E = V / L = 76 / 20 = 3.8
    • L = V* / V
    • V* is the volume of most compact design implementation
    • A good measure of program understandability

McCabe's Complexity

  • McCabe's metrics are based on a control-flow representation of the program
  • A CFG graph is used to depict the control flow
  • Nodes represent processing tasks (one or more code statements)
  • Edges represent control flow between nodes

Cyclomatic Complexity

  • Set of independent paths through the graph (basis set)
  • V(G) = EN + 2
  • E is the number of flow graph edges
  • N is the number of nodes

Example

CFG

Computing Basis Set

  • Every path through the code is a combination of basis paths
  • A: 1, 7
  • B: 1, 2, 6, 1, 7
  • C: 1, 2, 3, 4, 5, 2, 6, 1, 7
  • D: 1, 2, 3, 5, 2, 6, 1, 7

Formula: V(G) = EN + 2

  • V(G1) = EN + 2
  • V(G1) = 9 – N + 2
  • V(G1) = 9 – 7 + 2
  • V(G1) = 4

Meaning: V(G) = EN + 2

  • V(G) is the number of (enclosed) regions/areas of the planar graph
  • The number of regions increases with the number of decision paths and loops
  • A quantitative measure of testing difficulty and an indication of ultimate reliability

Easier Calculation

  • Number of conditions + 1

Interpretation

Complexity Number Meaning
1 - 10 Structured and well-written code
  High Testability
  Cost and effort is low
10 - 20 Complex Code
  Medium Testability
  Cost and effort is Medium
20 - 40 Very complex Code
  Low Testability
  Cost and Effort are high
> 40 Not at all testable
  Very high Cost and Effort

src

McClure's Complexity Metric

Complexity = C + V

  • C is the number of comparisons in a module
  • is the number of control variables referenced
  • Similar to McCabe's but concerning control variables.

Model for Metrics and Software Quality

  • FURPS
  • Functionality - features of the system
  • Usability – aesthesis, gestalt, documentation
  • Reliability – frequency of failure, security
  • Performance – speed, throughput
  • Supportability – maintainability

FURPS: Functionality

  • features of the system
  • Capability
  • Size & Generality of Feature Set
  • Reusability
  • Compatibility, Interoperability, Portability
  • Security
  • Safety & Exploitability

FURPS: Usability

  • aesthesis, gestalt, documentation
  • Human Factors
  • Aesthetics
  • Consistency
  • Documentation
  • Responsiveness

FURPS: Reliability

  • frequency of failure, security
  • Availability
  • Failure Frequency, Robustness/Durability/Resilience
  • Failure Extent & Time-Length
  • Recoverability/Survivability
  • Predictability
  • Stability
  • Accuracy
  • Frequency/Severity of Error

FURPS: Performance

  • speed, throughput
  • Speed
  • Efficiency
  • Resource Consumption
  • Throughput
  • Capacity
  • Scalability

FURPS: Supportability

  • Serviceability
  • Maintainability
  • Sustainability
  • Testability
  • Flexibility - Modifiability, Configurability, Adaptability, Extensibility, Modularity
  • Installability
  • Localizability