Software Engineering Methodologies

Refactoring

Michael L. Collard, Ph.D.

Department of Computer Science, The University of Akron

Why Design Must Change

  • UML inconsistencies, e.g., model behind class diagram and model behind sequence diagram
  • UML errors, i.e., invalid UML
  • Missing parts of the design
  • Growing experience with UML
  • Feedback from other developers
  • Improvements: e.g., better identifier names, decoupling, improving cohesion
  • Added features

When Does Design Change?

  • During the "design-only" period
  • After release to start development (coding)
  • After development (coding)

Factors for Successful Design Change

  • Multiple passes before development (coding)
  • Feedback from multiple perspectives
  • Format and tools used to create/edit UML
  • Design integrated with implementation phases
  • Adding features as an integrated part of the process

Mathematics: Factor

  • fac.tor
    • One of two or more quantities that divides a given quantity without a remainder, e.g., 2 and 3 are factors of 6; a and b are factors of ab
    • Multiple equivalent forms: 2 × 3 = 6 and 1 × 6 = 6, x^2 - 4 = (x - 2)(x + 2)
    • The best form depends on the problem to solve
  • fac.tor.ing
    • "To determine or indicate explicitly the factors of"

SE: Factoring

  • fac.tor
    • The individual items that combined form a complete software system:
      • identifiers
      • contents of methods
      • contents of classes
      • relationship of a class to other classes
    • Many different ways to factor a software system
  • fac.tor.ing
    • Determining the items, at design time, that make up a software system

Restructuring: [Opdyke92]

A program restructuring operation to support the design, evolution, and reuse of object-oriented frameworks that preserve the behavioural aspects of the program

Refactoring: [Fowler99]

Process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure

  • Does not change the behavior of the program
  • Type of software transformation
  • Source-to-source, remain inside the same language, e.g., Java to Java, C++ to C++
  • Originally designed for object-oriented languages, but can also apply to other programming paradigms, e.g., functions

Testing Requirements

Process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure

  • Testing must be in place to make sure that the behavior of the software does not change
  • Requires unit testing
  • No refactoring (or changes of any kind) before unit testing is in place
  • During refactoring, the current, expected behavior of the software must not change

Legacy Code

Process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure

  • Often applied to old systems with "old" technology (e.g., COBOL)
  • There is a great deal of software behavior that is:
    • Not stated (or clearly stated) in written policies
    • No code documentation
    • And even no test suite
  • Any code that does not have a unit-testing suite is considered legacy code

Relationship to Process

  • Not the same as "cleaning up code" (which may cause changes to the behavior of the program)
    • Note: "Cleaning up code" is not effective
  • Changes to a small context (e.g., individual method) or the entire program
  • An essential element of most workflows and processes in agile
  • Viewpoint is that design is part of an iterative process

Refactoring: Rename Method

Refactoring: Rename Method Timeline

  1. Copy the declaration/definition of the old method to the new method
  2. Compile. You can also commit at this point.
  3. Change the body of the old method to call the new method
  4. Compile and be sure to commit
  5. Find all calls to the old method and replace them with calls to the new method
  6. Compile after each change. Commit at least once.
  7. Remove the old method declaration and definition (if not a public interface)
  8. Compile and commit

Some Examples

  • Introduce Explaining Variable
  • Rename Method
  • Move Method
  • Pullup Method
  • Change Value to Reference
  • Remove Parameter
  • Extract Hierarchy

Refactoring: Split Temporary Variable

Refactoring: Pull-Up Method

  • Methods with identical results in subclasses
  • Move them to the superclass

Refactoring: Replace Inheritance with Delegation (UML)

  • A subclass uses only part of a superclass interface or does not want to inherit data.
  • Create a field for the superclass, adjust methods to delegate to the superclass, and remove the subclassing

Refactoring: Replace Inheritance with Delegation (Code)

Levels of Software Changes

  • High Level
    • Features to be added to a system
    • e.g., New feature
  • Intermediate Level
    • Change design (factoring)
    • e.g., Move a member function
  • Low Level
    • Change lines of code
    • e.g., Changes in (at least) two classes

Why: Design Preservation

  • Code changes often lead to a loss of the original design
  • Loss of design is cumulative:
  • Difficulty comprehending design → Difficulty preserving design → Design decays more rapidly → Difficulty comprehending design
  • Refactoring improves the design of existing code

Why: Comprehension

  • Developers are most concerned with getting the program to work, not about future developers
  • Refactoring makes existing code more readable
  • Increases comprehension of existing code, leading to higher levels of code comprehension
  • Often applied in stages

Why: Debugging

  • Greater program comprehension leads to easier debugging
  • Increased readability leads to the discovery of possible errors
  • Developers can put back into the code the understanding gained during debugging

Why: Faster Programming

  • A counterintuitive argument made by Fowler
  • Good design is essential for rapid development
  • A poor design allows for quick progress but soon slows the process down
  • Spend time debugging
  • Changes take longer as you understand the system and find duplicate code
  • Supported by Lehman's laws

When: During Activities Such As

  • Adding Functionality
  • Comprehension of an existing program
  • Preparation for additional functionality
  • Debugging
  • Code Review
  • Preparation for suggestions to other programmers
  • Looking for ideas to improve
  • Becomes the primary granularity of change in a system

Catalog

  • Collected by Fowler
  • Refactoring entry composed of:
    • Name
    • Summary
    • Motivation
    • Mechanics
    • Examples
  • Recent edition examples are in Javascript

Categories

  • Composing Methods
  • Organizing Data
  • Moving Features Between Objects
  • Simplifying Conditional Expressions
  • Simplifying Method Calls
  • Dealing with Generalization
  • Big Refactorings

Composing Methods: Extract Method

Composing Methods: Inline Method

Replace Nested Conditional with Guard Clauses

Inline Temp

Replace Temp with Query

Fowler's Catalog

  • List of each refactoring and links to individual refactorings
  • Primary name - second edition
  • Secondary names - first edition
  • Code examples are in Javascript
  • May not include all second-edition refactorings

Categories

  • Composing Methods
  • Organizing Data
  • Moving Features Between Objects
  • Simplifying Conditional Expressions
  • Simplifying Method Calls
  • Dealing with Generalization
  • Big Refactorings

Composing Methods

Simplifying Conditional Expressions

Organizing Data I

Organizing Data II

Moving Object Features

Simplifying Method Calls I

Simplifying Method Calls II

Dealing with Generalization

Dealing with Generalization II

Big Refactorings

  • Tease Apart Inheritance
  • Split an inheritance hierarchy that is doing two jobs at once
  • Convert Procedural Design to Objects
  • Separate Domain from Presentation
  • GUI classes that contain domain logic
  • Extract Hierarchy
  • Create a hierarchy of classes from a single class where the class contains many conditional statements

Convert Procedural Design

  1. Take each record type and turn it into a dumb data object with accessors
  2. Take all procedural code and put it into a single class
  3. Take each long method and apply Extract Method and related refactorings to break it down. As you break down the procedures, use Move Method to move each one to the appropriate dumb data class
  4. Continue until you have removed all behavior from the original class

Extract Method

  1. Create a new method and name it after the intention of the method (name it by what it does, not by how it does it)
  2. Copy the extracted code from the source method into the new target method
  3. Scan the extracted code for references to any variables that are local in scope to the source method

Extract Method II

  1. See whether any temporary variables are used only within this extracted code. If so, declare them in the target method as temporary variables
  2. Look to see whether any local-scope variables are modified by the existing code (See Split Temporary Variable and Replace Temp with Query)
  3. Pass into the target method as parameters local scope variables that are read from the extracted code

Extract Method III

  1. Compile when you have dealt with all the locally-scoped variables
  2. Replace the extracted code in the source method with a call to the target method
  3. Compile and test

Challenges

  • Preservation of documentary structure (comments, white space)
  • Processed code (C, C++)
  • Integration with test suite
  • Discovery of possible refactorings
  • Creation of task-specific refactorings

Database Refactoring

  • Change database schema
  • To prepare for new fields, remove unused parts, and better organization
  • Problem: It is often tricky to make schema changes

Changing Published Interfaces

  • Interfaces where you do not control all of the source code that uses the interface
  • Must support (for at least a time period) both old and new interfaces
  • Do not publish interfaces unless you have to