Software Engineering (SE)

Follow a TDD process to implement the project below. An invitation link to create the GitHub Classroom repository is on the Brightspace course page.

Project

The input for a source-code static analysis tool can come from multiple sources, including solitary files (e.g., main.cpp), directories of source-code files (e.g., src/), source-code archives (e.g., project.tar.gz, file.zip), and standard input (i.e., stdin, e.g., std::cin). In addition, solitary files and source-code archives can include a URL, e.g., https://mlcollard.net/fragment.cpp

To perform the code analysis, the source code is wrapped in a single XML element with metadata about the source code, including filename, programming language, url, hash, and LOC (lines of code). For the source code:

The attribute hash is of the source and is a SHA1 160-bit (20 bytes) represented in 40 hex characters, i.e., what Git uses. The xmlns:code="http://mlcollard.net/code" is not an attribute, but an XML namespace declaration, handled automatically by XMLWrapper.

The variety of input sources and program options for the metadata follows the Rules stated below.

Assignment

Implement the following rules following a Test-Driven Development (TDD) approach. Write a test in CodeAnalysisTest.cpp, and implement the minimal code necessary to get it working in CodeAnalysis.cpp, refactor/clean up the code to make it clear and logical, then commit with an appropriate commit message.

All commits must result in a program that compiles and passes all tests. This means you must successfully compile and pass the test program before committing. Any commit that does not build will result in a 0 for that step, and every commit will be checked.

Rules

The minimum LOC is 0. If the option loc is non-negative, add the attribute loc.
If you cannot figure out the attribute language based on any other rule, use the default language.
When the option hash is not empty, use it for the attribute hash.
When the source url is not empty, use it for the attribute url.
When used, the option url has priority over the source url for the attribute url.
The option language has priority over any other way of determining the attribute language.
The option filename has priority over any other way of determining the attribute filename.
For a non-archive source-code file the file extension of the disk filename determines the language, e.g., for the disk filename main.cpp, the language is C++. Use the provided function, filenameToLanguage(), to convert from a filename to a language for the attribute language.
For a file in a source-code archive, the file extension of the entry filename determines the language, e.g., for the entry filename main.cpp, the language is C++. Use the provided function, filenameToLanguage(), to convert from a filename to a language for the attribute language.
For a source-code archive, use the entry filename instead of the disk filename for the attribute filename.
For a non-archive source-code file use the disk filename for the attribute filename.
For standard input (i.e., std::cin) of non-archive source code, the disk filename is a single dash "-" and the entry filename is the literal string "data". In this case, you must use the option filename for the attribute filename.
For standard input (i.e., std::cin) of a source-code archive, the disk filename is a single dash "-". In this case, use the entry filename for the attribute filename.

Error Handling

All error messages are written to standard error (i.e., std::cerr), on their own output line, and the function should return an empty string.

Workflow

Refactoring Guidelines

All implementation should result in code that is as clean and clear as possible. For this case:

Evaluation Environment

The project is built and graded in a Docker container running Ubuntu 22.04 using the GCC compiler. This is the default for GitHub Codespaces and WSL.

On macOS, clang is the default compiler and the one you want to use in most cases. This should not cause a problem with this program. However, before you commit, I recommend ensuring your code can build and pass the tests with GCC and the default compiler clang.

The preset macos-gcc (in the file CMakePresets.json) provides settings to compile with GCC:

Instead of flipping back and forth in the same build directory, use two build directories, one for clang and one for GCC. Be sure to build and test in both build directories before you commit.

CPSC 480-010 Software Engineering (SE) Fall 2023