Project 1: srcFacts

Note: For the demo example, the number of return statements is 14516, and the number of literal strings is 21840. The total number of comments is 18938, with the majority, 18389, as block comments and only 9 as line comments. The demo is almost all C code.

Note: Repository with small, focused srcMLExamples

One approach to understand the source code of a system is to count things, e.g., the number of statements, functions, classes, expressions, LOC (Lines Of Code), etc. The program srcFacts produces a report of these counts.

It is very difficult for a program to directly parse C++ source code, i.e., identify the syntactic parts of the code. So the srcFacts program does not parse the C++ source code. It uses another tool, srcML, to convert the C++ source code into an XML format. The srcFacts program inputs this XML format, parses the XML collecting counts as it does, and produces the report.

The srcFacts program is one large, main() function using almost no design features. It does have high scalability and performance as it is quite fast and, for example, can produce a report on the entire Linux kernel in under 20 seconds. However, in its current form, it is not easy to debug the XML parsing, add additional program element counts, or even understand what is going on. It includes code for parsing all parts of XML, but it would be difficult to adapt the XML parsing code for another purpose, e.g., another report. The only way to reuse this code is by copy/paste reuse. It has no modularity, extensibility, or reusability, and specifically no maintainability.

Your task is to take the srcFacts code, improve the overall design, and extract the XML parsing part of the code. When completed, a set of functions in the files xml_parser.hpp and xml_parser.cpp handle as much of the XML parsing as it can, while the main program, srcFacts.cpp, generates the report using the XML parser functions for parsing.

The steps (in order) with the associated (git) tag are:

Tag “v1a” Move the refillBuffer() function into a separate refillBuffer.hpp and refillBuffer.cpp files. Due Feb 5

Tag “v1b” Without changing the design, add the code necessary for the report to include the number of return statements and the number of literal strings. Due Feb 5

Tag “v1c” Extract a set of (free) functions to handle the low-level details of the XML parsing. Each commit can only include a single extracted function (i.e., extract one at a time). Due Feb 15

Tag “v1d” Add the necessary code to count the number of line comments. Due Feb 15

Your repository for this project is through GitHub Classroom. A link to create the repository is available on the Brightspace course page. Make sure to login to the GitHub account that you are going to use before you click the link.