This article (originally featured in www.eedesignnewseurope.com) was inspired by a NASA competition announcement early in 2017 seeking help to update an old FORTRAN application. Here, we examine what’s the best course of action for migrating legacy applications to a modern environment – bugs and all.
NASA’s FUN3D is written in old FORTRAN code, runs on its Pleiades supercomputer, and is used for computational fluid dynamics (CFD) analysis on its experimental aircraft designs. FORTRAN has been around since the 1950s and is considered to be the oldest high-level programming language, and during its heyday, was positioned as the best tool for scientific and engineering application development.
So, is the challenge because there is a lack of engineers at NASA with the appropriate FORTRAN skills? Given its prominence in the 1950s, many of the engineers and programmers who were involved in its development have probably retired. It might be a consideration.
More importantly, on closer inspection it appears that FUN3D is a mixed language environment with part of the application in FORTRAN and part in C++ and Ruby. C++ is seen as the natural contender to inherit FORTRAN’s position for scientific and engineering tasks.
Theoretically, another consideration for moving legacy applications is tackling the obsolescence of hardware and its replacement architecture not supporting the legacy application environment.
So, I’ve been giving some thought to how easy is it to tackle a project that requires the migration and even modernisation of this type of legacy application to a C++ environment and how would you sustain functionality and improve software quality. I’ve come up with a four-step plan that should remove a lot of the pitfalls encountered when undertaking this type of project.
Step 1 – Assess
As the first step of the project the existing behaviour of the application needs be baselined to be sure that in the final stage of the project there are clear and exact matches to the behaviours exhibited in the original form. However, this precursor doesn’t inhibit the project team’s ability to reduce existing technical debt and improve the overall reliability, manageability and quality of the software.
Today we are all familiar with checking code coverage and static analysis statistics, but neither exist in the FORTRAN environment. So, this means investigating the existing testing infrastructure and data to document what tests are available, which pass and fail, and couple them with their associated features and requirements.
As the new target environment is C++ this would mean loading the requirements into a software testing environment like VectorCAST utilising its requirements gateway so that once converted, the new code can reliably be tested for the pre-existing performance.
A key issue during this step is to watch out for comments in existing coding about errors that have been accommodated into the application logic over time rather than being fixed. This would result in the application now relying on this error to be a correct output value. In some situations, bugs have now become features with many areas of the application relying on it and the way it behaves. I will explain more about this later.
Step 2 – Convert
You might consider it lucky that converters for old high-level languages exists; f2c, f2cpp or fable for FORTRAN and there is COB2CPP for COBOL, which was used in many legacy business and financial applications. However, there’s a lot more to the process than a simple case of uploading the original source into the conversion utility and outputting new source code. This process will transfer any existing technical debt, and it has probably increased by adding any inherent technical debt of the converter as well. So how do you know what’s the state of your project?
Therefore, one of the key goals of the project is to sustain functionality while improvingthe quality of the software. My advice is to consider the project holistically as no one testing tool will suffice if you want the final application to be of the highest quality.. I’d want to have some analytical evaluation of the task at hand so that I can plan the best way to attack the project. OK, its C++ so there are three things that should be done straight away.
- First, run tests to prove the converted code is functioning in the exact same way that it did beforehand when the assessment was carried out.
- Second, run static analysis to check how well the converter performed and our new code base adherence to coding standards. Converters generally parse the source language and create standardised target language with a low complexity score. Therefore, any functions or features with a high complexity score would be a good place to start when we need to improve the quality to the resultant code base.
- Third, check to see if there are unit tests for every piece of the code, and if not, automatically generate them so that code coverage can be checked.
While automatically generating missing tests is a great time-saver, it needs to be viewed in context. Having test cases for every unit can provide 100% code coverage, but it will not prove code correctness. It will formalise what the code does, which is powerful as it allows for any further changes to be validated to ensure it doesn’t break existing behaviour.
At this stage I would be looking at using an environment like VectorCAST/C++, VectorCAST/Lint and VectorCAST/Analytics products to create a platform that looks precisely at the condition of the code base to determine the highest priority areas that need the most attention to get them up to an acceptable standard.
Step 3 – Quality!
Now, the environment will be fully instrumented and it will be easy to just start fixing all static analysis errors and improving code coverage. However, in the previous section I mentioned paying attention to any comments in the source code and logs concerning bugs that have become let’s say ‘features’. We now need to consider “bug for bug” compatibility. If we go ahead and fix the original bug, then many functions that rely on that anomaly will now need fixing as they will stop working. Take the example where an existing function wrongly evaluates the sum of two registers by a factor of two. Sixty percent of the features call this function and rely on its incorrect evaluation. Fixing it would be easy but then it would create a large volume of follow-on activity to correct the features.
It’s a huge issue as its not known what impact one change in one line of code will have on the rest of the application. What’s needed here is an approach that keeps the developer informed on what test results will be affected by a change, and therefore, will need to be run again. Many developers call it “impact analysis”, I call it “change-based testing”. It’s not unusual for large applications to take one or two weeks to run a complete suite of tests. With change-based testing, small changes can be tested in minutes as the environment computes the minimum number of test cases impacted by each code change and just runs those specific tests.
Using this approach, it’s now possible to work your way towards the project’s quality goals for code coverage or other measures. One key area to pay attention to, as I mentioned at the beginning, is that a lot of legacy applications operate in a mixed language environment. Therefore, there are boundary conditions and APIs to be considered, and it’s the early documentation work that will provide the key to ensuring the new C++ version of the application behaves identically to its predecessor.
We are now at the stage when we can start to consider leaving the code as it is or refactoring, as we have the environment whereby we can make changes and quickly run regression testing to ensure the full software functionality is retained. We can also evaluate performance to make sure it’s at least the same, as FORTRAN is very efficient at numerological tasks and hopefully there is an improvement.
Step 4 – Extend
As we have now replicated the old application on its new target platform, the next step is to consider other factors that impact the performance and usage of the application. It is probable that the new application may be heading for a cloud or connected environment with all the network and security implications associated with that context.
Using a commercial software testing platform will provide the ability to establish if there are security vulnerabilities present in the code either through compliance with coding standards, e.g. syntax, semantics, variable estimation, control and data flow to identify issues in the code.
Another option is using dynamic testing to provide actionable intelligence based on the highly-regarded, community-developed formal list known as “Common Weakness Enumeration” (CWE) that serves as common language for describing software security weaknesses in architecture and code, and is a standard, common lexicon for tools detecting such potential weaknesses. Dynamic testing can highlight vulnerabilities, in particular, anything with hard errors such as the use of null pointers, buffer over- or under-runs, and stack overflows.
In conclusion, it may not be as straightforward as it first seems to undertake a project to migrate a legacy application because the deployment environment has changed so much. However, if you prioritise setting up the correct software quality testing environment by following the steps outlined above, it can be achieved. No one should believe old software is best left alone just because it was written over 50 years ago.