Understanding Legacy Code Challenges
Developers spend more time trying to figure out what legacy code does than rewriting it.
According to IEEE research tracking 78 professional developers across 3,148 working hours, developers spend 58% of their time on program comprehension versus actual code writing. And if they are skilled, they face more risk from the business rules they miss than in the rules they know to rewrite. Knight Capital lost $440 million in 45 minutes because developers didn’t realize a reused system flag would activate dormant testing code from 2003. But living with legacy code is also untenable. Springer’s research on code complexity shows complex legacy code requires 250-500% more maintenance time than simple code of the same size.
Take the example of an old C# .NET Forms application that is failing in production and needs to be rewritten. There are no specs and what exists isn’t verifiably accurate.
The application hides business logic:
- Stored procedures containing core business behavior
- Over-engineered services woven into the code obscuring the essential behavior
- Business rules dependent upon switches hidden in property table values
- Data models so abstract they didn’t describe the actual domain
Business analysts and architects waste days to weeks understanding basic workflows—longer than rewriting them. Without this understanding, any legacy rewrite risks missing critical business rules. Either way, the project becomes too expensive and risky to fund.
We built a system that traces code dependencies from form classes through layers of the application to database interactions, documenting what happens when a user clicks submit.
The technical implementation:
- Roslyn AST traversal from Forms to database calls
- Regex patterns to identify stored procedure dependencies
- ANTLR parsing of Transact-SQL logic
- Property table inclusion to capture configuration-driven behavior
We trace from business-critical entry points outward, setting depth limits to control scope. Then feed this context to an LLM—formatted as code symbol source, stored procedure code, table schema, and PlantUML diagrams. The LLM generates a business requirements document that becomes the blueprint for your legacy rewrite.
Time to trace: hours. Manual investigation: days to weeks.
In an undocumented legacy application, code is your business rules. That includes stored procedures and data configurations—not just application code. LLMs excel at finding patterns across these scattered artifacts and summarizing them in language specific enough that humans can verify.
What if we stopped pretending AI replaces experts and used it to eliminate drudgery instead so they can focus on the activities that create value?