Using Compilers Over ChatGPT for Legacy Code
Technology legacy codeLLMssoftware developmentcontext engineeringinnovation

Using Compilers Over ChatGPT for Legacy Code

2 Technology Post legacy code, LLMs, software development, context engineering, innovation Sep 3, 2025 1756944000000

Why We Use A Compiler, Not ChatGPT, to Parse Legacy Code

A year ago, the hype was that chains of agents could arrive at the truth of any complex analysis as long as you threw enough agent personae and compute resources at the problem.

Those of us working with agents learned the value of context engineering: providing correct and useful information to derive better analysis. What constitutes “correct and useful” is domain-specific and requires combining new and traditional solutions. We also learned the value of incorporating traditional tools into agentic workflows.

My coworkers and I first started applying LLMs to rewrite legacy software systems 18 months ago. We used linters, grep, regex, and diff to methodically identify and repair code quality issues in a large legacy codebase. We incorporated LLMs to suggest rewrites that were not simple replacements using information from our traditional pipeline as context.

I applied the same approach to legacy .NET application rewrites: use LLM’s for what they do well. Use existing tools to address solved problems. For example, tracing business logic through a legacy .NET application:

  • Roslyn AST gives guaranteed symbol resolution, i.e. MethodA in ClassA invokes MethodB in ClassB.
  • Regex patterns find stored procedure references deterministically, i.e. recognizing the syntax and extracting the procedures names.
  • ANTLR parses SQL with formal grammar rules, we identify and retrieve the schema without risk of error.
  • Graph traversal follows actual execution paths, we represent that as a sequence.

I could fire up Claude Code on the entire codebase and ask “what does this do?” Or I can give the agent a structured, correct representation of a vertical slice and use LLMs for what they’re actually good at: pattern recognition and summarization.

The difference: deterministic results in minutes versus hours of token processing with hallucinated connections.

Use LLM’s for what they do well. Use existing tools to address solved problems. Innovation comes from applying proven solutions to solved parts of the problem and focusing your resources and creativity on what’s left.

Originally published on LinkedIn on Sep 3, 2025. Enhanced for this site with expanded insights and additional resources.