Assessing Program Translation Capabilities of LLM Enabled Chatbots

Luke Marshall

This thesis examines the capabilities of Large Language Model (LLM) -enabled chatbot applications for translating programming languages. With the software industry facing challenges, including an aging COBOL infrastructure and widespread memory safety vulnerabilities resulting from the use of archaic programming languages, effective code migration strategies have become essential. We evaluate nine commercial chatbot applications from leading AI companies (Anthropic, Google, MetaAI, OpenAI, and xAI) in their ability to translate a complex OCaml project to Python, TypeScript, Rust, and C++. Our methodology employs a two-dimensional framework that examines prompting strategies (direct versus assisted) and translation approaches (functionalist versus linguistic), resulting in four distinct types of translation. Through analysis of 72 direct translations and seven assisted translations, we assess both the functionality and paradigmatic consistency. Results demonstrate that assisted translations achieve 75\% success rate for functionalist approaches across target languages, while direct translations show highly variable performance (0-56\% success rate) with consistent failures in Rust and C++. Linguistic translations using assisted approaches successfully demonstrated paradigm reorganization, while direct linguistic translations largely mirrored their functionalist counterparts. Our findings indicate that while LLM-enabled chatbots are not yet viable for fully automated code translation due to inconsistent success rates and systematic errors, they serve as effective tools when assisting human developers, particularly for complex projects that require paradigm shifts or the careful preservation of functionality.