Python is among the one of the most popular programming languages, yet it’s generally not the first choice when speed is required.
While it can be optimized for better performance, Python is prized for qualities other than speed, such as readability, a manageable learning curve, an expansive ecosystem, and utility in both academia and business.
MIT computer scientists and their colleagues, however, believe they’ve found a way to have it all – the approachability of a high-level language with the speed of a low-level language. They’ve developed a Python compiler called Codon that turns Python code into native machine code without a runtime performance hit.
“Typical speedups over Python are on the order of 10-100x or more, on a single thread,” the Codon repo declares. “Codon’s performance is typically on par with (and sometimes better than) that of C/C++.”
There’s a hitch, of course, other than its delayed-open-source license. Codon implements most but not all of the Python language. Some Python modules have not been incorporated into Codon. And it omits features such as dynamic type manipulation and runtime reflection that make code more difficult to analyze and optimize. In doing so, it can rely on a statically-typing compiler engine that – in conjunction with other innovations such as a more optimizable and flexible intermediate representation (IR) [PDF] – generates faster code.
Codon was originally developed as a framework for creating high-performance domain specific languages (DSLs) in Python. DSLs are languages focused on a specific purpose, as opposed to a general purpose programming language like Python or C. Examples of DSLs include CSS, SQL, and the ancient runes make understands.
Derived from Seq, a DSL for bioinformatics and genetics, Codon has grown into a language compiler that’s largely compatible with Python 3. As described in a paper [PDF] provided to The Register in advance of its planned March 16 release, “Codon: A Compiler for High-Performance Pythonic Applications and DSLs,” the toolchain “enables the development of DSLs that share Python’s syntax and semantics together with added domain-specific features and IR optimizations.”
The authors of the paper – Ariya Shajii (Exaloop), Gabriel Ramirez (MIT CSAIL), Haris Smajlović (University of Victoria, Canada), Jessica Ray (MIT CSAIL), Bonnie Berger (MIT CSAIL) Saman Amarasinghe (MIT CSAIL), and Ibrahim Numanagić (University of Victoria) – say that because Codon can output native machine code without any Python runtime overhead, they’re able to achieve C-like performance with Python scripts.
“Unlike other performance-oriented Python implementations (such as PyPy or Numba), Codon is built from the ground up as a standalone system that compiles ahead-of-time to a static executable and is not tied to an existing Python runtime (e.g., CPython or RPython) for execution,” the paper says. “As a result, Codon can achieve better performance and overcome runtime-specific issues such as the global interpreter lock.”
Instead of needing to … totally rewrite in a language like C, Codon can use the same Python implementation and give the same performance you’ll get by rewriting in C
The authors discuss various Codon-based, high-performance DSLs designed for bioinformatics, data compression, and parallel programming that take advantage of Codon’s compiler infrastructure. But Codon can also accelerate standard Python programs substantially, though those that rely on external libraries such as Django or DocUtils have to rely on a CPython bridge which limits performance to that of CPython. For example, on the Codon forum, some enterprising developer reports that a simple Codon-compiled Fibonacci script ran more than 70x faster than the CPython version.
MIT Professor and CSAIL Principal Investigator Saman Amarasinghe told the MIT News service in a release provided to The Register that Python is often used by domain experts who are not programming experts and haven’t optimized their applications for performance.
“Instead of needing to rewrite the program using a C-implemented library like numpy or totally rewrite in a language like C, Codon can use the same Python implementation and give the same performance you’ll get by rewriting in C,” explained Amarasinghe. “Thus, I believe Codon is the easiest path forward for successful Python applications that have hit a limit due to lack of performance.”
Codon, we’re told, is already being used commercially in fields from quantitative finance and bioinformatics to deep learning. And in the months ahead, expect Codon’s developers to implement some missing Python features. ®
PS: Yes, there are, of course, other Python compilers out there, as well as Codon, if you’d like to try them out.