MIT-Created Compiler Speeds up Python Code (2024)

Python is a popular, beginner-friendly language. It’s also an interpreted language, which makes it easy to use but slower than a compiled language such as C or C++. At the large scale that becomes a problem, as Ariya Shajii, an MIT CSAIL Ph.D. graduate, and his colleague Ibrahim Numanagić noticed when working with genomics, which involved large data sequences.

They realized the previous efforts to create faster versions of Python were predicated on a top-down approach that started with the traditional implementation and then attempted to make it faster by doing a just-in-time compilation, which compiles the code as the program runs, Shajii said.

“The clear advantage of that is you can get a lot of backwards compatibility, but you’re really limited in the types of things you can do,” Shajii told The New Stack. “For example, Python has this thing called a global interpreter lock, which basically prevents you from doing parallel or multithreaded applications. And that’s a big problem if you really want high performance.”

Instead, Shajii and Numanagić took a bottom-up approach, implementing everything from the ground up, independent of the standard Python implementation, he said. That led them to an unusual approach: compiling Python with a tool they created, with an MIT team, called Codon.

“It gives you a lot more flexibility to do interesting things and generate optimized code, and things like that,” Shajii said. “That’s why we’re able to get such a better performance than some of these other compilation approaches, which maybe get 2 to 4 times, for example, but with Codon it’s usually like 10 to 100 times.”

The MIT team tested Codon on approximately ten commonly used genomics applications, all written in Python and compiled using Codon. The team achieved five to ten times speed-ups over the original hand-optimized implementations.

Codon’s Origin Story

Originally, Shajii and Numanagić planned to build a domain-specific language for genomics, since that was their background. What they found, however, is that people didn’t want to learn a new and specialized language — they like Python.

“That’s why we just made everything as Pythonic as possible,” he said. “Then over time, we just closed the gaps farther and farther to the point where we had sort of a general Python, sort of Python replacement pretty much.”

The team then refactored their tool into the Codon compiler by converting all its genomic-specific library, data structures, and methods of dealing with sequences into an extension. This approach allows Codon to support other domain-specific languages, which are programming languages with higher abstraction for a specific class of problems, all wrapped in a comfortable Python-like environment.

“The whole system is extensible with plugins, so you can write a plugin that has new libraries, new compiler optimizations; you can even add new keywords to the language if you want it to, or new syntax,” Shajii said. “But from the user standpoint, they’re still writing very high-level Pythonic code.”

One of the first puzzles the team had to solve was how to feed the compiler Python code. The compiler’s first step is to perform “type checking,” a process where the program figures out the different data types — string, integers, floating-point numbers, etc. — of each variable or function. Some might be strings, some might be integers. In regular Python, that information is dealt with as the program runs, which is one of the reasons Python is slow. Codon does this type-checking before running the program. Doing so allows the compiler to convert the code to native machine code, thus avoiding the overhead of dealing with data types at runtime.

They then focused on optimizations in the compiler.

“If you’re working with the genomics plugin, for example, that will do its own set of optimizations that are specific to that computing domain, which involves working with genomic sequences and other biological data, for example. The result? An executable file that runs at the speed of C or C++, or even faster once domain-specific optimizations are applied,” MIT stated.

Shajii and the team published a paper detailing how Codon works.

Compiling Python Caveats

There are a few caveats with compiling Python, however. Codon does not support dynamically changing data types at runtime, for instance.

“We said, okay, we’re targeting scientific applications, and it’s rare to do stuff like that, so let’s just like shift our focus to statically analyzable things,” Shajii explained. “So some of those dynamic features we don’t support.”

Some of these omitted features are on Codon’s roadmap to support and some aren’t. For instance, standard library modules aren’t supported yet, but the MIT team is working on it.

“It’s a huge, huge library, but we’ve tried to implement the main ones that we typically see used […] in the kinds of applications that we’re targeting,” he said.

There are also data type differences. For example, integers in Codon are 64 bit and in Python they’re “arbitrarily long,” he said.

Also, while Codon is designed to help projects scale up, don’t expect a seamless output yet.

“Larger code bases, you’ll probably end up having some [of the] incompatibilities that I mentioned. So, you know, oftentimes we give you error messages: that you need to go and change this, or [we] don’t support this yet,” he said.

There are other ways to use Codon in larger Python applications, he said, noting that there is a decorator that allows developers to allow one particular function — say a bottleneck — to compile while everything else stays in Python.

“That’s to address this problem of an all-or-nothing approach,” he said. “Often, if you have some Python application, what people would typically do is they would write the really performance-critical pieces of that in C; or Cython, for example, is another tool that’s used for that. So we’re releasing something pretty soon that lets you do that same thing in Codon, so you never have to leave the Python environment, which, again, is sort of the underlying theme of all this.”

Codon’s Coming Soon: WebAssembly and More

Codon was released in December and is in version 0.15. It’s available for free usage in academic or personal applications.

The team wants to incorporate several dynamic features and expand its Python library coverage. There’s one planned feature, however, that may appeal to frontend and web developers: They’ve planned to support compiling to WebAssembly.

“We use LLVM as a backend. LLVM is a very common sort of compiler infrastructure/framework that a lot of compilers use, and LLVM has support for WebAssembly,” he said. “So one of the things that we plan to add support for is WebAssembly for Codon, so [that] you can take a Python program and compile it to WebAssembly.”

TRENDING STORIES

Loraine Lawson is a veteran technology reporter who has covered technology issues from data integration to security for 25 years. Before joining The New Stack, she served as the editor of the banking technology site Bank Automation News. She has... Read more from Loraine Lawson
MIT-Created Compiler Speeds up Python Code (2024)
Top Articles
2022 International Tax Competitiveness Index
5 Fascinating English Words With All 5 Vowels
Jack Doherty Lpsg
Northern Counties Soccer Association Nj
Ffxiv Palm Chippings
Mopaga Game
Jefferey Dahmer Autopsy Photos
La connexion à Mon Compte
Videos De Mexicanas Calientes
Wild Smile Stapleton
Roblox Developers’ Journal
Declan Mining Co Coupon
DIN 41612 - FCI - PDF Catalogs | Technical Documentation
Erskine Plus Portal
Money blog: Domino's withdraws popular dips; 'we got our dream £30k kitchen for £1,000'
Csi Tv Series Wiki
Bridge.trihealth
Promiseb Discontinued
Espn Horse Racing Results
Understanding Gestalt Principles: Definition and Examples
Www.craigslist.com Austin Tx
Rogue Lineage Uber Titles
Desales Field Hockey Schedule
Transformers Movie Wiki
Kristen Hanby Sister Name
Play 1v1 LOL 66 EZ → UNBLOCKED on 66games.io
Tgh Imaging Powered By Tower Wesley Chapel Photos
What Time Is First Light Tomorrow Morning
2008 Chevrolet Corvette for sale - Houston, TX - craigslist
Raisya Crow on LinkedIn: Breckie Hill Shower Video viral Cucumber Leaks VIDEO Click to watch full…
Marcus Roberts 1040 Answers
Wisconsin Women's Volleyball Team Leaked Pictures
Housing Intranet Unt
Craigslist Tulsa Ok Farm And Garden
Keir Starmer looks to Italy on how to stop migrant boats
Bartow Qpublic
Letter of Credit: What It Is, Examples, and How One Is Used
Fwpd Activity Log
Emulating Web Browser in a Dedicated Intermediary Box
Walmart Car Service Near Me
Valls family wants to build a hotel near Versailles Restaurant
Dickdrainersx Jessica Marie
Does Target Have Slime Lickers
Craigslist Mendocino
Sc Pick 3 Past 30 Days Midday
53 Atms Near Me
Electric Toothbrush Feature Crossword
Ingersoll Greenwood Funeral Home Obituaries
Denys Davydov - Wikitia
Noaa Duluth Mn
Latest Posts
Article information

Author: Carlyn Walter

Last Updated:

Views: 6350

Rating: 5 / 5 (50 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Carlyn Walter

Birthday: 1996-01-03

Address: Suite 452 40815 Denyse Extensions, Sengermouth, OR 42374

Phone: +8501809515404

Job: Manufacturing Technician

Hobby: Table tennis, Archery, Vacation, Metal detecting, Yo-yoing, Crocheting, Creative writing

Introduction: My name is Carlyn Walter, I am a lively, glamorous, healthy, clean, powerful, calm, combative person who loves writing and wants to share my knowledge and understanding with you.