A single photograph offers glimpses into the creator's world: their interests and feelings about a subject or space. But what about the creators behind the technologies that help make those images possible?
MIT Department of Electrical Engineering and Computer Science Associate Professor Jonathan Ragan-Kelley is one such person, having designed everything from tools for visual effects in movies to the Halide programming language, widely used in the industry for photo editing and processing. As a researcher at the MIT-IBM Watson ai Laboratory and the Computer Science and artificial intelligence Laboratory, Ragan-Kelley specializes in high-performance, domain-specific programming languages and machine learning that enable 2D and 3D graphics, visual effects and computational photography. .
“The biggest drive for much of our research is the development of new programming languages that make it easier to write programs that run really efficiently on the increasingly complex hardware found in your computer today,” he says. Ragan-Kelley. “If we want to continue increasing the computational power that we can actually exploit for real applications, from graphics and visual computing to ai, we need to change the way we program.”
Find a middle ground
Over the past two decades, chip designers and programming engineers have witnessed a slowdown in moore's law and a marked shift from general-purpose computing on CPUs to more varied and specialized computing and processing units, such as GPUs and accelerators. This transition comes with a trade-off: the ability to run general-purpose code somewhat slowly on CPUs, for faster, more efficient hardware that requires the code to be heavily tailored and mapped to it with custom programs and compilers. Newer hardware with improved programming can better support applications such as high-bandwidth cellular radio interfaces, decoding highly compressed videos for streaming, and graphics and video processing in power-constrained cell phone cameras, to name a few applications.
“Our job is largely about unlocking the power of the best hardware we can build to deliver the highest possible computational performance and efficiency for these types of applications in a way that traditional programming languages don't.”
To achieve this, Ragan-Kelley divides his work in two directions. First, she sacrifices generality to capture the structure of particular and important computational problems and exploits it for better computational efficiency. This can be seen in the Halide image processing language, which she co-developed and which has helped transform the image editing industry in programs like Photoshop. Additionally, because it is specially designed to quickly handle dense, regular arrays of numbers (tensors), it also works well for neural network calculations. The second approach focuses on automation, specifically how compilers map programs to hardware. One such project with the MIT-IBM Watson ai Lab leverages Exo, a language developed in Ragan-Kelley's group.
Over the years, researchers have worked doggedly to automate coding with compilers, which can be a black box; However, there is still a great need for explicit monitoring and tuning by performance engineers. Ragan-Kelley and her group are developing methods that span each technique, balancing trade-offs for effective, resource-efficient scheduling. At the heart of many high-performance programs, such as video game engines or mobile phone camera processing, are state-of-the-art systems that are largely manually optimized by human experts in detailed low-level languages such as C, C++, and assembly. . Here, engineers make specific decisions about how the program will run on the hardware.
Ragan-Kelley notes that programmers can opt for “very fine-grained, very unproductive, very insecure low-level code,” which could introduce bugs, or “more secure, more productive, higher-level programming interfaces” that lack the ability to They make fine adjustments to a compiler on how the program runs and typically deliver lower performance. That's why her team is trying to find a middle ground. “We're trying to figure out how to provide control for the key things that human performance engineers want to be able to control,” says Ragan-Kelley, “so we're trying to build a new class of languages that we call User Programmable Languages that they provide safer, higher-level controls to control what the compiler does or control how the program is optimized.”
Hardware Unlocking: High Level and Unattended Ways
Ragan-Kelley and his research group are addressing this through two lines of work: applying machine learning and modern artificial intelligence techniques to automatically generate optimized schedules, an interface to the compiler, to achieve better compiler performance. Another uses the “exocompile” he is working on with the lab. He describes this approach as a way to “flip the compiler,” with a compiler skeleton with controls for human guidance and customization. Additionally, your team can add your custom programmers, which can help target specialized hardware like machine learning accelerators from IBM Research. Applications for this work run the gamut: computer vision, object recognition, speech synthesis, image synthesis, speech recognition, text generation (large language models), etc.
A large project of his with the lab takes this a step further, approaching the work through a systems lens. In work led by his advisor and lab intern William Brandon, in collaboration with lab research scientist Rameswar Panda, Ragan-Kelley's team is rethinking large language models (LLMs), finding ways to slightly change the computation and the model programming architecture so that transformer ai-based models can run more efficiently on ai hardware without sacrificing accuracy. His work, Ragan-Kelley says, deviates from standard ways of thinking in significant ways with potentially large benefits for reducing costs, improving capabilities, and/or reducing LLM to require less memory and run on smaller computers.
It's this more forward thinking, when it comes to computing and hardware efficiency, that Ragan-Kelley excels at and sees value in, especially in the long term. “I think there are areas (of research) that need to be pursued, but that are well established, obvious, or have enough conventional wisdom that a lot of people are already investigating them or will pursue them,” she says. “We try to find ideas that are very influential in practically impacting the world and at the same time are things that wouldn't necessarily happen or that I think the rest of the community doesn't see in relation to their potential. “
The course he now teaches, 6.106 (Software Performance Engineering), is an example of this. About 15 years ago, there was a shift from single to multiple processors in a device that caused many academic programs to start teaching parallelism. But, as Ragan-Kelley explains, MIT realized the importance of students understanding not only parallelism but also memory optimization and the use of specialized hardware to achieve the best possible performance.
“By changing the way we program, we can unlock the computational potential of new machines and make it possible for people to continue to rapidly develop new applications and new ideas that can exploit that increasingly complicated and challenging hardware.”