artificial intelligence models of the neuronal network used in applications such as medical image processing and voice recognition perform operations in enormously complex data structures that require a huge amount of calculation to process. This is one of the reasons why deep learning models consume so much energy.
To improve the efficiency of ai models, MIT researchers created an automated system that allows developers of deep learning algorithms to simultaneously take advantage of two types of data redundancy. This reduces the amount of calculation, bandwidth and memory storage necessary for automatic learning operations.
The existing techniques to optimize algorithms can be cumbersome and generally only allow developers to capitalize on shortage or symmetry, two different types of redundancy that exist in deep learning data structures.
By allowing a developer to build an algorithm from scratch that takes advantage of both redundancies at the same time, the MIT approach researchers increased the speed of calculations almost 30 times in some experiments.
Because the system uses an easy -to -use programming language, you could optimize automatic learning algorithms for a wide range of applications. The system could also help scientists who are not experts in deep learning, but want to improve the efficiency of ai algorithms they use to process data. In addition, the system could have applications in scientific computer science.
“For a long time, capturing these data redundancies has required a lot of implementation effort. On the other hand, a scientist can tell our system what he would like to calculate in a more abstract way, without telling the system exactly how to calculate it, ”says Willow Ahrens, a postdoc from MIT and co -author of a Document on the systemwhich will be presented in the International Symposium on Code Generation and Optimization.
It joins the document by the main author Radha Patel '23, SM '24 and the main author Saman Amarasinghe, professor in the Department of Electrical Engineering and Computer Science (EEC) and principal researcher in the Computer and artificial intelligence Laboratory ( Csail).
Cut the calculation
In automatic learning, data is often represented and manipulated as multidimensional matrices known as tensioners. A tensioner is like a matrix, which is a rectangular matrix of values arranged in two axes, rows and columns. But unlike a two -dimensional matrix, a tensioner can have many dimensions or axes, which makes tensioners more difficult to manipulate.
Deep learning models perform tensioners that use multiplication and addition of repeated matrix: this process is how neuronal networks learn complex patterns in the data. The large volume of calculations that must be performed in these multidimensional data structures require a huge amount of calculation and energy.
But due to the way in which data is organized in tensioners, engineers can often increase the speed of a neuronal network by eliminating redundant calculations.
For example, if a tensioner represents user review data of an electronic commerce site, since not all users reviewed each product, most values in that tensioner are probably zero. This type of data redundancy is called dispersion. A model can save time and calculation when storing and operating only in values other than zero.
In addition, sometimes a tensioner is symmetrical, which means that the upper half and half of the data structure are equal. In this case, the model only needs to operate in the middle, reducing the amount of calculation. This type of data redundancy is called symmetry.
“But when you try to capture both optimizations, the situation becomes quite complex,” says Ahrens.
To simplify the process, she and her collaborators built a new compiler, which is a computer program that translates a complex code in a simpler language that can process a machine. Your compiler, called Systec, can optimize calculations by automatically taking advantage of both scarcity and symmetry in tensioners.
They began the system of SYSTEC construction identifying three key optimizations they can perform using symmetry.
First, if the algorithm output tensioner is symmetrical, then you just need to calculate half of it. Second, if the entrance tensioner is symmetrical, then the algorithm only needs to read half of it. Finally, if the intermediate results of tensioning operations are symmetrical, the algorithm can omit redundant calculations.
Simultaneous optimizations
To use Systec, a developer enters its program and the system automatically optimizes its code for the three types of symmetry. Then, Systec's second phase performs additional transformations to store only zero data values, optimizing the dispersion program.
In the end, Systec generates code ready to use.
“In this way, we obtain the benefits of both optimizations. And the interesting thing about symmetry is that, as its tensioner has more dimensions, you can get even more savings in calculation, ”says Ahrens.
The researchers demonstrated accelerates of almost a factor of 30 with code automatically generated by SYSTEC.
Because the system is automated, it could be especially useful in situations where a scientist wants to process data using an algorithm that writes from scratch.
In the future, researchers want to integrate SYSTEC in existing scattered tensioning compiler systems to create a perfect interface for users. In addition, they would like to use it to optimize the code for more complicated programs.
This work is financed, in part, by Intel, the National Science Foundation, the Agency for Advanced Defense Research and the Department of Energy.
(Tagstotranslate) Saman Amarasinghe (T) Willow Ahrens (T) Compiler (T) Programming efficiency (T) Tensioners