DeepPCR: parallelization of sequential operations in neural networks

Parallelization techniques have become ubiquitous to accelerate inference and training of deep neural networks. Despite this, several operations are still performed sequentially. For example, forward and backward passes are executed layer by layer and the output of diffusion models is produced by applying a sequence of denoising steps. This sequential approach results in a computational cost proportional to the number of steps involved, presenting a potential bottleneck as the number of steps increases. In this work, we present DeepPCR, a novel algorithm that parallelizes typically sequential operations to accelerate inference and training of neural networks. DeepPCR is based on the interpretation of a sequence of steps such as the solution of a specific system of equations, which we recover using the Parallel Cyclic Reduction algorithm. This reduces the complexity of calculating sequential operations of to thus generating an acceleration for large . To verify the lower theoretical complexity of the algorithm and identify speedup regimes, we tested the effectiveness of DeepPCR in parallelizing forward and backward passing in multilayer perceptrons and achieved speedups of up to for the forward and for the backward pass. Furthermore, we show the flexibility of DeepPCR by parallelizing the training of ResNets with up to 1024 layers and the generation in diffusion models, allowing up to faster training and faster generation, respectively, compared to the sequential approach.

DeepPCR: parallelization of sequential operations in neural networks

Technical Terrence Team

Are these best-traded FTSE 100 shares the best to buy in 2024?

Leave a Reply Cancel reply

Recommended.

Walmart posts strong quarter thanks to higher-income shoppers

Hertz sells 20,000 electric cars after being burned by Tesla's price drop

MLOps con Prefect y CometML: predecir los precios de venta de bulldozers

The initiative promoted by DePIN aims to expand Internet access in India

Web 3.0 infrastructure provider Wakweli signs new partnership with Polygon

Categories

Important Links

DeepPCR: parallelization of sequential operations in neural networks

Related

Technical Terrence Team

Are these best-traded FTSE 100 shares the best to buy in 2024?

Leave a Reply Cancel reply

Recommended.

Walmart posts strong quarter thanks to higher-income shoppers

Hertz sells 20,000 electric cars after being burned by Tesla's price drop

MLOps con Prefect y CometML: predecir los precios de venta de bulldozers

The initiative promoted by DePIN aims to expand Internet access in India

Web 3.0 infrastructure provider Wakweli signs new partnership with Polygon

Categories

Important Links

Get daily news updates to your inbox!