Towards Exact Gradient-based Training on Analog In-memory Computing
Zhaoxian Wu*
Tayfun Gokmen*
Malte J. Rasch
Tianyi Chen*
* External authors
NeurIPS 2024
2024
Abstract
Analog in-memory accelerators present a promising solution for energy-efficient training and inference of large vision or language models. While the inference on analog accelerators has been studied recently, the analog training perspective is under-explored. Recent studies have shown that the vanilla analog stochastic gradient descent (Analog SGD) algorithm {\em converges inexactly} and thus performs poorly when applied to model training on non-ideal devices. To tackle this issue, various analog-friendly gradient-based algorithms have been proposed, such as Tiki-Taka and its variants. Even though Tiki-Taka exhibits superior empirical performance compared to Analog SGD, it is a heuristic algorithm that lacks theoretical underpinnings. This paper puts forth a theoretical foundation for gradient-based training on analog devices. We begin by characterizing the non-convergence issue of Analog SGD, which is caused by the asymptotic error arising from asymmetric updates and gradient noise. Further, we provide a convergence analysis of Tiki-Taka, which shows its ability to exactly converge to a critical point and hence eliminates the asymptotic error. The simulations verify the correctness of the analyses.
Related Publications
Analog in-memory computing is a promising future technology for efficiently accelerating deep learning networks. While using in-memory computing to accelerate the inference phase has been studied extensively, accelerating the training phase has received less attention, despi…
This paper introduces the Analog AI Cloud Composer platform, a service that allows users to access Analog In-Memory Computing (AIMC) simulation and computing resources over the cloud. We introduce the concept of an Analog AI as a Service (AAaaS). AIMC offers a novel approach…
We developed a phase-change memory (PCM), with SiSbTe material, that showed state-independent resistance drift (v~0.04) at 65°C over the entire analog conductance range. We evaluated this PCM for In Memory Compute (IMC) applications simulating the performance of BERT model w…
JOIN US
Shape the Future of AI with Sony AI
We want to hear from those of you who have a strong desire
to shape the future of AI.