New Approach for Training Physical (as Opposed to Computer-Based) Artificial Neural Networks

Author:Murphy  |  View: 23756  |  Time: 2025-03-23 11:51:39

Traditional AI systems relying on deep artificial neural networks that run inside computers require vast amounts of computational resources for their training, leading to concerns about their sustainability. One promising avenue to address this issue is the development of physical artificial neural networks: systems that mimic the structure of biological neural networks more closely than their digital counterparts, by using physical ways to make information flow rather than concatenating numerical calculations across neurons simulated in a computer. For example, in a subclass of physical neural network called "optical neural networks", light waves are emitted and combined to carry out various computations. But these physical systems face unique challenges, especially when it comes to training them. A recent study just published in Nature (Xue et al Nature 632:280–286, 2024) presents a truly groundbreaking solution that leverages physics to tackle these challenges. This is a step towards a possible future where AI systems run on physical systems hence becoming much more manageable, scalable, and above all, orders of magnitude cheaper to train.

"Physical" as opposed to conventional "digital" or "computer-based" artificial neural networks

Regular ("computer-based") artificial neural networks run on traditional digital computers that process millions of relatively simple operations per second and connect them through large numbers of artificial neurons connected in large networks. These networks are composed of layers of artificial neurons connected by weighted links, often also with biases acting on the neurons. All these weights and biases are nothing more than numbers stored in memory units; and training these networks involves adjusting these weights to minimize errors in the network's predictions. This process happens mostly through gradient descent algorithms, for example using methods such as backpropagation where errors are propagated backward through the network to update the weights effectively.

Physical neural networks, on the other hand, are built using materials and systems that inherently perform the same operations as digital neural networks but in a physical medium. For example, and pertinent to the article discussed here, optical neural networks use light waves. Another example is that of nanoelectronic networks, which combine electrical currents at the core of their working. Note that these are analog rather than digital systems like those that underly regular neural networks running inside computers.

Physical neural networks offer a more energy-efficient alternative to digital networks because they can perform computations in parallel and with less energy. However, they also come with a major drawback: they can't naturally perform backpropagation because their design only allows data to flow in the forward direction, that is from inputs to outputs. This means that typically, one of these networks would need to be trained in a computerized version and only then could it be used in prediction mode.

The challenge of training physical neural networks

The unidirectional nature of physical neural networks means that backpropagation, the basic algorithm used for training conventional digital neural networks, is just useless (because its application requires running the neural network backwards). To address this huge limitation, AI researchers have tried different approaches. One common solution is to create a mathematical model of the physical system and perform backpropagation on it using a computer, and then apply the obtained parameters to the physical system, which can from that point be executed in prediction mode.

Another approach is to develop entirely new learning algorithms that don't require backpropagation at all. However, these methods often fall short when it comes to matching the accuracy of traditional neural networks, especially in complex tasks.

The new approach, explained below, breaks with the paradigm by exploiting some key physical properties of light.

Fully Forward Mode Learning

The paper by Xue et al from Tsinghua University in China introduces a novel approach specifically tailored to optical neural networks, which are a kind of physical neural networks where light waves are used to transmit information and on which computations are carried out by mixing the light beams. The new training method, which applies to these optical networks, exploits a principle from electromagnetism known as "Lorentz reciprocity" which ensures that light can travel through an optical system in both directions with equal ease. This symmetry allowed the researchers to simulate the effects of backpropagation without actually reversing the direction of data flow. Instead, they used forward propagation to adjust the network's parameters.

Yes, hardcore light physics put to work for computation, and here more specifically for AI!

This method, called Fully Forward Mode (FFM) learning, thus enables the optical neural network to be trained just as effectively as traditional training of regular digital networks but without the need for standard backpropagation at all:

Fully forward mode training for optical neural networks – Nature

As expected for a paper in Nature, besides showing the theoretical proof that the method works, it also demonstrates the power of the method in various setups including some integrated into silicon chips. With these setups the authors showed that FFM-trained optical networks can handle a range of Machine Learning tasks from generic classification problems to more specialized tasks, some actually quite complex.

Implications and future challenges

The success of FFM learning in optical neural networks opens up new possibilities for AI. By embracing the physical laws that govern in this case optics, these systems could evolve into AI models that are far more energy-efficient than regular digital networks, also being far more scalable. Higher speeds could revolutionize applications that require real-time processing, and higher scalability could enable deeper and wider AI models, while energy efficiency speaks for itself especially in a world concerned by the ecological footprint of AI model training.

Not everything is roses, though, and some significant challenges remain before physical neural networks can be fully integrated into actual technology and products. First, these physical systems will need to be embedded into regular computer systems, and in turns out that such hybrid systems combining optical and electronic components are barely experimental at the moment, needing substantial development to optimize the conversion between analog (optical) and digital (electronic) signals. Additionally, more research is needed to determine exactly how much more scalable and adaptable these physical systems are for practical applications.

Fascination never ends in the world of AI and computing

Still after reading the paper deep enough to write this outreach blog post and share with my readers this out-of-the-ordinary work, I remain astonished at what human creativity can bring. We see every day new AI models, new mathematics, new hardware… and now even radically novel ways to enhance how AI systems work by building neural networks on physical supports.

In particular here I can't but link this FFM method and the whole idea of physical artificial neural networks to this video by YouTube channel Veritasium about a future where computers might be… analog!

References and other posts you may like

The paper presented here:

Fully forward mode training for optical neural networks – Nature

A comment on it at Nature:

Physics solves a training problem for artificial neural networks

After Beating Physics at Modeling Atoms and Molecules, Machine Learning Is Now Collaborating with…

A New Method to Detect "Confabulations" Hallucinated by Large Language Models

A family of specialized supercomputers that simulates molecular mechanics like no other

Google's AI Companies Strike Again: AlphaFold 3 Now Spans Even More of Structural Biology

New DeepMind Work Unveils Supreme Prompt Seeds for Language Models

www.lucianoabriata.com I write about everything that lies in my broad sphere of interests: nature, Science, technology, programming, etc. Subscribe to get my new stories by email. To consult about small jobs check my services page here. You can contact me here. You can tip me here.

Tags: Artificial Intelligence Deep Learning Machine Learning Science Technology

Comment