"Sparks of Chemical Intuition"-and Gross Limitations!-in AlphaFold 3

Author:Murphy | View: 25082 | Time: 2025-03-22 21:26:04

Index

– Introduction

What AlphaFold 3 is and how it Stands out From Previous Versions Beyond proteins Community hacking effort – P**roblems With Proteins, Including Some That AlphaFold 2 Didn't Suffer From
What AlphaFold 3 (Doesn't) Knows About Lipid Physical Chemistry
Proteins in membranes
Protein-nucleic acid complexes
Ions and metal sites in proteins **Metal ions in proteins– Discussion and Conclusions For Structural Biologists Implications for AI and Data Science Enthusiasts

Introduction

Just 3 weeks ago, AlphaFold 3 came to life as DeepMind's (and Isomorphic Lab's, another company linked to Google) latest tool applying AI to the problem of understanding biology at the atomic level. As I covered extensively in a dedicated article, AlphaFold 3 made a big step forward in the world of protein structure prediction, because unlike its predecessors it can now also parse and model not just proteins and their complexes with each other but also with ions, small molecules, and nucleic acids:

Google's AI Companies Strike Again: AlphaFold 3 Now Spans Even More of Structural Biology

With such kinds of tools (and AlphaFold 3 is neither the first nor the only one, as I reported here), biology is close to a new revolution. Yes, with that last sentence I intend to mean that AlphaFold 3 might not be there yet, although it is definitely marking a very promising way. If we could someday have a truly accurate and reliable version of AlphaFold 3, then the worlds of biology, biotechnology, and pharma would change forever. Just like if any of the alternatives to AlphaFold 3 cracks the problem:

AlphaFold and Other AI Tools for Molecular Structure Go Beyond Proteins

Now, you may think that the title of this article you are reading sounds a bit like clickbait. I stole the idea from Microsoft's preprint suggesting in its title that GPT-4 displayed "sparks of Artificial Intelligence":

Provocatively, Microsoft Researchers Say They Found "Sparks of Artificial Intelligence" in GPT-4

And the point is exactly that: My title intends to anticipate several interesting observations about DeepMind's latest model for biomolecular structure prediction, which seems to have "learned" some basic rules of chemistry. But, yep, as my title also conveys, AlphaFold 3 botches even some quite simple concepts, including some that AlphaFold 2 handled correctly! Read on to learn about all this, as distilled from large numbers of tests carried out by scientists by using AlphaFold 3 at AlphaFoldServer.com – currently the only way to use the program, since no code or executables have been released (a serious issue in itself that you can read about here).

What AlphaFold 3 is and how it Stands out From Previous Versions

Inside cells, proteins perform various functions, including catalyzing reactions, replicating and transcribing DNA, responding to stimuli, and providing the cell with a structure, and a myriad more. All of these functions depend on the proteins' three-dimensional structures, and on the 3D structures they make up with other components. Likewise, in biotechnology and the clinic proteins are essential as enzymes, adjuvants, cages, and more. Knowing proteins deep enough to manipulate their functions in order to target disease and develop new products, or simply for the fun of understanding life at atomic level, all require that we get to know their 3D structures at the atomic level.

Traditionally, determining the structures of proteins was very laborious, reason why AlphaFold 2 revolutionized the field by allowing the quite accurate prediction of protein structures together with reliable confidence scores. Although this did not really replace experiments, it did greatly reduce the need to run experiments for experimental structure determination and it even aided in their execution. You can know more about AlphaFold 2 in some of my previous articles:

Guide to my blog articles on AlphaFold

Beyond proteins

Proteins often function in complexes, interacting with other proteins, nucleic acids (DNA/RNA), ions, and various small molecules that constitute substrates, products, regulators, signalers, etc. Understanding these complexes is crucial for applied and fundamental Biology, because, for example, drug discovery (that is, the process of finding or designing new drugs) is based on exploiting the interactions that take place between different molecules.

Unlike AlphaFold 2, which focused on individual proteins and their complexes but only considering protein components, AlphaFold 3 models biomolecular interactions within a unified framework that includes not only proteins but also nucleic acids, ions, and small molecules, allowing scientists to model more complete assemblies that better resemble how biomolecules work together. Besides understanding multiple kinds of molecules beyond proteins, AlphaFold 3 incorporates several architectural changes that redound in improved accuracy, higher speed, and, supposedly, in less errors, problems and hallucinations. AlphaFold 3 highlights Google's deep interest in AI for biological research, which promises breakthroughs in disease understanding and treatment by, in particular, providing radically new ways to find and design clinical compounds such as antibiotics, cancer effectors, etc.

Now, a big limitation to all these extremely important applications promised by AlphaFold 3, is that we don't know very well how well it works. Even less so considering the many problems associated with the revision of the paper presenting AlphaFold 3 (see here by the end of this article). Getting familiar with the model, particularly with its capabilities and limitations, turned out much harder than with AlphaFold 2 because AlphaFold 3 is not as open, and scientists can only use it through a server with quite strict limitations in the numbers of runs and also in the possible tweaking of parameters.

A big limitation to all the applications promised by AlphaFold 3 is that we don't know very well how well it works. [That's why the ongoing community effort to test it is so important.]

Community hacking effort

Yet, within what's possible, the community is trying AlphaFold 3 a lot (through AlphaFoldServer.com, which is the only possible way to use this model) in very imaginative ways, posting results to X's / Twitter's AlphaFold 3 community. I have been following these posts quite closely, and came up with a long list of interesting reports, some positive, some negative. Here I will show you some recent successes, many of them showing "sparks of chemical intuition" and others showing some important failures. Together, these observations underscore that AlphaFold 3 might be powerful as a tool to generate hypothesis and accelerate work, but is still (very!) far from being a reliable replacement for experiments.

Before we delve into the richer part of this post, if you want to know more about the "battle" between academics and Google for an open-source version of AlphaFold 3 that can enable deeper benchmarking than that presented in the paper, check out this article. As it notes, a fully open-source version of AlphaFold 3 will allow researchers to better understand how the model works, what its limitations and capabilities are, and maybe eventually correct some problems and biases and even perhaps expand its capabilities. As I develop here, many scientists are already trying to do this with the AlphaFold 3 server, hacking it within what's possible to run tests of all kinds.

First, Problems With Proteins, Including Some That AlphaFold 2 Didn't Suffer From

DeepMind sells AlphaFold 3 mainly as a tool that understands structural biology beyond proteins, but it also does say and report in their paper that it improves over AlphaFold 2 regarding protein modeling, especially for protein-protein complexes.

However, serious problems have been reported on protein-only modeling at least in two main fronts. First, just 1 day after the model was released, Lucas Farnung (Assistant Professor at Harvard) reported that AlphaFold 3 folds a lot of unstructured regions as helices. Many other users reported the same problem later on. Fortunately, though, and as Farnung himself acknowledges, these helices are largely flagged by AlphaFold 3 with low confidence (low pLDDT), and therefore there shouldn't be a big chance of drawing incorrect results.

Interestingly, the problem of overproducing helical structures seems to affect other protein AI models that use backbone diffusion such as Chroma and RoseTTAFold-AA. Note that AlphaFold 2 doesn't use any diffusion elements, and it doesn't suffer from this problem.

The second is a quite "pathological" behavior, that already appeared on AlphaFold 2 but was then corrected by AlphaFold 2 – multimer, and now came back with AlphaFold 3: sometimes, apparently for rather large proteins but this isn't very clear yet, multiple proteins that are supposed to form a complex are folded in the same space, thus effectively clashing with each other. Edward Marcotte, Professor at the University of Texas at Austin, shows on X/Twitter a quite clear example:

As he notes, AlphaFold 3 is pretty confident in this prediction even though the structure is just physically impossible. He further explains that "it's especially hard to tell when AF3 is using previously known structures as templates or not [because] the functionality of the server is limited and it's not possible to opt out of using templates." This complicates running tests that could disclose why this all happens, as one could do with AlphaFold 2 because it was open source,

Finally in this section, I bring you Jan Kosinski's insights as he tried the AlphaFold 3 webserver with the task of modeling a large protein complex that was released by the PDB likely after the date cutoff used by DeepMind to train the model. Summarizing (check his post for details), he found that although AlphaFold 3 built some good local arrangements, the model is overall too far from the known (experimentally determined) structure. On the good side, though, he found that AlphaFold 3's quality scores are robust, confirming that only local arrangements and some interchain contacts are reliable but not many other structural features.

In summary, this first section teaches us that even pure-protein modeling isn't still mastered by AlphaFold 3, moreover running in disadvantage compared with AlphaFold 2 regarding some issues. Fortunately, though, all reports consistently show that the quality estimates are reliable.

What AlphaFold 3 (Doesn't) Knows About Lipid Physical Chemistry

Among the limited set of molecules available for modeling at the AlphaFoldServer, there are a few classes of lipids available. To a first approximation you can think of lipids as oil, wanting to separate from water by clustering together into various structures such as micelles, bilayers, etc. Some proteins can, in turn, insert themselves into such structures, for example the so-called "membrane proteins".

These molecules are much less represented in the training dataset than proteins, and they largely show up bound to proteins but not forming any lipid-rich structures. Hence, it is very important to know what AlphaFold 3 predicts when multiple lipid molecules are put together. Tests by researchers and scientists have uncovered some intriguing results.

Francisco Enguita, an Assistant Professor and Molecular Artist at the University of Lisbon, used 250 lipids as input and observed that AlphaFold 3 could generate micelle-like structures with surprisingly (and unrealistically!) rigid lipid conformations. This suggests that while AlphaFold 3 can predict some aspects of lipid assembly, it may oversimplify the fluid and dynamic nature of lipid molecules in biological membranes .

Additionally, Enguita noted a particular trend: as the number of lipid molecules increased, the model favored micelle formation but often neglected the crucial protein-lipid interactions. This is a significant limitation since many biological membranes depend on the intricate interplay between lipids and proteins to function properly. He also mentioned the existence of a limit in the number of lipids per entry, capped at 50 ligand molecules, though this can be circumvented by adding more lipids across multiple entries.

Karel Krápník Berka, a theoretical physical chemist and bioinformatician, further explored the limits of AlphaFold 3 with free fatty acids. He tested the maximum numbers of three types of fatty acids: oleic acid (OLA), palmitic acid (PLM), and myristic acid (MYR). His findings were quite revealing: while OLA and MYR tended to form bilayer-like structures, PLM was more prone to creating micelle-like formations. This variability underscores AlphaFold 3's inconsistent handling of different lipid types and its partial understanding of lipid physical chemistry .

In summary, the experiences shared by these scientists highlight several critical limitations of AlphaFold 3 in lipid modeling. First, positively, that the model does capture the "oil-like" behavior of lipids, as it always tries to hide their hydrophobic ("water-unliking") tails leaving their polar (i.e. "water-liking") heads exposed to the solvent. However, on the bad side, AlphaFold 3 models the lipids in an overly rigid fashion, not capturing their natural flexibility and hence incapable of modeling the fluidity of biological membranes. There also seems to be some bias towards forming micelle-like structures as the number of lipids increases, potentially overlooking other important structural formations such as bilayers.

Unfortunately, the restriction on the number of lipids per entry poses a challenge for modeling large and more complex lipid assemblies.

It is important to keep in mind that lipids appear in the Protein Data Bank mainly as molecules bound in small numbers to proteins, and rarely making large lipid-only assemblies. Thus, asking AlphaFold 3 to properly build such assemblies was overshooting, and the tests were simply meant to probe where its limitation is.

Proteins in membranes

Some posts fursther explored a very interesting aspect: how well can AlphaFold 3 predict how a protein embeds in a membrane? See, many proteins display surface patches with oil-like properties, that make them insert in membranes.

While there are dedicated methods and programs to compute if and how a protein will interact with a membrane, it is of course of interest to know whether AlphaFold 3 "knows" something about this or not. To put the model to the test, then, researchers tried running proteins together with many copies of lipid molecules… and they got some interesting results!

Here, Karel Krápník Berka fed AlphaFold 3 with the sequence of a protein that has an alpha helix with affinity for lipids, plus several oleic acid molecules. He found that AlphaFold 3 indeed made the protein interact with the oleic acid molecules mainly through its specialized helix, making quite much sense:

Prof. Jan Kosinski reported similar results for another protein:

Dr. Mustafa Tekpinar further shows that AlphaFold 3 could properly assemble a membrane complex made up of 5 copies of a protein:

Prof. Francisco Enguita showed, like others above, that AlphaFold 3 could correctly model a protein inside a membrane from the protein's sequence and a list of lipids; however, he also reported that an increase in the number of lipids favours the formation of a micelle-like structure that pushes the protein out, something that one wouldn't in principle expect.

Protein-nucleic acid complexes

Nucleic acids, mainly DNA and RNA, are central to biology and are, again, the subject of interactions with proteins. Understanding protein-nucleic acid interactions is important for fundamental biology as well as for modern biotechnology, especially regarding the Crispr-Cas9 technology for gene editing.

Indeed Matteo Ciciani, PhD student at the University of Trento in Italy, found that AlphaFold 3 could accurately predict the complex between a Cas9 protein, a specific RNA, and the target DNA, even in cases of intermediate similarity to proteins available in the Protein Data Bank as of the time of AlphaFold 3 training:

Among other examples of special interest, Francisco Enguita reported that models of a special kind of RNA called microRNA together with proteins that process these nucleic acids could provide useful insights for his research:

Extremely interesting were also Jan Kosinski's tests showing that AlphaFold 3 could not only produce reasonable models of protein-DNA binding but also of mutants that compromised specific recognition of the DNA by the protein!

Here again, Jan Kosinski shows how AlphaFold 3 correctly predicts that DNA repair enzymes (MutS in the example below) are meant to identify base mismatches in DNA, positioning the repair enzymes in the right place for action to take place:

Other tests by the same author confirmed the expectation that, just like with proteins, AlphaFold 3 is not perfect to predict the structures of complex RNA molecules:

Ions and metal sites in proteins

Ions are single ions or small groups of ions coming from salts solubilized in water. For example, when table salt of formula NaCl is dissolved in water, you don't see it anymore because it has dissociated into individual Na+ and Cl- ions that make an homogeneous phase with water. Normally, no protein or nucleic acid prediction program models ions, even though they are in many cases very important. For example, nucleic acids often bind Magnesium, and many proteins contain metal ions like Calcium that are important to stabilize the structure, or like Zinc, Copper, Iron and others that are important for them to carry out their functions.

First, a rather bad but insightful example. Wojciech Kopec, lecturer at the Queen Mary University of London, observed that when you provide AlphaFold 3 with many ions, it starts putting them in locations known to be occupied by water molecules in X-ray structures.

As Frank Noe from the Free University of Berlin and Microsoft's team exploring AI applications to chemistry and structural biology comments, "It's a structure prediction method, not a physics emulator." To which he then clarifies: "It definitely learns some physics/chemistry, but it's not been trained to predict physical conditions and molecule setups that are very dissimilar from those of the PDB entries." As others respond, I think it's still worth trying out to see what happens and know how much the model knows, even if this might be very little!

On a different kind of test, Prof. Timothy Duignan from the University of Queensland in Australia found that AlphaFold 3 can produce a roughly correct separation of certain ions in solution, while failing for others – but where even more complex simulation methods don't excel either.

On closing this section it's important to highlight that AlphaFold 3 is neither intended nor expected at all to properly model multiple ion systems, but just small numbers of ions bound to specific areas of proteins and nucleic acids.

Metal ions in proteins

Many proteins have highly specific binding sites for metal ions that can have a structural or functional role. Predicting such sites in proteins is of utmost relevance, a question that involves finding which atoms of the protein will interact with the metal ion and how, that is in terms of geometry, bonding strength, etc.

Simon Duerr, PhD student at EPFL in Switzerland, compared AlphaFold 3 as a tool to jointly predict the structures of proteins containing metal ions, against other tools specifically designed for that. He found AlphaFold 3 is as good as specialized tools, even being sensitive to mutations that alter metal ion binding. Moreover, when using more metals ions than those expected to bind tightly and natively to the protein, AlphaFold 3 fits the excess ions in sites that make sense regarding the underlying physical chemistry. Plus, residues binding the metal ions adopt conformations consistent with binding.

Duerr further notes however that AlphaFold3 is still biased for well ordered proteins that it has seen during training, and it cannot predict conformational changes induced by metal binding. Likewise, it does not generalize to new metal binding motifs it hasn't seen during training.

Duerr just published a preprint describing his tests in detail here:

Predicting metal-protein interactions using cofolding methods: Status quo

Discussion and Conclusions

It is clear that despite its many advancements, AlphaFold 3 still faces several challenges; in particular, it doesn't seem to be far better at modeling protein-only systems than AlphaFold 2 – actually making mistakes that its older brother doesn't.

On the other hand, it is outstanding how AlphaFold 3 could learn some chemistry/physics that it was not expected to learn, some "sparks of chemical intuition" as I call them. These are evidenced in the model's ability to generate accurate local arrangements and re-arrangements in protein-nucleic acid complexes, for example when a base mismatch is introduced in a pair of nucleic acid molecules; also an unexpected ability to model model lipid-lipid and protein-lipid interactions when multiple lipids are present that can self-assemble, and even predict some (even if very coarse) aspects of ion behavior.

As you have seen here in numerous examples, the community-driven efforts to test and push the boundaries of AlphaFold 3 are invaluable. These experiments, though constrained by the limited access to the model, provide critical insights into its capabilities and limitations.

A very interesting benchmark that was just published as a preprint put AlphaFold 3 to work on datasets focused on binding energy, revealing that it learns unique information and synergizes with force field, profile-based, and other deep learning methods to predict the effects of mutations on on protein-protein interactions. The authors' conclusion in that work is that AlphaFold 3 captures more global effects of mutations by learning a smoother energy landscape but lacks the detailed atomic modeling that force field methods provide, and they propose that integrating these approaches could be a promising future direction:

AlphaFold3, a secret sauce for predicting mutational effects on protein-protein interactions

Besides and related to the need of further testing AlphaFold 3's power and limitations, the ongoing push for an open-source version of AlphaFold 3 is crucial, as it would allow researchers to better understand and refine the model, as they did with previous versions of it.

For Structural Biologists

Next, a message for end users who want to benefit from AlphaFold 3 for its investigations in experimental or computational structural biology. AlphaFold 3 is obviously a powerful tool that will likely advance biomolecular structure prediction significantly, perhaps not as much as AlphaFold 2 did – that one was truly revolutionary – but certainly unlocking several interesting routes. Yet, it remains as one more tool rather than the ultimate solution, requiring continued experimental validation and refinement. As always with such kind of tools, it is essential to maintain a balanced perspective, recognizing both its groundbreaking potential and its current limitations.

Implications for AI and Data Science Enthusiasts

There's also a message in all this for those interested in AI and data Science but not in structural biology. Different to but like its predecessor, AlphaFold 3 is a quite complex piece of AI that integrates a large number of very modern concepts, certainly at the forefront of the technology. Thus, everyone working in AI might find bits of valuable information by trying to understand how the model works, what it has learned (a special point here as I intended to present!), what its power and limitations are, etc.

In particular, AlphaFold 3's ability to predict complex biomolecular interactions showcases the potential of deep learning to solve intricate problems, in this case applied to science but in principle in any other domain where data is abundant. Despite its impressive capabilities, however, AlphaFold 3's limitations highlight the importance of further thinking how to improve AI models.

Another point to learn from is the ongoing push for an open-source version of AlphaFold 3, which emphasizes the role of community collaboration and transparency in advancing research, in this case AI research. And this is not limited to AI models for natural science; we have for example seen how Meta's LLAMA large language models and OpenAI's Whisper models are all open.

More globally, for AI enthusiasts, AlphaFold 3 is one more example of the truly transformative potential that AI can have on our world.

www.lucianoabriata.com I write about everything that lies in my broad sphere of interests: nature, science, technology, programming, etc. Subscribe to get my new stories by email. To consult about small jobs check my services page here. You can contact me here. You can tip me here.

Tags: Artificial Intelligence Biology Machine Learning Science Thoughts And Theory