DeepMind, the London artificial intelligence laboratory that Google bought in 2014, has already created programs that always win at chess, shogi and Go, the most complex board games. But the ultimate goal of the company is not in hobbies, but in solving pressing scientific problems. Its AlphaFold algorithm, presented at the beginning of December in Cancun, Mexico, has won a global competition in predicting the three-dimensional structure of proteins.
Proteins are the molecular machines of living beings. Each is a long chain of units called amino acids, like beads strung on a wire, which folds spontaneously to take a complex and precise shape. The final structure of each protein determines its function. Antibodies, for example, are like hooks that attach to microbes. Hemoglobin has a gap that traps oxygen molecules. The collagen is like a braided cable.
Predicting the structure of any protein from its amino acid sequence is considered one of the holy grail of biology. It is not a minor task; amino acids are 20 molecules with slightly different chemical properties, which are linked by bonds of different lengths and angles. It would take more than the age of the universe to bend a protein for all its possible configurations before hitting the correct three-dimensional structure by chance.
Despite the esoteric nature of this scientific field, it is difficult to exaggerate its importance. Certain diseases, such as Alzheimer’s, Parkinson’s, diabetes or cystic fibrosis, are due to the accumulation of misfolded proteins, something that could be avoided by knowing the relationship between their sequence and their structure. Almost all drugs act by coupling to the specific region of a protein, a process that again depends on the precise structure of the target. In addition, with the ability to predict exactly how a chain of amino acids will bend, scientists can design artificial proteins, for example to degrade plastics or polluting compounds, the organism or the environment.
In a statement, the DeepMind team has called this achievement its “first significant milestone” in the application of artificial intelligence to scientific progress. “The problem of protein folding is not solved yet”, warns Paul Bates, an expert in this field at the Francis Crick Institute in the United Kingdom, who attended the presentation of AlphaFold in Cancun. The DeepMind program hits more times and more accurately than the others, but it does not solve all the structures. This is because artificial intelligence learns from a database of known proteins, and therefore encounters completely new structures.
The contest that AlphaFold has won, called Critical Assessment of Structure Prediction (CASP), is held every two years. In it, each team receives new genetic sequences at intervals of several days. These correspond to proteins well studied in the laboratory, but whose structure has not been made known to the public. Contestants should approach as closely as possible with their prediction models to the actual form of the molecule.
The Google team, which was presented for the first time to the contest, was first of 98 contestants, estimating with more precision the structure of 25 of the 43 proteins, according to The Guardian. For each amino acid sequence, there is usually a correct fold, which corresponds to the configuration of greater biochemical stability. In a laboratory, the actual form of biomolecules can be observed using techniques such as magnetic resonance or X-ray crystallography, a method similar to that allowed Rosalind Franklin to see for the first time the structure of the DNA double helix.
Artificial intelligence is an incredible advance over these complex and costly techniques, although it is not yet able to completely replace them. DeepMind trained a neural network linking the shape and gene sequence of thousands of known proteins. Armed with that knowledge, the AlphaFold program predicts the distance and angle between each pair of amino acids in the chain, and then makes small adjustments to the entire structure to find the most stable configuration.
The most immediate medical applications of this technology will be seen in the design of drugs, including anti-cancer drugs, says Bates. “We still have not got models with enough precision for this”, he says. In the future something more distant, could come the modification of proteins associated with degenerative diseases such as Alzheimer’s. “You can start thinking about those more difficult problems. This gives a starting point”.
DeepMind contestants conclude that “there is a lot of work to be done before we can have a quantifiable impact in the treatment of diseases, environmental management and other applications”, but they add that “the potential is enormous”. According to Bates, the ideal algorithm would test every link of the protein chain without the need for an external reference, but that requires a deep knowledge of unattainable physics yet. Despite the fact that the DeepMind team is new to a well-worn discipline, “it has done better than anybody”, says the British scientist.