Tech Tomorrow Podcast
Transcript: Could AI and data science help us find a cure for Alzheimer’s?

DAVID ELLIMAN
Hello and welcome to Tech Tomorrow. I'm David Elliman, Chief of Software Engineering at Zühlke. Each episode we tackle a big question to help you make sense of the fast-changing world of emerging tech. Today I'm joined by Alejo Nevado-Holgado, Associate Professor of Psychiatry at the University of Oxford.
Alejo heads up a multidisciplinary lab that uses AI and machine learning to better understand Alzheimer's and other neurodegenerative diseases. He's here to help me answer the question: ‘Can AI and data science help us find a cure for Alzheimer's?’.
PROFESSOR ALEJO NEVADO-HOLGADO
We are interested in using this new technology, applying it to all the aspects of Alzheimer’s disease where it could be helpful. Not all aspects of Alzheimer’s disease, because there are many. I would say more on the aspects of finding new medications and targets that could bring new therapies to patients.
DAVID ELLIMAN
So, you are tackling multiple goals in your lab, from identifying disease mechanisms to discovering new drug targets. Is there an overarching mission that ties all these efforts together?
PROFESSOR ALEJO NEVADO-HOLGADO
The overarching mission is more a methodological one. So, the mission is, okay, we are experts on artificial intelligence and also on the laboratory side, on iPSC cells, on pluripotent stem cells.
So, where are the places in Alzheimer's disease where we can help the most with these two things? That's where we start our planning of what to do next, so to speak.
DAVID ELLIMAN
Pluripotent stem cells are found throughout the body and can return to an embryonic-like state and renew themselves. They’ve changed how we research regenerative medicine, and show great promise for treating degenerative brain conditions.
PROFESSOR ALEJO NEVADO-HOLGADO
The reasons why Alzheimer's disease is so difficult are mainly two reasons. The first one is common to all brain disorders, and it is that it is in the brain, and the brain is both very well protected morphologically with the bones, so you cannot get a biopsy very easily, and it is very well protected chemically or microscopically, so to speak.
So, with the blood-brain barrier, which is a rather quite sophisticated structure around all blood vessels, that makes it very difficult both for medications to go into the brain and for biosignals to get out of the brain, that would allow us to know what's going on.
And that makes fixing the disease very difficult. The other problem, which is a bit more particular to Alzheimer’s and other neurodegenerations, is that once the disease becomes evident, once symptoms manifest, the disease has been developing already for a very long time. So, the first molecular triggers of the disease start like 10 years before the symptoms appear.
Once you get all your patients to see what is wrong with them, it is already very difficult to find what were the first events that started the disease, because they happened 10 years in the past of the people that you know have Alzheimer’s.
DAVID ELLIMAN
There seems to be some debate with some of the findings and some of the drugs to treat some of the amyloid plaques that build up in the brain.
Some people seem to think that's the thing you address, because that's the cause, and other people seem to think it's the symptom. It seems to be incredibly complicated.
PROFESSOR ALEJO NEVADO-HOLGADO
Yes, exactly. The thing is that there are two very clear anomalies that you see in the brain of Alzheimer's diseased people. If you look under the microscope, one is what is called amyloid plaques, and the other one is tau fibrils.
In summary, they are two different types of proteins, amyloid beta and tau, that aggregate into molecular structures that are very difficult to dissolve and that are hydrophobic, and they form clumps.
So, when you look under the microscope, you see that all over the brain, and especially more in the areas where the brain starts to show symptoms of degeneration. So, it looks like a very good candidate, but what that's only telling you is simply a correlation: that people that have the disease show these clumps of proteins in the brain and people that haven't got the disease don't show them. But a correlation doesn't mean causation. So, in principle, you cannot say whether those clumps that you see in the brain are causing the disease or something else is causing the disease, and further down the line, as a consequence of the disease, you see those clumps. So, in principle people didn't know. That's why the first medications that are starting to show a little bit of positive effect that act by tackling the amyloid aggregates, that's why people are a bit sceptical about them.
DAVID ELLIMAN
You mentioned that by the time people begin to show symptoms and get tested, neurodegeneration has often been going on for years. Since early prevention is important, what are you exploring in that area?
PROFESSOR ALEJO NEVADO-HOLGADO
So, another very active area of research that we are also working on is trying to detect from blood samples whether people are developing the disease or not, and trying to detect it like 10 years earlier than what we can detect it now.
Because theoretically, and studies have shown in the last decade, that actually despite the blood-brain barrier, you can detect, because of the concentration of particular proteins in blood, whether a person is developing those clumps of amyloid beta and tau in the brain, and that opens the door to finding out people that are at the very, very early stages of developing the disease so that you can investigate better what is happening then.
And later and more importantly, once we have a very good cure, as we partly might have now with lecanemab and so on, once we have a good cure, being able to give that medication or therapy to the patients early before the brain starts to deteriorate.
DAVID ELLIMAN
One of the things that I think a lot of people are fascinated with is how people who use AI, statistical models, et cetera, when you're working almost in the digital world, how that complements the traditional neuroscience practitioner process.
So how do AI and the traditional models of work in drug discovery, how do they work together now?
PROFESSOR ALEJO NEVADO-HOLGADO
In the area of genomics, the human has like millions of different tiny mutations in the genome. So, every person might have a different letter in each of those 1 million different positions in the genome.
And there are many other more complex mutations because these mutations and their associations with each other are so complex. In the past, what people were focusing on was only maybe on the 10% of the most common mutations, the little tiny mutations or locations in the genome that change in the majority of people.
And they would study them one SNP at a time, so to speak. So that will allow you only to find, imagine that you are analysing only 10,000 possible SNPs, that will allow you to find only among 10,000 possible associations between your DNA and the disease.
And if you find any one of those, and usually per disease you only find 20 or 30, that might give you a lot of information about the biology of the disease, what pathways are getting us to produce the disease. But now with AI and more sophisticated methods, you can go way beyond the most common SNPs, the most common mutations.
Rather than working only on the 10,000, you might be able to work on the 1 million mutations and you might be able to work not only on each individual mutation, whether each individual mutation might be associated with the disease, but whether pairs or trios or bigger groups of mutations might be associated with the disease.
So that opens the number of possible associations that you can find. It increases it by several orders of magnitude potentially, and that's one of the places, for instance, where people are quite excited to see if AI models can help.
DAVID ELLIMAN
Alejo uses many types of data in his work, including data from the UK Biobank and research from a specialist wet lab. So why is it important or helpful to look at different types of data sets to get an accurate picture? Because complex problems don't live in one dimension. Multimodal data lets you see the same problem from different angles, and that's where the real signal emerges in the intersections.
It's like the old parable of the blind man and the elephant. Each dataset gives you part of the shape, but you need all of them to see the whole animal. However, in the world of neurodegenerative brain disorders, rich data can be hard to come by. So Alejo has to work in a much leaner way.
PROFESSOR ALEJO NEVADO-HOLGADO
Rather than thinking of what will be the ideal data that we would like to have with which to investigate the disease, we rather ask the opposite question, like what is the data that there is available around? Because there is so, so little data, it is so difficult to access the brain that there aren't too many options.
And then once we have in front of us on the table the different types of data that are available, we think, okay, how should we use this data to try to understand the disease? The situation has become better because now there are big resources like UK Biobank, which have data from half a million people in the UK and they have all sorts of data like protein concentration in blood, brain imaging, medical history, many different variables.
And then there are other types of data, but this type of data, usually the richer it is, the more difficult it is to access it because it contains delicate information that should remain private in safe servers and so on. The other type of data that is safer to move around is also becoming more readily available.
Like, for instance, results of drug screens where they test hundreds of thousands of drugs to see what they do in pluripotent stem cell models of the disease and so on. So those are the two main types of data sets that we are combining just now.
DAVID ELLIMAN
So you have to decide which data to use, normalise it, and then choose the right machine learning models.
I think some people think there's a magical digital twin that can simulate everything, but in reality it's much more complicated than that, I assume.
PROFESSOR ALEJO NEVADO-HOLGADO
Yeah, exactly. So, before you start, once you have the data, you are thinking how you could use this data to do something useful for Alzheimer patients, because in particular we are specialists in AI.
The next thing we think about is, okay, where AI is going to bring an advantage in comparison to other more traditional statistical methods that many other labs are using. And usually the answer is that AI will bring an advantage in data sets where the signal-to-noise is good. The data is not very noisy, but the signal is very complex, so to speak.
So for instance, an ideal data set is human text. So when we write, we could say that the only noise that there is in the text that we write could be some spelling mistakes. But apart from that, the signal, the information encoded in our text, is very rich, but it is encoded in a very complex way, so that's why traditional techniques were terrible at understanding human text and not using it.
But current large language models like ChatGPT or Claude are suddenly so much better than prior techniques. It is because that particular signal, human text, has those properties very salient, so a very strong signal and low noise. In the case of Alzheimer, the places where we are investigating the data, where we believe that might be the case as well, is on genetic data because current techniques can measure genetic sequence code very, very accurately, and we believe the information is encoded in a very deep way, in a very complex way.
Another place where the same might be happening is on the structure of molecules of drugs. Because all the information that, so to speak, the universe has when you take a drug and you add it to an in vitro culture of cells that are modelling Alzheimer's disease, for instance, all the information that the universe has to tell you whether those cells are going to heal or not is just the structure of the molecule that you are adding there.
So the hope is that with sufficiently sophisticated analysis methods like AI, we could decode that relationship of what a molecule needs to have to heal a particular in vitro cell system.
DAVID ELLIMAN
I think it's fascinating. I was reading one of your papers describing your pathway-informed neural networks, and it struck me as though I come from a computing background.
It's not a science or medical background, but it looked to me that it was a fantastic way of optimising and not only finding, as you say, the signal-to-noise ratio, so you find the pathway where you get signal coming through. But it also struck me as one of the big challenges with machine learning is explainability and repeatability of decision-making, that oftentimes we can't always repeat why or prove why something was decided or something happened. And that this pathway dependency might give you not just the answers that you seek, but also a lineage straight away, like a provenance of the decision-making.
PROFESSOR ALEJO NEVADO-HOLGADO
Yeah, so that's trying to explain, understand why neural networks are taking this or that decision.
It is just now one of the limitations of AI, so to speak. So AI seems to work very well in these situations where there is a strong signal and low noise. But once it is working very well, you don't know very well why it is working so well. We have a little bit the same problem that in neuroscience, we know that the human brain can do all these very amazing things, but we don't know very well why.
And the frustrating thing is that in neuroscience the problem is that you cannot access the brain. You can only maybe measure a little bit with patch clamp, the activity of a few neurons in a human. But the frustrating thing is that in neural networks, we can measure every individual neuron of the artificial neural network.
And still we don't know what they are doing. We have a couple of projects where we are trying to progress a bit on understanding why neural networks applied in this case to genomics and to molecules are making this or that decision. But we are just now starting in this research.
DAVID ELLIMAN
The challenge is that we cannot access the brain to validate the results, and we see similar things in the world of software engineering, the kind of black box within a black box concept. And it's actually more common than people think, and legacy systems are a perfect analogy.
We regularly work with mainframe applications where the original developers left decades ago.
There's no documentation. You can't open the box without risking a production outage. You have to infer what the system does from the inputs and the outputs. Much like neuroscientists infer brain function from behaviour and scans.
We build models of what we think is happening inside. We test our hypotheses and iterate, but we're never entirely certain until we find our way in. So leaders have to make decisions about the conclusions AI came to and whether to trust them or not. The honest answer is the same as the way they make their strategic decisions more generally: with structured uncertainty.
You don't need to understand every neuron in the network, but you do need confidence boundaries. What I advise leaders is look at the AI's track record on similar decisions, stress test the output against the edge cases, and always have a human checkpoint before something becomes irreversible. Think of it less like trusting a black box and more like trusting a highly competent colleague.
You don't audit every thought in their head, but you do ask them to show their working on things that matter.
I think one of the surprises, reading about your work, that struck me was the potential for other drugs to be effective in either having an effect on Alzheimer's, like metformin and non-steroidal anti-inflammatories, drugs that people could take every day, like aspirin or ibuprofen, might possibly have a protective effect.
I was thinking about it from a sort of a data set point of view, that if you've got a data set of hundreds of thousands of people that you could suddenly create demographics of people and their experiences, if you had historical studies of how people and their symptom development and the knowledge of the other drugs they were taking, you could suddenly start finding some of these things that are present in the data.
PROFESSOR ALEJO NEVADO-HOLGADO
Yes. If we have that type of very dense data, which probably in a couple of decades we will have, then, yeah, the task wouldn't be so difficult. A limitation is that currently the data we have, even when it is way bigger and way better than a decade ago, it is still very patchy. So for instance, in UK Biobank and different biobanks, usually you only have a couple of samples per person or maybe one. So for each person, you know whether they have Alzheimer’s or not, and whether maybe in the past they were taking metformin or not. So with only that information, you have to put everything into a statistical model or AI to see if there is any correlation between taking metformin and having, or not having, Alzheimer’s.
Because of how statistics behave, you can do that only with the medications that are taken very frequently. So metformin and non-steroidal anti-inflammatory drugs are taken a lot by many people. And then you have the so-called enough statistical power to see if there is a correlation between them, namely that people that take metformin, after decades of taking metformin, have lower chances of having Alzheimer’s. If we rather had a lot of data points for each person because people maybe are logging their medical history every couple of weeks in their phone, as it might be the case in the future, then we will be able to investigate drugs that are way less frequent because once you have such longitudinal data the statistical power becomes way better. So you will be able to do the same studies that we do now for metformin. You will be able to do it with hundreds and hundreds of drugs that maybe only 100 people in the country take.
DAVID ELLIMAN
So given everything that we've discussed, do you think AI and data science can help us find a cure for Alzheimer's?
PROFESSOR ALEJO NEVADO-HOLGADO
So I think that AI is a new tool. It is not going to replace everything, so to speak, but it is a new tool that might be very useful for research on Alzheimer's disease. It is a tool that can be potentially applied to everything, but that probably in the end, it'll be only a proportion of places where it'll be applicable.
It might be tremendously useful in some of those places where it is applicable. It won't be that useful in some others. For instance, an example of somewhere where it might be useful is, currently trying to make simulations, what is called all-atom simulations, of what drugs and molecules do in the human cell.
So what you do to simulate on the computer what a drug might do in the cell is to simulate the position of every single atom of the molecule and every single molecule of water that is swimming around the drug. And that might be like 100 million atoms that you have to simulate. And currently with the biggest supercomputers that academics have access to, maybe you might be able to simulate in one day 300 nanoseconds of that molecule swimming around through the cells and doing, I don't know, seeing if it interacts with amyloid beta or not.
But the problem is that 300 nanoseconds is like 10 orders of magnitude lower than the speed at which amyloid beta aggregates and forms those clumps in the cell that you want to get rid of with the drug. So then these traditional methods are quite hopeless towards you finding out whether that molecule is going to do what you want. With neural networks, one of the very interesting things that people are trying to do is to speed up those simulations such that artificial neural networks can run those simulations way, way faster and can simulate maybe rather than 300 nanoseconds per day, it can simulate 300 seconds or more.
That will get you much closer to the speed at which amyloid beta aggregates, and therefore you will be able to see whether your drugs that you are simulating in the computer are actually stopping the aggregation of amyloid beta or not.
DAVID ELLIMAN
So can AI and data science help us find a cure for Alzheimer's? Yes, but it'll be a tool in a toolbox that we apply, that humans administer, and it'll change the way we do things. And we might treat data differently. We might get wider data sets, more data sets, different viewpoints. It just means that we'll start looking at things in a slightly different way.
But I don't think it fundamentally takes away or potentially automates anything for us. And I think it's true for all areas where AI is being used in research and science, that we're finding that it helps in certain places within a pre-existing process. We're finding ways that we can apply AI, and I think there are some expectations that we can step back and we can automate an entire process, and I don't think we're seeing evidence for that.
Thank you for listening to Tech Tomorrow brought to you by Zühlke. If you'd like to learn more about what we do, you can find links to our website and more resources in this episode's show notes. Until next time.