A new episode of our podcast, “Show Me the Science,” has been posted. At present, these podcast episodes are highlighting research and patient care on the Washington University Medical Campus as our scientists and clinicians confront the COVID-19 pandemic.
The crowdsourced supercomputing project Folding@home harnesses the combined processing power of millions of computers whose owners download software and run simulations to model how proteins move and fold. Now, in response to the COVID-19 pandemic, individuals, universities companies, even the Spanish soccer league La Liga, have joined forces to model how the coronavirus uses its spike protein to bind to human cells.
In this episode, we speak with Greg Bowman, PhD, an associate professor of biochemistry and molecular biophysics and the international leader of the Folding@home effort, about using computer processing power to run simulations that would take more than 100 years to complete on a standard computer. Bowman says that with thousands of new participants, the project now has more raw computing power than the world’s largest 500 supercomputers … combined.
The podcast “Show Me the Science” is produced by the Office of Medical Public Affairs at Washington University School of Medicine in St. Louis.
Jim Dryden (host): Hello, and welcome to Show Me the Science, a podcast about the research, teaching, and patient care, as well as the students, staff, and faculty at Washington University School of Medicine in St. Louis, Missouri, the Show Me State. My name is Jim Dryden and I’m your host this week. We’ve been focusing these podcasts on the COVID-19 pandemic and Washington University’s response. This week: Folding@home. That’s a crowdsourced supercomputing project that is using people’s home computers to simulate the movements, or folding, of proteins used by the coronavirus that causes COVID-19. When the pandemic began, about 30,000 computers around the world already were running Folding@home software to understand protein folding involving other diseases. But in the last few months, that number has grown to more than 4,000,000 and includes computers from major tech companies and even from La Liga, the top-flight Spanish professional soccer league, which also has donated computing resources to the cause.
Gregory Bowman, PhD: It’s a pretty ridiculously amazing resource. And just to put to this in reference, the world’s fastest public supercomputer is the Summit supercomputer. Even by very conservative estimates, we have broken to the exaFLOP level, which is at least five times faster than the Summit supercomputer. We’re very much focused on COVID-19 right now, so we’re bringing a pretty unprecedented amount of compute power to bear on a single set of interrelated problems.
Dryden: That’s Washington University’s Greg Bowman, who leads the international Folding@home effort and who now works with companies such as Microsoft, Amazon Web Services, Advanced Micro Devices, Cisco, Oracle, and CERN, the European Organization for Nuclear Research, which runs the world’s most powerful particle accelerator. Folding@home didn’t start with COVID, but Bowman says the number of collaborators has exploded this year. That’s a good thing because Folding@home requires a lot of computing power.
Bowman: Folding@home is a massive distributed computing project with the aim of understanding protein dynamics. So, how do these molecular machines function, and how can we use this structural insight to accelerate the development of drugs and other therapeutics? And the reason for creating Folding@home is that watching all the moving parts of these little molecular machines is extremely computationally expensive to simulate. So I tell people, “Our easy problems could easily take 100 years on a single reasonable desktop computer, and no one wants to wait that long for an answer.” So what we’ve done is devised ways of breaking these big calculations up into little pieces that can be performed independently of one another, and then getting as many people as possible, through the Internet, to volunteer their computing resources to run these little pieces of simulation that we have put together into a big map of all the structures of a protein at the end.
Dryden: What sorts of things were you working on before this pandemic began?
Bowman: We’re really looking at how do atoms push and pull on each other and move over time as a result? Some examples of things that were really exciting before the pandemic hit were a protein called APOE involved in Alzheimer’s disease, where changing one out of about 300 chemical units that make up this protein can modulate your risk of developing Alzheimer’s disease by as much as 15-fold. People really don’t understand why, and so our hope is that we can shed some light on why are these pathogenic variants different, and can we use that, ultimately, to help design drugs to reduce people’s risk of developing Alzheimer’s disease. And then we’ve been doing a lot with infectious diseases, also. So right before the pandemic hit, we put out our first preprint on a protein from Ebola virus. And this was a really interesting one for us, because if you look at the snapshot of what this protein looks like derived from experiments, there aren’t really any good sites to target with small-molecule drugs to shut its function off. And what we showed with our simulations of how the components of this protein move around is that there actually are pieces that move, and spread apart from each other, and create potential drug binding sites that people hadn’t guessed were there before to interfere with Ebola’s ability to infect our cells and evade our immune systems. It was extremely natural to say, “Oh, look what we’ve been doing with Ebola virus. Can we do the same sort of thing with this new virus that obviously is of great concern to all of us?”
Dryden: And I guess that sort of makes my next question redundant, which was, “What made you think this would be useful for COVID-19?” But could you expand on that just a little bit?
Bowman: There’s been a flood of experimentally derived structures of proteins from the virus and these are extremely valuable. And they’re also, in many respects, just the tip of the iceberg, right? Because they’re blind to all of these moving parts, and so they may show you some of the opportunities for understanding how the virus works and finding ways to combat it, but they’re not an exhaustive picture of all of those opportunities. And so what we’ve been doing is taking all of this structural information as it comes out and using it to start up our simulations and add to that rapidly growing picture of how the virus works. And so this has been really cool. So, like, just to give one example, one of my favorites, for lack of a better word right now, is the spike complex. So when you look at these common pictures of the virus, these are these red protrusions sticking off from the surface of the virus. Each of those is a set of three identical proteins kind of arranged around the edges of a cylinder. And this is really interesting, because one of the ways that the virus invades our immune system is that these spike complexes use a conformational masking strategy. So you can kind of think of them like molecular turtles. So they pull their heads in and close up on themselves to hide the sensitive parts from our immune systems. And then every once in a while, they open up and expose these key parts for binding our cells so that they have a chance of binding to nearby human cells, or the proteins on those human cells, and initiating infection. But most of the time, these things are closed up on themselves to evade our immune system. And so when you see these common structures of the virus, that’s the closed-up state. And so what we’re able to do with our simulations is see this opening up in full atomistic resolution, and that provides a wealth of structural information on what this rare open state looks like. It’s hard to get by any other means. And we’re hoping we and others can use this to inform the development of drugs or antibodies or other therapeutics that target the spike, which is one of the most prominent targets for vaccines, for example.
Dryden: Now, is this strictly about understanding the structure of these proteins, or can the Folding@home effort give you some insight into the function of those proteins too?
Bowman: Yeah. So they’re very much connected, right? So, like, with the spike, the opening up is an essential component of its function. And so we’ve been doing things like comparing the spike complexes from different coronaviruses to understand what is it that makes this one so different from the others and such a problem for us? And does that insight give us clues as to what parts of the spike to target? And it’s helping inform experiments in a number of groups with collaborators and some in our own lab.
Dryden: Your personal story connects to this effort in that you’re studying how these tiny proteins arrange themselves and seeing things that we can’t normally see. Can you share some of that with us about your own personal history and how it led you to this kind of work?
Bowman: Yeah. So I have an inherited form of macular degeneration, so at the age of eight I started losing my vision very rapidly. I have pretty severe – stable, fortunately, but pretty severe – vision loss, and this has impacted my life in many ways. I mean, I can’t drive. It’s hard for me to recognize faces. Yeah, so it’s a pretty constant hurdle, but one silver lining is that it definitely got me interested in the biomedical sciences at a very early age, and the idea that I could contribute to understanding what goes wrong in diseases like mine and how can we go about designing drugs or other therapeutics. So this has been a lifelong goal of mine. And early on I discovered that wet lab, or experimental biology, is challenging if you have a vision impairment. There’s lots of small things that you have to work with. But I fell in love with computers and found that to be a really interesting way of contributing to biomedical sciences, and now has brought me back into the experimental world, too. Often through the hands and eyes of other people, but I can help get the insight and design the experiments and analyze the data and make sense of what’s going on.
Dryden: This effort now has enlisted enough home computers that it’s estimated that Folding@home has more raw computing power than is stored now in the world’s 500 largest supercomputers combined.
Bowman: That’s right. It’s a pretty ridiculously amazing resource. And just to put this in reference, the world’s fastest public supercomputer is the Summit supercomputer, and that has a peak performance of 200 petaFLOPs. And even by very conservative estimates, we have broken to the exaFLOP level, which is at least five times faster than the Summit supercomputer. We’re very much focused on COVID-19 right now, so we’re bringing a pretty unprecedented amount of compute power to bear on a single set of interrelated problems.
Dryden: And you’ve got some pretty heavy hitters involved in the effort, too, not just folks at home who want to help, not that they’re not important, but the Spanish professional soccer league, Microsoft. How did those big players get involved in this?
Bowman: Yeah. So this has been really nice. I mean, I think one of the things that’s been interesting to watch is that, especially as we all were going into quarantine stay-at-home mode, it’s not really satisfying to just hide from this big, life-altering threat. And so everyone, from individuals to large organizations, has been trying to figure out, “How can I actually do something about this?”
Dryden: That’s been true for doctors, scientists, computer programmers, even soccer.
[Audio from Spanish football league game]
Bowman: So they have some serious compute resources that, as I understand it, go towards making sure that people don’t pirate footage of the soccer games. And with the games on hold, these machines were largely sitting idle. And they got wind of what we were doing with Folding@home and pointed these pretty substantial computing resources at us. So they’ve been running simulations for us for quite a while and helping to accelerate the pace of our data acquisition.
Dryden: But in general, it’s mostly folding “at home.”
Bowman: People, individuals running it on their personal computers and people who are embedded in different tech companies using their influence over data centers and large compute resources to point some of those at Folding@home, also, and contribute to our calculations. And after we started growing by like tenfold in just a matter of weeks, our server-side infrastructure started to show the strain that it was under. And a bunch of these companies and organizations then reached out to try to figure out how they could help us keep up with the growth and harness all this compute power that’s been made available to us. So we now have a lot of extra servers running in the cloud and helping us to productively engage with the large number of volunteers and corporate computers that are contributing to our calculations now.
Dryden: Now, if I or somebody else wanted to help out with something like this, does it mean that I have to sort of sacrifice my computer so that it can gather this data, or can this stuff run in the background, or do people have their computers run at night? How does that part of it work, normally?
Bowman: There’s a lot of options there. So you can go to foldingathome.org and download our software and install it, and you can control how much of your computer we use and when. So two of the common ways for people to run, if it’s their personal computer and they also want to get other things done, our software will go on pause so that you’re unimpeded. Or, if you have a powerful enough machine – a lot of people have like eight-core processors, and if you’re watching Disney+, you probably don’t need all eight processors. So you can set it so that six of them are contributing to Folding@home and the other two are keeping your browser running smoothly.
Dryden: We’ve heard a lot, from a public health perspective, about wearing masks. And I wear a mask and it may not protect me, but it’s my way of contributing to protect you if I’m contagious. This seems like a similar kind of effort, that Folding@home has this goal of defeating the virus by getting all of us together to cooperate in the same way that we might cooperate by wearing masks, or staying 10 feet away from people, or whatever it is.
Bowman: I agree. I think it’s a really cool community-building effort, and it’s been really encouraging to see people coming together over this to work towards a common goal, and everyone going out of their way to take initiative to figure out a bunch of things that have looked like small things have turned into quite big things, where a software engineer from a big company will reach out to us and next thing we know, we’re in touch with their managers and their managers and have a whole data center helping out with Folding@home. Or people going and recruiting their communities in other aspects of their life to run Folding@home, and suddenly we see a big jump in participation. So that’s been a really encouraging silver lining against the backdrop of all of the challenges that we’re all facing right now.
Dryden: Have you been able to send any information to people who are working on therapies or developing vaccines or whatever? Or, because this is publicly available as soon as you have it, do you just assume that they’re checking out your resource that you’re providing?
Bowman: We’re working on sharing the data with anyone who wants it. It turns out the data sets are quite large, so it’s not a trivial task to just post it on our website and let anyone download it. We’ve got some of it out there through a group called OSF that wants to facilitate the sharing of scientific data, and we’re working with some other partners to get more of it up there. But we’re already sharing it with a number of efforts that are directly working on developing new therapies. And so this is a collaboration between a number of different computational and experimental groups that are all working together to rapidly iterate between predictions and designs and testing of potential new compounds. So we’ve got tens of thousands of simulations of this protein with potential inhibitors going on Folding@home to try to help figure out which ones are best to spend finite time and experimental resources on testing.
Dryden: Bowman and his colleagues around the world are hoping to use their massive amounts of computing power to zero in on the vulnerabilities of the SARS-CoV-2 virus. Once that job is done, they’ll get back to other things like Alzheimer’s disease, antibiotic resistance and other problems.
Show Me the Science is a production of the Office of Medical Public Affairs at Washington University School of Medicine in St. Louis. The goal of this project is to keep you informed and maybe teach you some things that will give you hope. Thanks for tuning in. I’m Jim Dryden. Stay safe.