Texas A&M University

Protein Origami By Gene Charleton

Proteins are some of the most complicated and important molecules in the human body. Computer scientists are adding to our understanding of how proteins do what they do.

Dr. Nancy Amato
Nancy Amato is co-director of the Parasol Laboratory. Researchers in the Parasol Lab are using programming techniques originally developed for robots to analyze the best way for the parts of complex protein molecules to fold together and function properly.

Computer programming techniques that enable robots to find their way around obstacles are helping researchers understand one of the most complex and important problems in biomedical science — how protein molecules fold.

Proteins are crucial to human health. They do most of the important things that allow us — and other living things — to stay alive. The hemoglobin that carries oxygen in our blood is a protein. So are hormones like insulin, estrogen and testosterone. The antibodies that fight infection are proteins. Tendons and ligaments and bones are mostly protein.

A team of researchers led by professor Nancy Amato, co-director of the Department of Computer Science’s Parasol Laboratory, is applying these programming techniques, known as motion planning, to protein folding. Motion planning means computing a feasible, or possibly most efficient, way for something to happen. It could be figuring a safe way for a robot to move through a roomful of obstacles, the most efficient way to fold cardboard into a box or activities even more complex — like understanding how stringlike protein molecules fold into the complex shapes they take to do their jobs.

Understanding how proteins fold will help medical scientists develop treatments for diseases caused by misfolded proteins and work out ways to prevent them.

“Our motion-planning technique for simulating protein folding is orders of magnitude faster than existing methods — we solve problems in hours on a desktop PC that take traditional methods months of supercomputer time. Essentially, we save time by computing approximate solutions that capture the important features of the precise solution,” says Amato.

Putting proteins into motion

When motion planning is applied to protein folding, it means working out how individual parts of the complex protein molecule move into what biologists call the molecule’s native state, the shape in which the parts of the molecule nest together most “comfortably” in terms of the energy in the molecule. The protein’s ability to function properly depends on this shape being right.

Figuring out how this works is a knotty problem. Proteins are some of the largest and most complicated molecules there are. Even simple proteins often are made up of more than 100 smaller molecules called amino acids strung together like beads. Big proteins have thousands of amino acids. In your body, this string of molecules is folded into complex subshapes known as alpha helices and beta strands, hairpins and sheets, that are connected by loops. All of these pieces have to fit together perfectly for the protein to have the right overall shape and stability for it to do what it’s supposed to do in our bodies.

When this intricate folding goes awry, bad things happen. A hemoglobin protein that’s folded wrong results in the fatal disease sickle cell anemia. A misfolded bone protein gives us brittle bone disease. Misfolded proteins in the brain can mean Alzheimer’s or mad cow disease.

The protein solution

Understanding how proteins fold can help researchers understand why the proteins sometimes fold incorrectly. This understanding will help medical scientists develop treatments for diseases caused by misfolded proteins and work out ways to prevent them.

So far, the researchers have applied their technique to moderate-sized protein molecules  — consisting of between 50 and 200 smaller molecules.

They use desktop PCs to compute mathematical descriptions of the “folding landscape” that determines the process that protein molecules go through as they fold. Intuitively, the landscape encodes paths that protein molecule “robots” might follow to find their way around high-energy obstacles to “comfortable” positions.

To study more and larger proteins, they are using the STAPL parallel C++ library, also developed in the Parasol laboratory, and the world’s fastest supercomputer, IBM’s Blue Gene.

“So far we have shown that our simulations agree with known experimental results,” Amato says. The real significance of our method, though, lies in its potential to discover facts that have not yet been established experimentally and to test proposed therapies to alter undesirable folding behaviors.’’ end of story