By treating DNA as a language, Brian Hie’s “ChatGPT for genomes” could pick up patterns that humans can’t see, accelerating biological design.

The post The Poetry Fan Who Taught an LLM to Read and Write DNA first appeared on Quanta Magazine

Summary

  • Computer scientist Brian Hie has led a team that trained a large language model, called Evo, on DNA sequences from 2.7 million microorganisms to allow it to “read” DNA and create novel DNA sequences for functional biological machines.
  • Evo can predict evolutionary likelihoods for DNA mutations and has created wholly new DNA sequences for biological molecules that function as well as or better than those designed by humans.
  • The team believes AI-designed biological design could be used to create better tools for medicine and the environment.
  • Hie compares the language of DNA to that of human language, and suggests that an algorithm fluent in DNA can interpret what humans cannot.
  • However, Evo makes mistakes and is currently only trained on the genomes of prokaryotes, not the more complex eukaryotes that include animals, plants and fungi.

By Ingrid Wickelgren

Original Article