Summary

  • A significant amount of time, money and energy is required to train large language models (LLMs) and conduct queries on them, prompting some researchers to consider smaller options
  • Recently, IBM, Google, Microsoft and OpenAI have developed small language models with around 10 billion parameters, which is far fewer than most LLMs
  • While not as versatile as their larger counterparts, the more focused small language models are good at carrying out specific tasks, such as summarising conversations or answering queries from patients in a medical setting
  • Researchers are developing ways to train the small models using high-quality data sets created by larger models, through a method known as knowledge distillation
  • It is hoped these more efficient, smaller models will allow researchers to experiment with novel ideas more quickly and easily, while saving time and money.

By Stephen Ornes

Original Article