Summary

  • Chinese start-up DeepSeek has drawn attention for its AI chatbot R1, which rivals leading AI companies’ performance at a lower cost.
  • Accusations have been made that DeepSeek used a technique called ‘distillation’ (also referred to as ‘knowledge distillation’) without permission to learn from OpenAI’s proprietary o1 model.
  • Distillation is a widely used tool in AI that reduces the size of a model without losing accuracy and has been part of computer science research for a decade.
  • The idea began in 2015 with a paper by three Google researchers, including AI pioneer Geoffrey Hinton, who called the learning that takes place “dark knowledge”.
  • Distillation involves a large teacher model transferring information to a smaller student model, using soft targets to communicate probabilities instead of firm answers.
  • The student can then learn more efficiently, grasping categories of information that it is supposed to sort through.
  • Distillation is now offered as a service by companies including Google, OpenAI and Amazon.

By Amos Zeeberg

Original Article