Summary

  • Cohere for AI, the non-profit research division of Canadian AI startup Cohere, has launched Aya Vision, its first vision model.
  • The open-weight multimodal AI model integrates language and vision capabilities and supports 23 different languages, making it appealing to a global audience.
  • Aya Vision is designed to enhance AI’s ability to interpret images, generate text, and translate visual content into natural language in multiple languages.
  • The model is available on Cohere’s website and AI code communities Hugging Face and Kaggle under a Creative Commons license, allowing researchers and developers to use, modify and share the model for non-commercial purposes.
  • Aya Vision is also available through WhatsApp, where users can interact with the model directly.
  • The model’s efficiencies and performance relative to model size are standout features, outperforming larger multimodal models in several key benchmarks.
  • Cohere For AI credits this to innovations such as synthetic data generation for training, multilingual data scaling, and model merging techniques.
  • Although ostensibly catering to enterprises, restrictive non-commercial licensing terms limit Aya Vision’s business use.

By Carl Franzen

Original Article