Summary

  • French AI firm Mistral has released its first embedding model, Codestral Embed, which it claims outperforms existing models on benchmarks including SWE-Bench.
  • The model, which specialises in coding and is available to developers for $0.15 per million tokens, is designed for retrieval-augmented generation (RAG) use cases and to transform code and data into numerical representations.
  • It can be used for RAG, semantic code search, similarity search and code analytics, according to the company.
  • “Codestral Embed can output embeddings with different dimensions and precisions,” Mistral said in a blog post.
  • “The dimensions of our embeddings are ordered by relevance. For any integer target dimension n, you can choose to keep the first n dimensions for a smooth trade-off between quality and cost.

By Emilia David

Original Article