
Title: What is the Controversy Surrounding Sarvam AI’s Latest Launch?
The Indian startup, Sarvam AI, recently launched its latest product, Sarvam-M, a 24B open-weights Large Language Model (LLM) built on top of Mistral Small. The launch was met with a muted response, with fewer than 720 downloads within the first three days. However, the real controversy surrounding the launch lies in the criticisms it has received from industry experts.
Menlo Ventures’ Deedy Das was one of the first to speak out against Sarvam-M. In a tweet, he stated that the model performs “marginally better in Indic languages” and that it “doesn’t solve hard problems.” Das’s comments sparked a heated debate in the AI community, with many questioning the value of Sarvam-M.
However, not everyone was critical of Sarvam-M. Zoho’s Sridhar Vembu came to the defense of the model, saying that “no product was ever an instant hit.” Vembu’s comments suggest that the startup should not be judged solely on its initial performance.
So, what is the controversy surrounding Sarvam AI’s latest launch? Let’s dive deeper into the issue.
Sarvam-M: A 24B Open-Weights LLM
Sarvam-M is a 24B open-weights LLM built on top of Mistral Small. The model is designed to be a general-purpose language understanding and generation model, capable of performing a wide range of tasks, from text classification and sentiment analysis to language translation and text generation.
The model’s architecture is based on the transformer model, which is a type of recurrent neural network (RNN) that has gained popularity in recent years due to its ability to handle sequential data. The transformer model is particularly well-suited for natural language processing tasks, as it allows for the processing of long-range dependencies in language.
Sarvam-M’s performance on Indic languages
Das’s criticism of Sarvam-M’s performance on Indic languages is a significant point of contention. According to Das, the model performs “marginally better in Indic languages,” which suggests that it may not be as effective as other models in processing languages that are written in non-Latin scripts.
This criticism is particularly relevant in the context of India, where many languages are written in non-Latin scripts. For example, Hindi, Bengali, and Telugu are all written in the Devanagari, Bengali, and Telugu scripts, respectively. These scripts are significantly different from the Latin script used in English, and require specialized models to process.
Does Sarvam-M solve hard problems?
Das’s criticism of Sarvam-M goes beyond its performance on Indic languages. He also states that the model “doesn’t solve hard problems.” This criticism suggests that Sarvam-M may not be capable of tackling complex tasks that require a deep understanding of language.
This criticism is particularly relevant in the context of AI, where models are often judged on their ability to solve complex tasks. For example, a model that is able to recognize objects in images may be considered more effective than a model that is only able to perform simple tasks.
Zoho’s Sridhar Vembu defends Sarvam-M
Vembu’s defense of Sarvam-M is a significant counterpoint to Das’s criticism. He suggests that “no product was ever an instant hit,” which implies that Sarvam-M may not be as effective as other models, but it is still a valuable contribution to the field.
Vembu’s comments are significant because they suggest that the value of Sarvam-M lies not in its performance, but in its potential to be improved upon. This is a common theme in AI, where models are often judged on their ability to be improved upon, rather than their initial performance.
Conclusion
The controversy surrounding Sarvam AI’s latest launch is a significant point of contention in the AI community. Das’s criticism of Sarvam-M’s performance on Indic languages and its inability to solve hard problems has sparked a heated debate, with many questioning the value of the model.
However, Vembu’s defense of Sarvam-M suggests that the model may still be a valuable contribution to the field, even if it is not as effective as other models. The controversy surrounding Sarvam-M highlights the challenges of developing AI models that are capable of processing languages that are written in non-Latin scripts.
As the AI community continues to evolve, it is likely that we will see more models like Sarvam-M, which are designed to be general-purpose language understanding and generation models. The success of these models will depend on their ability to process languages that are written in non-Latin scripts, and to solve complex tasks that require a deep understanding of language.