Microsoft unveils Phi-3 family of small language models

By Elizabeth Montalbano

Microsoft has introduced a new family of small language models (SLMs) as part of its plan to make lightweight yet high-performing generative artificial intelligence technology available across more platforms, including mobile devices.

The company unveiled the Phi-3 platform in three models: the 3.8-billion-parameter Phi-3 Mini, the 7-billion-parameter Phi-3 Small, and the 14-billion-parameter Phi-3 Medium. The models comprise the next iteration of Microsoft’s SLM product line that began with the release of Phi-1 and then Phi-2 in rapid succession last December.

Microsoft’s Phi-3 builds on Phi-2, which could understand 2.7 billion parameters while outperforming large language models (LLMs) up to 25 times larger, Microsoft said at the time. Parameters refer to how many complex instructions a language model can understand. For example, OpenAI’s large language model GPT-4 potentially understands upwards of 1.7 trillion parameters. Microsoft is a major stock holder and partner with OpenAI, and uses ChatGPT as the basis for its Copilot generative AI assistant.

Generative AI goes mobile

Phi-3 Mini is available now, with the others to follow. Phi-3 can be quantized to 4 bits so that it only occupies about 1.8GB of memory, which makes it suitable for deployment on mobile devices, Microsoft researchers revealed in a technical report about Phi-3 published online.

In fact, Microsoft researchers already successfully tested the quantized Phi-3 Mini model by deploying it on an iPhone 14 with an A16 Bionic chip running natively. Even at this small size, the model achieved overall performance, as measured by both academic benchmarks and internal testing, that rivals models such as Mixtral 8x7B and GPT-3.5, Microsoft’s researchers said.

Phi-3 was trained on a mix of “heavily filtered” web data from various open internet sources, as well as synthetic LLM-generated data. Microsoft performed pre-training in two phases, one of which was comprised mostly of web sources aimed at teaching the model general knowledge and language understanding. The second phase merged even more heavily filtered web data with some synthetic data to teach the model logical reasoning and various niche skills, the researchers said.

Trading ‘bigger is better’ for ‘less is more’

The hundreds of billions and even trillions of parameters that LLMs must understand to produce results come with a cost, and that cost is computing power. Chip makers scrambling to provide processors for generative AI already envision a struggle to keep up with the rapid evolution of LLMs.

Phi-3, then, is a manifestation of a continuing trend in AI development to abandon the “bigger is better” mentality and instead seek more specialization in the smaller data sets on which SLMs are trained. These models provide a less expensive and less compute-intensive option that can still deliver high performance and reasoning capabilities on par or even better than LLMs, Microsoft said.

“Small language models are designed to perform well for simpler tasks, are more accessible and easier to use for organizations with limited resources, and they can be more easily fine-tuned to meet specific needs," noted Ritu Jyoti, group vice president, worldwide artificial intelligence and automation research for IDC. “In other words, they are way more cost-effective the LLMs.”

Many financial institutions, e-commerce companies, and non-profits already are embracing the use of smaller models due to the personalization they can provide, such as to be trained specifically on one customer’s data, noted Narayana Pappu, CEO at Zendata, a provider of data security and privacy compliance solutions.

These models also can provide more security for the organizations using them, as specialized SLMs can be trained without giving up a company’s sensitive data.

Other benefits of SLMs for business users include a lower probability of hallucinations—or delivering erroneous data—and lower requirements for data and pre-processing, making them overall easier to integrate into enterprise legacy workflow, Pappu added.

The emergence of SLMs does not mean LLMs will go the way of the dinosaur, however. It just means more choice for customers “to decide on what is the best model for their scenario,” Jyoti said.

“Some customers may only need small models, some will need big models, and many are going to want to combine both in a variety of ways,” she added.

Not a perfect science—yet

While SLMs have certain advantages, they also have their drawbacks, Microsoft acknowledged in its technical report. The researchers noted that Phi-3, like most language models, still faces “challenges around factual inaccuracies (or hallucinations), reproduction or amplification of biases, inappropriate content generation, and safety issues.”

And despite its high performance, Phi-3 Mini has limitations due to its smaller size. “While Phi-3 Mini achieves a similar level of language understanding and reasoning ability as much larger models, it is still fundamentally limited by its size for certain tasks,” the report states.

For example, the Phi-3 Mini doesn’t have the capacity to store large amounts of “factual knowledge.” However, this limitation can be augmented by pairing the model with a search engine, the researchers noted. Another weakness related to the model’s capacity is that the researchers mostly restricted the language to English, though they expect future iterations will include more multilingual data.

Still, Microsoft’s researches noted that they carefully curated training data and engaged in testing to ensure that they “significantly” mitigated these issues “across all dimensions,” adding that “there is significant work ahead to fully address these challenges.”