Microsoft Launches Phi-3 Mini Language Model
Small enough to be deployed on a phone
Microsoft has released a miniaturized artificial intelligence model called Phi-3 Mini, the first in a family of lightweight models designed to bring AI capabilities to everyday devices.
Phi-3 Mini has a compact 3.8 billion parameters, making it significantly smaller than many other models currently available.
"We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g. phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone," Microsoft researchers said in a research paper.
Small language models (SLMs) like Phi-3-mini are designed to handle simpler tasks, making them ideal for businesses that may not require the full power of more complex AI models. They may be practical for tasks like data analysis, document summarization and basic chatbot interactions.
"Phi-3 is not slightly cheaper, it's dramatically cheaper, we're talking about a 10x cost difference compared to the other models out there with similar capabilities," said Sebastien Bubeck, Microsoft's vice president of GenAI research.
Eric Boyd, corporate vice president of Microsoft Azure AI Platform, told The Verge that the Phi-3 Mini matches the capabilities of larger language models (LLMs) like GPT-3.5 but in a more compact design.
According to Boyd, developers trained Phi-3 using a "curriculum," drawing inspiration from how children learn through bedtime stories and books with simpler vocabulary.
Due to the relatively limited number of children's books available, the researchers provided an LLM with a list of over 3,000 words to create 'children's books' to teach Phi. Boyd said Phi-3 has built upon the knowledge of its predecessors. While Phi-1 focused on coding and Phi-2 began to develop reasoning skills, Phi-3 is competent in both coding and reasoning.
The research team demonstrated Phi-3 Mini's capabilities by generating creative content like poems and suggesting activities in specific locations, all while running offline on an iPhone 14.
The benefits of smaller models like Phi-3 Mini extend beyond convenience. They require significantly less processing power to run, making them a cost-effective option.
But Phi-3 Mini comes with some limitations. Challenges common to LLMs - like generating inappropriate content, bias amplification and hallucinations - are present in Phi-3 Mini.
The research team acknowledges this, stating in their paper that "significant work remains to fully address these issues."
Phi-3 Mini will be available across multiple platforms, including Microsoft's own Azure cloud service. It can also be accessed through Hugging Face, a popular machine learning platform, and Ollama, a framework designed for running AI models directly on local machines.
Looking ahead, Microsoft plans to release larger Phi-3 models, Phi-3 Small (7B parameters) and Phi-3 Medium (14B parameters), offering a spectrum of capabilities for different needs.
Last week, Meta released Llama 3, the latest iteration of its popular large language model (LLM) series, claiming significant performance improvements over its predecessors.
Llama 3 comes in two variants: Llama 3 8B with 8 billion parameters and Llama 3 70B with 70 billion parameters.
Meta said Llama 3 8B outperforms other open models like Mistral 7B and Google's Gemma 7B on several benchmarks, demonstrating its abilities in areas like knowledge acquisition, reasoning and code generation.
This article originally appeared on our sister site Computing.