Microsoft Research reposted this
Introducing Phi-4-reasoning, adding reasoning models to the Phi family. The model is trained with both supervised finetuning (using a carefully curated dataset of reasoning demonstration) and Reinforcement Learning. 📌Competitive results on reasoning benchmarks with much larger top-tier models up to DeepSeek R1 📌 Strong performance on new tests released after data collection (AIME 2025, HMMT 2025) 📌Reasoning transfers/generalizes well to new domains even with only SFT (e.g. k-SAT, Mae Solving, Calendar Planning, etc.) 📌Retains and often significantly improves general-purpose capabilities (e.g. instruction following) Models available on Azure Foundry: https://lnkd.in/gRertUdw And HuggingFace: https://lnkd.in/g_8ncNxq and https://lnkd.in/gSnXfrvx In addition to the models, we are also very excited to share a very detailed technical report with insights on model training and evaluation here: https://lnkd.in/geuBAiir Still have a lot to improve especially with context length, coding and tools. Hope you find the models useful! A big thanks to the amazing team and to all our partners.
Applied Science Manager @ Allstate | On a Continuous Journey to Achieve my Fitness Goals
1dThis is great work Ahmed Awadallah & team . But it would be nice to see how it performs on Unseen and Future Data ( outside the benchmark dataset ) and also how it performs in a Domain Specific environment. I am assuming that we have not done any fundamental changes to the model architecture here and it still reasons based on the usual transformer architecture which predicts the next best token in an Autoregressive manner.