Back to List
TechnologyAIMachine LearningResearch

MIT & ETH Zurich Develop Self-Distillation Fine-Tuning (SDFT) to Enable LLMs to Learn New Skills Without Catastrophic Forgetting, Revolutionizing Enterprise AI Adaptation

Researchers from MIT, the Improbable AI Lab, and ETH Zurich have introduced a novel fine-tuning method called Self-Distillation Fine-Tuning (SDFT) that allows large language models (LLMs) to acquire new skills and knowledge without losing their previously learned capabilities. This breakthrough addresses the critical challenge of "catastrophic forgetting" often encountered when enterprises fine-tune LLMs for new tasks, which typically necessitates maintaining separate models for each skill. SDFT leverages LLMs' inherent in-context learning abilities, enabling them to learn from demonstrations and their own experiments. Experiments demonstrate SDFT's superior performance over traditional supervised fine-tuning (SFT) and its ability to overcome limitations of reinforcement learning. For businesses, this means a single AI model can accumulate diverse skills over time without performance degradation on older tasks, paving the way for adaptive AI agents in dynamic environments, reducing expensive retraining, and preserving general reasoning.

VentureBeat

Enterprises frequently face a significant hurdle when fine-tuning large language models (LLMs) for new applications: the risk of models forgetting previously acquired knowledge and skills. This often compels companies to manage and maintain distinct models for every specific capability. However, a collaborative research effort involving MIT, the Improbable AI Lab, and ETH Zurich has yielded a groundbreaking solution. They have developed a new technique, termed self-distillation fine-tuning (SDFT), which empowers LLMs to learn novel skills and integrate new knowledge without compromising their existing proficiencies.

SDFT operates by capitalizing on the inherent in-context learning capabilities present in modern LLMs. This allows the models to learn directly from provided demonstrations and through their own experimental interactions. The researchers' experiments have consistently shown that SDFT outperforms traditional supervised fine-tuning (SFT) methods. Furthermore, it effectively addresses several limitations commonly associated with reinforcement learning algorithms.

From an enterprise perspective, this method offers substantial advantages. It enables a single LLM to progressively accumulate multiple skills over time without experiencing performance regression on tasks it learned earlier. This capability is crucial for developing AI agents that can truly adapt to the ever-changing demands of business environments. Such agents could acquire new proprietary knowledge and skills as needed, eliminating the necessity for costly retraining cycles and ensuring the preservation of their fundamental general reasoning abilities.

The core challenge SDFT aims to solve is "continual learning." Once an LLM is initially trained and deployed, its parameters typically remain static. It does not inherently update itself to gain new skills, internalize fresh knowledge, or improve through experience. To achieve truly adaptive AI, akin to how humans continuously learn throughout their careers, the industry must overcome this static nature. The most effective learning paradigm for models is "on-policy learning." This approach involves the model learning from data it generates itself, allowing it to identify and correct its own errors and refine its reasoning processes. This contrasts sharply with learning solely by mimicking static datasets. Without on-policy learning, LLMs are susceptible to "catastrophic forgetting," a phenomenon where the acquisition of new information or skills inadvertently leads to the loss of previously learned knowledge and abilities.

Related News

Technology

Microsoft's HVE Core: Streamlined Hyper-Velocity Engineering Components for Project Acceleration and Copilot Integration

Microsoft has released 'hve-core,' a collection of refined hyper-velocity engineering components designed to accelerate project initiation and enhance existing projects. These components, which include instructions, prompts, agents, and skills, are specifically developed to help projects fully leverage the capabilities of various Copilots. The initiative aims to provide essential building blocks for developers looking to optimize their workflows and integrate advanced AI assistance into their development processes.

Technology

MiroFish: A Concise and Universal Swarm Intelligence Engine for Omnipresent Prediction Trends on GitHub

MiroFish, developed by 666ghj, is introduced as a concise and universal swarm intelligence engine designed for predicting a wide range of phenomena. The project, trending on GitHub since March 9, 2026, aims to leverage collective intelligence to offer predictive capabilities across various domains. Its core functionality focuses on providing a streamlined and adaptable solution for 'predicting all things,' highlighting its broad applicability in the realm of intelligent systems.

Technology

Alibaba's Page Agent: A JavaScript GUI Proxy for Natural Language Web Interface Control

Alibaba has released 'Page Agent,' a JavaScript-based GUI proxy designed to enable natural language control over web page interfaces. This tool, currently trending on GitHub, aims to simplify web interaction by allowing users to manage graphical user interfaces within web pages using natural language commands. The project is developed by Alibaba and was published on March 9, 2026.