MIT researchers have developed a new technique to allow large language models (LLMs) to learn new skills without losing existing knowledge, according to VentureBeat. The technique, called self-distillation fine-tuning (SDFT), allows models to learn directly from demonstrations and their own experiments. This innovation addresses a key challenge in fine-tuning LLMs, where adding new skills can inadvertently erase previously learned information, forcing companies to maintain separate models for each skill.
The SDFT method, developed by researchers at MIT, the Improbable AI Lab, and ETH Zurich, consistently outperformed traditional supervised fine-tuning (SFT) in experiments, VentureBeat reported. The new method leverages the in-context learning abilities of modern LLMs.
Meanwhile, the landscape of artificial intelligence continues to evolve rapidly. Chinese companies have been making significant strides, with models matching the performance of Western counterparts at a lower cost, according to MIT Technology Review. The firm Moonshot AI recently released its open-weight model, Kimi K2.5, which nearly matched the performance of Anthropic's Claude Opus on some benchmarks, but at a fraction of the price. Alibaba's Qwen family of models has also surpassed Meta's Llama models in downloads on Hugging Face.
The rapid advancements in AI are also raising concerns about potential misuse. As reported by MIT Technology Review, cybersecurity researchers are already seeing AI being used to make online crimes easier. One example involved a sophisticated ransomware strain that encrypted files on a victim's system, making them unusable until a ransom was paid.
The intersection of public and private markets is also being reshaped by AI and other factors. Paul Wick, chief investment officer at Seligman, noted a "psychological shift" in the market, with increased fear among investors, according to Fortune. The funding mechanism for the software LBO complex has been disrupted, and IPO markets have been weak.
In other news, the discovery of Stela C, an Olmec stone, provided crucial insights into the history of the Olmec civilization, according to Hacker News. The stone, found by the Stirlings, helped determine that the Olmecs were much older than previously believed. The date carved on the stone, 7.16.6.16.18 in the Meso-American Long Count calendar, corresponds to September 3, 32 BC.
Discussion
AI Experts & Community
Be the first to comment