StableLM Suite of Language Models

The world of artificial intelligence (AI) is evolving at a breakneck pace, and at the forefront of this revolution is Stability AI, an organization dedicated to making foundational AI technology accessible to all. Recently, Stability AI announced the release of StableLM, a new open-source language model that comes in two Alpha versions: 3 billion and 7 billion parameters. The company is also planning to release larger models, ranging from 15 billion to 65 billion parameters.

StableLM builds on the success of Stability AI's previous release, Stable Diffusion, a groundbreaking image model that provides a transparent, open, and scalable alternative to proprietary AI systems. The new language model is designed to generate text and code, powering a wide range of downstream applications. The small size of the StableLM models (3 to 7 billion parameters) is a testament to their efficiency, as they still deliver high performance with appropriate training.

A key aspect of StableLM's development is its foundation in open sourcing. Stability AI has collaborated with EleutherAI, a nonprofit research hub, on the development of earlier language models such as GPT-J, GPT-NeoX, and the Pythia suite. These models were trained on The Pile, an open-source dataset. The ongoing success of open-source language models such as Cerebras-GPT and Dolly-2 can be traced back to these collaborative efforts.

StableLM, however, is trained on a new experimental dataset that is three times larger than The Pile, containing a staggering 1.5 trillion tokens of content. The company will release more details about this dataset in the near future. The vastness and richness of this dataset contribute to StableLM's impressive performance in conversational and coding tasks, even when compared to larger models like GPT-3, which boasts 175 billion parameters.

In addition to the base models, Stability AI is also releasing a series of research models that are instruction fine-tuned. These models utilize a combination of five recent open-source datasets for conversational agents: Alpaca, GPT4All, Dolly, ShareGPT, and HH. The fine-tuned models are intended for research purposes only and are released under a noncommercial CC BY-NC-SA 4.0 license, aligning with Stanford’s Alpaca license.

Developers can freely inspect, use, and adapt the StableLM base models for commercial or research purposes, subject to the terms of the CC BY-SA-4.0 license.

They promised that they will publish a full technical report in the near future and will be kicking off a crowd-sourced RLHF program, and working with community efforts such as Open Assistant to create an open-source dataset for AI assistants.

As AI continues to advance, open-source initiatives like Stability AI's StableLM are poised to play a crucial role in shaping the future of the field. By making cutting-edge AI technology accessible to all, Stability AI is empowering developers, researchers, and organizations to harness the power of AI to drive innovation and progress.

We research, curate and publish daily updates from the field of AI. Paid subscription gives you access to paid articles, a platform to build your own generative AI tools, invitations to closed events and open-source tools.
Consider becoming a paying subscriber to get the latest!

Subscribe Now