Recently, Databricks, the data lake house and AI company, announced the launch of Dolly, a cheap-to-build large language model (LLM) that exhibits a surprising degree of the instruction following capabilities exhibited by reknown AI, ChatGPT. Using Databricks, any business can take an off-the-shelf open source LLM and give it magical ChatGPT-like instruction following ability by training it in 30 minutes on a single machine, using high-quality training data.
Why are companies resorting to Open Models?
There are reasons why a company would prefer to build their own model rather than sending data to a centralized LLM provider that serves a proprietary model behind an API. For many, the problems and datasets most likely to benefit from AI represent their most sensitive and proprietary intellectual property and handing it over to a third party may be unpalatable.
Companies may have different tradeoffs in terms of model quality, cost, and the desired behavior. Databricks believes that ML users are best served long term by directly controlling and owning their models. The release of Dolly is first of a series of notices Databricks is making that focus on helping organizations harness the power of large language models.
What is the marketing offering of Dolly?
Dolly works by taking an existing open source 6 billion parameter model from EleutherAI and modifying it ever so slightly to elicit instruction following capabilities such as brainstorming and text generation that is not present in the original model, using data from Alpaca.
The model underlying Dolly only has 6 billion parameters, compared to 175 billion in GPT-3, and is two years old, making it particularly surprising that it works so well. This suggests that much of the qualitative gains in state-of-the-art models like ChatGPT may owe to focused corpuses of instruction-following training data, rather than larger or better-tuned base models.
Databricks evaluated Dolly on the instruction-following capabilities described in the InstructGPT paper that ChatGPT is based on and found that it exhibits as well as open Q&A.
“We are choosing to call the model Dolly — after Dolly the sheep, the first cloned mammal — because it’s an open-source clone of an Alpaca, the animal, inspired by a LLaMA,” said Ali Ghodsi, Co-founder and Chief Executive Officer at Databricks. This is the origin of the name.
“We’re in the earliest days of the democratization of AI for the enterprise, and much work remains to be done, but we believe the technology underlying Dolly represents an exciting new opportunity for companies that want to cheaply build their own instruction-following models.”
Disclaimer: Generative AI is a gradually emerging technology, and we are in the early stages of the research around how to address bias, offensive responses and general response toxicity, and hallucinations in LLMs. Dolly may sometimes exhibit some of this behavior. Make no mistake, Databricks is committed to continuing to advance the quality and safety of Dolly. We hope by open sourcing the technology, it will accelerate all the necessary improvements.