The Biggest Deal in AI

It's not ChatGPT - it's "open source" LLaMA. In this short post, we'll take a quick look at the mind-boggling pace of developments.

May 05, 2023

Wherever you go, the talk of the town is ChatGPT. However, the biggest deal in AI is not ChatGPT, but the open-source model LLaMA. To understand why, let's revisit what happened in the past half year when the world changed overnight.

For most of 2022, people were quite relaxed about AI. Progress in machine learning was fast, for sure, but there was no sign on the horizon that our view on AI would change overnight.

On November 30, 2022, OpenAI released ChatGPT to the public. Within days, the world realized what had just happened. The large language model (LLM) underlying ChatGPT, a model called GPT-3.5, was responding to users' queries at a human level that many thought was years, if not decades away. Because it could converse on any topic, it raised the specter of artificial general intelligence, or AGI - the holy grail of AI.

In the days that followed, ChatGPT sent shockwaves through the world - first to the tech industry ("Google is done"), then to society ("AGI is around the corner - are we doomed?"). Over the weeks that followed, key weaknesses of the model became apparent. In particular, while ChatGPT was well-trained to behave nicely, it still generated a lot of nonsense and could be easily fooled. This "hallucination" problem was considered a major issue. Fine, people said, the conversational power here is impressive, but it's not that good. For example, ask it for scientific papers on a topic, and it invents plausible-sounding references that don't even exist!

On March 14, OpenAI released GPT-4, a substantial improvement over GPT-3.5. While GPT-3.5 could impressively pass some exams but failed at others, GPT-4 passed them all with flying colors. The hallucination problem wasn't entirely gone, but it was substantially less frequent. GPT-4 generated useful scientific references, along with links and short summaries. It excelled at almost any challenge that GPT-3.5 had failed at. Once again, the world was stunned: that much progress in that little time?

ChatGPT became the fastest-growing consumer product in history, reaching over 100 million users in just two months. Along with the implications that AI smarter than humans was perhaps just around the corner, ChatGPT quickly became the dominant topic in every discussion. Wherever you went, the topic was ChatGPT.

The big deal

ChatGPT was a big deal, no doubt. However, something else happened that was a much bigger deal. Competitors like Google and Meta tried to demonstrate that they were capable of doing the same thing. First, Google released its competitor model, Bard. In February 2023, Meta launched LLaMA, a relatively small AI model, and open-sourced its code. However, the code itself is not really interesting. What makes an LLM tick are its weights, i.e., the values of the (generally billions of) parameters.

On March 3, 2023, LLaMA weights were leaked to the public, leading to an incredibly rapid innovation cycle. By March 12th, the model was running on a Raspberry Pi, and on March 13th, Stanford released Alpaca, which added instruction fine-tuning to LLaMA. By March 19th, a model with just 13 billion parameters called Vicuna achieved 90% quality of Bard and GPT-4, trained with open data at the cost of $300. On March 28th, open-source GPT-3 clones were trained, and LLaMA-Adapter introduced multimodal (e.g. including images) training in just one hour. By April 3rd - one month after the LLaMA weight leak - Berkeley's Koala model was almost indistinguishable from ChatGPT.

https://bair.berkeley.edu/blog/2023/04/03/koala/

All of this happened in a timespan of one month. Today, you can tune a very powerful LLM at extremely low cost and run it on consumer hardware. Granted, it seems that nobody has quite achieved the consistency of GPT-4 yet, and who knows what OpenAI is cooking in their kitchen. But the pace of the developments is just mind-boggling, even if you've gotten used to the dynamics in the past six months. We now seem to have breakthroughs on a weekly, if not even daily basis.

What enabled these developments was the "open-sourcing" of LLaMA. That's why it's the biggest deal in AI, and when we look back on this crazy moment, it will be known as the LLaMA moment. However, nobody will call it that, because LLaMA simply kickstarted the process, and it won't be needed anymore in the future.

(In a fascinating development, the CEOs of OpenAI, Alphabet/Google, and Microsoft were invited to the White House to discuss AI - but not Meta. While it's likely that Meta did not want LLaMA weights to be leaked, they're nevertheless highly competitive in this space and are clearly the dominant communication platform. Being concerned about what AI does to misinformation and not talking to Meta strikes me as a rather shortsighted move.)

Engineering Prompts

Discussion about this post