MEDITRON-70B: a new truly open AI model for medicine
My colleagues at EPFL just released an open medical LLM that shows significant performance gains over several state-of-the-art benchmarks.
Medical large language models (LLMs) are a game changer for health care. Imagine a medical AI that can understand and use the combined knowledge of medical information in the world, helping to answer medical questions quickly and cheaply.
Today, access to medical knowledge remains challenging. Although a vast amount of medical information exists, finding and understanding it is difficult. And of course, nobody has time to read it all, a problem that will only be exacerbated as the rate of new medical knowledge generation increases exponentially.
Enter medical LLMs. The MEDITRON team at EPFL used a fully open source pipeline to pre-train a model on 48.1B tokens from a number of datasets, including:
Clinical Guidelines: a new dataset of 46K clinical practice guidelines from various healthcare-related sources
Paper Abstracts: openly available abstracts from 16.1M closed-access PubMed and PubMed Central papers
Medical Papers: full-text articles extracted from 5M publicly available PubMed and PubMed Central papers
This model not only achieves a 6% absolute performance gain over the best public baseline but also shows strong performance against closed-source LLMs. Remarkably, MEDITRON-70B outperforms GPT-3.5 and Med-PaLM (with 540B parameters), and is approaching the performance of GPT-4 and Med-PaLM-2, the two leading large, closed commercial models.
This is a great first step towards open medical AI. Making these models (at both 7B and 70B scale), the tools required for curating the training corpus, and the distributed training library available as an open resource not only ensures access for real-world evaluation but also enables further fine-tuning and the development of instruction-based models, among other efforts.
Safe and trustworthy AI
EPFL has published a very nice news article on the launch of MEDITRON, which also touches on the subject of next steps. Of course, models like these need to be rigorously evaluated in properly designed studies to ensure the safe and effective use of such models in practical health care. This is important, given that safety is a key goal of any responsible AI development.
Beyond safety, there's the issue of trust. Having the entire pipeline developed openly is a critical step, and this is what makes MEDITRON such an impressive initiative. Anyone can examine the model, the data, and the whole training process to spot anything amiss - and then work on improving it.
I’d love to see much more activity like this. As I wrote last week, I believe that public universities like EPFL have a major role to play in the development of truly open AI, especially as society cannot rely on private actors to do that (for understandable reasons).
CODA
This is a newsletter with two subscription types. I highly recommend to switch to the paid version. While all content will remain free, I will donate all financial support, starting in 2024, to the EPFL AI Center.
To stay in touch, here are other ways to find me:
Writing: I write another Substack on digital developments in health, called Digital Epidemiology.