Merge Large Language Models with mergekit

Author:Murphy | View: 23054 | Time: 2025-03-22 23:27:14

Model merging is a technique that combines two or more LLMs into a single model. It's a relatively new and experimental method to create new models for cheap (no GPU required). Model merging works surprisingly well and produced many state-of-the-art models on the Open LLM Leaderboard.

In this tutorial, we will implement it using the mergekit library. More specifically, we will review four merge methods and provide examples of configurations. Then, we will use mergekit to create our own model, Marcoro14–7B-slerp, which became the best-performing model on the Open LLM Leaderboard (02/01/24).

The code is available on GitHub and Google Colab. I recommend using my automated notebook to easily run mergekit:

Tags: Artificial Intelligence Data Science Editors Pick Large Language Models Programming