LM_Cocktail

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

LM-Cocktail: Resilient Tuning of Language Models via Model Merging

Make fine-tuning of language models akin to crafting a nuanced cocktail. More details please refer to our paper: LM-Cocktail.

Introduction

The core of LM-Cocktail Tuning is to merge multiple models, which can inherit the strength of each model. The following are some application scenarios:

1. Address the Problem of Catastrophic Forgetting

Fine-tuning the base language model could lead to severe degeneration of model’s general capabilities beyond the targeted domain. By mixing the fine-tuned model and the base model (use function mix_models), LM-Cocktail can significantly enhance performance in downstream task while maintaining performance in other unrelated tasks.

2. Improve the performance of new task without fine-tuning

Cocktail can improve the accuracy of the new task without a requisition to fine-tune a model. Give a few examples data (e.g., five examples), function mix_models_wit_data can automatically generate a task-specific new model via merging existing language models (from open-source community or pre-existing for other tasks).

3. Approximate multitask learning or model ensemble

By amalgamating multiple expert models, mix_models also can approximate multitask learning. You also can boost the performance for the downstream task utilizing multiple expert models: using five examples to merge the other models for the target task (mix_models_wit_data), and then merge it with the model fine-tuned on target task(mix_models).

Usage

Recommend to install the latest version from source:

git clone https://github.com/FlagOpen/FlagEmbedding.git
cd FlagEmbedding/LM_Cocktail
pip install -e .

Install by pip:

pip install -U LM_Cocktail

1. Mix models

1.1. Mix fine-tuned model and base model

Mix the fine-tuned model and the base model to avoid Catastrophic Forgetting after fine-tuning:

from LM_Cocktail import mix_models, mix_models_with_data

# mix LLMs and save it to output_path: ./mixed_model_1
model = mix_models(
    model_names_or_paths=["meta-llama/Llama-2-7b-chat-hf", "Shitao/llama2-ag-news"], 
    model_type='decoder', 
    weights=[0.7, 0.3], 
    output_path='./mixed_model_1')
# you can select a weight for your models to get a trade-off between generality and expertise.

# Mix Embedding Models
model = mix_models(
    model_names_or_paths=["BAAI/bge-base-en-v1.5", "Shitao/bge-hotpotqa"], 
    model_type='encoder', 
    weights=[0.5, 0.5],
    output_path=None)

# Mix reranker Models
model = mix_models(
    model_names_or_paths=["BAAI/bge-reranker-base", "BAAI/bge-reranker-base"], 
    model_type='reranker', 
    weights=[0.5, 0.5],
    output_path="./mixed_reranker")

1.1. Mix muliple models

from LM_Cocktail import mix_models, mix_models_with_data

model = mix_models(
    model_names_or_paths=["meta-llama/Llama-2-7b-chat-hf", "Shitao/llama2-ag-news", "Shitao/llama2-nq", "Shitao/llama2-mnli"], 
    model_type='decoder', 
    weights=[0.4, 0.2, 0.3, 0.1])
# The sum of weights should be equal to 1.

2. Mix models with weights computed based on a few examples

LM-cocktail can merge multiple models based on a few examples data. It can be used to produce a model for a new task without training, or boost the performance for the downstream task with multiple models.

For LLMs

The format of example_data for LLMs is a list, where each item is a dict like:

{"input": str, "output": str}

LM-cocktial will compute the loss of the output.

You can use the example data to merge models:

from LM_Cocktail import mix_models, mix_models_with_data

example_data = [
    {"input": "Question: when was the last time anyone was on the moon? Answer:\n", "output": "14 December 1972 UTC"},
    {"input": "Review: \"it 's a charming and often affecting journey . \" Is this movie review sentence negative or positive?\n", "output": "Positive"}
]

model = mix_models_with_data(
    model_names_or_paths=["meta-llama/Llama-2-7b-chat-hf", "Shitao/llama2-ag-news", "Shitao/llama2-nq"], 
    model_type='decoder', 
    example_ata=example_data, 
    temperature=5.0)
# you can set the temperature argument to adjust the distribution of mixing weights

For Embedder

The format of example_data for LLMs is a list, where each item is a dict like:

{"query": str, "pos": List[str], 'neg': List[str]}

where pos is a list of positive text and neg is a list of negative text. LM-Cocktail will compute the contrastive loss.

You can use the example data to merge models:

from LM_Cocktail import mix_models, mix_models_with_data

example_data = [
    {"query": "How does one become an actor in the Telugu Film Industry?", "pos": [" How do I become an actor in Telugu film industry?"], "neg": [" What is the story of Moses and Ramesses?", " Does caste system affect economic growth of India?"]}, 
    {"query": "Why do some computer programmers develop amazing software or new concepts, while some are stuck with basic programming work?", "pos": [" Why do some computer programmers develops amazing softwares or new concepts, while some are stuck with basics programming works?"], "neg": [" When visiting a friend, do you ever think about what would happen if you did something wildly inappropriate like punch them or destroy their furniture?", " What is the difference between a compliment and flirting?"]}
]

model = mix_models_with_data(
    model_names_or_paths=["BAAI/bge-base-en-v1.5", "Shitao/bge-hotpotqa", "Shitao/bge-quora"], 
    model_type='encoder', 
    example_ata=example_data,
    temperature=5.0,
    max_input_length=512,
    neg_number=2)

Performance

Detailed results please refer to our paper: LM-Cocktail

LM-Cocktail for Catastrophic Forgetting

Model	Target Task	Others(29 tasks)
Llama	40.8	46.8
Fine-tuned	94.4	38.6
LM-Cocktail(2 models) [1]	94.5	47.7
LM-Cocktail(10 models) [2]	94.4	48.3

[1]: merge 2 models: fine-tuned model and the base model

[2]: merge 10 models: fine-tuned model, the base model, and 8 models fine-tuned on other tasks

Model	Target Task	Other Tasks(14 tasks)
BGE	71.8	49.8
Fine-tuned	76.0	48.5
LM-Cocktail(2 models)	74.8	50.0
LM-Cocktail(10 models)	74.7	50.6

LM-Cocktail for new tasks

Model	MMLU(57 tasks)
Llama	45.9
Llama-5shot	46.7
LM-Cocktail(10 models)	48.0

Model	Retrieval(12 tasks)
BGE	47.3
LM-Cocktail(10 models)	48.8

Evaluation

1. Reproduce the results of LLM

Models: we fine-tune the meta-llama/Llama-2-7b-chat-hf on 9 tasks, and you can find the fine-tuned models at this link. Noted that the most of fine-tuned models has a poor performance on other unrelated tasks.
Examples Data for dataset from FLAN: ./llm_examples.json
MMLU dataset: https://huggingface.co/datasets/cais/mmlu (use the example in dev set to do in-context learning)

You can use these models and our code to produce a new model and evlaute its performance using Use the llm-embedder script as following:

# for 30 tasks from FLAN
torchrun --nproc_per_node 8 -m evaluation.eval_icl \
--retrieval_method no \
--few_shot 0 \
--data_root /data/llm-embedder \
--model_name_or_path ./mixed_model_1

# for MMLU datasets
torchrun --nproc_per_node 8 -m evaluation.eval_mmlu \
--retrieval_method no \
--few_shot 0 \
--data_root /data/llm-embedder \
--model_name_or_path ./mixed_model_2

2. Reproduce the results of Embedding Model

Models: we fine-tune the bge-base-en-v1.5 on 9 tasks, and you can find the fine-tuned models at this link.
Examples Data: ./embedder_examples.json

Use MTEB script to evaluate the mixed embedding model:

python eval_MTEB.py --model_name_or_path mixed_model --task_type Retrieval

Acknowledgement

The Llama is fine-tuned using the FastChat scripts. Fine-tuning datasets are from sentence-transformers/embedding-training-data and intfloat/llm-retriever-tasks.

Citation

If you find this repository useful, please consider giving a star :star: and citation

@misc{cocktail,
      title={LM-Cocktail: Resilient Tuning of Language Models via Model Merging}, 
      author={Shitao Xiao and Zheng Liu and Peitian Zhang and Xingrun Xing},
      year={2023},
      eprint={2311.13534},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.4

Jan 23, 2024

0.0.3

Dec 14, 2023

0.0.2

Nov 30, 2023

This version

0.0.1

Nov 30, 2023

0.0.0

Nov 25, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

LM_Cocktail-0.0.1.tar.gz (7.4 kB view hashes)

Uploaded Nov 30, 2023 Source

Hashes for LM_Cocktail-0.0.1.tar.gz

Hashes for LM_Cocktail-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`b6dac1958fec1e5d6a478a0b12b9b3001700f0db6eb2de2fc3e1d8a04f642268`
MD5	`9294ce5382eb63216cb44409cc4acc23`
BLAKE2b-256	`01be552994fc1d7eb9f6b021a5eb4e3004eff8051f819ba13d99f22108c5cae3`