LLM-Blender 使用教程

2024-09-18 09:28:57作者：彭桢灵Jeremy

[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the diverse strengths of multiple open-source LLMs. LLM-Blender cut the weaknesses through ranking and integrate the strengths through fusing generation to enhance the capability of LLMs.

项目地址：https://gitcode.com/gh_mirrors/ll/LLM-Blender

1. 项目介绍

LLM-Blender 是一个创新的集成框架，旨在通过利用多个开源大型语言模型（LLMs）的多样性优势，实现持续卓越的性能。该项目由 Dongfu Jiang、Xiang Ren 和 Bill Yuchen Lin 开发，并在 ACL 2023 会议上发表。

LLM-Blender 的核心思想是通过两个模块来提升 LLMs 的能力：

PairRanker：使用专门的成对比较方法来区分候选输出之间的细微差异。
GenFuser：旨在合并 PairRanker 选出的顶级候选输出，生成改进的输出。

2. 项目快速启动

安装

首先，通过 pip 安装 LLM-Blender：

pip install llm-blender

或者，从 GitHub 克隆并安装：

git clone https://github.com/yuchenlin/LLM-Blender.git
cd LLM-Blender
pip install -e .

使用示例

以下是一个简单的使用示例，展示如何加载 PairRanker 并进行成对比较：

import llm_blender

# 初始化 Blender
blender = llm_blender.Blender()

# 加载 PairRanker 模型
blender.loadranker("llm-blender/PairRM")

# 定义输入和候选输出
inputs = ["hello, how are you?", "I love you"]
candidates_A = ["hi", "I hate you"]
candidates_B = ["f**k off", "I love you too"]

# 进行成对比较
comparison_results = blender.compare(inputs, candidates_A, candidates_B)

print(comparison_results)

3. 应用案例和最佳实践

案例1：最佳N采样（Best-of-N Sampling）

最佳N采样是一种通过采样和重新排序来提高 LLMs 响应质量的策略。以下是一个在 Zephyr-7b 模型上应用最佳N采样的示例：

import llm_blender
from transformers import AutoTokenizer, AutoModelForCausalLM

# 加载模型和分词器
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-beta")
model = AutoModelForCausalLM.from_pretrained("HuggingFaceH4/zephyr-7b-beta", device_map="auto")

# 初始化 Blender
blender = llm_blender.Blender()
blender.loadranker("llm-blender/PairRM")

# 定义输入
inputs = ["can you tell me a joke about OpenAI?"]

# 进行最佳N采样
outputs = blender.best_of_n_generate(model, tokenizer, inputs, n=10)

print(outputs)

案例2：直接偏好优化（DPO）

PairRM 的成对比较自然支持 DPO，这是一种直接偏好优化方法，用于通过成对比较信号优化模型。以下是一个使用 PairRM 进行 DPO 的示例：

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

from llm_blender.pair_ranker.pairrm import DebertaV2PairRM
from transformers import AutoTokenizer

# 加载 PairRM 模型
pairrm = DebertaV2PairRM.from_pretrained("llm-blender/PairRM-hf", device_map="cuda:0")
tokenizer = AutoTokenizer.from_pretrained('llm-blender/PairRM-hf')

# 定义输入和候选输出
inputs = ["hello", "I love you"]
candidates_A = ["hi", "I hate you"]
candidates_B = ["f**k off", "I love you too"]

# 进行成对比较
encodings = tokenizer.encode_pair(inputs, candidates_A, candidates_B)
outputs = pairrm(**encodings)

print(outputs.logits)