基于自定义数据集微调LLama 2模型

您能否让 LLM 更好地完成您的特定任务？是的你可以！在这个教程中，你将学习到如何运用QLoRA技术在自定义数据集上对Llama 2进行微调。我们会用到客户间的对话数据集以及Twitter上的客服对话作为训练材料。我们的目标是让模型能够概括对话，并将其生成的摘要与数据集中原有的摘要进行对比。

首先，我们需要安装所需的库。接着，我们会挑选一个数据集，并查看其中的一些具体案例。然后，我们将对Llama 2的7b基础模型进行微调，并设置好相关的软件包。最后，我们会将微调后的模型的表现与基础Llama 2模型的表现进行对比

为什么要微调 LLM？

使用大型语言模型（LLM）的提示功能是快速启动并享受生成式AI强大功能的一种便捷方式。然而，长期依赖提示可能会引发一系列问题：

高成本问题：复杂的提示，尤其是包含广泛上下文的提示，可能会累积大量的Token，从而增加使用成本。
高延迟困扰：冗长的提示，特别是在进行链接时，可能会引入显著的延迟，对用户体验造成不良影响。
幻觉现象：基于提示的方法可能因上下文不足而难以给出简洁且真实的答案。
平庸结果：随着基础模型的持续改进，仅凭具有竞争力的提示所能提供的优势正在逐渐减少。要想获得出色的结果，通常需要根据特定数据训练的微调模型。

如果您遇到这些问题，微调可能是一个解决方案。虽然其他向量搜索、缓存和提示链接等技术可以帮助解决一些问题问题，微调通常是最有效和最通用的选择。

微调的好处：

性能提升：微调能够根据您的具体需求定制模型，从而获得更好的任务性能。
降低成本和延迟：微调能够减少生成响应所需的Token数量，进而降低成本并减少延迟。
增强隐私保护：使用您自己的数据和部署进行微调，可以为您的隐私保护增添一层额外的保障。

然而，微调也面临一些挑战：

时间和资源消耗大：微调是一个耗时的过程，需要大量的资源（尤其是强大的GPU），涉及训练、优化和评估等多个环节。
专业知识要求高：要实现最佳结果，需要具备数据处理、训练和推理技术等方面的专业知识。
上下文知识受限：微调模型虽然在特定任务中表现出色，但可能缺乏像GPT-4等闭源模型那样的多功能性。

何时微调 LLM？

当提示不再满足您的需求，且您具备微调所需的资源时，就可以考虑对LLM进行微调了。就这么简单！

我所说的资源是指：

计算能力（如GPU）
时间和专业知识（确保您了解整个微调过程）
高质量数据及其标签（如果您正在进行摘要、文本提取或其他需要标签的任务）

基础（非指令调整）LLM模型可以通过监督方式进行训练。这个过程与传统深度学习模型的训练过程类似。您需要准备数据，选择一个模型进行微调，并评估结果。主要区别在于，在这个过程中，您将使用文本作为输入和输出。

当然，您也可以对指令调整模型进行微调，但这需要额外的指令数据集。该过程与微调基础模型类似，但您需要使用适当的提示格式。

设置

我们将使用一些常见的库，如PyTorch和HuggingFace Transformers。除此之外，我们还需要一些额外的库来微调Llama 2模型：

!pip install -Uqqq pip --progress-bar off

!pip install -qqq torch==2.0.1 --progress-bar off

!pip install -qqq transformers==4.32.1 --progress-bar off

!pip install -qqq datasets==2.14.4 --progress-bar off

!pip install -qqq peft==0.5.0 --progress-bar off

!pip install -qqq bitsandbytes==0.41.1 --progress-bar off

!pip install -qqq trl==0.7.1 --progress-bar off

Bitsandbytes库将协助我们以4位精度加载模型。Peft库则为我们提供了运用LoRA技术的工具。而trl库则提供了一个Trainer类，我们将借助这个类来微调模型。

接下来，让我们添加所需的导入：

import json

import re

from pprint import pprint

 

import pandas as pd

import torch

from datasets import Dataset, load_dataset

from huggingface_hub import notebook_login

from peft import LoraConfig, PeftModel

from transformers import (

    AutoModelForCausalLM,

    AutoTokenizer,

    BitsAndBytesConfig,

    TrainingArguments,

)

from trl import SFTTrainer

 

DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"

MODEL_NAME = "meta-llama/Llama-2-7b-hf"

我们将采用的模型是Meta AI提供的Llama 2的7b版本。这是一个基础模型（未进行指令调整），因为我们不打算在对话模式下应用它。

数据预处理

我们将使用的数据集来源于客户与Twitter上的支持代理之间的对话。这些数据由Salesforce提供，并在HuggingFace数据集平台上以6个数据集枢纽的形式公开可用。该数据集共包含1099个对话，其中879个用于训练，110个用于验证，另有110个用于测试。接下来，我们将加载这个数据集。

dataset = load_dataset("Salesforce/dialogstudio", "TweetSumm")

dataset

DatasetDict({

    train: Dataset({

        features: ['original dialog id', 'new dialog id', 'dialog index',

        'original dialog info', 'log', 'prompt'],

        num_rows: 879

    })

    validation: Dataset({

        features: ['original dialog id', 'new dialog id', 'dialog index',

        'original dialog info', 'log', 'prompt'],

        num_rows: 110

    })

    test: Dataset({

        features: ['original dialog id', 'new dialog id', 'dialog index',

        'original dialog info', 'log', 'prompt'],

        num_rows: 110

    })

})

让我们看一下 HuggingFace Datasets Hub 上的预览：

我们主要关注两个领域的信息：

original_dialog_info – 对话的摘要内容。
log – 对话的具体内容本身。

接下来，我们将编写一个函数，用于从数据点中提取这些信息。

def generate_text(data_point):

    summaries = json.loads(data_point["original dialog info"])["summaries"][

        "abstractive_summaries"

    ]

    summary = summaries[0]

    summary = " ".join(summary)

 

    conversation_text = create_conversation_text(data_point)

    return {

        "conversation": conversation_text,

        "summary": summary,

        "text": generate_training_prompt(conversation_text, summary),

    }

摘要是从数据点的结构中提取的。下面是一个示例摘要：

Customer enquired about his Iphone and Apple watch which is not showing his anysteps/activity and health activities. Agent is asking to move to DM and lookinto it.

让我们来看看 create_conversation_text 函数：

def create_conversation_text(data_point):

    text = ""

    for item in data_point["log"]:

        user = clean_text(item["user utterance"])

        text += f"user: {user.strip()}\n"

 

        agent = clean_text(item["system response"])

        text += f"agent: {agent.strip()}\n"

 

    return text

 

 

def clean_text(text):

    text = re.sub(r"http\S+", "", text)

    text = re.sub(r"@[^\s]+", "", text)

    text = re.sub(r"\s+", " ", text)

    return re.sub(r"\^[^ ]+", "", text)

该函数将数据点的log字段中的对话文本放在一起。它还通过删除URL、提及和额外空格来清理文本。下面是一个对话示例：

user: So neither my iPhone nor my Apple Watch are recording my steps/activity,and Health doesn't recognise either source anymore for some reason. Any ideas?please read the above. agent: Let's investigate this together. To start, can youtell us the software versions your iPhone and Apple Watch are running currently?user: My iPhone is on 11.1.2, and my watch is on 4.1. agent: Thank you. Have youtried restarting both devices since this started happening? user: I've restartedboth, also un-paired then re-paired the watch. agent: Got it. When did you firstnotice that the two devices were not talking to each other. Do the two devicescommunicate through other apps such as Messages? user: Yes, everything seemsfine, it's just Health and activity. agent: Let's move to DM and look into thisa bit more. When reaching out in DM, let us know when this first startedhappening please. For example, did it start after an update or after installinga certain app?

最后一部分是prompt生成函数（我们将在训练过程中使用的文本）：

DEFAULT_SYSTEM_PROMPT = """

Below is a conversation between a human and an AI agent. Write a summary of the conversation.

""".strip()

 

 

def generate_training_prompt(

    conversation: str, summary: str, system_prompt: str = DEFAULT_SYSTEM_PROMPT

) -> str:

    return f"""### Instruction: {system_prompt}

 

### Input:

{conversation.strip()}

 

### Response:

{summary}

""".strip()

我们将使用Alapaca风格的提示格式。下面是我们示例中的提示：

### Instruction:

 

Below is a conversation between a human and an AI agent. Write a summary of the

conversation.

 

### Input:

 

user: So neither my iPhone nor my Apple Watch are recording my steps/activity,

and Health doesn't recognise either source anymore for some reason. Any ideas?

please read the above. agent: Let's investigate this together. To start, can you

tell us the software versions your iPhone and Apple Watch are running currently?

user: My iPhone is on 11.1.2, and my watch is on 4.1. agent: Thank you. Have you

tried restarting both devices since this started happening? user: I've restarted

both, also un-paired then re-paired the watch. agent: Got it. When did you first

notice that the two devices were not talking to each other. Do the two devices

communicate through other apps such as Messages? user: Yes, everything seems

fine, it's just Health and activity. agent: Let's move to DM and look into this

a bit more. When reaching out in DM, let us know when this first started

happening please. For example, did it start after an update or after installing

a certain app?

 

### Response:

 

Customer enquired about his Iphone and Apple watch which is not showing his any

steps/activity and health activities. Agent is asking to move to DM and look

into it.

我们现在可以使用helper函数来处理整个数据集：

def process_dataset(data: Dataset):

    return (

        data.shuffle(seed=42)

        .map(generate_text)

        .remove_columns(

            [

                "original dialog id",

                "new dialog id",

                "dialog index",

                "original dialog info",

                "log",

                "prompt",

            ]

        )

    )

这里我们使用datasets库来对数据进行混洗，并将generate_text函数应用于每个数据点。同时，该函数还会删除我们不需要的字段。接下来，我们将把这个处理流程应用到数据集的所有分割部分上：

dataset["train"] = process_dataset(dataset["train"])

dataset["validation"] = process_dataset(dataset["validation"])

dataset

DatasetDict({

    train: Dataset({

        features: ['conversation', 'summary', 'text'],

        num_rows: 879

    })

    validation: Dataset({

        features: ['conversation', 'summary', 'text'],

        num_rows: 110

    })

    test: Dataset({

        features: ['original dialog id', 'new dialog id', 'dialog index', 'original dialog info', 'log', 'prompt'],

        num_rows: 110

    })

})

稍后我们将处理测试子集。

模型

我们将采用Llama 2的基础7b版本。为了以4位精度加载它，我们将使用bitsandbytes库。首先，我们需要登录HuggingFace Hub（访问可能需要权限）：

notebook_login()

接下来，我们将编写一个helper函数来加载模型和tokenizer：

def create_model_and_tokenizer():

    bnb_config = BitsAndBytesConfig(

        load_in_4bit=True,

        bnb_4bit_quant_type="nf4",

        bnb_4bit_compute_dtype=torch.float16,

    )

 

    model = AutoModelForCausalLM.from_pretrained(

        MODEL_NAME,

        use_safetensors=True,

        quantization_config=bnb_config,

        trust_remote_code=True,

        device_map="auto",

    )

 

    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

    tokenizer.pad_token = tokenizer.eos_token

    tokenizer.padding_side = "right"

 

    return model, tokenizer

为了进行4位量化，我们选择使用4位归一化浮点数（nf）。同时，我们启用use_safetensors选项以确保采用安全的张量格式进行加载。接下来，我们将下载模型和分词器。

model, tokenizer = create_model_and_tokenizer()

model.config.use_cache = False

transformers库与各种量化库实现了良好的集成。我们可以方便地检查模型的量化配置，以了解其具体细节。

model.config.quantization_config.to_dict()

{

    'quant_method': <QuantizationMethod.BITS_AND_BYTES: 'bitsandbytes'>,

    'load_in_8bit': False,

    'load_in_4bit': True,

    'llm_int8_threshold': 6.0,

    'llm_int8_skip_modules': None,

    'llm_int8_enable_fp32_cpu_offload': False,

    'llm_int8_has_fp16_weight': False,

    'bnb_4bit_quant_type': 'nf4',

    'bnb_4bit_use_double_quant': False,

    'bnb_4bit_compute_dtype': 'float16'

}

最后一个组件是 QLora 配置：

lora_r = 16

lora_alpha = 64

lora_dropout = 0.1

lora_target_modules = [

    "q_proj",

    "up_proj",

    "o_proj",

    "k_proj",

    "down_proj",

    "gate_proj",

    "v_proj",

]

 

 

peft_config = LoraConfig(

    r=lora_r,

    lora_alpha=lora_alpha,

    lora_dropout=lora_dropout,

    target_modules=lora_target_modules,

    bias="none",

    task_type="CAUSAL_LM",

)

我们设置了更新矩阵的秩（r = 16）和丢弃率（lora_dropout = 0.05）。权重矩阵通过LORA的α参数进行缩放。

训练

我们将使用 Tensorboard 来监控训练过程。让我们开始吧：

OUTPUT_DIR = "experiments"

 

%load_ext tensorboard

%tensorboard --logdir experiments/runs

接下来，我们将设置训练参数：

training_arguments = TrainingArguments(

    per_device_train_batch_size=4,

    gradient_accumulation_steps=4,

    optim="paged_adamw_32bit",

    logging_steps=1,

    learning_rate=1e-4,

    fp16=True,

    max_grad_norm=0.3,

    num_train_epochs=2,

    evaluation_strategy="steps",

    eval_steps=0.2,

    warmup_ratio=0.05,

    save_strategy="epoch",

    group_by_length=True,

    output_dir=OUTPUT_DIR,

    report_to="tensorboard",

    save_safetensors=True,

    lr_scheduler_type="cosine",

    seed=42,

)

大多数设置都相当直观。我们正在采用以下配置：

paged_adamw_32bit_optimizer，这是AdamW优化器的内存高效版本。
余弦学习率调度器，用于动态调整学习率。
group_by_length选项，该选项会将长度大致相同的样本进行分组处理，这样做有助于提高训练的稳定性。

我们将使用的训练器类来自trl库，它是一个基于transformers库中Trainer类的封装。除了提供标准的训练功能外，我们还将通过peft_config和dataset_text_field这两个选项进行配置。其中，dataset_text_field选项需要指定用于训练提示的数据集中的字段名称。

trainer = SFTTrainer(

    model=model,

    train_dataset=dataset["train"],

    eval_dataset=dataset["validation"],

    peft_config=peft_config,

    dataset_text_field="text",

    max_seq_length=4096,

    tokenizer=tokenizer,

    args=training_arguments,

)

让我们开始训练：

trainer.train()

步步骤	训练损失	验证丢失
22	1.906400	1.921726
44	1.823500	1.881039
66	1.677000	1.861916
88	1.774600	1.853609
110	1.646800	1.852111

我们来看看 Tensorboard 中的训练指标：

验证和训练损失均显著下降。现在，让我们来保存这个模型。

trainer.save_model()

这将仅保存 QLoRA 适配器权重和模型配置。你仍然需要加载原始模型和 tokenizer。

将 QLoRA 适配器与 Llama 2 合并（可选）

您可以将QLoRA适配器与原始模型进行合并，从而生成一个可用于推理的单一整合模型。以下是实现这一步骤的方法：

from peft import AutoPeftModelForCausalLM

 

trained_model = AutoPeftModelForCausalLM.from_pretrained(

    OUTPUT_DIR,

    low_cpu_mem_usage=True,

)

 

merged_model = model.merge_and_unload()

merged_model.save_pretrained("merged_model", safe_serialization=True)

tokenizer.save_pretrained("merged_model")

现在可以从merged_model目录加载您的模型和标记器。

评估

我们来看看测试集中的一些预测结果。为了这些预测，我们将使用generate_prompt函数来为模型生成相应的提示符。

def generate_prompt(

    conversation: str, system_prompt: str = DEFAULT_SYSTEM_PROMPT

) -> str:

    return f"""### Instruction: {system_prompt}

 

### Input:

{conversation.strip()}

 

### Response:

""".strip()

让我们构建示例（摘要、对话和提示）：

examples = []

for data_point in dataset["test"].select(range(5)):

    summaries = json.loads(data_point["original dialog info"])["summaries"][

        "abstractive_summaries"

    ]

    summary = summaries[0]

    summary = " ".join(summary)

    conversation = create_conversation_text(data_point)

    examples.append(

        {

            "summary": summary,

            "conversation": conversation,

            "prompt": generate_prompt(conversation),

        }

    )

test_df = pd.DataFrame(examples)

test_df

	总结	谈话	提示
0	客户抱怨监视列表是…	用户：我的监视列表没有更新新的EP.	### 说明：以下是我们之间的对话…
1	客户询问ACC链接到…	用户：嗨，我的帐户被链接到一个旧号码….	### 说明：以下是我们之间的对话…
2	客户抱怨新的更新…	用户：iOS 11的新更新很糟糕。我甚至不能…	### 说明：以下是我们之间的对话…
3	客户投诉包裹服务…	用户：去他妈的你和你那狗屁的包裹服务……	### 说明：以下是我们之间的对话…
4	顾客说他被困在斯泰内斯了.	用户：卡在斯泰内斯等待阅读测试.	### 说明：以下是我们之间的对话…

最后，让我们添加一个helper函数来总结给定的提示：

def summarize(model, text: str):

    inputs = tokenizer(text, return_tensors="pt").to(DEVICE)

    inputs_length = len(inputs["input_ids"][0])

    with torch.inference_mode():

        outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.0001)

    return tokenizer.decode(outputs[0][inputs_length:], skip_special_tokens=True)

让我们加载基本模型和微调模型：

model, tokenizer = create_model_and_tokenizer()

trained_model = PeftModel.from_pretrained(model, OUTPUT_DIR)

让我们看看测试集中的第一个示例：

example = test_df.iloc[0]

print(example.conversation)

user: My watchlist is not updating with new episodes (past couple days). Anyidea why? agent: Apologies for the trouble, Norlene! We're looking into this. Inthe meantime, try navigating to the season / episode manually. user: Triedlogging out/back in, that didn't help agent: Sorry! 😔 We assure you that ourteam is working hard to investigate, and we hope to have a fix ready soon! user:Thank you! Some shows updated overnight, but others did not... agent: Wedefinitely understand, Norlene. For now, we recommend checking the show page forthese shows as the new eps will be there user: As of this morning, the problemseems to be resolved. Watchlist updated overnight with all new episodes. Thankyou for your attention to this matter! I love Hulu 💚 agent: Awesome! That'swhat we love to hear. If you happen to need anything else, we'll be here tosupport! 💚

以下是数据集的摘要：

print(example.summary)

原始摘要

Customer is complaining that the watchlist is not updated with new episodes frompast two days. Agent informed that the team is working hard to investigate toshow new episodes on page.

我们可以从Llama 2模型中得到总结：

summary = summarize(model, example.prompt)

pprint(summary)

基本模型摘要

('\n' '\n' '### Input:\n' 'user: My watchlist is not updating with new episodes(past couple days). Any ' 'idea why?\n' "agent: Apologies for the trouble,Norlene! We're looking into this. In the " 'meantime, try navigating to theseason / episode manually.\n' 'user: Tried logging out/back in, that didn'thelp\n' 'agent: Sorry! 😔 We assure you that our team is working hard toinvestigate, ' 'and we hope to have a fix ready soon!\n' 'user: Thank you! Someshows updated overnight, but others did not...\n' 'agent: We definitelyunderstand, Norlene. For now, we recommend checking the ' 'show page for theseshows as the new eps will be there\n' 'user: As of this morning, the problemseems to be resolved. Watchlist ' 'updated overnight with all new episodes.Thank you for your attention to ' 'this matter! I love Hulu 💚\n' "agent:Awesome! That's what we love to hear. If you happen to need anything " "else,we'll be here to support! 💚\n" '\n' '### Output:\n' '\n' '### Input:\n' 'user:My watchlist')

这个结果似乎不太理想。让我们来看看微调后的模型会给出什么样的输出

summary = summarize(trained_model, example.prompt)

pprint(summary)

微调模型摘要

('\n' 'Customer is complaining that his watchlist is not updating with new ''episodes. Agent updated that they are looking into this and also informed ''that they will be here to support.\n' '\n' '### Input:\n' 'Customer iscomplaining that his watchlist is not updating with new ' 'episodes. Agentupdated that they are looking into this and also informed ' 'that they will behere to support.\n' '\n' '### Response:\n' 'Customer is complaining that hiswatchlist is not updating with new ' 'episodes. Agent updated that they arelooking into this and also informed ' 'that they will be here to support.\n''\n' '### Input:\n' 'Customer is complaining that his watchlist is not updatingwith new ' 'episodes. Agent updated that they are looking into this and alsoinformed ' 'that they will be here to support.\n' '\n' '### Response:\n''Customer is complaining that his watchlist is not updating with new ''episodes. Agent updated that they are looking into this and also informed ''that they will be here to support.\n' '\n' '### Input:\n' 'Customer iscomplaining that his watchlist is not updating with new ' 'episodes. Agentupdated that they are looking into this and also informed ' 'that they will behere to support.\n' '\n' '### Response:\n' 'Customer is complaining that hiswatchlist is')

确实有所改进，但我们先只关注第一段的内容：

pprint(summary.strip().split("\n")[0])

微调模型摘要（已清理）

Customer is complaining that his watchlist is not updating with new episodes.Agent updated that they are looking into this and also informed that they willbe here to support.

这看起来好多了，确实给出了一个很好的总结。接下来，我们试试下一个示例吧

example = test_df.iloc[1]

print(example.conversation)

user: hi , my Acc was linked to an old number. Now I'm asked to verify my Acc ,where a code / call wil be sent to my old number. Any way that I can link my Accto my current number? Pls help agent: Hi there, we are here to help. We willhave a specialist contact you about changing your phone number. Thank you. user:Thanks. Hope to get in touch soon agent: That is no problem. Please let us knowif you have any further questions in the meantime. user: Hi sorry , is it for myaccount : **email** agent: Can you please delete this post as it does havepersonal info in it. We have updated your Case Manager who will be following upwith you shortly. Feel free to DM us anytime with any other questions orconcerns 2/2 user: Thank you agent: That is no problem. Please do not hesitateto contact us with any further questions. Thank you.

原始摘要

Customer is asking about the ACC to link to the current number. Agent says thatthey have updated their case manager.

最初的总结非常简洁，让我们看看基本模型产生了什么：

基本模型摘要

('\n' 'The conversation between a human and an AI agent is about changing thephone ' 'number of an account. The human asks if there is any way to link theaccount ' 'to a new phone number, and the agent replies that they will have a ''specialist contact the user about changing the phone number. The human ''thanks the agent and hopes to get in touch soon. The agent then asks the ''human to delete the post as it contains personal information. The human ''replies that they will delete the post. The agent then thanks the human for ''their cooperation and closes the conversation.\n' '\n' '### Output:\n' 'Theconversation between a human and an AI agent is about changing the phone ''number of an account. The human asks if there is any way to link the account ''to a new phone number, and the agent replies that they will have a ''specialist contact the user about changing the phone number. The human ''thanks the agent and hopes to get in touch soon. The agent then asks the ''human to delete the post as it contains personal information. The human ''replies that they will delete the post. The agent then thanks the human for ''their cooperation and closes the conversation.\n' '\n' '### Output:\n' 'Theconversation between a human and an AI agent is')

与第一个例子相比，看起来要好得多，但仍然很长。让我们看看微调后的模型会产生什么：

微调模型摘要（已清理）

Customer is asking to link his account to his current number. Agent updated thatthey will have a specialist contact him about changing his phone number.

非常完美，简短而中肯。最后一个例子：

example = test_df.iloc[2]print(example.conversation)

user: the new update ios11 sucks. I can't even use some apps on my phone. agent:We want your iPhone to work properly, and we are here for you. Which apps aregiving you trouble, and which iPhone? user: 6s. Words with friends Words proagent: Do you see app updates in App Store &gt; Updates? Also, are you using iOS11.0.3? user: I am using 11.0.3 and there are no updates for words pro that Ican find agent: Thanks for checking. Next, what happens in that app that makesit unusable? user: It's says it's not compatible. agent: Thanks for confirmingthis. Send us a DM and we'll work from there:

原始摘要

Customer is complaining about the new updates IOS11 and can't even use some appson phone. Agent asks to send a DM and work from there URL.

同样，让我们看看基本模型摘要：

基本模型摘要

('\n' '\n' '### Input:\n' 'user: the new update ios11 sucks. I can't even usesome apps on my phone.\n' 'agent: We want your iPhone to work properly, and weare here for you. Which ' 'apps are giving you trouble, and which iPhone?\n''user: 6s. Words with friends Words pro\n' 'agent: Do you see app updates in AppStore &gt; Updates? Also, are you using ' 'iOS 11.0.3?\n' 'user: I am using11.0.3 and there are no updates for words pro that I can ' 'find\n' 'agent:Thanks for checking. Next, what happens in that app that makes it ''unusable?\n' 'user: It's says it's not compatible.\n' "agent: Thanks forconfirming this. Send us a DM and we'll work from there:\n" '\n' '### Output:\n''\n' '### Input:\n' 'user: the new update ios11 sucks. I can't even use someapps on my phone.\n' 'agent: We want your iPhone to work properly, and we arehere for you. Which ' 'apps are giving you trouble, and which iPhone?\n' 'user:6s. W')

它基本上是对话的副本。让我们看看微调后的模型给我们带来了什么：

微调模型摘要（已清理）

Customer is complaining about the new update ios11 sucks. Agent updated to senda DM and they will work from there.

我真的很喜欢这个新总结，比起原来的总结要好得多。它简洁明了地表达了对话的主要思想，即关于iOS 11的评价（是否很烂）

结论

Llama 2的微调为我们提供了一种生成对话摘要的方法。与基础模型相比，微调后的模型能够生成更短、更精炼且切中要害的摘要。我会说，这次微调成功地满足了我们的特定用例需求。

原文链接：https://www.mlexpert.io/blog/fine-tuning-llama-2-on-custom-dataset

基于自定义数据集微调LLama 2模型

为什么要微调 LLM？

何时微调 LLM？

设置

数据预处理

模型

训练

将 QLoRA 适配器与 Llama 2 合并（可选）

评估

结论

如何追踪AI API的使用情况：终极指南

利用LangChain与OpenLLM构建基于自定义知识库的聊天机器人