如何使用少量样本

How to use few shot examples

Posted by Brian on Friday, August 1, 2025

前言

为了能成功运行示例,同时降低学习的门槛。示例中的向量模型我将采用 bge-m3:latest,模型运行使用本地部署ollama。如果你的机器不能跑本地模型的话,那你需要找能提供向量模型API的厂商来接入学习。比如 OpenAIEmbeddings

要创建一个few-shotting。需要有以下几步:

创建提示模板
from langchain_core.prompts import  PromptTemplate

example_prompt = PromptTemplate.from_template("question: {question} \n {answer}")

示例中我们使用了 question , answer两个占位符来替换示例。

创建示例

提供的示例应该是一个字典。字典的Key要包含刚才的占位符。

examples = [
    {
        "question": "Who lived longer, Muhammad Ali or Alan Turing?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
""",
    },
    {
        "question": "When was the founder of craigslist born?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
""",
    },
    {
        "question": "Who was the maternal grandfather of George Washington?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
""",
    },
    {
        "question": "Are both the directors of Jaws and Casino Royale from the same country?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
""",
    },
]

现在我们使用其中的一个示例来测试一下提示词模板是否正常工作。

print(example_prompt.invoke(examples[0]).to_string())
question: Who lived longer, Muhammad Ali or Alan Turing? 
 
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali

结果如我们预期的一样。成功替换了先前定义的占位符。

使用 FewShotPromptTemplate
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

print(example_prompt.invoke(examples[0]).to_string())

prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    suffix="Question: {input}",
    input_variables=["input"],
)

res = prompt.invoke({"input":"Who was the father of mary ball washington?"})

print(res.to_string())

FewShotPrompt 参数解释:

  • examples 需要提供的示例
  • example_prompt 提示词模板
  • suffix 后缀,也就是我们输入问题后将们的的问题拼接在最后
  • Input_variables 用户输入变量名称

从下面的输出结果中的可以看到。最终的结果是我们的示例在前。最后问的问题帮我们拼接在了最后面了。给模型提供这样子的示例,可以引导模型给出更好的回答。同时也能减少幻想的问题。

question: Who lived longer, Muhammad Ali or Alan Turing? 
 
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali

question: Who lived longer, Muhammad Ali or Alan Turing? 
 
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali


question: When was the founder of craigslist born? 
 
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952


question: Who was the maternal grandfather of George Washington? 
 
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball


question: Are both the directors of Jaws and Casino Royale from the same country? 
 
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No


Question: Who was the father of mary ball washington?
示例选择器

上一步的结其实已经可以发给模型了。但是如果你的示例过多(过多的示例有可能超出上下文限制)。或者你想智能匹配相关的示例这时候就需要用到选择器了。在 Langchain中选择器为 SemanticSimilarityExampleSelector ,这个类根据输入的相似性从示例中选择与输入相关少量的示例。起到一个过滤的效果。它在原理是使用嵌入模型来计算输入与少样本示例之间的相似性,并使用向量存储来执行最近邻搜索。所以你需要确保有一个向量模型来调用。

ollama pull bge-m3
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings

embeddings = OllamaEmbeddings(
    model="bge-m3:latest",
)

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # 提供的示例
    examples,
    # 这是一个嵌入类,用于生成用于度量语义相似度的嵌入。
    embeddings,
    # 这是VectorStore类,用于存储嵌入并进行相似性搜索。
    Chroma,
    # 这是要生成的样本的数量。
    k=1,
)
# Select the most similar example to the input.
question = "Who was the father of Mary Ball Washington?"
selected_examples = example_selector.select_examples({"question": question})
print(f"Examples most similar to the input: {question}")
for example in selected_examples:
    print("\n")
    for k, v in example.items():
        print(f"{k}: {v}")
Examples most similar to the input: Who was the father of Mary Ball Washington?


answer: 
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball

question: Who was the maternal grandfather of George Washington?

通过示例结果可以看到。我们问 玛丽·鲍尔·华盛顿的父亲是谁 。selector 选择 Who was the maternal grandfather of George Washington? 这个回答返回给我们。现在让我们来改造一下、FewShotPromptTemplate

prompt = FewShotPromptTemplate(
    example_selector=example_selector,
    example_prompt=example_prompt,
    suffix="Question: {input}",
    input_variables=["input"],
)

res = prompt.invoke({"input": question})
print(res.to_string())

这里我们把 examples 参数换成了 example_selector 。由 selector 决定要使用那个样本生成最终的提示词。

question: Who was the maternal grandfather of George Washington? 
 
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball


Question: Who was the father of Mary Ball Washington?

到这里已经完成了如何定义提示模板、示例、使用选择器提取符合输入的应用。