前言
为了能成功运行示例,同时降低学习的门槛。示例中的向量模型我将采用 bge-m3:latest,模型运行使用本地部署ollama。如果你的机器不能跑本地模型的话,那你需要找能提供向量模型API的厂商来接入学习。比如 OpenAIEmbeddings
。
要创建一个few-shotting。需要有以下几步:
创建提示模板
from langchain_core.prompts import PromptTemplate
example_prompt = PromptTemplate.from_template("question: {question} \n {answer}")
示例中我们使用了 question
, answer
两个占位符来替换示例。
创建示例
提供的示例应该是一个字典。字典的Key要包含刚才的占位符。
examples = [
{
"question": "Who lived longer, Muhammad Ali or Alan Turing?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
""",
},
{
"question": "When was the founder of craigslist born?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
""",
},
{
"question": "Who was the maternal grandfather of George Washington?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
""",
},
{
"question": "Are both the directors of Jaws and Casino Royale from the same country?",
"answer": """
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
""",
},
]
现在我们使用其中的一个示例来测试一下提示词模板是否正常工作。
print(example_prompt.invoke(examples[0]).to_string())
question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
结果如我们预期的一样。成功替换了先前定义的占位符。
使用 FewShotPromptTemplate
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate
print(example_prompt.invoke(examples[0]).to_string())
prompt = FewShotPromptTemplate(
examples=examples,
example_prompt=example_prompt,
suffix="Question: {input}",
input_variables=["input"],
)
res = prompt.invoke({"input":"Who was the father of mary ball washington?"})
print(res.to_string())
FewShotPrompt 参数解释:
- examples 需要提供的示例
- example_prompt 提示词模板
- suffix 后缀,也就是我们输入问题后将们的的问题拼接在最后
- Input_variables 用户输入变量名称
从下面的输出结果中的可以看到。最终的结果是我们的示例在前。最后问的问题帮我们拼接在了最后面了。给模型提供这样子的示例,可以引导模型给出更好的回答。同时也能减少幻想的问题。
question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
question: When was the founder of craigslist born?
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
question: Who was the maternal grandfather of George Washington?
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
question: Are both the directors of Jaws and Casino Royale from the same country?
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
Question: Who was the father of mary ball washington?
示例选择器
上一步的结其实已经可以发给模型了。但是如果你的示例过多(过多的示例有可能超出上下文限制)。或者你想智能匹配相关的示例这时候就需要用到选择器了。在 Langchain中选择器为 SemanticSimilarityExampleSelector
,这个类根据输入的相似性从示例中选择与输入相关少量的示例。起到一个过滤的效果。它在原理是使用嵌入模型来计算输入与少样本示例之间的相似性,并使用向量存储来执行最近邻搜索。所以你需要确保有一个向量模型来调用。
ollama pull bge-m3
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings
embeddings = OllamaEmbeddings(
model="bge-m3:latest",
)
example_selector = SemanticSimilarityExampleSelector.from_examples(
# 提供的示例
examples,
# 这是一个嵌入类,用于生成用于度量语义相似度的嵌入。
embeddings,
# 这是VectorStore类,用于存储嵌入并进行相似性搜索。
Chroma,
# 这是要生成的样本的数量。
k=1,
)
# Select the most similar example to the input.
question = "Who was the father of Mary Ball Washington?"
selected_examples = example_selector.select_examples({"question": question})
print(f"Examples most similar to the input: {question}")
for example in selected_examples:
print("\n")
for k, v in example.items():
print(f"{k}: {v}")
Examples most similar to the input: Who was the father of Mary Ball Washington?
answer:
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
question: Who was the maternal grandfather of George Washington?
通过示例结果可以看到。我们问 玛丽·鲍尔·华盛顿的父亲是谁 。selector 选择 Who was the maternal grandfather of George Washington? 这个回答返回给我们。现在让我们来改造一下、FewShotPromptTemplate
prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
suffix="Question: {input}",
input_variables=["input"],
)
res = prompt.invoke({"input": question})
print(res.to_string())
这里我们把 examples 参数换成了 example_selector 。由 selector 决定要使用那个样本生成最终的提示词。
question: Who was the maternal grandfather of George Washington?
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
Question: Who was the father of Mary Ball Washington?
到这里已经完成了如何定义提示模板、示例、使用选择器提取符合输入的应用。