Introduction to guidance
This notebook is a terse tutorial walkthrough of the syntax of guidance
.
Models
At the core of any guidance program are the immutable model objects. You can create an initial model object using any of the constructors under guidance.models
:
[1]:
from guidance import models
# For LlamaCpp, you need to provide the path on disk to a .gguf model
# A sample model can be downloaded from
# https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/blob/main/mistral-7b-instruct-v0.2.Q8_0.gguf
mistral = models.LlamaCpp("/home/scottlundberg_google_com/models/mistral-7b-instruct-v0.2.Q8_0.gguf", n_gpu_layers=-1, n_ctx=4096)
#llama2 = models.Transformers("meta-llama/Llama-2-7b-hf")
#gpt3 = models.OpenAI("text-davinci-003")
#palm2 = models.VertexAI("text-bison@001")
Simple generation
Once you have an initial model object you can append text to it with the addition operator. This creates a new model object that has the same context (prompt) as the original model, but with the text appended at the end (just like what would happen if you add two strings together).
[2]:
lm = mistral + "Who won the last Kentucky derby and by how much?"
Who won the last Kentucky derby and by how much?
Once you have added some text to the model you can then ask the model to generate unconstrained text using the gen
guidance function. Guidance functions represent executable components that can be appended to a model. When you append a guidance function to a model the model extends its state by executing the guidance function.
[3]:
from guidance import gen
lm + gen(max_tokens=10)
[3]:
Who won the last Kentucky derby and by how much? The last Kentucky Derby was held on
Note that while the lm
and mistral
objects are semantically separate, for performance purposes they share the same model weights and KV cache, so the incremental creation of new lm objects is very cheap and reuses all the computation from prior objects.
We can add the text and the gen
function in one statement to follow the traditional prompt-then-generate pattern:
[4]:
mistral + '''\
Q: Who won the last Kentucky derby and by how much?
A:''' + gen(stop="Q:")
[4]:
Q: Who won the last Kentucky derby and by how much? A: The last Kentucky Derby was held on May 1, 2021, and the winner was Medina Spirit, ridden by jockey John Velazquez. Medina Spirit won by 0.5 lengths over Mandaloun. However, it's important to note that Medina Spirit failed a drug test and the results of the race are under investigation. Therefore, the official winner may change.
Simple templates
You can define a template in guidance
(v0.1+) using f-strings. You can interpolate both standard variables and also guidance functions. Note that in Python 3.12 you can put anything into f-string slots, but in python 3.11 and below there are a few disallowed characters (like backslash).
[5]:
query = "Who won the last Kentucky derby and by how much?"
mistral + f'''\
Q: {query}
A: {gen(stop="Q:")}'''
[5]:
Q: Who won the last Kentucky derby and by how much? A: The last Kentucky Derby was held on May 1, 2021, and the winner was Medina Spirit, ridden by jockey John Velazquez. Medina Spirit won by 0.5 lengths over Mandaloun. However, it's important to note that Medina Spirit failed a drug test and the results of the race are under investigation. Therefore, the official winner may change.
Capturing variables
Often when you are building a guidance program you will want to capture specific portions of the output generated by the model. You can do this by giving a name to the element you wish to capture.
[6]:
query = "Who won the last Kentucky derby and by how much?"
lm = mistral + f'''\
Q: {query}
A: {gen(name="answer", stop="Q:")}'''
Q: Who won the last Kentucky derby and by how much? A: The last Kentucky Derby was held on May 1, 2021, and the winner was Medina Spirit, ridden by jockey John Velazquez. Medina Spirit won by 0.5 lengths over Mandaloun. However, it's important to note that Medina Spirit failed a drug test and the results of the race are under investigation. Therefore, the official winner may change.
Then we can access the variable by indexing into the final model object.
[7]:
lm["answer"]
[7]:
"The last Kentucky Derby was held on May 1, 2021, and the winner was Medina Spirit, ridden by jockey John Velazquez. Medina Spirit won by 0.5 lengths over Mandaloun. However, it's important to note that Medina Spirit failed a drug test and the results of the race are under investigation. Therefore, the official winner may change."
Function encapsulation
When you have a set of model operations you want to group together, you can place them into a custom guidance function. To do this you define a decorated python function that takes a model as the first positional argument and returns a new updated model. You can add this guidance function to a model to execute it, just like with the built-in guidance functions like gen
.
[8]:
import guidance
@guidance
def qa_bot(lm, query):
lm += f'''\
Q: {query}
A: {gen(name="answer", stop="Q:")}'''
return lm
query = "Who won the last Kentucky derby and by how much?"
mistral + qa_bot(query) # note we don't pass the `lm` arg here (that will get passed during execution when it gets added to the model)
[8]:
Q: Who won the last Kentucky derby and by how much? A: The last Kentucky Derby was held on May 1, 2021, and the winner was Medina Spirit, ridden by jockey John Velazquez. Medina Spirit won by 0.5 lengths over Mandaloun. However, it's important to note that Medina Spirit failed a drug test and the results of the race are under investigation. Therefore, the official winner may change.
Note that one atypical feature of guidance functions is that multi-line string literals defined inside a guidance function respect the python indentation structure. This means that the whitespace before “Q:” and “A:” in the prompt above is stripped (but if they were indented 6 spaces instead of 4 spaces then only the first 4 spaces would be stripped, since that is the current python indentation level). This allows us to define multi-line templates inside guidance functions while retaining
indentation readability (if you ever want to disable this behavior you can use @guidance(dedent=False)
).
Selecting among alternatives
Guidance has lots of ways to constrain model generation, but the most basic buliding block is the select
function that forces the model to choose between a set of options (either strings or full grammars).
[9]:
from guidance import select
mistral + f'''\
Q: {query}
Now I will choose to either SEARCH the web or RESPOND.
Choice: {select(["SEARCH", "RESPOND"], name="choice")}'''
[9]:
Q: Who won the last Kentucky derby and by how much?
Now I will choose to either SEARCH the web or RESPOND.
Choice: SEARCH
Note that since guidance is smart about when tokens are forced by the program (and so don’t need to be predicted by the model) only one token was generated in the program above (the beginning of “SEARCH” that is highlighted in green).
Interleaved generation and control
Because guidance is pure Python code you can interleave (constrained) generation commands with traditional python control statements. In the example below we first ask the model to decide if it should search the web or respond directly, then act accordingly.
[10]:
@guidance
def qa_bot(lm, query):
lm += f'''\
Q: {query}
Now I will choose to either SEARCH the web or RESPOND.
Choice: {select(["SEARCH", "RESPOND"], name="choice")}
'''
if lm["choice"] == "SEARCH":
lm += "A: I don't know, Google it!"
else:
lm += f'A: {gen(stop="Q:", name="answer")}'
return lm
mistral + qa_bot(query)
[10]:
Q: Who won the last Kentucky derby and by how much?
Now I will choose to either SEARCH the web or RESPOND.
Choice: SEARCH
A: I don't know, Google it!
Generating lists
Whenever you want to generate a list of items you can use the list_append
parameter which will cause the captured value to be appended to a list instead of overwriting previous values.
[11]:
lm = mistral + f'''\
Q: {query}
Now I will choose to either SEARCH the web or RESPOND.
Choice: {select(["SEARCH", "RESPOND"], name="choice")}
'''
if lm["choice"] == "SEARCH":
lm += "Here are 3 search queries:\n"
for i in range(3):
lm += f'''{i+1}. "{gen(stop='"', name="queries", temperature=1.0, list_append=True)}"\n'''
Q: Who won the last Kentucky derby and by how much? Now I will choose to either SEARCH the web or RESPOND. Choice: SEARCH Here are 3 search queries: 1. "last Kentucky derby winner" 2. "latest Kentucky derby results" 3. "Kentucky derby winner 2021"
[12]:
lm["queries"]
[12]:
['last Kentucky derby winner',
'latest Kentucky derby results',
'Kentucky derby winner 2021']
Chat
You can control chat models using special with
context blocks that wrap whatever is inside them with the special formats needed for the chat model you are using. This allows you express chat programs without tying yourself to a single model backend.
[13]:
# to use role based chat tags you need a chat model, here we use gpt-3.5-turbo but you can use 'gpt-4' as well
gpt35 = models.OpenAI("gpt-3.5-turbo")
[14]:
from guidance import system, user, assistant
with system():
lm = gpt35 + "You are a helpful assistant."
with user():
lm += "What is the meaning of life?"
with assistant():
lm += gen("response")
systemYou are a helpful assistant.userWhat is the meaning of life?assistantThe meaning of life is a philosophical question that has been debated for centuries. Different people and cultures have different beliefs and interpretations. Some believe that the meaning of life is to seek happiness and fulfillment, while others find meaning in religious or spiritual beliefs. Ultimately, the meaning of life may be subjective and can vary from person to person. It is up to each individual to explore and find their own sense of purpose and meaning in life.
Multistep
[15]:
# you can create and guide multi-turn conversations by using a series of role tags
@guidance
def experts(lm, query):
with system():
lm += "You are a helpful assistant."
with user():
lm += f"""\
I want a response to the following question:
{query}
Who are 3 world-class experts (past or present) who would be great at answering this?
Please don't answer the question or comment on it yet."""
with assistant():
lm += gen(name='experts', max_tokens=300)
with user():
lm += f"""\
Great, now please answer the question as if these experts had collaborated in writing a joint anonymous answer.
In other words, their identity is not revealed, nor is the fact that there is a panel of experts answering the question.
If the experts would disagree, just present their different positions as alternatives in the answer itself (e.g. 'some might argue... others might argue...').
Please start your answer with ANSWER:"""
with assistant():
lm += gen(name='answer', max_tokens=500)
return lm
gpt35 + experts(query='What is the meaning of life?')
[15]:
systemYou are a helpful assistant.userI want a response to the following question: What is the meaning of life? Who are 3 world-class experts (past or present) who would be great at answering this? Please don't answer the question or comment on it yet.assistantSure, here are three world-class experts who have explored the question of the meaning of life: 1. Viktor Frankl: Viktor Frankl was an Austrian psychiatrist and Holocaust survivor. He is best known for his book "Man's Search for Meaning," in which he explores the importance of finding meaning in life, even in the face of extreme suffering. 2. Albert Camus: Albert Camus was a French philosopher and writer. He delved into existentialism and absurdism, questioning the meaning of life in a world that appears to lack inherent purpose. His works, such as "The Myth of Sisyphus," offer thought-provoking insights on the subject. 3. Dalai Lama: The Dalai Lama is the spiritual leader of Tibetan Buddhism. His teachings emphasize compassion, mindfulness, and the pursuit of happiness. He often discusses the importance of finding purpose and meaning in life through inner peace and altruistic actions. These experts have provided valuable perspectives on the meaning of life, each from their unique backgrounds and philosophies.userGreat, now please answer the question as if these experts had collaborated in writing a joint anonymous answer. In other words, their identity is not revealed, nor is the fact that there is a panel of experts answering the question. If the experts would disagree, just present their different positions as alternatives in the answer itself (e.g. 'some might argue... others might argue...'). Please start your answer with ANSWER:assistantANSWER: The meaning of life is a profound and complex question that has captivated thinkers throughout history. While there is no definitive answer, a collaboration of world-class experts offers various perspectives to ponder. Some might argue that the meaning of life lies in finding purpose and significance in our experiences. Viktor Frankl, drawing from his experiences in the Holocaust, suggests that meaning can be discovered through our ability to choose our attitudes and actions, even in the face of adversity. He emphasizes the importance of finding meaning in our relationships, work, and contributions to society. On the other hand, Albert Camus presents a different viewpoint. He explores the concept of absurdism, suggesting that life is inherently devoid of meaning. According to Camus, the universe is indifferent, and our existence is marked by a fundamental tension between our longing for meaning and the inherent meaninglessness of the world. In this perspective, individuals are challenged to create their own meaning and embrace the absurdity of existence. Another perspective comes from the Dalai Lama, who emphasizes the cultivation of compassion, mindfulness, and inner peace. He suggests that the meaning of life can be found in our ability to alleviate suffering, both within ourselves and others. By nurturing positive qualities and engaging in altruistic actions, we can discover a deeper sense of purpose and fulfillment. While these experts offer distinct viewpoints, their collective wisdom suggests that the meaning of life is a deeply personal and subjective journey. It may involve finding purpose in our relationships, embracing the absurdity of existence, or cultivating compassion and mindfulness. Ultimately, the search for meaning is an ongoing exploration that varies from person to person, influenced by individual experiences, beliefs, and values.
Streaming
Often you want to get the results of a generation as it is happening so you update an interface. You can do this programmatically using the .stream()
method of model objects. This creates a ModelStream
that you can use to accumulate updates. These updates don’t get executed until you iterate over the ModelStream
object. When you iterate over the object you get lots of partially completed model objects as the guidance program is executed.
[16]:
for part in mistral.stream() + qa_bot(query):
part # do something with the partially executed lm
Have an idea for more helpful examples? Pull requests that add to this documentation notebook are encouraged!