gen API Examples

This notebook gives examples of how to use the gen command.

[3]:
from guidance import models, gen

gpt2 = models.Transformers("gpt2", device=0)
gpt3 = models.OpenAI("text-davinci-003")
gpt4 = models.OpenAI("gpt-4")

Basic usage

Below we have a program that includes a basic generation call using gen. There are two arguments passed to gen one positional argument and one keyword argument. The positional argument is the name of the program variable to store the generation in. The keyword argument stop is a string that tells gen when to stop generating (in this case we stop when generating a period).

[5]:
out = gpt2 + "This is a sentence about " + gen("completion", stop=".")
This is a sentence about the way that the world works
[7]:
# can access the generated text as a variable in the updated model state object
out["completion"]
[7]:
'the way that the world works'

name positional argument

The name argument is a string that represents the key to store the results of the generation on the model state object (see above for an example).

[14]:
out = gpt2 + f"""\
This is a sentence about {gen("sentence1", stop=".")}.
This is another sentence with {gen("sentence2", stop=".")}."""
This is a sentence about the way that the world works.
This is another sentence with a different meaning.
[16]:
out["sentence1"], out["sentence2"]
[16]:
('the way that the world works', 'a different meaning')

stop keyword argument

The stop argument can either be a string or a list of strings. If it is a string then it is the string that tells gen when to stop generating. If it is a list of strings then any of those strings will cause the generation to stop when they appear.

[17]:
gpt2 + "This is a sentence about " + gen('text', stop=[" the", " of", " a"])
[17]:
This is a sentence about the way that

stop_regex keyword argument

The stop_regex argument is just like the stop argument but contains regular expressions instead of raw strings. This can be used to stop generation in a highly configurable way.

[18]:
gpt2 + "This is a sentence about " + gen('text', stop_regex=[" of[^a-z]", " the[^a-z]"])
[18]:
This is a sentence about the way that
[21]:
out = gpt3 + f"""\
Please solve the following word problem and call a calcuator with CALC(EQUATION) = ANSWER whenever you need to compute equations. For example: CALC((4+3) * 2) = 14.
Problem: Joe has ten apples and needs run 5 tests on each apple, if each test takes 7 minutes how long will this take Joe?
Reason step by step: """ + gen('text', stop_regex=r"CALC\(.*\) =", max_tokens=100, save_stop_text=True)
Please solve the following word problem and call a calcuator with CALC(EQUATION) = ANSWER whenever you need to compute equations. For example: CALC((4+3) * 2) = 14.
Problem: Joe has ten apples and needs run 5 tests on each apple, if each test takes 7 minutes how long will this take Joe?
Reason step by step: 

Joe has 10 apples and needs to run 5 tests on each apple. 

This means Joe needs to run 50 tests in total. 

Each test takes 7 minutes, so the total time it will take Joe is 50 tests multiplied by 7 minutes per test. 

[22]:
# here we can see the stop text that was saved
out["text_stop_text"]
[22]:
'CALC(50 * 7) ='

save_stop_text keyword argument

The save_stop_text argument causes the gen command to save the text that caused it to stop generating. This is useful when you have a list of strings or a regular expression that you are using to stop generation and you want to know what exact string caused the generation to stop. If set to true it will save the stop text in a variable named variable_name + "_stop_text", if set to a string it will save the stop text to the variable with that name.

[24]:
# stop on any three letter word, and then print the word that was stopped on
out = gpt3 + "This is a sentence about " + gen('text', stop_regex=" [a-z][a-z][a-z][^a-z]", save_stop_text=True, temperature=1.0)
out["text_stop_text"]
This is a sentence about 
the importance of professional development.

Professional development is an important part of
[24]:
' any '
[25]:
# stop on any three letter word, and then print the word that was stopped on
out = gpt3 + "This is a sentence about " + gen('text', stop_regex=" [a-z][a-z][a-z][^a-z]", save_stop_text="stop", temperature=1.0)
out["stop"]
This is a sentence about 
entrepreneurship.

Entrepreneurship is a dynamic process of creating something
[25]:
' new '

max_tokens keyword argument

[26]:
gpt2 + "This is a sentence about " + gen('text', max_tokens=10)
[26]:
This is a sentence about the way that the world works. It's not

n keyword argument

The n argument controls how many generations to perform in a batch. If n > 1 then only the first completion is used for future contex, and the rest are just stored in the variable.

NOTE! This is still a TODO for version ``v0.1+``, use a ``for`` loop for now.

[30]:
# use a for loop for now
lm = gpt2
lm.echo = False # don't draw to notebook
lm += "This is a fun sentence about "
texts = []
for _ in range(5):
    out = lm + gen('text', max_tokens=5, temperature=1.0)
    texts.append(out["text"])
texts
[30]:
['the act of munch',
 'the contradiction between defend one',
 'genre writing, but here',
 'a giant frog.\n',
 "how there's a loss"]

temperature keyword argument

The temperature argument controls the sampling temperature and is passed directly to the LLM. By default temperature is set to 0 and the LLM does greedy sampling. This allows the LM calls to be cached and reused. If the temperature is set to a value greater than 0 then the LLM will do sampling and repeated calls in the same LM session (program execution) will lead to new generations (though re-runs of the same program may use caches for each of those calls in the future).

[31]:
# with a zero temperature, the generated text will be the same each time
lm = gpt2
lm.echo = False # don't draw to notebook
lm += "This is a fun sentence about "
texts = []
for _ in range(5):
    out = lm + gen('text', max_tokens=5)
    texts.append(out["text"])
texts
[31]:
['how to make a good',
 'how to make a good',
 'how to make a good',
 'how to make a good',
 'how to make a good']
[33]:
lm = gpt2
lm.echo = False # don't draw to notebook
lm += "This is a fun sentence about "
texts = []
for _ in range(5):
    out = lm + gen('text', max_tokens=5, temperature=1.0)
    texts.append(out["text"])
texts
[33]:
['dragons. How often entrants',
 "friendliness I don't",
 'finding out about its competitors',
 'a stray photograph sprawled',
 'lots of suggestions we might']

top_p keyword argument

The top_p argument controls the proportion of the probability space used from sampling. By default it is 1.0, so we sample from the whole space. Note that setting top_p only matters if you have a non-zero temperature value.

[34]:
# NOT YET SUPPORTED in v0.1!

regex keyword argument

The regex argument is a regular expression that is used to contrain the text generated by the LM. When pattern is given only token that represent valid extensions of the pattern will be generated. This can be useful for enforcing formats (like only numbers). Just remember that the model does plan in advance for this contraint (yet), so you need to specify a format the model is already familar with.

[35]:
gpt2 + "This is a sentence about " + gen('text', regex="[0-9 ]+")
[35]:
This is a sentence about 2 

list_append keyword argument

When the list_append argument is True then the results of the generation will appended to the list given by the name argument. If not list exists with that name then a new list will be created. This can be a useful alternative to using the geneach command in some circumstances.

[40]:
out = gpt3 + f"""\
Write three story title options about the arctic circle:
OUTLINE
1. "{gen('story', max_tokens=20, list_append=True, stop='"')}"
2. "{gen('story', max_tokens=20, list_append=True, stop='"')}"
3. "{gen('story', max_tokens=20, list_append=True, stop='"')}"
"""
out["story"]
Write three story title options about the arctic circle:
OUTLINE
1. "The Frozen North: A Journey Through the Arctic Circle"
2. "Exploring the Arctic: A Tale of Adventure and Discovery"
3. "The Icy Depths of the Arctic: A Voyage of Discovery"
[40]:
['The Icy Depths of the Arctic: A Voyage of Discovery',
 'Exploring the Arctic: A Tale of Adventure and Discovery',
 'The Frozen North: A Journey Through the Arctic Circle']

Have an idea for more helpful examples? Pull requests that add to this documentation notebook are encouraged!