Chat dialog

Guidance supports chat-based models using role tags. These are then converted to the appropriate format for the model (either a JSON API format or special tokens).

[ ]:
import logging
[ ]:
call_delay_secs = 0
requested_log_level = logging.WARNING
[ ]:
logging.basicConfig(level=requested_log_level)

Preliminaries concluded, we can now create our model:

[ ]:
from guidance import models, gen

chat_enabled_model = models.Transformers("microsoft/Phi-4-mini-instruct")

Multi-step chat with hidden blocks

We are now going to set up a multistage chat, where we have the chat bot help the use achieve some goal. The user will only have to specify the goal, and then we will create a chain-of-thought conversation with the bot which will:

  1. Ask the bot for a number of suggestions.

  2. List the pros and cons of each.

  3. Pick the best suggestion.

  4. Product a detailed action plan.

Our goal is to only show the final result to the user.

Now, let us define our generation function:

[ ]:
import re
import time

import guidance
from guidance import gen, select, system, user, assistant

@guidance
def plan_for_goal(lm, goal: str):

    # This is a helper function which we will use below
    def parse_best(prosandcons, options):
        best = re.search(r'Best=(\d+)', prosandcons)
        if not best:
            best =  re.search(r'Best.*?(\d+)', 'Best= option is 3')
        if best:
            best = int(best.group(1))
        else:
            best = 0
        return options[best]

    # Some general instruction to the model
    with system():
        lm += "You are a helpful assistant."

    # Simulate a simple request from the user
    # Note that we switch to using 'lm2' here, because these are intermediate steps (so we don't want to overwrite the current lm object)
    with user():
        lm2 = lm + f"""\
        I want to {goal}
        Can you please generate one option for how to accomplish this?
        Please make the option very short, at most one line."""

    # Generate several options. Note that this means several sequential generation requests
    n_options = 5
    with assistant():
        options = []
        for i in range(n_options):
            options.append((lm2 + gen(name='option', temperature=1.0, max_tokens=50))["option"])

    # Have the user request pros and cons
    with user():
        lm2 += f"""\
        I want to {goal}
        Can you please comment on the pros and cons of each of the following options, and then pick the best option?
        ---
        """
        for i, opt in enumerate(options):
            lm2 += f"Option {i}: {opt}\n"
        lm2 += f"""\
        ---
        Please discuss each option very briefly (one line for pros, one for cons), and end by saying Best=X, where X is the number of the best option."""

    # Get the pros and cons from the model
    with assistant():
        lm2 += gen(name='prosandcons', temperature=0.0, max_tokens=600, stop="Best=") + "Best=" + gen("best", regex="[0-9]+")
        time.sleep(call_delay_secs)

    # The user now extracts the one selected as the best, and asks for a full plan
    # We switch back to 'lm' because this is the final result we want
    with user():
        lm += f"""\
        I want to {goal}
        Here is my plan: {options[int(lm2["best"])]}
        Please elaborate on this plan, and tell me how to best accomplish it."""

    # The plan is generated
    with assistant():
        lm += gen(name='plan', max_tokens=500)
        time.sleep(call_delay_secs)

    return lm

Create a plan for the user. Note how the portions which were sent to lm2 in the function above are not shown in the final result:

[ ]:
results = chat_enabled_model + plan_for_goal(goal="read more books")

We can access the final plan itself:

[ ]:
print(results['plan'])

Asking help from experts

Now, let us ask our chat model to pick some experts in a particular field, and impersonate them to give advice:

[ ]:
@guidance
def run_expert_advice(lm, query: str):
    # Some general instruction to the model
    with system():
        lm += "You are a helpful assistant."

    with user():
        lm += f"""I want a response to the following question:
{query}
Who are 3 world-class experts (past or present) who would be great at answering this?
Please don't answer the question or comment on it yet.
"""

    with assistant():
        lm += gen(name='experts', temperature=0, max_tokens=300)
        time.sleep(call_delay_secs)

    with user():
        lm += """Great, now please answer the question as if these experts had collaborated in writing a joint anonymous answer.
In other words, their identity is not revealed, nor is the fact that there is a panel of experts answering the question.
If the experts would disagree, just present their different positions as alternatives in the answer itself (e.g. 'some might argue... others might argue...').
Please start your answer with ANSWER:
"""

    with assistant():
        lm += gen(name='answer', temperature=0, max_tokens=500)
        time.sleep(call_delay_secs)

    return lm
[ ]:
mean_life = chat_enabled_model + run_expert_advice("What is the meaning of life?")
[ ]:
more_productive = chat_enabled_model + run_expert_advice('How can I be more productive?')

Agents

We are now going to define a ‘conversation agent.’ This maintains a memory of a conversation, and can generate an appropriate reply, based on the persona it has been given.

[ ]:
class ConversationAgent:
    def __init__(self, chat_model, name: str, instructions: str, context_turns: int = 2):
        self._chat_model = chat_model
        self._name = name
        self._instructions = instructions
        self._my_turns = []
        self._interlocutor_turns = []
        self._went_first = False
        self._context_turns = context_turns

    @property
    def name(self) -> str:
        return self._name

    def reply(self, interlocutor_reply = None) -> str:
        if interlocutor_reply is None:
            self._my_turns = []
            self._interlocutor_turns = []
            self._went_first = True
        else:
            self._interlocutor_turns.append(interlocutor_reply)

        # Get trimmed history
        my_hist = self._my_turns[(1-self._context_turns):]
        interlocutor_hist = self._interlocutor_turns[-self._context_turns:]

        # Set up the system prompt
        curr_model = self._chat_model
        with system():
            curr_model += f"Your name is {self.name}. {self._instructions}"
            if len(interlocutor_hist) == 0:
                curr_model += "Introduce yourself and start the conversation"
            elif len(interlocutor_hist) == 1:
                curr_model += "Introduce yourself before continuing the conversation"

        # Replay the last few turns
        for i in range(len(my_hist)):
            with user():
                curr_model += interlocutor_hist[i]
            with assistant():
                curr_model += my_hist[i]

        if len(interlocutor_hist) > 0:
            with user():
                curr_model += interlocutor_hist[-1]

        with assistant():
            curr_model += gen(name='response', max_tokens=100)
        time.sleep(call_delay_secs)

        self._my_turns.append(curr_model['response'])
        return curr_model['response']

We can have two of these agents converse with each other with a conversation simulator:

[ ]:
def conversation_simulator(
    bot0: ConversationAgent,
    bot1: ConversationAgent,
    total_turns: int = 5 ):
    conversation_turns = []
    last_reply = None
    for _ in range(total_turns):
        last_reply = bot0.reply(last_reply)
        conversation_turns.append(dict(name=bot0.name, text=last_reply))
        time.sleep(call_delay_secs)
        last_reply = bot1.reply(last_reply)
        conversation_turns.append(dict(name=bot1.name, text=last_reply))
    return conversation_turns

Now, let’s try generating a conversation:

[ ]:
bot_instructions = """You are taking part in a discussion about bodyline bowling.
Only generate text as yourself and do not prefix your reply with your name.
Keep your answers to a couple of short sentences."""

bradman_bot = ConversationAgent(chat_enabled_model, "Donald Bradman", bot_instructions, context_turns=5)
jardine_bot = ConversationAgent(chat_enabled_model, "Douglas Jardine", bot_instructions, context_turns=5)

conversation_turns = conversation_simulator(bradman_bot, jardine_bot, total_turns=3)
[ ]:
for turn in conversation_turns:
    print(f"{turn['name']}: {turn['text']}\n")

Have an idea for more helpful examples? Pull requests that add to this documentation notebook are encouraged!