Skip to content

Commit

Permalink
Render custom tool docs a bit better (huggingface#23269)
Browse files Browse the repository at this point in the history
* Try on a couple of blocks to see

* Build the doc please

* Build the doc please

* Build the doc please

* add more

* Finish with all

* Style
  • Loading branch information
sgugger authored May 10, 2023
1 parent 42017d8 commit eb5b5ce
Showing 1 changed file with 27 additions and 24 deletions.
51 changes: 27 additions & 24 deletions docs/source/en/custom_tools.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ The prompt is structured broadly into four parts.

To better understand each part, let's look at a shortened version of how the `run` prompt can look like:

````
````text
I will ask you to perform a task, your job is to come up with a series of simple commands in Python that will perform the task.
[...]
You can print intermediate results if it makes sense to do so.
Expand Down Expand Up @@ -101,7 +101,7 @@ The second part (the bullet points below *"Tools"*) is dynamically added upon ca
exactly as many bullet points as there are tools in `agent.toolbox` and each bullet point consists of the name
and description of the tool:

```
```text
- <tool.name>: <tool.description>
```

Expand All @@ -115,7 +115,7 @@ print(f"- {document_qa.name}: {document_qa.description}")
```

which gives:
```
```text
- document_qa: This is a tool that answers a question about a document (pdf). It takes an input named `document` which should be the document containing the information, as well as a `question` that is the question about the document. It returns a text that contains the answer to the question.
```

Expand Down Expand Up @@ -143,7 +143,7 @@ executable code in practice.

Let's have a look at one example:

````
````text
Task: "Identify the oldest person in the `document` and create an image showcasing the result as a banner."
I will use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
Expand All @@ -166,7 +166,7 @@ The prompt examples are curated by the Transformers team and rigorously evaluate
to ensure that the agent's prompt is as good as possible to solve real use cases of the agent.

The final part of the prompt corresponds to:
```
```text
Task: "Draw me a picture of rivers and lakes"
I will use the following
Expand All @@ -187,7 +187,7 @@ exactly in the same way it was previously done in the examples.
Without going into too much detail, the chat template has the same prompt structure with the
examples having a slightly different style, *e.g.*:

````
````text
[...]
=====
Expand Down Expand Up @@ -225,8 +225,8 @@ to past exchanges as is done *e.g.* above by the user's input of "I tried **this
previously generated code of the agent.

Upon running `.chat`, the user's input or *task* is cast into an unfinished example of the form:
```
Human: <user-input>\n\nAssistent:
```text
Human: <user-input>\n\nAssistant:
```
which the agent completes. Contrary to the `run` command, the `chat` command then appends the completed example
to the prompt, thus giving the agent more context for the next `chat` turn.
Expand Down Expand Up @@ -254,7 +254,7 @@ agent.run("Show me a tree", return_code=True)

gives:

```
```text
==Explanation from the agent==
I will use the following tool: `image_segmenter` to create a segmentation mask for the image.
Expand All @@ -269,7 +269,8 @@ are present in the tool's name and description. Let's have a look.
```py
agent.toolbox["image_generator"].description
```
```

```text
'This is a tool that creates an image according to a prompt, which is a text description. It takes an input named `prompt` which contains the image description and outputs an image.
```

Expand All @@ -280,7 +281,7 @@ agent.run("Create an image of a tree", return_code=True)
```

gives:
```
```text
==Explanation from the agent==
I will use the following tool `image_generator` to generate an image of a tree.
Expand All @@ -307,7 +308,7 @@ used a lot for image generation tasks, *e.g.*
agent.run("Make an image of a house and a car", return_code=True)
```
returns
```
```text
==Explanation from the agent==
I will use the following tools `image_generator` to generate an image of a house and `image_transformer` to transform the image of a car into the image of a house.
Expand All @@ -322,9 +323,11 @@ to understand the difference between `image_generator` and `image_transformer` a

We can help the agent here by changing the tool name and description of `image_transformer`. Let's instead call it `modifier`
to disassociate it a bit from "image" and "prompt":
```
```py
agent.toolbox["modifier"] = agent.toolbox.pop("image_transformer")
agent.toolbox["modifier"].description = agent.toolbox["modifier"].description.replace("transforms an image according to a prompt", "modifies an image")
agent.toolbox["modifier"].description = agent.toolbox["modifier"].description.replace(
"transforms an image according to a prompt", "modifies an image"
)
```

Now "modify" is a strong cue to use the new image processor which should help with the above prompt. Let's run it again.
Expand All @@ -334,7 +337,7 @@ agent.run("Make an image of a house and a car", return_code=True)
```

Now we're getting:
```
```text
==Explanation from the agent==
I will use the following tools: `image_generator` to generate an image of a house, then `image_generator` to generate an image of a car.
Expand All @@ -350,7 +353,7 @@ which is definitely closer to what we had in mind! However, we want to have both
agent.run("Create image: 'A house and car'", return_code=True)
```

```
```text
==Explanation from the agent==
I will use the following tool: `image_generator` to generate an image.
Expand Down Expand Up @@ -389,7 +392,7 @@ of the tools, it has available to it as well as correctly insert the user's prom
</Tip>

Similarly, one can overwrite the `chat` prompt template. Note that the `chat` mode always uses the following format for the exchanges:
```
```text
Human: <<task>>
Assistant:
Expand Down Expand Up @@ -441,7 +444,7 @@ print(f"Name: '{controlnet_transformer.name}'")
```

gives
```
```text
Description: 'This is a tool that transforms an image with ControlNet according to a prompt.
It takes two inputs: `image`, which should be the image to transform, and `prompt`, which should be the prompt to use to change it. It returns the modified image.'
Name: 'image_transformer'
Expand All @@ -457,7 +460,7 @@ agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder",

This command should give you the following info:

```
```text
image_transformer has been replaced by <transformers_modules.diffusers.controlnet-canny-tool.bd76182c7777eba9612fc03c0
8718a60c0aa6312.image_transformation.ControlNetTransformationTool object at 0x7f1d3bfa3a00> as provided in `additional_tools`
```
Expand All @@ -480,7 +483,7 @@ You can always have a look at the toolbox that is currently available to the age
print("\n".join([f"- {a}" for a in agent.toolbox.keys()]))
```

```
```text
- document_qa
- image_captioner
- image_qa
Expand Down Expand Up @@ -518,7 +521,7 @@ Let's transform the image into a beautiful winter landscape:
image = agent.run("Transform the image: 'A frozen lake and snowy forest'", image=image)
```
```
```text
==Explanation from the agent==
I will use the following tool: `image_transformer` to transform the image.
Expand All @@ -536,7 +539,7 @@ By default the image processing tool returns an image of size 512x512 pixels. Le
image = agent.run("Upscale the image", image)
```

```
```text
==Explanation from the agent==
I will use the following tool: `image_upscaler` to upscale the image.

Expand Down Expand Up @@ -657,7 +660,7 @@ agent.run(
)
```
which outputs the following:
```
```text
==Code generated by the agent==
model = model_download_counter(task="text-to-video")
print(f"The model with the most downloads is {model}.")
Expand Down Expand Up @@ -738,7 +741,7 @@ agent.run("Generate an image of the `prompt` after improving it.", prompt="A rab
```

The model adequately leverages the tool:
```
```text
==Explanation from the agent==
I will use the following tools: `StableDiffusionPromptGenerator` to improve the prompt, then `image_generator` to generate an image according to the improved prompt.
Expand Down

0 comments on commit eb5b5ce

Please sign in to comment.