Tools
Tools help your agent perform actions or fetch information from the outside world.
Your starter project already has a few tools in src/example_tools
(opens in a new tab) that you can edit and use.
And you can import or find more open source tools in the Steamship SDK (opens in a new tab):
- Audio Transcription:
- Assembly AI (opens in a new tab) - Turns audio into text
- Whisper (opens in a new tab) - Turns audio into text
- RSS Download (opens in a new tab) - Returns Audio URLs from an RSS feed
- Classification:
- Sentiment Analysis (opens in a new tab) - Can report on the sentiment of a piece of text
- Zero Shot Classification (opens in a new tab) - Can classify a piece of text
- Image Generation:
- DALL-E (opens in a new tab) - Generate images with DALL-E
- Stable Diffusion (opens in a new tab) - Generate images with Stable Diffusion
- Google Image Search (opens in a new tab) - Perform a Google Image Search and return the results
- Speech Generation:
- Eleven Labs (opens in a new tab) - Turn text into the spoken word
- Search:
- Google Search (opens in a new tab) - Find answers to questions on the web
- Question Answering:
- Vector Search QA (opens in a new tab) - Find answers to questions in the Steamship Vector Database
- Prompt Database QA (opens in a new tab) - Find answers to questions from a pre-loaded prompt database
- Text Generation:
- Image Prompt Generation (opens in a new tab) - Rewrite a topic into a Stable Diffusion image prompt
- Personality Tool (opens in a new tab) - Reword a response according to a particular personality
- Text Summarization (opens in a new tab) - Summarize text
- Text Rewriter (opens in a new tab) - Utility tool for building tools that use prompts to operate
- Translation (opens in a new tab) - Translate text using an LLM
- Conversation Starters:
- Knock Knock Joke Starter (opens in a new tab) - Initiate a knock knock joke. The world's most useful tool.
Tools are run in the Cloud
Every Steamship tool is managed for you and run in the cloud. That way you, and your agent, can use them like Python functions without worrying about:
- Authentication (for tools that require it, like DALL-E)
- Logging so you can track what each agent is doing
- Metering so you can track how much each agent is using
- Load balancing & rate limiting so you don't have to worry about overloading API endpoints
- Retry on error so you don't have to add your own retry logic
- Async execution so you can kick off large jobs without worrying about HTTP timeouts
Building your own tool
A tool is just a basic Python class with a few fields:
class TextRewritingTool(Tool):
name: str = "TextRewritingTool"
human_description: str = "Rewrites a piece of text using the provided prompt."
agent_description: str = "Used to rewrite a piece of text given a prompt. Takes text as input, and provides text as output."
def run(self, tool_input: List[Block], context: AgentContext) -> Union[List[Block], Task[Any]]:
pass
For now, tools should return lists of Blocks (opens in a new tab). Async support is coming soon.
- The
name
field is the token that will be used by the ReACT LLM to request the tool. - The
human_description
is used for logging output for human consumption - The
agent_description
is used to tell the LLM how the tool should be used. - The
run
method returnsBlock
objects instead ofstr
(as with some other frameworks) in order to support multi-modal output.
The AgentContext
object contains a copy of the Steamship Client (opens in a new tab) which is pre-authenticated into the workspace (opens in a new tab) in which your agent is running.
That means the tools you write can use it for:
- Vector Search (opens in a new tab)
- Key-Value storage (opens in a new tab)
- File storage (opens in a new tab)
- And arbitrary model invocation on data you have