Tools

Tools help your agent perform actions or fetch information from the outside world.

Your starter project already has a few tools in src/example_tools (opens in a new tab) that you can edit and use.

And you can import or find more open source tools in the Steamship SDK (opens in a new tab):

Audio Transcription:
- Assembly AI (opens in a new tab) - Turns audio into text
- Whisper (opens in a new tab) - Turns audio into text
- RSS Download (opens in a new tab) - Returns Audio URLs from an RSS feed
Classification:
- Sentiment Analysis (opens in a new tab) - Can report on the sentiment of a piece of text
- Zero Shot Classification (opens in a new tab) - Can classify a piece of text
Image Generation:
- DALL-E (opens in a new tab) - Generate images with DALL-E
- Stable Diffusion (opens in a new tab) - Generate images with Stable Diffusion
- Google Image Search (opens in a new tab) - Perform a Google Image Search and return the results
Speech Generation:
- Eleven Labs (opens in a new tab) - Turn text into the spoken word
Search:
- Google Search (opens in a new tab) - Find answers to questions on the web
Question Answering:
- Vector Search QA (opens in a new tab) - Find answers to questions in the Steamship Vector Database
- Prompt Database QA (opens in a new tab) - Find answers to questions from a pre-loaded prompt database
Text Generation:
- Image Prompt Generation (opens in a new tab) - Rewrite a topic into a Stable Diffusion image prompt
- Personality Tool (opens in a new tab) - Reword a response according to a particular personality
- Text Summarization (opens in a new tab) - Summarize text
- Text Rewriter (opens in a new tab) - Utility tool for building tools that use prompts to operate
- Translation (opens in a new tab) - Translate text using an LLM
Conversation Starters:
- Knock Knock Joke Starter (opens in a new tab) - Initiate a knock knock joke. The world's most useful tool.

Tools are run in the Cloud

Every Steamship tool is managed for you and run in the cloud. That way you, and your agent, can use them like Python functions without worrying about:

Authentication (for tools that require it, like DALL-E)
Logging so you can track what each agent is doing
Metering so you can track how much each agent is using
Load balancing & rate limiting so you don't have to worry about overloading API endpoints
Retry on error so you don't have to add your own retry logic
Async execution so you can kick off large jobs without worrying about HTTP timeouts

Building your own tool

A tool is just a basic Python class with a few fields:

class TextRewritingTool(Tool):
    name: str = "TextRewritingTool"
    human_description: str = "Rewrites a piece of text using the provided prompt."
    agent_description: str = "Used to rewrite a piece of text given a prompt. Takes text as input, and provides text as output."
 
    def run(self, tool_input: List[Block], context: AgentContext) -> Union[List[Block], Task[Any]]:
        pass

For now, tools should return lists of Blocks (opens in a new tab). Async support is coming soon.

The name field is the token that will be used by the ReACT LLM to request the tool.
The human_description is used for logging output for human consumption
The agent_description is used to tell the LLM how the tool should be used.
The run method returns Block objects instead of str (as with some other frameworks) in order to support multi-modal output.

The AgentContext object contains a copy of the Steamship Client (opens in a new tab) which is pre-authenticated into the workspace (opens in a new tab) in which your agent is running.

That means the tools you write can use it for:

Vector Search (opens in a new tab)
Key-Value storage (opens in a new tab)
File storage (opens in a new tab)
And arbitrary model invocation on data you have

Memory Add Personality