Developing Blockifiers
To develop a blockifier, first follow the instructions in Developing Plugins to create a new plugin project. This will result in a full, working plugin scaffold that you could deploy and use immediately.
Then, read below details about how to modify that scaffold for your own needs.
The Blockifier Contract
Blockifiers are responsible for transforming raw data into Steamship Block Format. Using our SDK, that means implementing the following method:
class MyBlockifier(Blockifier):
def run(
self, request: PluginRequest[RawDataPluginInput]
) -> Union[
Response,
Response[BlockAndTagPluginOutput]
]:
pass
How to Structure Blocks and Tags
The biggest design question you will face when implementing a blockifier is how to structure your blocks and tags.
At the platform level, we leave this open-ended on purpose, but we do encourage a few conventions of common convergence.
See the Data Model section for a discussion of how to think effectively about blocks and tags.
Synchronous Example: A Pseudo-Markdown Blockifier
A trivial implementation of this contract would be a pseudo-Markdown blockifier.
Let’s say this blockifier assumes the input data is UTF-8, assumes that empty new lines represent paragraph breaks. You could implement such a blockifier with this following code:
class PretendMarkdownBlockifier(Blockifier):
def run(self, request: PluginRequest[RawDataPluginInput]) -> Union[PluginRequest[BlockAndTagPluginOutput], BlockAndTagPluginOutput]:
# Grab the raw bytes.
text = request.data.data
# Decode it as UTF-8
if isinstance(text, bytes):
text = text.decode("utf-8")
# Split it into paragraphs based on a double newline
paragraphs = data.split("\n\n")
# Create a block for each paragraph and add a tag marking it as a paragraph
blocks = [
Block.CreateRequest(text=paragraph, tags=[
Tag.CreateRequest(kind="my-plugin", name="paragraph")
]) for paragraph in paragraphs
]
# Return a BlockAndTagPluginOutput object
return BlockAndTagPluginOutput(file=File.CreateRequest(blocks=blocks))
From the standpoint of the Steamship Engine, this PretendMarkdownBlockifier
now provides a way to
transform any bytes claiming to be of this pseudo-markdown type into Steamship Block Format.
Asynchronous Blockifiers
Some blockifiers will need to call third-party APIs that are asynchronous. Image-to-text (OCR) and speech-to-text (S2T) are two common examples. When this occurs, you should make your blockifier asynchronous as well.
See the Developing Asynchronous Plugins section for details.