Subtitle
The Subtitle processor allows to parse subtitles in the WebVTT and SubRip formats, and transform them into text or structured data. This allows you to process subtitle documents using other middlewares requiring pure text formats — for example, using the Translate middleware to translate subtitles into multiple languages.
It can also be a good choice when you need to format subtitles of various formats into a common JSON description, highlighting the attributes of each text block within the subtitles.
💬 Parsing Subtitles
To use this middleware, you import it in your CDK stack and instantiate it as part of a pipeline.
Output Formats
You can select the output formats that the subtitle processor will produce for each subtitle document using the .withOutputFormats
method.
💁 If you select more than one output format, the subtitle processor will emit one document per output format. You can select between
text
andjson
.
📄 Output
The Subtitle processor supports extracting subtitles as plain text, or as structured JSON data. Below are examples of each output format.
Plain Text
The plain text format outputs the subtitles as new line separated text blocks, with each new line consisting of the \r\n\r\n
characters. It is safe to assume that you can isolate each text block by splitting the text on those characters.
💁 Click to expand example
JSON
The JSON format outputs each text block from the subtitles, as a common JSON description.
💁 Click to expand example
🏗️ Architecture
This middleware is based on a Lambda compute based on the ARM64 architecture, using the node-webvtt and srt-parser-2 libraries to parse WebVTT and SubRip subtitles, respectively.
🏷️ Properties
Supported Inputs
Mime Type | Description |
---|---|
text/vtt | WebVTT subtitles. |
text/srt | SubRip subtitles. |
Supported Outputs
Mime Type | Description |
---|---|
text/plain | Plain text documents. |
application/json | JSON documents. |
Supported Compute Types
Type | Description |
---|---|
CPU | This middleware only supports CPU compute. |
📖 Examples
Building a Video Subtitle Service - An example showcasing how to build a video subtitle service using Project Lakechain.