Documentation Translation
Documentation Translation
Section titled “Documentation Translation”This project uses AWS Bedrock’s Haiku 3.5 model to automatically translate documentation from English to multiple languages. The translation system is designed to be efficient, accurate, and easy to use.
Supported Languages
Section titled “Supported Languages”Currently, the following languages are supported:
- Japanese (jp)
- French (fr)
- Spanish (es)
- German (de)
- Chinese (zh)
- Korean (ko)
How It Works
Section titled “How It Works”The translation system works by:
- Splitting documents by h2 headers - This allows for more efficient processing and better context for the translation model.
- Preserving markdown formatting - All markdown syntax, code blocks, and HTML tags are preserved during translation.
- Special handling for frontmatter - YAML frontmatter is translated while preserving its structure.
- Incremental translation - Only changed files are translated by default, saving time and resources.
Running Translations Locally
Section titled “Running Translations Locally”To translate documentation locally, use the scripts/translate.ts
script:
# Translate only changed files to Japanese (default)./scripts/translate.ts
# Translate all files./scripts/translate.ts --all
# Translate to specific languages./scripts/translate.ts --languages jp,fr,es
# Show what would be translated without actually translating./scripts/translate.ts --dry-run
# Show verbose output./scripts/translate.ts --verbose
GitHub Workflow
Section titled “GitHub Workflow”A GitHub workflow automatically translates documentation when changes are made to English documentation files in pull requests. The workflow:
- Detects changes to English documentation files
- Translates the changed files using AWS Bedrock
- Commits the translations back to the source branch
- Updates the PR with translation status
Manual Workflow Trigger
Section titled “Manual Workflow Trigger”You can also manually trigger the translation workflow from the GitHub Actions tab. This is useful for:
- Running a full translation of all documentation
- Translating to specific languages
- Updating translations after making changes to the translation script
AWS Configuration
Section titled “AWS Configuration”The translation system uses AWS Bedrock’s Haiku 3.5 model for translation. To use this feature, you need:
- AWS credentials - For local development, configure your AWS credentials using the AWS CLI or environment variables.
- IAM Role - For GitHub Actions, configure an IAM role with OIDC authentication and the necessary permissions for AWS Bedrock.
Required Permissions
Section titled “Required Permissions”The IAM role or user needs the following permissions:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "bedrock:InvokeModel" ], "Resource": [ "arn:aws:bedrock:*::foundation-model/anthropic.claude-3-haiku-20240307-v1:0" ] } ]}
Translation Quality
Section titled “Translation Quality”The translation quality is generally high, but there are a few things to keep in mind:
- Technical terms - The system is configured to preserve technical terms in English.
- Code blocks - Code blocks are not translated, as they should remain in their original form.
- Context awareness - The translation model understands the context of the documentation, which helps with technical translations.
Customizing the Translation
Section titled “Customizing the Translation”You can customize the translation process by modifying the scripts/translate.ts
file. Some possible customizations include:
- Adding support for more languages
- Changing the translation model
- Adjusting the prompts used for translation
- Modifying how documents are split and processed
Troubleshooting
Section titled “Troubleshooting”If you encounter issues with the translation process:
- Check AWS credentials - Ensure your AWS credentials are properly configured.
- Check AWS region - Make sure you’re using a region where AWS Bedrock is available.
- Run with verbose output - Use the
--verbose
flag to see more detailed logs. - Check for rate limiting - AWS Bedrock has rate limits that may affect large translation jobs.