Built-in Converters
ASH includes 2 built-in converters that preprocess files to make them suitable for security scanning. Converters handle file format transformations and archive extraction automatically.
For detailed visual diagrams of the built-in converter architecture and workflows, see Built-in Converter Diagrams.
Converter Overview
Converter | Purpose | Input Formats | Output |
---|---|---|---|
Archive Converter | Extract compressed archives | zip, tar, tar.gz | Extracted files of known scannable extensions |
Jupyter Converter | Process Jupyter notebooks | .ipynb | Python source code |
Converter Details
Archive Converter
Purpose: Automatically extracts compressed archives to enable scanning of contained files.
Supported Formats: - ZIP files (.zip) - TAR archives (.tar, .tar.gz, .tgz)
Configuration:
converters:
archive:
enabled: true
options:
max_extraction_depth: 3
max_file_size: "100MB"
preserve_permissions: true
extract_nested: true
Key Features: - Recursive extraction of nested archives - Size and depth limits for security - Permission preservation - Automatic cleanup after scanning
Use Cases: - Scanning packaged applications - Analyzing deployment artifacts - Processing downloaded dependencies - Auditing compressed source code
Jupyter Converter
Purpose: Extracts Python code from Jupyter notebooks for security analysis.
Configuration:
converters:
jupyter:
enabled: true
options:
extract_code_cells: true
extract_markdown_cells: false
preserve_cell_numbers: true
output_format: "python"
Key Features: - Code cell extraction - Cell number preservation for accurate line mapping - Markdown cell processing (optional) - Python syntax validation
Use Cases: - Data science project security - ML model code analysis - Educational content scanning - Research code auditing
Configuration Examples
Basic Configuration
Advanced Configuration
converters:
archive:
enabled: true
options:
max_extraction_depth: 2
max_file_size: "50MB"
allowed_extensions: [".zip", ".tar.gz", ".7z"]
exclude_patterns: ["*.exe", "*.dll"]
jupyter:
enabled: true
options:
extract_code_cells: true
extract_markdown_cells: true
cell_separator: "# %%"
validate_syntax: true
Best Practices
Archive Security
converters:
archive:
options:
max_extraction_depth: 3 # Prevent zip bombs
max_file_size: "100MB" # Limit resource usage
scan_extracted_only: true # Don't scan original archives
Jupyter Processing
converters:
jupyter:
options:
preserve_cell_numbers: true # Accurate line mapping
validate_syntax: true # Skip malformed cells
Integration with Scanners
Converters automatically prepare files for scanner consumption:
# Archives are extracted, then contents scanned
ash project.zip --scanners bandit,semgrep
# Jupyter notebooks converted to Python, then scanned
ash analysis.ipynb --scanners bandit,detect-secrets
Troubleshooting
Archive Issues
Extraction failures:
Large archives:
Jupyter Issues
Malformed notebooks:
Next Steps
- Scanner Configuration: Configure security scanners
- File Processing: Advanced file handling