Learning Module (mlsimkit.learn)

Before reading this, first understand Code Structure and Concepts. You are encouraged to complete the tutorials and read the user guides to understand the use case applications before modifying the code.

The learn module is the core component of the MLSimKit package, offering functionality for various machine learning tasks related to physics-based simulations. It follows a modular design, with a centralized common submodule containing shared utilities and helper functions used across the toolkit, and dedicated submodules for different use cases.

The common submodule contains shared components like the command-line interface (CLI) utilities, configuration management, logging utilities, mesh data processing, and a shared training loop. The training.py file within this submodule provides a generic training function that can be utilized by different use cases.

The networks submodule contains the implementations of various neural network architectures used in the toolkit, such as convolutional autoencoders and MeshGraphNets (a graph neural network architecture). These network architectures are designed to handle different types of data and serve different use cases.

Use Cases

There are dedicated submodules for each specific use case, such as Key Performance Indicator (KPI) prediction, surface variable prediction, and slice prediction (kpi, surface, and slices, respectively). These submodules encapsulate the functionality related to their respective use cases, including data loading and preprocessing, training, inference, and visualization (where applicable).

The learn module follows a consistent structure by convention across the different use case submodules (e.g., kpi, surface, slices). Each submodule typically contains the following components:

  • preprocessing.py: This file contains utilities for preprocessing the data specific to the use case, such as processing mesh files, converting data formats, and generating manifests.

  • data.py: This file defines custom dataset interfaces for loading and preprocessing the data from manifests, tailored to the specific data types and requirements of the use case.

  • schema/: This directory contains Pydantic schema definitions for various settings and configurations related to the use case, such as preprocessing settings, training settings, and inference settings.

  • training.py: This file implements the training functionality for the use case, including the run_train function that leverages the shared training loop from the common submodule.

  • inference.py: This file provides utilities for performing inference and generating predictions using the trained models specific to the use case.

  • cli.py: This file serves as the command-line interface (CLI) entry point for the use case, allowing users to interact with the functionality through subcommands (e.g., mlsimkit-learn kpi preprocess, mlsimkit-learn surface train).

Within each of these components, the code is structured to encapsulate the logic and utilities specific to the use case, while leveraging shared utilities and abstractions from the common submodule for common tasks like configuration management, logging, and training.

Training Flow

The training.py file within each use case submodule follows a similar flow for training the respective machine learning models. Here’s a general overview of the training process:

  1. Configuration Parsing: The run_train function, typically the entry point for training, parses the configuration settings from the command-line arguments or configuration files. These settings may include hyperparameters, data paths, and other training-related options.

  2. Data Loading: The function loads the training and validation data from the respective manifests using the dataset interfaces defined in data.py. This step typically involves creating instances of the custom dataset classes and passing the appropriate manifest paths.

  3. Model and Optimizer Initialization: Based on the configuration settings and the characteristics of the input data (e.g., node and edge input sizes for geometric data), the function initializes the appropriate model architecture. It also creates an instance of the ModelIO class from the networks submodule, which encapsulates the logic for creating, saving, and loading models. Additionally, an optimizer is initialized for the training process.

  4. Training Loop: The core training process is typically delegated to the shared train function from the common.training module. This function handles the iterative training loop, computing the loss, backpropagation, and model updates. It also manages checkpointing, validation, and early stopping based on the provided configurations.

  5. Model Saving: After the training process is complete, the run_train function saves the trained model using the ModelIO instance. This step typically involves saving the model state, optimizer state, and other relevant metadata to a file or directory specified in the configuration.

  6. Optional Steps: Depending on the use case and configuration settings, additional steps may be performed after training, such as:

    • Generating predictions on the training and validation datasets, and saving the results for comparison or visualization purposes.

    • Logging training metrics and artifacts using MLflow or other experiment tracking tools.

    • Updating the internal manifests with the paths to the trained model or other generated artifacts.

While the specific implementation details may vary across different use cases, the general flow of the training.py file follows this structure, leveraging the shared utilities from the common submodule and the custom components defined within the use case submodule.

Programmatically Training and Predicting

The learn module can be imported and used in your Python scripts or notebooks. For example, to perform KPI prediction, you can import the necessary components from the kpi submodule:

from mlsimkit.learn.kpi import preprocessing, training, inference

# Preprocess data
settings = PreprocessingSettings(...)
working_manifest = preprocessing.run_preprocess(settings, project_root)

# Train the model
train_settings = TrainingSettings(...)
accelerator = Accelerator(...)
training.run_train(train_settings, accelerator)

# Perform inference
inference_settings = InferenceSettings(...)
inference.run_predict(inference_settings, compare_groundtruth=True)

Similarly, for other tasks like surface variable prediction or slice prediction, you can import the relevant components from the corresponding submodules.

For more detailed usage examples and configuration options, refer to the user guides and tutorials provided in the MLSimKit documentation.

mlsimkit.learn.common

The common module contains shared utilities and helper functions used across the learning module:

  • cli.py: CLI command entry for mlsimkit-learn. Submodules add sub-commands e.g, mlsimkit-learn kpi ....

  • config.py: Configuration management and parsing utilities.

  • logging.py: Logging utilities and configuration.

  • mesh.py: Utilities for working with mesh data, including loading, converting, downsampling, and preprocessing mesh files.

  • schema: Pydantic schema definitions for various components, including optimizers, training settings, and project configuration.

  • tracking.py: Utilities for tracking and logging machine learning experiments using MLflow.

  • training.py: Utilities for training machine learning models, including a generic training loop that can be used across different use cases.

  • utils.py: General utility functions for tasks like calculating mean and standard deviation, obtaining optimizers and learning rate schedulers, and saving loss plots and prediction results.

mlsimkit.learn.kpi

The kpi module contains components for Key Performance Indicator (KPI) prediction tasks:

  • cli.py: CLI command entry KPI prediction, including options for preprocessing, training, and inference.

  • data.py: Data loading and preprocessing utilities for KPI prediction, including the KPIDataset class for loading and handling KPI data.

  • inference.py: Inference functionality for KPI prediction, including utilities for getting predictions and saving prediction results.

  • preprocessing.py: Preprocessing utilities for KPI prediction, including functions for processing mesh files and adding preprocessed data to the manifest.

  • schema: Pydantic schema definitions for KPI prediction tasks, including preprocessing, inference, and training settings.

  • training.py: Training functionality for KPI prediction models, including the run_train function for training KPI models using the shared training loop.

mlsimkit.learn.slices

The slices module contains components for slice prediction tasks:

  • cli.py: CLI command entry for slice prediction, including options for preprocessing, training, and inference.

  • data.py: Data loading and preprocessing utilities for slice prediction, including the SlicesDataset and GraphDataset classes for loading and handling slice data.

  • inference.py: Inference functionality for slice prediction, including utilities for running inference on autoencoders and the final prediction model.

  • preprocessing.py: Preprocessing utilities for slice prediction, including functions for loading and converting slice image data.

  • schema: Pydantic schema definitions for slice prediction tasks, including preprocessing, inference, and training settings.

  • training.py: Training functionality for slice prediction models, including the run_train_ae and run_train_mgn functions for training autoencoders and the final prediction model, respectively.

mlsimkit.learn.surface

The surface module contains components for surface variable prediction tasks:

  • cli.py: CLI command entry for surface variable prediction, including options for preprocessing, training, inference, and visualization.

  • data.py: Data loading and preprocessing utilities for surface variable prediction, including the SurfaceDataset class for loading and handling surface data.

  • inference.py: Inference functionality for surface variable prediction, including utilities for running inference and converting predictions to VTK/PyVista formats.

  • preprocessing.py: Preprocessing utilities for surface variable prediction, including functions for processing mesh files, mapping data to STL files, and handling surface variables.

  • schema: Pydantic schema definitions for surface variable prediction tasks, including preprocessing, inference, training, and visualization settings.

  • training.py: Training functionality for surface variable prediction models, including the run_train function for training surface prediction models using the shared training loop.

  • visualize.py: Visualization utilities for surface variable prediction results, including the Viewer class for rendering and visualizing predictions.

mlsimkit.learn.manifest

The manifest module provides utilities for working with data manifests:

  • manifest.py: This file contains the core functionality for creating, splitting, and processing manifests. It includes functions for:

    • Generating manifest entries from simulation “run” folders, extracting parameter values from data files and file paths from glob patterns.

    • Reading and writing manifest files in JSON lines format.

    • Creating and copying working manifests to avoid modifying the original user manifests.

    • Resolving file paths within manifests, handling relative and absolute paths.

    • Splitting manifests into train, validation, and test sets based on specified percentages and random seeds.

  • cli.py: This file provides the command-line interface (CLI) for working with manifests:

    • The create command generates a manifest file from a dataset, extracting parameter values from data files and file paths from glob patterns.

    • The split command splits an existing manifest file into train, validation, and test sets based on the provided split settings.

The manifest module plays a crucial role in managing and preprocessing data for the various machine learning tasks supported by MLSimKit. It ensures that the necessary data files and metadata are organized and accessible through the manifest files, which are then used by other components of the toolkit for tasks such as training and inference.

mlsimkit.learn.networks

The networks module contains implementations of various neural network architectures used in the toolkit:

  • autoencoder.py: Implementation of convolutional autoencoders, including the ConvAutoencoder class and related utilities for training and inference.

  • mgn.py: Implementation of MeshGraphNets, a graph neural network architecture, including the MeshGraphNet class and related utilities for training and inference.

  • schema: Pydantic schema definitions for network architectures, including settings for convolutional autoencoders.