Learning Module (mlsimkit.learn
)¶
Before reading this, first understand Code Structure and Concepts. You are encouraged to complete the tutorials and read the user guides to understand the use case applications before modifying the code.
The learn
module is the core component of the MLSimKit package, offering functionality for various machine learning tasks related to physics-based simulations. It follows a modular design, with a centralized common
submodule containing shared utilities and helper functions used across the toolkit, and dedicated submodules for different use cases.
The common
submodule contains shared components like the command-line interface (CLI) utilities, configuration management, logging utilities, mesh data processing, and a shared training loop. The training.py
file within this submodule provides a generic training function that can be utilized by different use cases.
The networks
submodule contains the implementations of various neural network architectures used in the toolkit, such as convolutional autoencoders and MeshGraphNets (a graph neural network architecture). These network architectures are designed to handle different types of data and serve different use cases.
Use Cases¶
There are dedicated submodules for each specific use case, such as Key Performance Indicator (KPI) prediction, surface variable prediction, and slice prediction (kpi
, surface
, and slices
, respectively). These submodules encapsulate the functionality related to their respective use cases, including data loading and preprocessing, training, inference, and visualization (where applicable).
The learn
module follows a consistent structure by convention across the different use case submodules (e.g., kpi
, surface
, slices
). Each submodule typically contains the following components:
preprocessing.py
: This file contains utilities for preprocessing the data specific to the use case, such as processing mesh files, converting data formats, and generating manifests.data.py
: This file defines custom dataset interfaces for loading and preprocessing the data from manifests, tailored to the specific data types and requirements of the use case.schema/
: This directory contains Pydantic schema definitions for various settings and configurations related to the use case, such as preprocessing settings, training settings, and inference settings.training.py
: This file implements the training functionality for the use case, including therun_train
function that leverages the shared training loop from thecommon
submodule.inference.py
: This file provides utilities for performing inference and generating predictions using the trained models specific to the use case.cli.py
: This file serves as the command-line interface (CLI) entry point for the use case, allowing users to interact with the functionality through subcommands (e.g.,mlsimkit-learn kpi preprocess
,mlsimkit-learn surface train
).
Within each of these components, the code is structured to encapsulate the logic and utilities specific to the use case, while leveraging shared utilities and abstractions from the common
submodule for common tasks like configuration management, logging, and training.
Training Flow¶
The training.py
file within each use case submodule follows a similar flow for training the respective machine learning models. Here’s a general overview of the training process:
Configuration Parsing: The
run_train
function, typically the entry point for training, parses the configuration settings from the command-line arguments or configuration files. These settings may include hyperparameters, data paths, and other training-related options.Data Loading: The function loads the training and validation data from the respective manifests using the dataset interfaces defined in
data.py
. This step typically involves creating instances of the custom dataset classes and passing the appropriate manifest paths.Model and Optimizer Initialization: Based on the configuration settings and the characteristics of the input data (e.g., node and edge input sizes for geometric data), the function initializes the appropriate model architecture. It also creates an instance of the
ModelIO
class from thenetworks
submodule, which encapsulates the logic for creating, saving, and loading models. Additionally, an optimizer is initialized for the training process.Training Loop: The core training process is typically delegated to the shared
train
function from thecommon.training
module. This function handles the iterative training loop, computing the loss, backpropagation, and model updates. It also manages checkpointing, validation, and early stopping based on the provided configurations.Model Saving: After the training process is complete, the
run_train
function saves the trained model using theModelIO
instance. This step typically involves saving the model state, optimizer state, and other relevant metadata to a file or directory specified in the configuration.Optional Steps: Depending on the use case and configuration settings, additional steps may be performed after training, such as:
Generating predictions on the training and validation datasets, and saving the results for comparison or visualization purposes.
Logging training metrics and artifacts using MLflow or other experiment tracking tools.
Updating the internal manifests with the paths to the trained model or other generated artifacts.
While the specific implementation details may vary across different use cases, the general flow of the training.py
file follows this structure, leveraging the shared utilities from the common
submodule and the custom components defined within the use case submodule.
Programmatically Training and Predicting¶
The learn
module can be imported and used in your Python scripts or notebooks. For example, to perform KPI prediction, you can import the necessary components from the kpi
submodule:
from mlsimkit.learn.kpi import preprocessing, training, inference
# Preprocess data
settings = PreprocessingSettings(...)
working_manifest = preprocessing.run_preprocess(settings, project_root)
# Train the model
train_settings = TrainingSettings(...)
accelerator = Accelerator(...)
training.run_train(train_settings, accelerator)
# Perform inference
inference_settings = InferenceSettings(...)
inference.run_predict(inference_settings, compare_groundtruth=True)
Similarly, for other tasks like surface variable prediction or slice prediction, you can import the relevant components from the corresponding submodules.
For more detailed usage examples and configuration options, refer to the user guides and tutorials provided in the MLSimKit documentation.
mlsimkit.learn.common
¶
The common
module contains shared utilities and helper functions used across the learning module:
cli.py
: CLI command entry formlsimkit-learn
. Submodules add sub-commands e.g,mlsimkit-learn kpi ...
.config.py
: Configuration management and parsing utilities.logging.py
: Logging utilities and configuration.mesh.py
: Utilities for working with mesh data, including loading, converting, downsampling, and preprocessing mesh files.schema
: Pydantic schema definitions for various components, including optimizers, training settings, and project configuration.tracking.py
: Utilities for tracking and logging machine learning experiments using MLflow.training.py
: Utilities for training machine learning models, including a generic training loop that can be used across different use cases.utils.py
: General utility functions for tasks like calculating mean and standard deviation, obtaining optimizers and learning rate schedulers, and saving loss plots and prediction results.
mlsimkit.learn.kpi
¶
The kpi
module contains components for Key Performance Indicator (KPI) prediction tasks:
cli.py
: CLI command entry KPI prediction, including options for preprocessing, training, and inference.data.py
: Data loading and preprocessing utilities for KPI prediction, including theKPIDataset
class for loading and handling KPI data.inference.py
: Inference functionality for KPI prediction, including utilities for getting predictions and saving prediction results.preprocessing.py
: Preprocessing utilities for KPI prediction, including functions for processing mesh files and adding preprocessed data to the manifest.schema
: Pydantic schema definitions for KPI prediction tasks, including preprocessing, inference, and training settings.training.py
: Training functionality for KPI prediction models, including therun_train
function for training KPI models using the shared training loop.
mlsimkit.learn.slices
¶
The slices
module contains components for slice prediction tasks:
cli.py
: CLI command entry for slice prediction, including options for preprocessing, training, and inference.data.py
: Data loading and preprocessing utilities for slice prediction, including theSlicesDataset
andGraphDataset
classes for loading and handling slice data.inference.py
: Inference functionality for slice prediction, including utilities for running inference on autoencoders and the final prediction model.preprocessing.py
: Preprocessing utilities for slice prediction, including functions for loading and converting slice image data.schema
: Pydantic schema definitions for slice prediction tasks, including preprocessing, inference, and training settings.training.py
: Training functionality for slice prediction models, including therun_train_ae
andrun_train_mgn
functions for training autoencoders and the final prediction model, respectively.
mlsimkit.learn.surface
¶
The surface
module contains components for surface variable prediction tasks:
cli.py
: CLI command entry for surface variable prediction, including options for preprocessing, training, inference, and visualization.data.py
: Data loading and preprocessing utilities for surface variable prediction, including theSurfaceDataset
class for loading and handling surface data.inference.py
: Inference functionality for surface variable prediction, including utilities for running inference and converting predictions to VTK/PyVista formats.preprocessing.py
: Preprocessing utilities for surface variable prediction, including functions for processing mesh files, mapping data to STL files, and handling surface variables.schema
: Pydantic schema definitions for surface variable prediction tasks, including preprocessing, inference, training, and visualization settings.training.py
: Training functionality for surface variable prediction models, including therun_train
function for training surface prediction models using the shared training loop.visualize.py
: Visualization utilities for surface variable prediction results, including theViewer
class for rendering and visualizing predictions.
mlsimkit.learn.manifest
¶
The manifest
module provides utilities for working with data manifests:
manifest.py
: This file contains the core functionality for creating, splitting, and processing manifests. It includes functions for:Generating manifest entries from simulation “run” folders, extracting parameter values from data files and file paths from glob patterns.
Reading and writing manifest files in JSON lines format.
Creating and copying working manifests to avoid modifying the original user manifests.
Resolving file paths within manifests, handling relative and absolute paths.
Splitting manifests into train, validation, and test sets based on specified percentages and random seeds.
cli.py
: This file provides the command-line interface (CLI) for working with manifests:The
create
command generates a manifest file from a dataset, extracting parameter values from data files and file paths from glob patterns.The
split
command splits an existing manifest file into train, validation, and test sets based on the provided split settings.
The manifest
module plays a crucial role in managing and preprocessing data for the various machine learning tasks supported by MLSimKit. It ensures that the necessary data files and metadata are organized and accessible through the manifest files, which are then used by other components of the toolkit for tasks such as training and inference.
mlsimkit.learn.networks
¶
The networks
module contains implementations of various neural network architectures used in the toolkit:
autoencoder.py
: Implementation of convolutional autoencoders, including theConvAutoencoder
class and related utilities for training and inference.mgn.py
: Implementation of MeshGraphNets, a graph neural network architecture, including theMeshGraphNet
class and related utilities for training and inference.schema
: Pydantic schema definitions for network architectures, including settings for convolutional autoencoders.