Tutorial: Create an MLCube¶

Interested in getting started with MLCube®? Follow the instructions in this tutorial.

Step 1: Setup¶

Get MLCube, MLCube examples and MLCube Templates, and CREATE a Python environment.

# You can clone the mlcube examples and templates from GtiHub
git clone https://github.com/mlcommons/mlcube_examples

# Create a python environment
virtualenv -p python3 ./env && source ./env/bin/activate

# Install mlcube, mlcube-docker and cookiecutter 
pip install mlcube mlcube-docker cookiecutter

Step 2: Configure MLCube using the mlcube_cookiecutter¶

Let's use the matmul example, that we downloaded in the previous step, to illustrate how to make an MLCube. Matmul is a simple matrix multiply example written in Python with TensorFlow. When you create an MLCube for your own model you will use your own code, data and dockerfile.

# make the mlcube examples root directory your current directory
cd mlcube_examples

# rename matmul reference implementation from matmul to matmul_reference
mv ./matmul ./matmul_reference

# create a mlcube directory using mlcube template(note: do not use quotes
# in your input to cookiecutter): 
#    mlcube_name = matmul
#    mlcube_description = Matrix multiplication example
#    author_name = MLPerf Best Practices Working Group
#    author_email =  first.second@corp.com
#    author_org = corp
cookiecutter https://github.com/mlcommons/mlcube_cookiecutter.git

# copy the matmul.py,Dockerfile and requirements.txt to your 
# mlcube_matmul/build directory
cp matmul_reference/matmul.py matmul/
cp matmul_reference/Dockerfile matmul/
cp matmul_reference/requirements.txt matmul/

# copy input file for matmul to workspace directory
cp -R  matmul_reference/workspace matmul

Edit the template files Start by looking at the mlcube.yaml file that has been generated by cookiecutter.

cd ./matmul

Cookiecutter has modified the lines shown in bold in the mlcube.yaml file shown here:

 
# This YAML file marks a directory to be an MLCube directory. When running MLCubes with runners, 
# MLCube path is specified using `--mlcube` runner command line argument. The most important 
# parameters that are defined here are (1) name, (2) author and (3) list of MLCube tasks.

# MLCube name (string). Replace it with your MLCube name (e.g. "MNIST").
name: matmul

# MLCube description (string). Replace it with your MLCube name
# (e.g. "MLCommons MNIST MLCube example").
description: Matrix multiplication example

# List of authors. Cookiecutter sets the first author.
authors:
  - name: MLPerf Best Practices Working Group
    email: first.second@corp.com
    org: corp

# This dictionary can contain information about SW/HW requirements
platform: {}
#  accelerator_count: 0
#  accelerator_maker: NVIDIA
#  accelerator_model: A100-80GB
#  host_memory_gb: 40
#  need_internet_access: True
#  host_disk_space_gb: 100

# This cookiecutter creates a docker-based MLCube.
docker:
  image: mlcommons/matmul:0.0.1

# List of MLCube tasks supported by this MLCube.
tasks:
  matmul:
    parameters:
      inputs: {parameters_file: parameters_file.yaml}
      outputs: {output_file: {type: file, default: output.txt}}

Our input file shapes.yaml that we have copied previously into the mlcube workspace contains input parameters to set matrix dimensions. We need to remove the automatically generated parameters file.

rm workspace/parameters_file.yaml

Now we will edit file mlcube.yaml. The lines you need to edit are shown in bold shown here:


# This YAML file marks a directory to be an MLCube directory. When running MLCubes with runners, 
# MLCube path is specified using `--mlcube` runner command line argument. The most important 
# parameters that are defined here are (1) name, (2) author and (3) list of MLCube tasks.

# MLCube name (string). Replace it with your MLCube name (e.g. "MNIST").
name: matmul

# MLCube description (string). Replace it with your MLCube name 
# (e.g. "MLCommons MNIST MLCube example").
description: Matrix multiplication example

# List of authors. Cookiecutter sets the first author.
authors:
  - name: MLPerf Best Practices Working Group
    email: first.second@corp.com
    org: corp

# This dictionary can contain information about SW/HW requirements
platform: {}
#  accelerator_count: 0
#  accelerator_maker: NVIDIA
#  accelerator_model: A100-80GB
#  host_memory_gb: 40
#  need_internet_access: True
#  host_disk_space_gb: 100

# This cookiecutter creates a docker-based MLCube.
docker:
  image: mlcommons/matmul:v1.0

# List of MLCube tasks supported by this MLCube.
tasks:
  matmul:
    parameters:
      inputs: {parameters_file: shapes.yaml}
      outputs: {output_file: {type: file, default: matmul_output.txt}}

Step 3: Build Docker container Image¶

mlcube configure --mlcube=. --platform=docker -Prunner.build_strategy=auto

Step 4: Test your MLCube¶

# Run `matmul` task
mlcube run --mlcube=. --platform=docker --task=matmul

# Show the content of the workspace directory 
ls ./workspace

# Show the content of the file generated by `matmul` task
cat ./workspace/matmul_output.txt