Part 1: Execute a simple pipeline

The main goal of the Craft AI platform is to allow to deploy easily your machine learning pipelines.

In this part we will use the platform to build a simple “hello world” application by showing you how to execute a basic Python code that prints “Hello world” and displays the number of days until 2025.

You will learn how to:

Package your application code into a step on the platform
Embed it in a pipeline
Execute it on the platform
Check the logs of the executions on the web interface

step1_00

Create a step with the SDK

The first thing to do to build an application on the Craft AI platform is to create a step.

A Step is the equivalent of a Python function in the Craft AI platform. Like a regular function, a step is defined by the inputs it ingests, the code it runs, and the outputs it returns. For this “hello world” use case, we are focusing on the code part so we will ignore inputs and outputs for now.

A step is created from any function located in the source code on your repository, using the create_step() method of thesdk object.

It is very important to understand that the platform can only create steps from the code present in your GitHub / Gitlab repository in the branch specified during setup. If you have some uncommitted changes, they won’t be taken into account at step creation.

We've added this code to your repository in the get_started folder:

import datetime

def helloWorld() -> None:

    now = datetime.datetime.now()
    difference = datetime.datetime(2025, 1, 1) - now

    print(f'Hello world ! Number of days to 2025 : {difference}')

Our helloworld function is located in src/part-1-helloWorld.py so we can create our first step on the platform as follow:

sdk.create_step(
    step_name='part-1-hello-world',
    function_path='src/part-1-helloWorld.py',
    function_name='helloWorld'
)

Its main arguments are:

The step_name is the name of the step that will be created. This is the identifier you will use later to refer to this step.
The function_path argument is the path of the Python module containing the function that you want to execute for this step. This path must be relative to the root of the git repository.
The function_name argument is the name of the function that you want to execute for this step.

The above code should give you the following output:

>>> Please wait while step is being created. This may take a while...
>>> Steps creation succeeded
>>> {'name': 'part-1-hello-world'}

You can view the list of steps that you created in the platform with the list_steps() function of the SDK.

step_list = sdk.list_steps()
print(step_list)

>>> [{'name': 'part-1-hello-world',
>>>   'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>   'updated_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>   'status': 'Ready',
>>>   'repository_branch': 'main',
>>>   'commit_id': 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
>>>   'repository_url': 'git@github.com:xxxxx/xxxxx.git'}]

You can see your step and its status of creation at Ready.

You can also get the information of a specific step with the get_step() function of the SDK.

step_info = sdk.get_step('part-1-hello-world')
print(step_info)

>>> {
>>>   'parameters': {
>>>     'step_name': 'part-1-hello-world',
>>>     'function_path': 'src/part-1-helloWorld.py',
>>>     'function_name': 'helloWorld',
>>>     'description': None,
>>>     'container_config': {
>>>       'repository_branch': 'main',
>>>       'repository_url': 'git@github.com:xxxxx/xxxxx.git',
>>>       'repository_deploy_key': 'xxx ... xxx',
>>>       'dockerfile_path': None
>>>     },
>>>     'inputs': [],
>>>     'outputs': []
>>>   },
>>>   'creation_info': {
>>>     'commit_id': 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
>>>     'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>     'updated_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>     'status': 'Ready'
>>>   }
>>> }

Success

🎉 Now your step has been created. You can now create your Pipeline (and after that, you’ll execute it on the platform).

Create a pipeline with the SDK

step1_2

The step part-1-hello-world containing our helloWorld code is now created in the platform and ready to be used in a pipeline that we will then be executed.

A pipeline is a machine learning workflow, consisting of one or more steps, that can be easily deployed on the Craft AI platform. This way, you can create a full pipeline formed with a directed acyclic graph (DAG) by specifying the output of one step as the input of another step.

In the future, it will be possible to assemble multiple steps into a complex machine learning pipeline. For now, the platform only allows single step pipelines.

To create a pipeline consisting of the previous step, you must use the create_pipeline() function of the SDK.

sdk.create_pipeline(
    pipeline_name='part-1-hello-world',
    step_name='part-1-hello-world',
)

This function has two arguments:

The pipeline_name is the name of the pipeline you have just created. As for the step_name you will then refer to the pipeline using this name
The step_name is the name of the step used in the pipeline.

After executing this function, you should see the following output :

>>> Pipeline creation succeeded
>>> {'pipeline_name': 'part-1-hello-world',
>>> 'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>> 'steps': ['part-1-hello-world'],
>>> 'open_inputs': [],
>>> 'open_outputs': []}

Success

🎉 Now that our pipeline is created (around our step), we want to execute it. To do this, we will run the pipeline with the sdk function, run_pipeline(), and it will execute the code contained in the step.

Execute your pipeline (run)

You can execute a pipeline on the platform directly with the run_pipeline() function.

This function has two arguments:

The name of the existing pipeline to execute (pipeline_name)
Optional (only if you have inputs): a dict of inputs to pass to the pipeline with input names as dict keys and corresponding values as dict values.

sdk.run_pipeline(pipeline_name='part-1-hello-world')

>>> The pipeline execution may take a while, you can check its status and get information on the Executions page of the front-end.
>>> Its execution ID is 'part-1-hello-world-xxxxx'.
>>> Pipeline execution results retrieval succeeded
>>> Pipeline execution startup succeeded

Success

🎉 Now, you have created a step for the helloWorld function, included it in a pipeline and execute it on the platform! Our hello world application is built and ready to be executed again!

Get information about an execution

Now, we have executed the pipeline. The return of the function allows us to see that the pipeline has been successfully executed, however it does not give us the logs of the execution (we can receive outputs with the return of the run pipeline, but we did not put any here).

It is possible to have the returns of the executions with the SDK, let’s see how it works.

Once your pipeline is executed, you can now see the pipeline executions with the sdk.list_pipeline_executions() command.

sdk.list_pipeline_executions(
    pipeline_name='part-1-hello-world'
)

>>> [{'execution_id': 'part-1-hello-world-XXXX',
>>> 'status': 'Succeeded',
>>> 'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>> 'end_date': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>> 'created_by': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx',
>>> 'pipeline_name': 'part-1-hello-world',
>>> 'deployment_id': 'xxxxxxxx-xxxx-xxxx-xxx-xxxxxxxx',
>>> 'steps':
>>>     [{'name': 'part-1-hello-world',
>>>       'status': 'Succeeded',
>>>       'end_date': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>       'start_date': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>       'commit_id': 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
>>>       'repository_url': 'https://github.com/orga-name/repo-name',
>>>       'repository_branch': 'main',
>>>       'requirements_path': 'requirements.txt'}]}]

Furthermore, you can get the specific logs of an execution with the sdk.get_pipeline_execution_logs() command. You will have to fill in the execution ID, which can be found with the previous command. The logs are given to us line by line in JSON, however, we can display them clearly with the command in the print() below. This way, we will also be able to see the error messages of the step code through this if the execution of the code encounters any.

pipeline_executions = sdk.list_pipeline_executions(
    pipeline_name='part-1-hello-world'
)

logs = sdk.get_pipeline_execution_logs(
    pipeline_name='part-1-hello-world',
      execution_id=pipeline_executions[-1]['execution_id'] # [-1] to get the last execution
)

print('\n'.join(log['message'] for log in logs))

>>> Please wait while logs are being downloaded. This may take a while…
>>> Hello world ! Number of days to 2024 : xxx

To be able to find more easily the list of executions as well as the information and associated logs, you can use the user interface, as follows:

Connect to https://mlops-platform.craft.ai
Click on your project:
Click on the Execution page and on “Select an execution”: this displays the list of environments:
Select your environment to get the list of runs and deployments:
Finally, click on a run name to get its executions:
You have the “General” tab to get general information about your execution and the “Logs” tab where you can see and download the execution logs:

Success

🎉 You can now get your execution's logs.

What we have learned

In this part we learned how to easily build, deploy and use a simple application with the Craft AI platform with the following workflow:

step1_9

These 3 main steps are the fundamental workflow to work with the platform and we will see them over and over throughout this tutorial.

Now that we know how to run our code on the platform, it is time to create more complex steps to have a real ML use case.

Next step : Part 2: Execute a simple ML model