3.3 Part 3: Deploy with input and output
3.3.1 Introduction
In Part 2, we have built and deployed our first ML pipeline to train a model and compute predictions on a test subset of the iris dataset.
What if we want to use our pipeline to perform predictions on new data? Currently, we can not pass new data to our model. ⇒ We need to add an Input to our pipeline.
Moreover, in Part 2, our deployment only prints predictions in the logs. What if we want to provide these predictions to a final user? ⇒ We need to add an Output to our pipeline.
This part will show you how to do this with the Craft AI platform:
We will first update the code of the
TrainPredictIris()
function so that it can receive data and return predictions.Then, we will see how to create a step, a pipeline and an endpoint that can handle input data and return the corresponding predictions as an output.
By the end of this part, we will have built an application that allows any user to get the predictions of the iris species on new data with a simple endpoint call:
Prerequisites
Python 3.8 or higher is required to be installed on your computer.
Have done the previous parts of this tutorial ( Part 0. Setup, Part 1: Deploy a simple pipeline and Part 2: Deploy with configuration step ).
3.3.2 Updating the application

First we have to update our code to compute predictions on any
(correctly prepared) data given as input instead of computing
predictions on a test set (below, we highlight in blue and bold the code
changes compared with Part
2).
Hence, our file src/part-3-irisModelIO.py
is as follows:
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.neighbors import KNeighborsClassifier
def TrainPredictIris(**input_data: dict**):
iris_X, iris_y = datasets.load_iris(return_X_y=True, as_frame=True)
np.random.seed(0)
indices = np.random.permutation(len(iris_X))
iris_X_train = iris_X.loc[indices[0:90], :]
iris_y_train = iris_y.loc[indices[0:90]]
## iris_X_test = iris_X.loc[indices[90:], :]
## iris_y_test = iris_y.loc[indices[90:]]
knn = KNeighborsClassifier()
knn.fit(iris_X_train, iris_y_train)
input_dataframe = pd.DataFrame.from_dict(input_data, orient="index")
result = knn.predict(input_dataframe)
print(result)
final_result = result.tolist()
return {"predictions": final_result}
Let’s explain changes compared to the code of Part 2:
We add the argument
input_data
. Here, we choose it to be a dictionary like the one below:{ 1: { 'sepal length (cm)': 6.7, 'sepal width (cm)': 3.3, 'petal length (cm)': 5.7, 'petal width (cm)': 2.1 }, 2: { 'sepal length (cm)': 4.5, 'sepal width (cm)': 2.3, 'petal length (cm)': 1.3, 'petal width (cm)': 0.3 }, }
It contains the data on which we want to compute predictions.
We remove test set from our code (we’ll be able to pass it through the
input_data
argument if we want to) and just keep the train set.At the end, we convert our
input_data
dictionary into a Pandas dataframe, and we compute predictions with our trained model.As you can see, the function now returns a Python dict with one field called “predictions” that contains the predictions value. The platform only accepts step function with one return value of type ``dict``. Each item of this dict will be an output of the step and the key associated with each item will be the name of this output on the platform.
Moreover, you can see that we converted our
result
from a numpy ndarray to a list. That is because the values of the inputs and outputs are restricted to native Python types such as int, float, bool, string, list and dict with elements of those types. More precisely anything that is json-serializable. Later, the platform might handle more complex input and output types such as numpy array or even pandas dataframe.
Warning
Since we updated our code, we must add and commit our changes with Git and push them to GitHub so that the platform can take them into account!
3.3.3 Step creation with Input and Output
Now, let’s create our step on the platform. Here, since we have an
input and an output, our step is the combination of three elements: an
input, an output and the Python function above. We will first declare
the input and the output. Then, we will use the function
sdk.create_step()
as in Part
2
to create the whole step.

3.3.3.1 Declare Input and Output of our new step
To manage inputs and outputs of a step, the platform requires you to declare them using the ``Input`` and ``Output`` classes from the SDK.
For our Iris application, the inputs and outputs declaration would look like this:
from craft_ai_sdk.io import Input, Output
## Create input
prediction_input = Input(
name="input_data",
data_type="json"
)
prediction_output = Output(
name="predictions",
data_type="array"
)
Both objects have two main attributes:
The
name
of theInput
orOutput
For the input it corresponds to the name of an argument of your step’s function. In our case
name="input_data"
, as in the first line of function :def TrainPredictIris(**input_data:** dict):
For the output it must be a key in the dictionary returned by your step’s function. In our case,
name="predictions"
as in the last line of function :return {"**predictions**": final_result}
The
data_type
describing the type of data it can accept. It can be one of: string, number, boolean, json, array.For the input we want a dictionary as we specified, which corresponds to
data_type="json"
.For the output, we return a list which corresponds to
data_type="array"
.
Now, we have everything we need to create, as
before,
the step and the pipeline corresponding to our new
TrainPredictIris()
function.
3.3.3.2 Create step
Now as in Part 2, it is time to create our step on the platform using
the sdk.create_step()
function, but this time we specify our input
and output:
sdk.create_step(
step_name="part-3-irisio-step",
function_path="src/part-3-irisModelIO.py",
function_name="TrainPredictIris",
description="This function creates a classifier model for iris and makes prediction on test data set",
inputs=[prediction_input],
outputs=[prediction_output],
container_config={
"included_folders": ["src"],
"requirements_path": "requirements.txt",
},
)
This is exclatly like in part 2 except for two parameters :
inputs
containing the list ofInput
objects we declared above (here,prediction_input
).outputs
containing the list ofOutput
objects we declared above (here,prediction_output
).
When step creation is finished, you obtain an output describing your step (including its inputs and outputs) as below:
>> Step "irisclassifier-with-io" created
Inputs:
- input_data (json)
Outputs:
- predictions (json)
>> Steps creation succeeded
>> {'name': 'irisclassifier-with-io',
'inputs': [{'name': 'input_data', 'data_type': 'json'}],
'outputs': [{'name': 'predictions', 'data_type': 'json'}]}
Now that our step is created in the platform, we can embed it in a piepline and deploy it.
3.3.4 Create and deploy your pipeline
3.3.4.1 Create pipeline
Let’s create our pipeline here with sdk.create_pipeline()
as in
Part
2:
sdk.create_pipeline(
pipeline_name="part-3-irisio-pipeline",
step_name="part-3-irisio-step",
)
You quickly obtain this output, which describes the pipeline, its step and its inputs and outputs:
>> Pipeline creation succeeded
>> {'pipeline_name': 'part-3-irisio-pipeline',
'created_at': '2023-02-02T17:12:33.032Z',
'steps': ['part-3-irisio-step'],
'open_inputs': [{'input_name': 'input_data',
'step_name': 'irisclassifier-with-io',
'data_type': 'json'}],
'open_outputs': [{'output_name': 'predictions',
'step_name': 'irisclassifier-with-io',
'data_type': 'json'}]}
🎉 You’ve created your first step & pipeline with an input and an output!
Let’s deploy this pipeline.
3.3.4.2 Create endpoint
To do this, we need to create an endpoint, similarly to what we did in
Part 2 with sdk.create_deployment()
.
The big difference here is that the pipeline, that will be triggered by the endpoint, expects to have data as input and will send data as output. Up until now, the endpoint was only a way for an external user of our app to trigger the execution of the associated pipeline, but now, the user will also use it to send input data to the pipeline and to retrieve the results.
By default, the endpoint will expect all the inputs of the pipeline to be transmitted via the endpoint (we will see in the next part some more advanced options). However, you have to specify explicitly the outputs you want to be returned by the endpoint to avoid data leakage to the end user.
You might also want to deliver the outputs with a different name that
the one you specified in the output of your step. In our case, we want
to return our only output predictions
, the predictions of our iris
model, to the user of the app, and serve it to the client with the name
iris_species
, which is more understandable for the end user.

In the craft platform, this is done by declaring some
OutputDestination
objects like so:
from craft_ai_sdk.io import OutputDestination
output_mapping = OutputDestination(
step_output_name='predictions',
endpoint_output_name='iris_species'
)
Now, we can create our endpoint as follows. Note that
sdk.create_deployment()
needs a new argument outputs_mapping
,
which is a list of the OutputDestination
objects we need
(there should be one mapping for each step output we want to expose):
endpoint = sdk.create_deployment(
pipeline_name="part-3-irisio-pipeline",
deployment_name="part-3-irisio-endpoint",
execution_rule="endpoint",
outputs_mapping=[output_mapping]
)
🎉 Bingo! You have created an endpoint using one input and one output. Let’s check if it effectively accepts input data and returns predictions by calling it.
3.3.5 Call the endpoint with new input data
3.3.5.1 Prepare input data
Now, our endpoint needs data as input (formatted as we said above ⬆️). Let’s prepare it, simply by choosing some of the rows of iris dataset we did not use when training our model:
## prepare input data for which we want predictions
import numpy as np
import pandas as pd
from sklearn import datasets
np.random.seed(0)
indices = np.random.permutation(150)
iris_X, iris_y = datasets.load_iris(return_X_y=True, as_frame=True)
iris_X_test = iris_X.loc[indices[90:120],:]
## convert our test dataframe into a dictionary as required
test_dict_data = iris_X_test.to_dict(orient="index")
Let’s check the data we created:
print(test_dict_data)
We get the following output:
>> 124: {'sepal length (cm)': 6.7,
'sepal width (cm)': 3.3,
'petal length (cm)': 5.7,
'petal width (cm)': 2.1
},
41: {'sepal length (cm)': 4.5
...
Finally, we need to encapsulate this dictionary in another dictionary
whose key is "input_data"
(the name of the input of our step,
i.e. the name of the argument of our step’s function):
test_data = {
"input_data": test_dict_data
}
In particular, when your step has several inputs, this dictionary should have as much keys as the number of inputs. the step function has arguments.
3.3.5.2 Call our endpoint
Finally, we can test our endpoint with the data we’ve just prepared by
calling it almost as in Part 2 and passing our dictionary test_data
in the json
argument of requests.post()
:
import requests
endpoint_URL = sdk.base_environment_url + "/endpoints/" + endpoint["name"]
headers = {"Authorization": "EndpointToken " + endpoint["endpoint_token"]}
request = requests.post(endpoint_URL, headers=headers, json=test_data)
Let’s check the HTTP status code of our request:
request.status_code
>> 200
Finally, our output can be obtained like this:
request.json()['outputs']
This gives the output we want (with the predictions!):
>> {'iris_species': [0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2]}
Moreover, you can check the logs of this execution as follows:
pipeline_executions = sdk.list_pipeline_executions(pipeline_name="pipe-irisclassifier-with-io") logs = sdk.get_pipeline_execution_logs(pipeline_name="pipe-irisclassifier-with-io", execution_id=pipeline_executions[-1]['execution_id']) print('\n'.join(log["message"] for log in logs))
You can also find these logs on the UI, by clicking on the Executions tab of your environment, selecting your pipeline and choosing the last execution.
🎉 Congratulations! You have deployed an endpoint to which we can pass new data and get predictions.
Next step: Part 4: Deploy with the Data Store