Create a Workflow#
So far we have defined the configuration for our project and have setup
some experiments to run. We know what models we want to train, but we haven’t
defined how to train and evaluate them. This is where the Workflow class
comes in. In Brisk, a Workflow is what we call the steps we want to take for
each experiment.
In workflows/workflow.py you will see a class called MyWorkflow that inherits
the Workflow class. You will also see a workflow method that takes no
arguments. This is where you define the steps you want to take for each experiment.
Brisk will pass the correct model, data splits and evaluation methods to the workflow
for each experiment.
There are a few instance variables that are available to the workflow:
self.modelrefers to the model we are training.self.X_trainandself.y_trainprovide the training data.self.X_testandself.y_testprovide the testing data.
We will create a simple workflow as follows:
class MyWorkflow(Workflow):
def workflow(self):
# Fit the model to the training data
self.model.fit(self.X_train, self.y_train)
# Evaluate the model on the testing data
self.evaluate_model(
self.model, self.X_test, self.y_test,
["mean_absolute_error"], "pre_tuning_score"
)
# Tune the model hyperparameters
tuned_model = self.hyperparameter_tuning(
self.model, "random", self.X_train, self.y_train,
"MAE", kf=5, num_rep=2, n_jobs=-1
)
# Evaluate the tuned model on the testing data
self.evaluate_model(
tuned_model, self.X_test, self.y_test,
["MAE"], "post_tuning_score"
)
# Plot the learning curve
self.plot_learning_curve(
tuned_model, self.X_train, self.y_train, metric="MAE"
)
# Save the tuned model
self.save_model(tuned_model, "tuned_model")
We can access our mean absolute error metric from metrics.py by using the name
or the abbreviation. Remember that this method will be called for all the experiments
so it is best not to hardcode the name of algorithms when creating variables or
filenames to avoid confusion. See the EvaluationManager
section for more information on the evaluation methods available.
As a final note you’ll notice that workflows are given their own workflows
directory. This allows you to have multiple workflows in the same project. In the
next section you will see how to select a specific workflow to use. Each .py file
can only contain one Workflow subclass. This is to avoid using the wrong workflow at
runtime.