Overview

Main Features

Automated Linage management.

Knitfab automatically executes all Tasks while recording their Lineage.

Lineage is a complete record (history) of all data updates, model changes, etc. performed on Knitfab.

Lineage is the chain of events that led to the generation of a task's output (e.g., machine learning models, performance metrics logs).

The substance of the Linage is a chain of records how and when the output (e.g., machine learning models, performance metrics logs) of a Task (described later) was generated.

For any given output, Knitfab exhaustively tracks and manages lineage across multiple tasks by automatically recording which program produced it and what input was given to the program in doing so.

Tag-based declarative workflow

Kntifab associates data to be handled with tasks using tags, which are metadata indicating the nature of the data.

By using tags to express "which data can be input to which task," Knitfab exhaustively and automatically identifies tasks that can be executed, and Knitfab automatically executes them accordingly. The tags can also be preset for the output, so that other tasks that use that output can be automatically executed in a chain.

Container-based task execution

Knitfab isolates and executes all tasks in independent containers.Therefore, there is no need to worry about conflicts with other tasks.

In addition, Knitfab can execute any task in a container image (docker image), so there are no restrictions on the programming language or framework used for implementation.

Technically, it is based on Kubernetes.

Concepts

Tasks

Knitfab considers a task to be something that "takes some input and outputs something". For example, a task in the machine learning field could be a training script for a model or a script for evaluating a model, but any program can be a task.

The rules for tasks in Knitfab are as follows:

As long as the above items are met, tasks can be implemented in any language or framework.

Data & Tags

The directory used by a task for input and output is called "Data" in Knitfab. A Task in Knitfab is something that "receives Data input and outputs Data. There are no restrictions on the contents of the Data. Directories containing arbitrary files can be retained as Data. Once Data is recorded in Knitfab, it is immutable

Therefore, it is never the case that a lineage cannot be reproduced because the Data used in a previous experiment has been changed. When "updating" the data, it is registered as new Data with the same Tags. 

In addition to its content, Data can have  some key:value type metadatas called "Tags". Tags can be set freely by the user for as long as needed and indicate "the type and nature of the data. For example:

The name of the data (name:...) and description (description:...) etc. can also be set as Tags.

Of course, these other Tags can also be freely set. Tags can be added or deleted at any time after the data has been created.

Plan & Tags

A Plan is what gives the definition of a task. Namely,

This gives a definition of what tasks should be performed and how they should be performed.

In the Plan definition, you do not directly specify the Data to be assigned to an input, but instead you specify "what Data with what Tags may be assigned to that input".  Data that can be assigned to an input is Data that has all of the Tags specified for that input. 

For example, an input with the Tag "project: my-project, mode: train, type: dataset, format: csv" will only be assigned data that is (indicated by the Tag) a CSV training dataset for my-project. Extra Tags are allowed, but data with missing Tags will not be assigned to the input.

You can also set Tags on the output. Knitfab will then automatically set that tag for the Data written to that output. 

For example, if the output of a Plan running a training script has the Tag "project: my-project, type: model, version: 1.x", you know that the Data written to that output is the my-project version 1 series model. 

One special type of Plan output is the log. This is the content of a task's standard output and standard errors that are automatically collected and left as Data. Of course, since logs are a type of output, they can be Tagged in the same way as outputs.

Run & Lineages

Knitfab automatically executes tasks based on the Data and Plans it has. This individual execution is the Run of Knitfab.

When Knitfab finds a Plan that has Data assigned to all inputs, it examines the assigned Data combinations and generates a Run with "input/data combinations that have not yet been run". (Note: Knitfab's Data is immutable, so any Data updates will be assigned a different Data id and recognized as data that has "not yet been run.")

For example, suppose you have a Plan that takes a "training dataset" and a "hyperparameter" as inputs, respectively. When a new hyperparameter is registered as Data, Knitfab generates and executes Run(s) for each combination of that new hyperparameter and each existing training dataset.

A Run is also, from a different perspective, a record of "what output data was generated by what program and what input was passed to it. The chain of Runs is the Linage in Knitfab.

CLI tool: knit

All general operations on Knitfab are performed via the CLI command knit.

It has a full range of functions for registering and retrieving Data, checking the status of Runs, reading Logs, and defining Plans.

Resolving MLOps Issues in Knitfab

Preventing omissions in experiments and experiment management

With Knitfab, there are no omissions in experimentation and experiment management.

Knitfab looks for "executable but not executed" Runs based on Tags and automatically executes them. It also records "what the input/output of the program was" when it does so.

For example, if a Plan specifying training data and training scripts is registered in Knitfab in advance, all you have to do is register more and more hyperparameters, and experiments(=Runs) with each hyperparameter will be performed with Lineage records.

Automating routine tasks

Tasks such as release evaluation of a new model, such as "evaluating old and new models on a given data set and comparing their performance," can be performed automatically by simply Tagging the newly created model and registering it in Knitfab.

Since Knitfab associates Data and Plan inputs by Tags, all you have to do is register a new model by setting the corresponding Tag and the input of the Plan including the evaluation script. In addition, experimental conditions other than the model are also determined according to the Plan, so that comparisons can always be made under constant conditions.

The data output from a Run is also automatically Tagged, so that if there is a Plan that can assign that Data, a run is generated in a chain reaction fashion. This can be used to freely define and build even complex workflows.

For example, a task such as continuous learning can be constructed as a workflow in which "a given tagged data set is periodically registered and sequentially executed from training to evaluation".

Ensure traceability

Any Data, no matter how it was generated within Knitfab, can be traced back to its "how it was generated" Linage. Even if the Data was generated at the end of a complex workflow, the Lineage always leads back to the beginning (the Data manually registered by the user in Knitfab).

Therefore, with Knitfab, data traceability is always ensured.

Distribute computing resources

Computing resources are automatically distributed as Knitfab automatically executes the Runs.

The Plan definition can also include requirements for GPUs, etc., so that appropriate computing resources can be allocated to each Plan, depending on the settings made by the Knitfab administrator.

Next step