We gradually update and improve this section with the help of our users.
Feel free to extend it via GitHub pull-requests.
Share and reuse CK components similar to PyPI
Collective Knowledge framework (CK) was introduced in 2015
to provide a common format for research artifacts and enable portable workflows.
The idea behind CK is to convert ad-hoc research projects into a file-based database
of reusable components (code, data, models, pre-/post-processing scripts, experimental results, R&D
automation actions and best research practices to reproduce results,
and live papers) with unified Python APIs, CLI-based actions, JSON meta
information and JSON input/output.
CK also features plugins to automatically detect required software, models and datasets
on a user machine and install (cross-compile) the missing ones while supporting
different operating systems (Linux, Windows, MacOS, Android)
and hardware (Nvidia, Arm, Intel, AMD …).
Unified CK API helps researchers to connect their artifacts into
automated workflows instead of some ad-hoc scripts while making them
portable
using the automatic software detection plugins and
meta-packages.
While using CK to help researchers share their artifacts during reproducibility initiatives at ML and systems conferences
(see 15+ artifacts shared by researchers in the CK format)
and companies to automate ML benchmarking and move ML models to production we noticed two major limitations:
The distributed nature of the CK technology, the lack of a centralized
place to keep all CK components and the lack of convenient GUIs makes
it very challenging to keep track of all contributions from the community,
add new components, assemble workflows, automatically test them across
diverse platforms, and connect them with legacy systems.
The concept of backward compatibility of CK APIs and the lack
of versioning similar to Java made it very challenging to keep stable and
bug-free workflows in real life - a bug in a CK component from one GitHub
project can easily break dependent ML workflows in another GitHub project.
These issues motivated us to develop cknow.io portal
as an open web platform
to aggregate, version and test all CK components and portable CK workflows
necessary to benchmarking deep tech systems in a reproducible and collaborative way,
and to enable portable MLOps with the automated deployment of ML models
in production across diverse systems from IoT to data centers in the most efficient way (MLSysOps).
You need to install cBench
and then follow this guide
to learn how to download or upload your CK components.