The AI life cycle involves various stages, from data collection, data analysis, feature engineering and algorithm selection to model building, tuning, testing, deployment, management, monitoring and feedback loops for continuous improvement. As such, the software development tools and processes are fairly standardized with DevOps. However, with the breadth and depth of AI disciplines, frameworks and languages, specialized capability is required and specialized tools need to come in, to manage every stage of AI project development. The software space is also fairly fragmented, with tools from large companies to small startup players.
Based on our interactions with clients, we are starting to see adoption of end-to-end AI life cycle development tools including H2O.ai, Kubeflow, and MLflow in enterprises; however, there is a long way to go, as standardization of these tools and pipelines is still a work in progress.
Creating an AI model from scratch needs a huge amount of effort and investment for collecting datasets, labeling data, choosing algorithms, defining network architecture, establishing hyperparameters, etc. Apart from this choice of language, frameworks and libraries along with client preferences, etc. differ from one problem space to another.
With these challenges, it is important to have a way to reuse the effort invested across the community by sharing models and ensuring model compatibility and portability across environments.
This brings up the need for a minimum of two mechanisms. One is a place where models can be shared for global usage. The global context can be truly global, across enterprises, or within enterprises. Currently, most of the models that exist are basic in nature, such as in vision-object detection and activity recognition. However, there is a need to share the models and datasets that are specific to domain problems for health care, finance, insurance, energy, etc. The second type of mechanism is where models built using Python, TensorFlow and Cuda versions on specific types of GPUs or CPUs need to be compatible and portable to and from PyTorch or other frameworks, libraries and environments.
Open neural network exchange (ONNX) is one such leading open-source model exchange that hosts various pretrained models using ONNX Model Zoo. Similarly, TensorFlow Hub and Model Zoo provide various datasets and models created by the TensorFlow community.
To keep yourself updated on the latest technology and industry trends subscribe to the Infosys Knowledge Institute's publications
Count me in!