Explore practices (design patterns, version control, and monitoring systems) to sustainably mitigate the cost of rapid delivery, with implementation codes.
As the machine learning (ML) community advances over the years, the resources available to develop ML projects are abundant. For example, we can rely on the generic Python package. learning-scikit, which is based on NumPy, SciPy, and matplotlib, for data preprocessing and basic predictive tasks. Or we can take advantage of the open source collection of pre-trained models of Hugging Face to analyze various types of data sets. These allow today’s data scientists to quickly and effortlessly tackle standard machine learning tasks while achieving moderately good model performance.
However, the abundance of ML tools often leads business stakeholders and even professionals to underestimate the effort required to build enterprise-level ML systems. Especially when faced with tight project deadlines, teams can rush to deploy systems into production without giving enough technical considerations. Consequently, the ML system often does not address business needs in a technically sustainable and maintainable way.
As the system evolves and is deployed over time, technical debt accumulates: the longer the implicit cost remains unaddressed, the more expensive it will be to rectify.
There are multiple sources of technical debt in the ML system. Some are included below.
#1 Inflexible code design to address unforeseen requirements
To validate whether ML can address current business challenges, many ML projects start with a proof of concept (PoC). We initially created a Jupyter Notebook or Google Colab environment to explore data, then developed several ad hoc functions and created the illusion that the project was nearing completion for stakeholders. These systems built directly from PoC may end up consisting primarily of glue code — supporting code that connects specific incompatible components but does not itself have data analysis functionality. They can be spaghetti-shaped, difficult to maintain, and prone to…