Why shouldn’t the focus of a project be on the use of complex techniques? In my opinion, there are three main reasons, which I will explain here.
Reason 1. Business doesn’t care
The first and most important reason is that the company does not care! Your stakeholders are not interested in the technical details of your model. Whether you’ve used powered trees or a neural network, it’s the same for them. What they want to know is how your model helps them achieve their business goals. If the model needs to be retrained frequently, you can justify your decision to use a simple model like logistic regression on a neural network because it’s super fast to train.
Often the primary goal of a machine learning model is not to achieve 100% accuracy. Instead, a machine learning model helps with business processes. Spending too much time optimizing the model will delay the time it takes to deliver a working product to market. It’s better to create an MVP, make sure it meets business requirements, and put it into production. It is critical to consider not only performance, but also interpretability, computation speed, development costs, robustness, and training time. These factors are also important and can be just as relevant to entrepreneurs as performance.
Besides you, there are other people who care about a complex model and state-of-the-art methods. Those people are usually data science researchers or colleagues. If you work too closely with them instead of the business, you may get to the point where you think modeling is the main goal. To overcome this, try to work more closely with business people. Demonstrate your product after each new feature is implemented and ask the company if your assumptions are correct. Decisions that seem small can be really important for entrepreneurs.
Reason 2. A complex model adds less value than a working MVP
The more time you spend on the model, the less time you have for good engineering principles like modular coding, testing, architecture, logging, and monitoring. Setting these things up in a good way at the beginning saves a lot of time later. You can easily add new features to a strong code base. This is more valuable than having a complex model in a Jupyter Notebook that works a little better but doesn’t run in production. Another benefit of a simple model is interpretability, which can help convince stakeholders because they can see that the predictions make sense.
Especially in the beginning, focus on building a working product with strong code and a well-designed CI/CD pipeline. This makes it easier to improve the solution later. If the company does not feel the need to improve the current solution, it can move on to another project. He wasted no time creating a ‘perfect’ model.
What relates to this is the Pareto principle. It is a rule that states that 80% of the results can be achieved with 20% of our efforts (also known as the 80/20 rule). Often creating a complex model that performs slightly better than a simple model is not included in 80% of the results, but it is difficult and time consuming. The complex model is that hard-to-reach last 20% that takes 80% of the effort. Before you start, convince yourself that it’s worth it.
Reason 3. Complex projects require more maintenance
The more complex the project, the more resources and time are required to maintain it. This means you’ll spend more time fixing bugs, optimizing the model, keeping data up to date, and less time adding new features or improving the product. A simple project, on the other hand, requires less maintenance, which means you can spend more time iterating on the MVP and adding new features to improve the product.
An important thought to keep in mind is that the best solution is often the simplest solution that fits the requirements. This can help you determine if that state-of-the-art deep learning model is really worth the extra work that goes into it! If there are two models that work equally well, and one is simple and the other complex, choose the simple one.
An example from my work in a company: I tried to solve a programming problem with reinforcement learning. It was quite complex and we were moving slowly. The business got a little upset and disappointed that we couldn’t show good results. When we changed our solution method to (good old) mathematical optimization, it was much faster! It was less interesting, but we gained the trust of the company and were able to easily implement new features and restrictions.