How to solve complex optimization problems with constraints that have discrete variables
Designing and implementing algorithms for complex problems is difficult. Fun, but difficult. What if I told you that you can solve certain optimization problems using only their mathematical specification? Join me on a journey into the wonderful world of mixed integer linear programming, which has its applications in nurses list, kidney exchange programs, production scheduling, robotic cell energy optimization, automated Sudoku solving, and many, many more! A common property of these problems is that they have a discrete solution space. Usually, a solution space is also bounded by constraints such that only a subset of values represents valid solutions. Therefore, widely known AI optimization algorithms (eg gradient descent) are not suitable for such problems.
In this series of posts, we’ll cover both the practical modeling of discrete optimization problems in Python and the theoretical machinery behind the scenes. The series is intended for professional programmers without a deep background in computer science and for machine learners for whom mixed integer linear programming can be an alternative tool to their familiar data-driven methods. A PhD in rocket science is not required to understand the content, although basic knowledge of mathematics (matrix algebra) and Python programming are assumed.
In this first introductory post, we’ll run into a classic optimization problem that we’ll solve using Mixed Integer Linear Programming in Python.
Imagine that you are the manager of a company. The company has accumulated a considerable budget and would like to invest it in various assets to maximize future profits. There are different types of assets, such as machinery or vehicles, each of which has a fixed cost and a estimated future profit if the good is purchased. The question you can ask yourself is: what assets should you buy today to maximize estimated future profitso you don’t go over budget (assuming you can’t buy a piece of an asset)?
As a smart programmer, you discover a direct method and penny pincher algorithm: rank assets in descending order of their estimated future profit per unit cost (i.e., divide each asset’s profit by its fixed cost) and buy assets along this ordered list until you run out of budget . Easy, although there is no guarantee that the assets selected in this way will be the ones that maximize total future profit. How much are you losing? Naturally, you may be wondering what the optimum selection of assets that you can buy so that no other group produces a greater profit.
So you decide to implement a more sophisticated heuristic algorithm, for example, based on genetic algorithms. After days of writing and debugging the code, you come up with an algorithm that has decent performance and the solutions obtained will have higher profits than those built from the greedy algorithm. But what if the problem requirements change, for example some assets cannot be purchased together? You would have to re-code and debug. And the cycle repeats…
Mixed integer linear programming addresses this problem. Instead of programming an algorithm, it describes its problem in a compatible mathematical language. Once the problem is formalized mathematically, it is passed to a ready-to-use Mixed Integer Linear Programming solver library to get the solution. And since these solvers are written by very smart people with a strong mathematical background, you may well get a better solution than you would from your hand-made algorithm. And if the problem requirements change, simply modify the problem description and then run the same solver again. Sounds too good to be true, right? So let’s break it down.
Mixed Integer Linear Programming (MILP) is called linear for a reason. And it is that: the mathematical description of a problem is nothing more than a bunch of linear inequalities and linear expressions. For example, the linear inequality
with variables x₁, x₂ and fixed parameters a₁, a₂, b₁ they are one of those beasts that appear in MILP formulations. Secondly,
is not a valid formulation within MILP, since there is a quadratic term x₁x₂.
So we know that MILP consists of linear inequalities that somehow encode the problem at hand: we call these inequalities restrictions. Given the fixed parameters, the MILP solver will attempt to plug values into variables so that the constraints are satisfied. The values of the variables that satisfy the constraints are collectively called a workable solution. However, MILP goes far beyond finding any solution that satisfies the constraints. We can look for a feasible solution that optimizes An objectivewhich is a linear function of the variables: optimization is finding the best feasible solution in terms of the objective value.
Variables in MILP can be rational, integer, or binary (which are in fact integers but restricted to 0 or 1); because of that there are mixed in the name. If all the variables in a formulation are rational, we get Linear Programming, which has good computational properties (but more on that in another post).
To put this into perspective for our budget problem, let’s formalize it. First, think about the fixed parameters. Suppose there is 𝑛≥1 different assets that can be chosen. each asset i∈{0,1,2,…,𝑛-1} have fixed cost 𝑐ᵢ and estimated future utility pᵢ. Finally, we are also given the budget 𝐵 that can be spent on the assets.
Now think about the solution. An active 𝑖 may or may not be purchased and it should be the job of the MILP solver to decide whether the asset 𝑖 is bought Then we will have a binary variable 𝑥ᵢ for each asset 𝑖which will be 1 if active 𝑖 is purchased and 0 if it is not. We determine the interpretation of the values of the variables in the model and this interpretation dictates how the rest of the problem formalization should be.
What are the constraints in our problem? We have only one and that is that the total cost of the assets purchased in the solution cannot exceed the budget. We can write this using the inequality
The interpretation of this restriction is the following: if I activate 𝑖 is not bought, then its cost 𝑐ᵢ will not be included in the total cost (the left-hand side of the inequality), since 𝑐ᵢ𝑥ᵢ=𝑐ᵢ0=0. On the other hand, if the asset 𝑖 is purchased, then its cost is included in the total cost.
Finally, we specify our objective, which is the maximization of the estimated total profit obtained from the purchased assets.
And there you have it, your first MILP formulation!
Once you have your MILP formulation, you can pass it to an existing MILP solver for a solution. Today, there are some commercial and even open source MILP solvers out there and I will cover their differences in a future post. For now I will use the python package Python-MIPthat groups the CBC Open Source Resolver. You can install it from PyPi with the following command (there are pre-built packages for Windows, MacOS and Linux)
pip3 install mip --user
Note to Mac M1 users: you may need to do some additional steps.
After that, fire up your favorite code editor and start typing!
MILP solution
Solver status: OptimizationStatus.OPTIMAL
Bought assets: [3, 4]
Total cost of the bought assets: 10
Estimated future profit: 180
Here we create a problem instance with 5 assets to choose from. After solving the MILP model, we got an optimal solution with an estimated future payoff of 180. By comparison, the greedy algorithm would give us a payoff of less than 160.
Greedy algorithm solution
Bought assets: [0]
Total cost of the bought assets: 8
Estimated future profit: 160
It may seem like 180–160=20 is a small profit difference, but consider that the profits are in millions of dollars, so it starts to get interesting, doesn’t it? Also notice that the asset 𝑖=0 with the highest benefit-to-cost ratio was not selected in the MILP solution (there is a better mix of assets leading to higher benefit).
By the way, how would we incorporate the additional requirement that two assets 𝑖,𝑗 can’t they be bought together? By including an additional constraint
m += x[i]+x[j] <= 1
in the model. You can check that this constraint is violated only in the case where both 𝑖,𝑗 they are bought together, because then the left hand side is 2, which is obviously bigger than 1. And that’s it, just one line of code and was able to accommodate an additional requirement in the problem. Phew.
In this first introductory post, we briefly talked about what Mixed Integer Linear Programming (MILP) is and why it is useful. It allows us to solve optimization problems without having to write algorithms. Instead, we write the problem description in a mathematical formulation, which is then solved with one of the many available MILP solvers. Compared to Machine Learning methods, MILP solving is not data-driven, as it takes advantage of the theoretical foundations of combinatorial optimization. We explored a simple optimization problem of buying assets and modeled it in Python (by the way: the problem is actually called backpack problem!).
In the next post, I’d like to define more formally what MILP is, what its solution space looks like, and what the computational complexity is (MILP is a powerful tool, but it has its limits).
All images, unless otherwise stated, are by the author.