[ad_1]
This article provides a step-by-step example of how the Hungarian algorithm solves the optimal assignment problem on a graph
The reason I am writing this article is because I struggled for a few days to understand how the Hungarian Algorithm works on a graph. The matrix version is much easier to understand, but it does not provide the required insight. All the excellent information I found on the web fell short of providing me the clarity needed to intuitively understand why the algorithm is doing what it does.
It was also not straightforward for me to translate the algorithm descriptions into a working example. While the various LLM tools we have today helped in rephrasing the description of the algorithm in various ways, they all failed miserably when I asked them to generate a working step by step example. So I persevered to generate an example of the Hungarian algorithm working its magic on a graph. I present this step by step example here and the intuition I gained from this exercise, in the hope that it helps others trying to learn this wonderful algorithm to solve the problem of optimum assignment.
The optimal assignment problem is to find a one-to-one match from a set of nodes to another set of nodes, where the edge between every pair of nodes has a cost associated with it, and the generated matching must ensure the minimum total cost.
This problem is pervasive. It could be assigning a set of people to a set of jobs, with each one demanding a specific price for a specific job, and the problem is then assigning people to the jobs such that the total cost is minimized. It could be assigning a set of available taxis to the set of people looking for taxis, with each taxi requiring a specific time to reach a specific person. The Hungarian algorithm is used to solve this problem every time we book a Uber or Ola.
The assignment problem is best represented as a bipartite graph, which is a graph with two distinct set of nodes, and the edges never connect nodes from the same set. We take the taxi matching example, and the bipartite graph here shows the possible connections between the nodes. Edge weights indicate the time (in minutes) it takes for the specific taxi to reach a specific person. If all the nodes from one set are connected to all the nodes from the other set, the graph is called complete.
In this example, we need to map 4 persons to 4 taxis. We can brute force the solution:
- For the first person, there are four possible taxi assignments, and we can pick any one of them
- For the second person, there are three taxis left, and we pick one of the three
- For the third person, we pick one of the two taxis left
- For the last person, we pick the remaining taxi
So the total number of possible assignments are 4 x 3 x 2 x 1, which is 4 factorial. We then calculate the total cost for all these 4! assignments and pick that assignment that has the lowest cost. For assignment problems of smaller size, brute force may still be feasible. But as n, the number of people/taxis grow, the n! becomes quite large.
The other approach is a greedy algorithm. You pick one person, assign the taxi with minimum cost to that person, then pick the next person, assign the one of the remaining taxis with minimum cost, and so on. In this example, the minimum cost assignment is not possible for the last person because the taxi that can get to him at the shortest time has been assigned to another person already. So the algorithm picks the next available taxi with minimum cost. The last person therefore suffers for the greed of everyone else. The solution to the greedy algorithm is can be seen in the graph here. There is no guarantee that this greedy approach will result in an optimum assignment, though in this example, it does result in achieving the minimum cost of 36.
The Hungarian algorithm provides an efficient way of finding the optimum solution. We will start with the matrix algorithm first. We can represent the graph as an adjacency matrix where the weight of each edge is one entry of the matrix. The graph and its adjacency matrix can be seen here.
Here are the steps of the Hungarian algorithm operating on the adjacency matrix:
- For each row in the adjacency matrix, find and subtract the minimum entry from all the entries of that row
- For each column in the adjacency matrix, find and subtract the minimum entry from all entries of that column
- Cover all zeros in the matrix with minimum number of lines
a. Count the number of zeros in each row and in each column
b. Draw lines on the row/column that has most zeros first, then second most and so on
c. If number of lines required cover all zeros is less than the number or row/column of the matrix, continue to step 4, else move to step 5 - Find the smallest entry that is not covered by a line and subtract that entry from all uncovered entries while adding it to all entries covered twice (horizontal and vertical line), go to step 3
- Optimum assignment can be generated by starting with the row/column that has only one zero
The first two steps are easy. The third step requires crossing out rows or columns in the order of the number of zeros they have, highest to lowest. If the number of rows and columns crossed out does not equal the number of nodes to be matched, we need to create additional zeros — this is done in the fourth step. Iterate step 3 and step 4 until enough rows and columns are crossed out. We then have enough zeros in the matrix for optimum matching.
The steps of the Hungarian algorithm for this taxi matching example are shown in this animated GIF.
For those to whom the GIF is hard to follow, here is the image that shows the steps.
The algorithm terminates once the number of rows/columns that have been crossed out equals the number of nodes to be matched. The final step is then to find the assignment, and this is easily done by first assigning the edge corresponding to one of the zero entry, which in this example, could be (P1,T2) by picking the first zero in the first row. We cannot therefore assign T2 to anyone else, so the second zero in the T2 column can be removed. The only remaining zero in the P4 row says it must be assigned to T1, so the next assignment is (P4,T1). The second zero in T1 column can now be removed, and the P2 row then only has one zero. The third assignment then is (P2,T3). The final assignment is therefore (P3,T4). The reader can compute the total cost by adding up all the edge weights corresponding to these assignments, and the result will be 36.
This assignment process is much more intuitive if we look at the GIF animation, where we have a subgraph that connects all the nodes, and we can create an alternating path where the edges alternate between matched (green) and unmatched (red).
[ad_2]
Source link