Wednesday, June 19, 2013

How to fail in TopCoder Marathon round 3

You are given N circles. Just because the problem setter is evil, the distribution for N is not uniform , the numbers are from 50 to 500, and have a bias towards the bottom. Each circle has a random center inside the [(0,0), (1,1)] rectangle, a random radius (rather small) and a random mass m[i]. They are of course going to overlap initially. Move the circles to any other positions in such a way that they no longer overlap and the total work you spend is as little as possible. The total work is equal to the sum of m[i]*(distance between original and final position) for each i.

So the objective is to waste 2 weeks of your life trying to come up with a good solution to this problem that runs in 10 seconds. If you acquire a good rank (probably at least 44-th) you win a t-shirt. The kind of t-shirt you couldn't win in the algorithm track of the tournament. If you do very good, and that means top 4, you win a free trip. I have no idea what is necessary to ever be in the top 4 of a real Marathon match. I suspect witchcraft is somehow related to it, though.

First week

After a very poor performance in the round 2. I decided to take round 3 very seriously and start working on it since the first day of the match. It was a great plan, and I think I could have gotten a t-shirt if I went through it. However, the week this match started was quite a difficult week. I was busy with an editorial, and then I became busy having to work in problem setting Block3Checkers - That problem that was used in the algorithm round. I actually barely noticed the marathon started the Satuday after the problem was finished. But then I still had to work on the *new* editorial...

I could barely start to work in this problem on last week's Tuesday. Almost all of the first week was not productive.

First version

I dedicate the first version of every marathon to the most trivial approach I can think of. The main purpose of the first version is to test the test environment. I have to tweak plenty of things. I have to adapt the visualizer provided with the problem to my ends. Adapt the testing to the thing that runs 500 tests for me. Make the thing that grades and ranks solutions, etc.

In this first version, I pick an arbitrary circle as the first circle. Then for each of the other circles, I try to put them as close as possible to the first circle, moving in the same angle as the one between the first circle and the new one. Using a binary search for the distance.

In the image, small red points are the original locations of each circle. Darker circles have larger m[i].

Even this solution was non-trivial, large Ns were timing out. It turns out that large cases don't have enough time to try all circles as the best initial circle. So I had to start with the larger circles and break when the time runs out.

More so, there was a bug with this solution, so I had to submit it twice. The outcome was position 169, very low. I knew I needed to point towards top 40 if I wanted a t-shirt.

Third version

So I already knew this problem was going to be complex. Too much geometry and too many factors to consider. One of them was , how exactly to decide where to place each circle? So I thought of something. I created a grid of sqrt(N) x sqrt(N) circle centers that were so separated from each other to ensure that no circle overlaps if put in those centers. Then I ran a bipartite matching between each circle and the new centers. Picking the matching with the best total work.

After the positions are decided, of course there will be tons of empty space between the circles. Do not despair, for I am adding a function that will try the circles in order, and try to move each circle as close as possible towards its center (in that direction). Using binary search.

Bipartite matching is too slow for large cases. So large cases still use the old approach.

Didn't really improve my score that much. Note how the circles in the two images are not in the same scale. The green square is the square from corner (0,0) to (1,1), all centers are picked inside it.

Fourth version

It was the beginning of a new day, and had a silly idea to try to improve the score. The original grid of empty spaces where to place the circles looks like this:

As you can see, that's not the most efficient use of all the space. So how about adding new, smaller spaces between the gaps?

It did improve my results. One of the reasons is that the radii are not uniformly distributed either, so there is a good quantity of small radii as well as very large. This took me to rank 146, still too low.

Fifth version

I needed a good idea, and I needed it quick. However, all I could think of was in lower level optimizations, I made a new min-cost max flow solver that was a bit faster. And indeed it was, it allowed me to use the matching strategy in larger N, but not much else. There was a mild improvement.

Sixth version

Bipartite matching is not the only kind of matching. We could always use a greedy matching. I decided to use greedy matching for larger values of N. Finding the closest position for the large values of m[i] first.

I also noticed that the whole binary search I was using to tighten the circles was slow and not very precise. Then it struck me: I can use equations! So you have a circle at some position (x1,y1) away from the original position (x,y), you want to find the closest position to (x,y) in the same angle that does not overlap with other circles. This means that you want a position that is tangent to some other circle. For each other circle, it is possible to find the positions (in the picked angle, if they exist) that would make the circle tangent to them. Then you need to pick one of these tangets that doesn't overlap with other circles....

This improved things up, but the equation is the important part, it opened some doors...

Seventh version

Another day passed. Kept thinking that the matching idea is probably not the way to go. It is very slow, and at the end, it creates gaps that you then waste tons of time "fixing". A better approach would need to be fast (so I could have some sort of iterative improvement to a solution

What I did like was the equation stuff from the previous solution. Then I somehow thought of something. An approach sort of similar to the very first one, but smarter: First decide an order in which to add the circles. Let us go with heaviest first. For each circle, try to place it as close as possible to its original location without overlaps. To reuse the equation, in each step we pick an angle (from the center), one of 360 ones. Using the angle, it is possible to find the best position for the circle with that angle.

This approach was slow at first. For each of N circles, we needed to repeat the same operation 360 times. The operation itself was slow because it needed to calculate positions using all of the previous circles. For each position it found, it needed to use all the previous circles again to avoid overlap... So slow, that I needed to make it use 180 angles for large values of N, making it less accurate.

This was still the most significant improvement I made. First time in the first page of standings. Top 100. But not good enough. The good thing about this approach is that it has plenty of room for improvement. How to pick the order? Picking the order is very friendly with iterative randomized searches, which makes the question of how to make the process - given an order - faster. How to pick the angle? , for example. How not to do that much work using all of the circles and instead only use the relevant ones? etc.

Eight version

I took a break for three days, because I was very late in writing the editorial for the TCO Algorithm Round 3A. My chronic headache (Which has just entered its 9-th consecutive week! Congratulations!) was also acting up. So I couldn't do much for three days. But I was tweaking the solution and doing X and Y things trying to improve. In the last day, I put things up together for a new submission. Surprising enough, all the small things I did at random times during the three days could really improve my score.

I tried to do Hill climbing for picking the order. In each iteration, pick two random indexes of the order array, swap them and calculate the minimum work when using that order, if the work is an improvement, replace your current order array with the new one, so that in the next iteration, that is your base. There are two problems with Hill Climbing. One is that my approach was too slow (barely executing once in large cases) the other is that Hill Climbing is not very good. It is prone to local minimums and getting stuck. To deal with the speed issue a bit , I changed the number of angles to 180 (small Ns) and 90 (large Ns), so that even large cases would have 4 or 5 iterations to try a modification of the order.

The initial order could also use some fixes. m[i] is a very good marker, but there are other things. For example, you could argue that it is more convenient to add circles that are close to the center (0.5, 0.5) first. So that there are less overlaps. But the weight is still important, so you need a sort of heuristic to combine both ideas. I settled with: sqrt(m[i])*m[i] * (1.5 - (distance to 0.5,0.5) ). Why? You ask. I have no idea, but it seemed to do better in local tests than other things I thought of...

I was able to optimize the cost of the function that adds the circles. Because it tries angles in all directions, then all already-place circles are relevant in that step. But only some few circles are relevant to some angles. In an initial step, I process all of the already-existing circles and find the lower and upper bound for an angle (starting at the original position of the new circle) that would have a ray that intersects with the circle. This is easy done by noticing that there is a straight triangle in there, with the upper and lower angles and that we know two of the distances, so it takes as much as calling asin.

This was good enough to reach top 50, back on June the 18-th... (Everybody keeps improving and your rank keeps getting worse if you don't improve).

Ninth version

A new day. Forgot to mention that the reason I focused so much in this approach is that I could show to myself that the order in which you add the circles alone can greatly improve my score. I tried with N=8 and all permutations. It was clear that if I had a good way to pick the order in which to add the circles, I could beat the first place even. But the problem is how to find this order.

A good idea is to make sure the initial order is as good as possible. The better the initial order, the less iterations you need before finding a GREAT order. In theory at least.

So I changed the order. To the previous formula I multiplied: m[i] / (sum of all m[i]s that overlap with the circle). The idea is that if a circle displaces plenty of heavy circles, it is probably worse as an initial pick that a light circle that does not displace heavy circles. This is all out of my mind and completely made up, but these crazy formulas worked in tests. This is the reason I have 500 tests, and a tester system that runs one in each cores... Before finding a good formula I try tons and tons of bad ones. However, once you reach to the hill climbing phase of a problem, testing becomes very tedious. Change a small bit of code, wait 15 minutes for a result. It is frustrating to wait. Specially near the end of the match...

Tenth version!

So I worked on ways to increase the number of iterations. The best I could find was a logic that made me stop needing to check, for each of the possible positions, if they overlap or not. Back to the tangents approach from 0.6, the two tangents delimit an interval of distances that we are unable to use. So for each relevant circle we have one of these intervals. We can use a line-sweep algorith. Just consider each start of an interval as an open bracket event, and each end of an interval as an end bracket event. After sorting the start/end points, it is easy to find the smallest distance that is not inside any bracket...

So I was finally having plenty of iterations. More than a thousand for small cases, and tens of iterations for the large cases. Remember I said Hill Climbing is not that great because of local minimums? It is the most noticeable when you have plenty of iterations. So I tried to implement a simulated annealing. And I failed. I thought I wasn't failing, because I was getting some good results after adding the annealing. But I was a victim of confirmation bias. A couple of versions later, I noticed that the only reason my annealing was working at all was that it wasn't working at all. Whenever I fixed the formulas to make the annealing work, and keep a less than-optimal solution in memory, results became worse. All that the annealing was doing was add instability to my test results, which made them look better, or worse after small changes, but it wasn't really so. I really need to learn SA....

11-th version

It was yesterday's night. My sleep time was approaching and I had nothing. I reached that moment in Marathons in which you cannot improve anything and can only manage to improvise and do small tweaks. I had just discovered that the SA was a total failure and ended up disabling it completely and decided to stick to Hill Climbing instead.

I needed a good idea. 11:00 PM was approaching, but then I had a good idea!. The angle searching! I can improve the precision of the angle searching. First try angles in increments of 12 (0, 12, 24, ...) then pick the better angle. The trick is that then you go towards the 22 angles next to 12, so in a way you do fulfill precision similar to picking the angles in increments of 1 (In reality this misses a few cases in which the real minimum was well hidden between two multiples of 12 that weren't better than a local minimum). The benefit is that this is faster (needs to check 52 angles in total), but it is more accurate than the previous approaches. So this improves a solution in two ways. Accuracy, and more iterations = better chance to find a good order.

That was in theory. In practice I was not having success making it work. The second phase was giving way wrong results (tons of circles overlapping together). Something was definitely not good with this. Was the theory wrong? Or is there a small bug?

I looked for the bug, but I couldn't find it. I knew that today, I was not going to be able to spend much time in the problem during the morning. It is also not good to do major changes during the last hours because you might end up making a strong mistake that you can't find and fix in time. So that night was probably my last chance to try a significant improvement...

11:00 PM passed, but I still didn't want to give up. Surely one day of not sleeping at 11:00 PM, won't delay my headache recovery much longer than it already is! (And after all, the whole idea to keep a strict sleep schedule seems based on a small study and doesn't seem to have that much success (Or does it? I am noticing some improvement, even though my days with headache have definitely not ended)... Nevertheless, I was still looking for the bug and it was already 12:00 AM...

I kept looking, decided to basically compare the two phases variable by variable until I find a discrepancy. Indeed, some variables weren't working well. I slowly got to the rood of it... It was near 1:00 AM, when I finally saw it, when converting the angle to radians I had int a = angle * PI / 180. I wonder if my readers have noticed the blatant mistake in this.... when I fixed it, and ran tests, it was beautiful. Finally a significant change! It wasn't so significant, per se.. , but enough to climb to 86XXXX score and back to the top 50. It was a symbolic victory, I knew that today, every coder was going to improve and my top 50 was going to be short-lived...

12 and 13-th versions

More tweaks there and there. I tried to think of good ideas. I had little time in the morning, but eventually it occurred me to try a dirty trick. I now use the second iteration to try a new formula for the initial ordering. I was trying this new formula (giving a small advantage to things with small radii) but it was having mixed results- Better results in some cases, worse in others. Why not combine both? I planned this as my last submission, and then went to college. Results were not that bad.

I came back from college earlier than I planned, there were 15 minutes before 12:00 PM, I decided to try another dirty trick! Third iteration tried the simple ordering by m[i]. It seemed to be working fine in local tests, but I didn't have time to finish testing all 500s, so I decided to risk it and submit...

Surprise! Turns out that my 12-th submission was done at 10:15 AM, so I actually had no way to make a new submission before 12:00 PM. That's a bummer. ... wait.. Surprise! The match ends at 13:00 not at 12:00. (what?) so I have a whole hour to try more things. I waited for the local tests for the new dirty trick to end. A small improvement with less variance!. So I decided that I could always do with a new dirty trick. Went to lunch while testing it. Just blatant m[i]/r[i].

I came back around 12:50, the new dirty trick did great in the local tests. When combined, the two tricks would make me improve around 10^4 in points, not bad. So I decided to submit it. ... Surprise! I can't submit because I made a whole submission at 12:35... What? I really didn't remember making that submission. I think I intended it to be an example submission to make sure I don't have any new time outs, but I accidentally made it a full submission. So, only one dirty trick could be used :(.

Preliminary results

Position 57-th, I have to improve 13 positions (not impossible, but unlikely) in the final tests if I want a t-shirt. Ouch.

Did I mention that because I went to sleep at 1:00 AM last night, my headache has gotten very bad today?

But it was a great and fun match. I think I really need to learn Simulated Annealing though.


Shih-Shun Mak said...

Hi, i subscribed your feeds for a long time. Can you tell me what software did you use to draw these beautiful graph?

vexorian said...

The cases that show the movement of the blue circles are just screenshots of the visualizer that TopCoder provided for this problem.

The other things were made with Inkscape and that's what I usually use.