Thursday, January 17, 2013

In case you are wondering why that editorial was taking so long

Since some TopCoder matches ago, I was given the responsibility of being the editorial writer for all Topcoder matches. Too bad that my last two experiences did not go very well in regards to the time these editorials are taking.

In SRM 565, the main cause of issues was the division 1 Hard. I think it was one of the hardest problems of 2012, if not the most confusing one. SRM 565 finished it on a Thursday, and I intended to get the editorial ready the next Friday. Because, mind you, the time of the year was a very complicated one and I had to do a ton of shopping in Saturday and stuff...

The plan failed. I could not really do much about understanding the hard problem until the Sunday. The Sunday was a day in which I was forced beyond my will to spend a good chunk away of a computer with extended family. Meh. But the good thing is that the empty time was a good chance to get to think about the problem. Points, trees, paths, distances. I got something that *looked* like an idea. I could only begin to test it until that Sunday night though. If you head to the editorial the part that I got correct that day was the section [Variation with a fixed center].

When I implemented, I noticed some issues with the other variation, I tried hard to fix them. I actually got it correct. But I was failing one of the big, random test cases. I spent the next two days working on it. Those two days happened to be Christmas eve and Christmas :).

It turns out that I underestimated the "2 roots" version greatly. The recursive part from the editorial was not in the first draft of the solution. I finally found the challenge case that would defeat my first wrong solution during Christmas, it was ok.

SRM 566 now turns out to have a similar fate. But at least I have finally solved the div1 hard.

Div1 hard: The one with penguins and polygons

There are numPosts (at most 222) posts scattered around the border of a circle at equidistant angles. There are also penguins (at most 50) of possibly different colors at some fixed positions. The objective is to create closed polygons (using the posts as vertices) in such a way that a) All polygons contain at least one penguin. b) No two segments cross. c) All penguins of each color are inside one of the polygon (A polygon may contain various colors). The objective is to return the total number of ways to create these polygons modulo 100007. Two ways to create polygons are different if a segment is present in one and not in the other.

The solution is not that hard conceptually. It is typical dynamic programming. But it is hard to think of it, understand it and also cover all possible corner cases. When you solve a counting problem using dynamic programming and there is a circle involved, you need to be careful not to count the same thing twice...

I will not explain it fully right now because that bellongs to the editorial. But the solution requires two main ideas:

  • There are O(numPosts^2) possible segments. If a segment separates two penguins of the same color (one penguin is at one side to the segment and the other at the other side) then this segment is certainly invalid. In fact, it does not matter what you do with the polygons, as long as you only use segments that are not invalid, the polygons will always contain all the penguins of one color. Once you identify the invalid segments, the problem becomes about counting the ways to create polygons such that each polygon uses only valid segments and contains at least one penguin.
  • To solve the new problem, think about how to create a function f(l,r) that counts the total number of ways to have polygons using only the posts with indexes between l and r, inclusive. An easy way to avoid issues is to, pick the index of the smallest vertex that will be used and the index of the vertex it will be connected to. This already creates a sub problem of the form (nr + 1, r) (more details in the editorial). But then you have to solve g(l,r): The total number of ways to create a polygon that starts with l and finishes with r. While you create this polygon, you create new subproblems f(l,r).

I was barely able to come up with this solution yesterday, and I had to spend a lot of time debugging. Until I ran out of ideas about how to fix the solution. I figured many corner cases in the process, but I reached a moment in which the only example case I was failing was a big one that I could not debug.

I spent all this time, since last afternoon to now, trying to come up with the reason. One last logic flaw in my polygon counting. One last small bug, maybe? The more time I spent, the more convinced I was that the logic was correct. So it must be a small issue. I could not help but notice that I was only failing large cases, and could not find a small case that was wrong.

- Maybe it is overflow?

-"NAH" I foolishly replied myself. When I first read the statement, I noticed that the modulo value is 100007 instead of 1000000007. Such a small value gave me the idea that the admins decided to use a small value because that would allow you to use only integer operations (for product) and since the processors in TopCoder are 32 bits, this would speed up things. Maybe they intend the solution to be a bit tight.

I spent more hours looking for issues. Until just now, I finally notice: 100006 * 100006 can still overflow!. All this time I skipped the overflow possibility because of a lame assumption. Right now I have no idea why they picked a smaller value than usual for the modulo. If it was 10007, then indeed we could have avoided 64 bits operations. But the 100007 is quite a mystery.

It is a bit discouraging that after all this time I still manage to get problems wrong because of overflow. Such is life, I guess.

Div1 hards cause too many editorial delays

Looking back, most of the time I spend making an editorial is spent trying to solve the div1 hard. Mind you, I am only a yellow, soon to be blue, coder and some of these problems are not even solved by guys like Petr. If the best coders cannot solve a problem, then it is likely I will need quite some time to do it.

We are working in new ways to deliver editorials that might allow the editorial for the easier problems to appear in less than a day (at best) so that at least those problems get delivered faster. Let us see how well this goes in the next SRM.

No comments :