Thursday, December 20, 2012

SRM 565: Huge little differences

Ouch, now with 16XX. My new strategy of not opening div1 easy until 10 minutes before the end of the match is really killing my rating.

Div1 250: The one with trees

This one seems very tough. I am very worried because I will have to write the editorial and explain this problem. Seems I will be working hard these two days :(

Div1 500: The one with the division game

Two players play a game with a list initially consisting of all numbers in a interval [A, B]. A turn consists of picking one of the numbers in the list, find a divisor != 1, and divide the number by that divisor. The player who does not have a valid move loses the game.

You are given L and R, count the number of intervals [A,B] such that L <= A < B < R in which the first player wins (both players play optimally.

All right, I just recently solved a problem that used nimbers (Grundy numbers); and it was actually the first time I learned about them and had to write the editorial. So they were in my mind today. The game is a two player, symmetric, turn based game, so a state of the game should have a nimber.

First let us find the nimber of a single game consisting of a number x. I had no ideas so I just did this for the first 40 numbers looking for a pattern: I define a function nimber(x) that calculates the number. Each divisor of x will give us a number y with a nimber of its own. The minimum number that is not in the set of nimber(y)s is the result (minimum excluded ordinal).

In fact, after doing all of that and printing the results from 1 to 40, it was clear that nimber(x) = number of prime factors of x (0 for x=1, 1 for prime x). Now it is clear why this happens. Imagine a number as a set of stones, each labeled with one of the prime factors. So 12 is equal to two stones labeled 2 and one stone labeled 3. Dividing by a divisor of the number is the same as taking some of the stones. For example 12/4 = 3, it is the same as if we took the two stones labeled 2. At the end, what matters all the time is the number of stones, not their labels. This becomes exactly a nim game in which the number of stones in the stack is equal to the number of prime factors.

Let us say that we have found the number of prime factors, the nimbers for all n numbers between L and R, inclusive. The problem becomes: Given n numbers, where n is up to 1000001. Count the number of intervals of consecutive numbers in that list such that the xor of the numbers in the interval is not 0. (Each number is a nim game, the combination of a set of nim games, has a nimber equal to the xor of all those games).

This new sub-problem can be solved with dynamic programming. Let us define a function f(x,i) that returns the number of intervals that start with number i and with a total xor equal to x. (The final result will then be the sum of all f(x,i) such that x is not 0). After placing number i in the interval, you have two choices: Finish the interval with this number, in which case x = number[i]. And not finish the interval, in which case it has to continue (and include number[i+1]) and there are f( x ^ number[i], i+1) intervals that work like that.

The slight inconvenient with that dynamic programming approach is that an array [32][1000001] (The maximum xor will be 31, because the maximum number of prime factors by the constraints is 29). Will not fit in the memory. But fear not, we can approach this complication by using an array [32][2], because each i, only needs to remember the results for i+1.

THE REAL problem (at least to me) was to actually calculate all the nimbers from L to R. This means calculating the number of prime factors for up to 1000001 numbers. In the worst case, each of those numbers is greater than 999999999. Straight forward approaches are too slow. And I could not think of anything during the match.

In my frustration, I decided to just submit a solution with the dynamic programming and all the logic already made, but using a slow approach for the nimber calculation. I knew it would fail. But at least it shows I solved the rest ^^.

At the end of the match, I read Petr's code and the trick to calculate the numbers became clear. It is similar to a Sieve of Erathostenes, but only caring about the numbers from L to R. Given a prime p (up to sqrt(R)), we can just iterate through all numbers between L and R that are multiples of it. And increase the counts of prime factors. This saves a lot of time.

typedef long long int64; 
#define long int64

int current[1000001];
int nimber [1000001];

long dp[32][2];

struct TheDivisionGame
{

long countWinningIntervals(int L, int R)
{
int n = R - L + 1;
// Cleverly calculate nimber[i] for each
// number between L and R (The number of prime factors)
for (int i=L ; i<=R; i++) {
nimber[i - L] = 0;
current[i - L] = i;
}
for (int p=2; p <= R/p; p++) {
int s = L;
if (L % p != 0) {
s += p - (L % p);
}
//cout << L << ", "<<s<<", "<<p<<endl;
while (s <= R) {
while( current[s - L] % p == 0) {
current[s - L] /= p;
nimber[s - L] ++;
}
s += p;
}
}
// some numbers might still have current[i] > 1, they are primes.
for (int i=L ; i<=R; i++) {
if (current[i - L] != 1) {
nimber[i - L]++;
}
}

// Now the dynamic programming part:
long s = 0;
// dp[x][i] : Number of intervals that start with nimber[i]
// such that the total xor of the interval is x.
for (int i=n-1; i>=0; i--) {
for (int x=0; x<32; x++) {
dp[x][i&1] = 0;
// Use nimber[i] and nimber[i+1]
// then we need to know the number of intervals
// starting with nimber[i+1] that give x ^ nimber[i]
if (i+1 < n) {
dp[x][i&1] += dp[x ^ nimber[i]][ (i&1)^1];
}
// Use only nimber[i] then x must be == nimber[i]
if (x == nimber[i]) {
dp[x][i&1]++;
}
// Count the intervals with x!=0
if (x!=0) {
s+= dp[x][ i& 1];
}

}
}
return s;
}
};
#undef long

Div1 250: The one with monsters

I exceeded my time with div1 500 and actually had 8 minutes to solve this problem. I think I should have been able to solve it though. At first it seems like an innocent dynamic programming problem. Then you notice that the scary ness of monsters can be very high.

The trick is to notice that prices are 1 or 2. You can change the problem to calculating f(i, p) : The maximum scaryness you can buy using the first i monsters and spending at most p. This allows a recursive solution. Let us say you want to solve f(i,p), then you have to decide whether or not to bribe monster (i-1), then your available money becomes p - something, but the scaryness is scaryness[i-1] + f(i-1, p - something). If you decide not to bribe the monster, then the maximum scaryness f(i-1,p) you can find must be greater than scaryness[i-1].

See? It is easy!. But it took me far more than 8 minutes to think of this reversal. Ouch. The result is that I am back to 16XX rating. That is a record low, I think. Probably the lowest rating I reeached in years :(.

But this was deserved. I think I should have known how to do the fast prime factorization already. That would have given me a score in div1 500 and more time for div1 easy. Which I should have been able to solve in less than 8 minutes anyway.

Comments?

Any comments?

Wednesday, December 12, 2012

I wrote SRM 564 (Except div1 hard)

The other day I was writing the editorial for SRM 562. I have noticed some trends in the topcoder problem setter list that made suspect that there is a scarcity of problems being submitted. I really didn't have a problem set prepared, but I gave it a shot and asked the admins if they need problems. They did!

But then I had to think of a problem set. My first reaction is try to finally use some of the problems I never get to use. You may suspect why. All the previous times I got a SRM assigned, I was close to use those problems, but something better appeared...

Until now! Unfortunately I just could not think of or find a problem that was good enough for division 1 medium or hard difficulties...

The Knights ones

This Knight circuits problem is very old. If my memory works, it is probably some of the first problems I came up with. Contemporary to my second problem set, probably.

When I thought of it, it seemed so edgy and good. But I could not use it until now. Nowadays, the whole "knight circuit = connected component" thing sounds so obvious.

There were two variations, the "harder" one was the one used in division 1. The easier one was, actually given a board with some blocked cells. So it was just solvable with a simple BFS. It was supposed to be used in division 2 as a medium.

I also proposed many other lame problems which were not in use yet.

The change in plans

The division 2 version of the knights problem was going to be used. But then ivan_metelsky, memetic tester opined that it was very boring, even for that slot. Proposed that instead, we could use a variation of the division 1 version, but with variable allowed moves for the knight. This also fixed the issue that the division 2 hard that I proposed was very standard. The new issue would be to find a new division 2 medium. Ivan proposed we go back to the original version of the palindromes (division 2 easy) problem that was harder. So then all I needed to do was find a new division 2 easy.

So, I suddenly had an idea for division 2 easy. Balls of three colors, getting alternated, find the color of the k-th ball. A simple simulation thing that at least was more interesting than the recent div 2 easies... I proposed it and it was accepted.

The most amazing thing happened after that. I went out to buy groceries, when in the way I suddenly started to think of how that alternating colors thing was so like some of my favorite problems that I wrote. Some ad hoc thing to do using a program as a starting point... Then I remembered a mantra: If you want to turn a simulation problem into a harder problem, turn it into a counting problem. I started to wonder how to solve a variation about counting the number of ways to have a certain ball color in a certain position. This variation was hard. So hard that it started to look good for division 1 medium. And it was!

Finally, I noticed that turning the division 2 easy about palindromes into a division 2 medium was going to take time to develop. Also the alternating balls problem was interesting to solve in O(k) time. So I asked to make that problem the division 2 medium instead of the previous plan.

What is going to happen?

I am writing this before the match. I think that the main complaint about the division 1 match will be that it is too mathematical. I do not know yet about the division 1 hard problem , but I hope it is hard and programmatic to compensate.

People will probably think division 1 easy was too easy. But division 1 medium will probably have a healthy success rate (not too low or too high). When I solved it, I took 4 hours, which is usual for me and division 1 medium. Since I thought division 1 easy was too easy, I left very weak example cases. I wonder how many people are going to fail the 3x3 case. I hate fake difficulty as much (if not more) as anyone else, but for this problem I think it was needed.

Division 2 will be very challenging. I actually needed help to solve the division 2 hard. Whereas division 2 hards are usually very simple and quick for me to solve.

Saturday, December 08, 2012

SRM 563: The trend has been stopped, for now

So, to recap, I made the decision that from now on I will never solve a TopCoder division 1 easy problem in more than 10 minutes. This new strategy to open the hard and medium problems first and never open the easy until there are 10 minutes left is, of course, suicidal. It does not help that my rating was already trending downwards before making this decision.

Div1 hard

This problem seemed complicated. In fact, for a second I wondered if the match was going to be canceled because the problem is actually impossible.

I have not shared this but it has been decided that starting in SRM 564 I will always write the editorial, unless it is an exceptional case. Which means I now receive the editorial assignment and explanation of hard problem just half an hour after results are announced. In theory, this should make editorials arrive faster... in theory. What is important to know is that I now read the short explanation for this problem, and it turns out it was not difficult at all ...

Div1 Medium - The one with magic cards

5 minutes after opening the hard problem, I gave up and opened the medium problem. My first reaction was that it seemed approachable. And it was.

You got a list of cards from left to right. Each card has a level and a damage value. You can use a card with level L and damage D if and only if there are at least L-1 cards to its right. In case you use the card, you perform D damage and the card and the L-1 cards to its right get removed from the game. Another allowed step is to move the right-most card to the left-most place of the list of cards. What is the maximum total amount of damage you can make?

The first step is to alter the problem a bit. You actually have a circle of cards, placed in clockwise order. At every point, you can take any of the cards of level L, take the L-1 cards to its right and the card itself and perform D damage. These two variations are equivalent.

To solve the variation, consider another variation. In which you have the cards in a line. Once again, you can pick any card and remove it and the L-1 cards next to it. But this time it is not a circle (for now).

To solve this easier variation of the problem we use dynamic programming. The issue is that you can remove cards in any order and the order can alter the result dramatically. Thus the main thing to approach is how to decide their removals following a fixed order...

In fact, imagine that we decide to use the i-th card. This means that the level[i]-1 next cards to it have to be removed. But then after this decision, we decide to also use card i+1. We got to use card i+1 before using card i. When we decided to remove card i, we decided that we owe level[i]-1 cards. After deciding that we will use card i+1 before card i, we actually increase the amount of cards owed by (level[i+1]-1), so that then we can decide what to do with card i+2, and so and so, assuming an amount of owed cards.

A different case is what happens when we decide not to use card i+1, then we will remove this card when we remove the owed cards. This means we will need one less card. The number of owed cards decreases by 1.

That logic allows a dynamic programming solution. f(p, owed) finds the maximum damage you can make by using the cards with index greater than or equal to p, and you currently owe owed cards. Note that the base case is when p=n, then owed must be 0.

The next issue is what to do in the circular case. There are two clever work around to this. The first one is to try all n ways to rotate the vector of cards, and then just solve using the previous method.

The second method is to assume that there was an amount initOwed of cards that we will owe after processing the last card in the input. Then at the start of the simulation, when p=0 we already owe some cards (that we will assume we will define to owe later). The base case changes, the amount of owed cards at the end must be the same as the one that was planned in advance.

It is funny but I used both approaches at the same time. But of course, only one is necessary.

Here is the code anyways:

Div1 easy

It was worth 300 points. I opened it with 9:50ish minutes left, and I knew it was going to be difficult. For the first 5 minutes I implemented a solution that turned out to be wrong. I tried to use the remaining time to come up with a solution or a cheap trick that allowed the wrong solution to work. With no success.

Challenge phase/etc

Not much happened afterwards. I eventually passed system tests in div1 500. At least the trend towards the bottom of the ratings is not going on.

int n; 
vector<int> lvl, dmg;

int dp[MAX+1][MAX+1];

int rec(int p, int owed)
{
int & res = dp[p][owed];
if (res == -1) {
if (p == n) {
if (owed == 0) {
res = 0;
} else {
res = -INF;
}
} else {
res = -INF;
//use this?
if (owed + lvl[p] - 1 <= n) {
res = std::max(res, dmg[p] + rec(p+1, owed + lvl[p] - 1 ) );
}
//do not use this
res = std::max(res, rec(p+1, std::max(owed - 1, 0) ) );
}
}
return res;
}

int maxDamage(vector <int> level, vector <int> damage)
{
n = level.size();
lvl.resize(n);
dmg.resize(n);
int res = 0;
for (int start = 0; start < n; start++) {
for (int i=0; i < n; i++) {
int j;
if (i < start) {
j = (n - start) + i;
} else {
j = i - start;
}
lvl[i] = level[j];
dmg[i] = damage[j];
}

memset(dp, -1, sizeof(dp));
res = std::max(res, rec(0, 0) );
}
return res;
}

Tuesday, December 04, 2012

Topcoder SRM 562 editorial preview (div1 1000)

I do not seem to finish this editorial. I am already one day after the deadline (first problems were difficult and explanations turned out be image intensive) and I still have to finish div1 250 and div2 500 (will need more images, that takes time). So here is a draft of the div1 1000 editorial. At least that should help anyone who needs it. Note that this is still a draft, and thus I guarantee you that it has far more mistakes than usual (and that is saying a lot).

----

Let us quickly quickly explain the problem statement: We are given a tree. Count the number of ways to assign labels 1,2,...n to each node in the tree such that the sets of vertices {1,2,..k,}, {2,3,...,k+1}, ... {n-k+1,...,n} are each connected components. This editorial will call these sets 'target' sets.

Small k versus large k

A good starting point is to notice that the problem can change dramatically depending on how large the value of k is. The main difference between small and large values of k is that when k is large enough, there will exist a group of labels that are common to every target set. For example, with n=10 an k = 7:

1 2 3 4 5 6 7
  2 3 4 5 6 7 8
    3 4 5 6 7 8 9
      4 5 6 7 8 9 10

The labels {4,5,6,7} exist in each of the target sets. Which does not happen with small values of k, for example n=6 and k=3:

1 2 3
  2 3 4
    3 4 5
      4 5 6

There is no vertex that is present in all of the sets. The requirement for us to consider a value of k large is that the set of the k smallest values intersects with the set of the k largest values. In other words: 2*k > n. This small difference makes a case much harder than the other. Let us begin with the small case, which is the simplest.

2*k <= n

In order to understand this case, let us think about what is needed to transition from the first target set to the next one making sure they stay connected. Imagine that we already found out the connected component of the first target set, the nodes with the k smallest labels. Since we are dealing with a tree and its subtrees, these labels will form a tree. Like in the following two examples for k=3:

The tree can so far have any shape, but with k connected elements. It is interesting what happens when we transition to the next target set. From {1,2,...k} to {2,...,k+1}. What it means is that the new tree will have {1} removed and {k+1} added, but it still must be connected. This has a consequence, in two of the examples above, there are only two super tree of 4 elements that allows the condition. The third example is simply not correct, if we add a {4} and remove {1} the tree will not remain connected.

Try making more transitions. Also consider that at the opposite part of the tree, the labels with larger values, the structure works the same when moving on from the largest k labels and removing {n} and adding {n-k}... At the end you will find that the tree must follow a specific structure. For example, when k=4 and n=10:

  • In the red square, we have a formed by the k smallest labels. They must follow a proper hierarchy structure. In every branch, the smallest label must be a label and the parent of each label must always be greater, up until k (4 in this case) which must be the root of the subtree. The reason for the hierarchy is that when transitioning from each target set that contains two of the smallest labels, the smallest label has to be removed, thus this label cannot act as a bridge between any other two larger labels.
  • Inside the green square, a chain of all the nodes from k to n-k+1. This must be a chain, because of how transitions work. When transitioning from {3,4,5,6} to {4,5,6,7}, 3 must be connected to 4. When transitioning from {4,5,6,7} to {5,6,7,8} then 4 must be connected to 5 and so and so...
  • Inside the blue square, also a subtree. This time it is the k largest labels. This subtree must also have a proper hierarchy, this time each parent must be smaller than each of its children.

The solution for this small case is already accessible. Given the tree, we can pick the starting and ending node of the chain. We can tell the chain must contain n - 2*(k-1) elements. Thus the distance between the two picked points must be n - 2*(k-1) - 1. These end points would get labels k and n-+1, respectively. Then we also need to verify that the subtrees rooted at each of the end points contain exactly k elements. If all checks up, the result is equal to the number of ways to assign labels to the first sub-tree multiplied by the number of ways to assign labels to the second sub-tree. This number of ways can be calculated using a dynamic programming approach.

How to assign labels to a sub-tree following a hierarchy

Let us quickly sum up the dynamic programming approach. We are given a subtree rooted at a given vertex. We also need to know its parent so that we understand what direction not to go when calculating the children (For different end points, the parents of the subtree roots change).

The base case is when the subtree contains only the root. Then we can only assign one label to the root (Either the lowest or the highest, depending on which of the subtrees we want to count, but for the sub-problem it does not matter).

The remaining case is when the root has children. We can assume that we already know the ways to assign labels to each children following the hierarchy. For example, when there are two children subtrees one with n1 nodes in total and the other with n2 nodes in total, we can assume that currently the labels that were assigned to each subtree are in the form {1,2,...,n1} and {1,2,....,n2}. We want to assign labels {1,2,...,n1+n2} to the two subtrees at the same time. This is equal to, out of a total n1+n2 labels, picking n1 labels to go to the first subtree. After picking which labels go to each subtree, we replace the original labels {1,2,...n1} and {1,2,...,n2} using the same order to make sure the hierarchies are not lost, so there is only one possible order for the labels. Then we put a label to the root, it must be larger than all the other labels, so there is only one way to assign a label to the root. This means that in total there will be: That is C(n1+n2,n1) * ways(first subtree) * ways(second subtree) ways to assign labels to the larger subtree still following the hierarchy. This approach can be adapted to when there are more than 2 children: First combine the first two subtrees into one hierarchy, then combine this new hierarchy with the third subtree and so and so.

2*k > n

The harder case is harder because of the labels that are common to each of the target sets. Each of the target sets must be a connected component. It is possible to show that this common intersection must also be a connected component. This intersection will have the labels that are not the largest and not the smallest. We will calls this group of labels 'middle'.

Once again we have to see what happens when we transition from a target subset to the next one. This time considering that the middle labels are always connected and remain so. Let us go back to the case with n=10, k=7.

  • The first target set is {1,2,3,4,5,6,7}. {4,5,6,7} must form a connected component. Thus we only need to add {1}, {2}, {3} to whatever structure the connected component has. We can add {1,2,3} as a single subtree. Or in separate partitions. We can connect these subtrees to any of {4,5,6,7} and the structure {1,2,3,4,5,6,7} will be connected.
  • But then we transition from {1,2,3,4,5,6,7} to {2,3,4,5,6,7,8} note that we deleted the smallest label, and added a new one. We once again have a process in which the smallest label is removed when performing transitions. In short, and reusing the knowledge from the previous analysis, all the subtrees of {1,2,3} that we added to {4,5,6,7} must follow a hierarchy.
  • The same will happen for {8,9,10}, all of their subtrees must follow the hierarchy.
  • {1,2,3,4,5,6,7} must be connected without the help of any of the vertices in {8,9,10}, this means that the subtrees of {1,2,3} will not have vertices from {8,9,10}. Similarly, the subtrees of {8,9,10} cannot have vertices from {1,2,3}.
  • Note that {4,5,6,7} can have any shape as long as it is a connected tree. It is not necessarily a chain. And the various subtrees do not necessarily have to connect to 4 or 7.

The following is a possible labelling for a large graph (n=28, k=19):

The white nodes would get middle labels. Let us call the remaining labels small or big depending on whether they are smaller than the middle labels or larger. The red nodes get small labels and the blue nodes get big labels. Thus any label assignment in which the white nodes get middle labels, the red ones get small labels (but in a correct hierarchy) and the blue ones get large labels (also in a correct hierarchy) will work. What is important is that the middle labels can make any shape as long as they are connected and once you assign a big label to a node, all of its children must have big labels too.

One thing is to know how the graph should look after assigning the labels, another is to actually count the ways to do it.

How to count them

It is a complicated process with many decisions. Which group of nodes will receive middle labels? Then which of the children will be small and which big? Then we also need to assign the labels to the big and small subtrees such that they keep a certain hierarchy.

The first simplification comes from just taking each middle node individually. For each vertex, count the number of valid labellings in which that vertex receives a middle label. If we sum these results together the total will contain each of the valid results repeated by the number of middle labels. We can just divide the sum by this number and we get the overall result.

When calculating the number of ways to assign the labels starting at a middle vertex, we can ignore the actual labels we place on middle vertices and then multiply by the factorial of the (number of middle labels).

What we have left is the following recursive logic: Let f(p,x, a,b) be the number of ways to assign a small labels and b big labels to the sub-tree with root x (and the parent of the root is p):

  • In a base case, we have a = (total number of nodes in the sub-tree) and b = 0. This means that the node and all its children must be given a small label. We also know the number of ways to label this following a hierarchy, it is just the same sub-problem we solved for the easier case.
  • Likewise, when a = 0 and b = (total number of nodes) then we have to make the complete sub-tree have a big label.
  • Else there are plenty of options. We can know for sure that the root has to belong to the middle. Some of the children may also belong to the middle. The rest should be big or small. In order to explain the logic, let us say the root has two children y and z, and we already know the results f(x,y, ?,?) and f(x,z, ?,?) for all pairs of combinations of big and small. Then the decision involves deciding how many of the big (b) and small (a) labels to assign to each subtree and combine their hierarchies afterwards (Using the same logic as above, e.g: C(total number of big labels, big labels used in the first child).). We can extent this logic to work with more than two children.

Solution

It has been a long journey, but we have finally reached the solution:

http://pastebin.com/4NmjuNnu (sorry, blogger is once again failing to allow me to paste my code.).

Saturday, December 01, 2012

Topcoder SRM 560-562: You people have stood in my way long enough. I'm going to clown college!

SRM 560

So, I did not like SRM 561, or specifically I did not like what I did in it (So much that I didn't even want to write the editorial). It is actually very tiresome to spend some little extra time debugging a solution for an easy problem and then getting penalized for solving it. Most people in that match solved only that problem, so speed was the only difference in there. And that blows.

I think made exactly the same paragraph in the past, but I am tired of this lame situation, I don't want my matches to depend only on the easy problem anymore. Therefore, from now on (or until I get bored of the strategy) I will only open div1 hard and div1 medium until 10 minutes before the end of the match, then I will open the div1 easy. If I solve div1 easy under 10 minutes, that would be a good score. If not, then so be it, 0.00 score, but knowing topcoder, if I take more than 10 minutes in div1 easy, then it is a bad result anyway.

SRM 561

So, time to apply that strategy. That match was not great for such a suicidal strategy...

I opened div 1 1000, oh so something with trees? I thought it was going to be some sort of dp. But a complicated one. The idea that rng_58 eventually shared with me is far more elegant and mathy. It was a good thing I decided not to give it a try to code my (possibly wrong) initial dp idea.

I immediately opened div1 550. And I thought I could solve it. I had most of the correct ideas to solve it (See it as a tree, if you remove a node a forest remains). But my knowledge of nimbers was ... null. Somehow I never learned that stuff. So I tried to use a game theory dynamic programming, unfortunately, without knowledge of nimbers, you do not know how to combine games into a single game in which you can play each of the sub-games at choice... For some reason I assumed that you could finish one of the games and then go on. This was a wrong assumption. I coded something that was of course wrong, although I initially thought it was wrong because of code mistakes and not because of the idea...

Until there were less than 10 minutes left... I opened the 250 points problem to find out the problem statement was huge. Once I finally understood what to do, I knew I was not going to finish in time. It required quite some long code and I had only 4 minutes left... I still tried and failed.

I wrote the editorial I wonder how noticeable it is that I had no idea what a nimber is until I had to write this editorial...

SRM 562 - HAH! canceled

This felt the same. The div1 500 was quite complex to come up with and even If I did I would not be able to solve it. I had plenty of ideas of how to solve it in O(N^3), but they were all wrong... The closest I got was to fix a black point. Then sort all the other points clock-wise. Then whenever you pick two white points, you have to make sure the angle of the third black point is between them . If in addition the distance to the black point is not too close, that is a valid quad. I tried as hard as possible to finish coding this two-pointers idea but I also needed a tree... eventually I figured that it was not viable anyway, because the too close distance depends on the two white points... so using two pointers is not that friendly.

I also opened 1000, it seemed, once again like a complicated dp on a tree. I wonder what it really is.

Less than 10 minutes and I open 250. It seems time flies while reading a problem because I only had 6 minutes left after reading it... I knew the simulation would become repetitive in at most 50 steps. But I had some difficulties at first coding the idea in a good way that took less than 6 minutes to code. In fact, I needed far more time... Once I finished coding it, the challenge phase was supposed to start... Yet I notice that it was just the end of the coding phase? So the timer reached 0:00 but nothing happened.

They canceled it! HAH!

That strategy...

I knew the strategy would be very bad for my rating. If yesterday's match was rated, I would probably be in the 1600s area.

But I think it is necessary. I think that when I started in programming competitions I was fast enough to code the division 1 problems of all these matches in less than 10 minutes. I am not sure what made me far slower... I suspect it may have something to do with the whole 4 minutes that magically disappeared after reading the statement in SRM 562... Either way - The pressure of having only 10 minutes to solve a problem will hopefully make me faster eventually. And the exposition to harder problems before that might allow me to go back to solving div1 medium or hard at least once every other match. I am giving up rating for long term improvement, I hope... Plus seriously, if I am gonna fail a match, I'd rather it not be because I scored 160 in an easy problem that everyone solved.

Tuesday, October 30, 2012

SRM 559: Just wow

This one was tough. I know ad hoc 250s are great. And I love them. I also liked this problem a lot. But it felt way too much for 250.

The good thing is that I managed to mostly avoid an ad hoc solution. Instead, mine relies on inclusion-exclusion magic.

Div1 "Easy": The one with knights

You have a (possibly large) chess board. Your knight can move from cell (x,y) to cells (x+a,y+b), (x+a,y-b), (x-a,y+b), (x-a,y-b), (x+b,y+a), (x+b,y-a), (x-b,y+a) and (x-b,y-a). If the cell does not exist, the move is invalid. Return the total number of cells from which there are exactly k moves. The values are from 1 to 109. a and b are never the same. Also, 2*max(a,b) is always less than min(number of columns, number of rows).

That last constraint was something very suspicious. I guessed that this constraint reduces the number of corner cases. When the people behind these contests reduce the number of corner cases, it can only mean that the problem is very complicated...

And it was. At first I thought to just solve the problem. There are 9 special cases. Some are easy It is impossible to have cells with (k=0, k=1, K=5, k=7) valid moves (In realty, k=0 and k=1 should be possible if not for the suspicious constraint). It is not completely obvious, but K=3 is possible. I thought that this was going to be a corner case, but nope.

Eventually, I returned to my senses and figured that finding a formula for each of the 4 remaining cases would have been crazy and bug-prone. There got to be a better way. Good thing I could think of it. But the time I wasted trying to think of the formulas and then debugging the solution (which turned out to be very long) gave me a very low score and no time to solve the medium-difficulty problem.

Here is the solution anyway: Let us define a move as (dx,dy) so that you move from cell (x,y) to (x+dx, y+dy). It is simple to know the number of cells in which this move is valid. There are 2*dx invalid rows and 2*dy invalid columns. So just multiply the number of remaining rows and columns and find the result.

Let us count the number of cells from which two specific moves (dx1, dy1) and (dx2, dy2) are valid. (And possibly other moves). This is a bit harder. But indeed, you can find the rectangle of cells from which these moves (but not necessarily only these moves) are valid.

For any subset of the 8 available moves, the number of cells from which all of the moves in the subset are valid are also a rectangle, and thus we can calculate this value for each subset of moves. Let us use bit masks to represent these sets. Now let us store the results in an array valid[mask] that for each mask (subset of moves) returns the total number of cells in which all of the moves in the bit mask are valid).

Now, if we wanted to find the total number of cells from which exactly k moves are valid. We would just need to find all subsets that have exactly k and sum their valid[mask]. Right?... NO!. You have not been paying attention. Because valid[mask] may include cells from which other moves are valid.

Thus what we want to somehow create is another array called exactly[mask] that gives us the number of cells from which the only moves allowed are those in the bit mask.

The key

The key is to observe that valid[255] = exactly[255]. In other words, since there are only 8 moves. Then all the cells valid[255] are also the cells from which the only valid moves are the 8 ones (note that 255 is the subset that contains all 8 moves).

And that is cool. Because we can now calculate the results of exactly[mask] for all masks that have exactly 7 bits. Yeah! Imagine a subset of moves that lacks exactly one moves in that case, valid[mask] will return all the cells that allow those 7 moves. This includes all the cells that allow the 8 moves AND the cells that allow only the 7 moves. We can find exactly[mask] with a simple subtraction.

In fact, for every bit mask mask. exactly[mask] is equal to valid[mask] minus sum(exactly[mask2]) for each mask2 of which mask1 is a proper subset of. You can build exactly[] using this method. Then sum all the values of exactly[mask] for all masks that have exactly k elements (1 bits).

Solution

typedef long long int64; 
#define long int64

struct HyperKnight
{
long countCells(int a, int b, int numRows, int numColumns, int k)
{
// The 8 moves:
long dx[8] = {a,a,-a,-a,b,b,-b,-b};
long dy[8] = {b,-b,b,-b,a,-a,a,-a};

long exactly[256];
long result = 0;

// We visit the masks from highest to smallest, this way we
// can guarantee that all subsets with more elements than the
// current one have already been solved.
for (int mask = 255; mask >= 0; mask--) {
long valid = 0;
// valid : How many cells allow the moves in mask?
// (and possibly other moves)

// In the explanation valid is an array, but we do not really
// need to remember all its values, just valid[mask].

int n = 0;
// Find the rectangle of such cells:
long left = 0, right = 0, up = 0, down = 0;
for (int i=0; i<8; i++) {
// For each move that belongs to the mask subset:
if (mask & (1<<i)) {
n++;
// update the rectangle's dimensions
if (dx[i] < 0) {
left = std::max(left, -dx[i]);
} else {
right = std::max(right, dx[i]);
}
if (dy[i] < 0) {
up = std::max(up, -dy[i]);
} else {
down = std::max(down, dy[i]);
}
}

}
// Area of the rectangle
valid = (numRows - left - right) * (numColumns - up - down);
// if we make sure to set valid = 0 when the moves are too large
// this makes the solution work even without the special constraint

exactly[mask] = valid;

// For each subset with more elements than this one.
// (mask must be a proper subset of mask2):
for (int mask2 = mask + 1; mask2 < 256; mask2++) {
if ( (mask & mask2) == mask ) {
// remove the cells that allow more moves than mask.
exactly[mask] -= exactly[mask2];
}
}
// n is the number of moves in the mask.
// now exactly[mask] contains the number of cells from which the ONLY
// valid moves are those in the mask:
if (n == k) {
result += exactly[mask];
}
}
return result;
}
};
#undef long

Div1 medium

By the time I finished my 250, I did not really have much time to try to code a solution. But I think it is easy to just fix the root and the depth of the tree. Then count the number of ways to have those given root and depth following the conditions. This root shall have one or two children. One child should have exactly (depth-1) depth and be full. The other can have (depth) depth and not be full (but to the right). These are all similar subproblems. Implementation is probably messy.

Opinions / etc?

I liked the match, I just wish I solved 250 faster. What about you?

Friday, October 19, 2012

SRM 558: Tough

This match seemed a bit harder than usual. The 275 points problem was definitely a more complicated dynamic programming than is usual for the easy slot. It was good though.

Div1 easy: For 275 the one about stamps

There are cells indexed from 0 to n-1. Some (or all) cells have a requirement to be colored red, green or blue. In order to color the cells. First pick a stamp length L, with a cost of stampCost*L. Then paint contiguous group of L cells using that stamp with the same color. Each time you push the stamp, it costs pushCost. You cannot use two or more different colors to paint the same cell. What is the minimum cost? The maximum value of n is 50. The cost values are at most 100000.

Things to consider: First of all, there is always a result. You can always pick L=1. Then the cost is: stampCost + n*pushCost. This also allows us to confirm that the result always fits a 32 bits integer.

Another thing, the maximum useful value of L is 50. With at most 50 possible values of L to try, we can just iterate through each of them, find if it is possible to use that length and then calculate the minimum cost if that stamp length was used. Return the minimum cost you find.

Now let us solve the sub-problem when L is fixed. First add L*stampCost. Let us find the minimum number of pushes. Imagine that we have already decided how to color each of the cells that do not have a color requirement. What is the cost to make that pattern? This all depends on each group of contiguous cells of the same color. For example, to paint RRGGBBBBBRGBRRR, consider each group of contiguous cells of each color: RR, GG, BBBBB, ... etc. What is the minimum number of pushes for each of these groups? Note that when L is greater than the length of one of the groups, this is not possible. Then we imagine L=2 and RRRR, of course, we can just use two pushes. That is len / L. But what if the contiguous length is not a multiple of L? For RRRRR and L=2, you would need to do three stamps, and color one of the cells twice. You will verify that the result in case of len not being a multiple of L is always len/L rounded.

But how can we choose the colors for the cells that can have any color? There might be a way that does not involve dynamic programming. But I can only prove the dynamic programming approach to be correct. Have a function f(p, lastColor, lastLength) that finds the minimum cost of painting cells above index p given that the last color we painted is (lastColor) and the current length of contiguous cells of that color is lastLength. In each step, you can decide one of the three colors. Altering lastColor and lastLength if necessary.

Implementing was not easy as it is not your easiest dp to code. It was also 7:00 AM. I am not that good at 7:00 AM. So I took some long time debugging my solution.

I was lucky though, during the challenge phase I expected there to be failed solutions. I opened the solution of the blue coder that had the most score. And it seemed strange. I think the approach itself might be correct, but it was not doing memoization or anything to avoid recursing too much. It was a time out case. There were large example cases, but I think some of the aspects of this solution were cutting the execution time in those cases. But a string of n * characters was too much. I was very lucky to find a solution that timed out.

Code:

const int INF = numeric_limits<int>::max(); 
struct Stamp
{
string desiredColor;
int pushCost, L;

int dp[51][4][51];

// When iterating through colors we go from 0 to 2, but then we got to
// do some input translation...
int toColorId(char ch)
{
switch(ch) {
case 'R': return 0;
case 'G': return 1;
case 'B': return 2;
}
return 3;
}
// What is the minimum cost to paint a group of length contiguous cells?
int getCost(int length)
{
if ( (length < L) && (length > 0) ) {
return INF;
} else {
return pushCost * ( (length / L) + (length % L != 0) );
}
}
int rec(int p, int last, int lastLength)
{
int &res = dp[p][last][lastLength];
if (res == -1) {
res = INF;

if (p == desiredColor.size()) {
// base case, all cells were painted, calculate the last cost
res = getCost( lastLength );
} else {
// pick one color, that matches desiredColor[p]
// remember the best result we find.
for (int c=0; c<3; c++) {
int r = toColorId(desiredColor[p]);
if ( (r == 3 ) || (r == c)) {
if (c == last) {
//continue this sequence.
res = std::min(res, rec(p+1, c, lastLength + 1) );
} else {
//do the change
int x = rec(p+1, c, 1);
int y = getCost(lastLength);
// careful with overflows.
if ( (x < INF) && (y < INF) && (x < INF - y) ) {
res = std::min(res, x + y);
}
}
}
}
}
}
return res;
}

int getMinimumCost(string desiredColor, int stampCost, int pushCost)
{
// Save some variables:
this->desiredColor = desiredColor;
this->pushCost = pushCost;
int n = desiredColor.size();
// Try each possible value of L. Remember the best result:
int minCost = INF;
for (int L=1; L<=n; L++) {
//reset the dp.
memset(dp, -1, sizeof(dp));
this->L = L;
// call the dp.
int x = rec(0,3,0);
int y = stampCost * L;
if (x < INF) {
minCost = std::min(minCost, x + y );
}

}
return minCost;
}
};

Funny thing, while adding comments to that code, I figured that there is a dp that is a bit easier. You just need a recurrence f(t), the minimum cost to color the first t cells. Then pick a group of the last x contiguous cells. If they are all * or the same color, then call getCost(x) and then recurse to f(t - x).

Div1 550: The one with triangles

A complicated one. O(n^3) should be fine. I think the key is to first pick the pair (P,Q), from there you can calculate (using slopes) the allowed red points. Then you have to count the number of ways to pick red points following a large inequality. I think it is possible. But when I noticed I had nothing and there were only 4 minutes left, I preffered to double check the 275 and try to think of ways to find mistakes in it.