Saturday, July 09, 2011

TopCoder open 2011 Round 3 Part 2: RowOfColors

(Link to part 1: The 250-pointer)

1000 : RowOfColors
I was lucky to be able to get a problem accepted as a 1000 pointer. 200 bucks :). Anyway, this problem initially started as an idea for (w,h,k <= 50). Then there is as solution that is O(w*w*k*k*h), if I remember correctly. The O(w*h*k) uses the same idea, but needs a small trick.

The first row will always begin with a given color. Note that this color may appear later in the row. Imagine we decided the color of this first cell and the other cells on the top row that will have that color:

C...C.........CC.C..

All the other rows will somehow cooperate with us to try as hard as possible for that color placement to be possible. Note that the C colors will HAVE to be connected, so there is going to be some sort of bridge connecting them. Since the other cells in the top row won't be of that color, the bridge has to go through the other rows:


C...C.........CC.C..
C...C...CCCCCCCC.C..
CC.CCCCCC......CCC..
.CCC................


But the rows need not to only contribute with us to allow the distribution of C colors, but also the distribution of all the other colors. This means that we should be careful not to use space that would be useful for any other bridge between the other colors. The required trick is to notice that the following is the best way to place the color in the other rows:

C...C.........CC.C..
C...C.........CC.C..
C...C.........CC.C..
CCCCCCCCCCCCCCCCCC..

Why? There is a bridge, but we are opening a lot of space for potential bridges between the first and second C and the second and third C. This actually solves the problem. Note that there are some empty spaces:

.111.222222222..3:44
.111.222222222..3.44
.111.222222222..3.44
..................44

These rectangles (1, 2, 3 and 4). Are now places in which we can place new colors. But note that, for example, the first cell in the top row of rectangle 2 will be able to get repeated, and that such color will not be able to jump to rectangle 3 or 1, because we cannot build a bridge. So, inside rectangle 2 we may have something like this:

Y222Y22YY
Y222Y22YY
YYYYYYYYY

Y is a another color but it follows the same rules. So, we can solve this recursively. At every recursion step, we pick the positions of the cells that will share the same color as the first cell. We then assign colors out of the K available for each rectangle and so and so. But that solution is very slow. We can optimize it by using two recursions and it will become the afore mentioned O(w*w*k*k*h) solution. That is still too slow.

The solution is to keep the recursion knowledge but to solve it with another method. First, acknowledge that the problem has become similar to parenthesis expressions. Each shape of colors starting from the first cell is a very recursive structure.

C...CY...Y..YYCC.C..
C...CY...Y..YYCC.C..
C...CYYYYYYYYYCC.C..
CCCCCCCCCCCCCCCCCC..

Imagine we decide the color to use in the first cell. In our previous work we picked in which other cells to place the same color. Not anymore, we will instead only decide if we will use the color again later or not:

C................... C?..................
C................... C?..................
C................... C?..................
C................... CC..................

In the second case, we would then add C to a list of colors that we need to use letter. Else we would continue to the next cell. There are k choices for color C.

Now, imagine that we are in another cell, and there is already a list of colors that need to be used again. We can decide to place a new color OR we can decide to place one of those colors that need to be used again. However, the order of the colors in that list matters. If we first added Green and then Red to this list, then we cannot place green again without first removing Red. In fact, this list should be treated as a stack, and since there is no choice on the color but to pick the first one on top of the stack, then we only need to remember the size of the stack.

If we decide to place the color from the top of the stack, we actually have two decisions, either we will still foresee to use that color again, or we will decide to stop using that color. In the first case we should keep that color in the top of the stack. Else we can take it out of the stack (decrease the size of the stack).

Even when there are colors in the stack, we may still want to place a whole new color. Due to the recursiveness, adding a new color works the same as when the stack is empty. There are k choices and we should decide whether to leave the color in the stack or not.

Note that the stack should never grow larger than h. In fact, we should never place a new color when the stack size is h, because there would be no space for the new color.

That is enough to find a recurrence that solves the problem:

// x: horizontal position in the row
// s: size of the stack
// k: unused colors left
int F(int x, int s, int k) {

if (x == w) {
// the row is full. Base case:

if ( (s==0) && (k==0) ) {
// if the stack had elements, then we wouldn't have fulfilled the
// plan to place those elements for the last time in the row.
//
// if k is not 0, then we didn't place all the colors that we had
// to place.
//
// Else there is exactly one way to finish this base case.
return 1;
}

result = 0;

if ( s < h ) {
// place a new color (plan to use it again)
result += k * F(x+1, s+1, k-1)

// place a new color (won't use it again)
result += k * F(x+1, s, k-1)
}

if ( s > 0 ) {
// place a color we planned to use again.
result += F(x+1, s, k) // will use it again

result += F(x+1, s-1, k) // won't use it again
}


return result;
}
}



That recurrence will obviously be too slow. If we do try to memoize it, there won't be enough space in memory for all w*h*k states. The alternative is to notice that we always use (x+1) in every called sub-instance of the recurrence. So we can use dynamic programming using a trick in which we keep only two tables. One table dp[s][k] keeps the results [x][s][k] for the current value of x and the other table tdp[s][k] keeps the results [x+1][s][k]. So, when calculating dp[][] we will use the results from tdp. Then we can move dp to the space used by tdp. In total we will need only O(h*k) memory which will fit the game.

It may happen that even after this optimization, your O(w*h*k) implementation does not pass. There are many ways to avoid the time out. For example, note that whenever we multiply by k inside the recurrence, we will first do it by (K) then by (K-1) then (K-2), ... etc. Where K is the original K from the statement and not the one used in the recurrence. We can just forget about multiplying by k and instead just multiply the final result of the dp by K! (factorial) in the end. This allows us to implement the logic using only 32 bits integers in the operations without worrying about overflow. Due to architecture, 32 bit operations are faster.

But a less lower level solution is to note that certain states with some values of k and s depending of x are not going to be reachable. For example, if x is too large then there will not be a good chance to decrease k and s until x becomes w. Combining both optimizations, the following Java implementation requires only 0.5 seconds in the worst case (in TopCoder's servers):

final int MOD = 1000000007;

public int countWays(int w, int h, int K) {
int[][] dp = new int[h+2][K+1];
int[][] tdp = new int[h+2][K+1];

for (int p=w; p>=0; p--) {
for (int stack=0; stack<=h; stack++) {
for (int k=Math.min(K,w-p); k>=0; k--) {
int res = 0;
if (p == w) {
//reached the end.
// but the stack should be empty.
// and all colors should be used.
res = ( ( (stack==0) && (k==0) ) ? 1 : 0);
} else if ( stack > h) {
// too many colors in the stack, which means there is no
// vertical space for them
res = 0;
} else {
if ( (k > 0) && (stack+1<=h) ) {
// we can try a new color.
// plan to use this color again
res += tdp[stack+1][k-1];
while (res >= MOD) res-=MOD;

// don't use it again
res += tdp[stack][k-1];
while (res >= MOD) res-=MOD;
}

if ( (k <= w-p+1) && (stack>0) ) {
// we could also just use the top color.
// will use it again.
res += tdp[stack][k];
while (res >= MOD) res-=MOD;

// won't use it again
res += tdp[stack-1][k];
while (res >= MOD) res-=MOD;
}
}

dp[stack][k] = res;
}
}
// move tdp to dp and viceversa
int[][] aux = tdp; tdp = dp; dp = aux;
}
// multiply by K!
long p = tdp[0][K];
for (int k=2; k<=K; k++) {
p = (p * k) % MOD;
}

return (int)p;
}

No comments :