Tuesday, October 04, 2011

SRM 520: Bad day

Yeah, 0 points. And what's worse is that it is not because of not being able to solve the problems. I just made mistakes and couldn't debug them in time.

I started with the 500 and intended to switch to the 250 15 minutes before the end of the coding phase if necessary. The time I used to allocate to 250 "just in case" used to be 30. Today's match kind of shows why it was a good idea to use 30. Or does it? In my case, solving the 250 after 30 would have meant 150-ish of score and that is almost as bad as 0 score-wise for my rating. Yet, I think that it was very possible I could have solved the 500 with those extra 15 minutes. Also, this is probably going to make me forcefully get faster on solving the 250. I am actually considering to start switching when there are 10 minutes left. That will probably make me go through more bad matches like today's. But now that my rating is going to be kicked way fa from its close-to-red glory, I no longer have to care about that, and I can begin experimenting.

SRMIntermissionPhase
This problem was not so hard to solve algorithmically, but I found it very complicated in implementation.

The main idea is this: Imagine you had an array ways[i][j] that would tell you the total number of ways to have coder #i score j as total score. (Using combinations of its solved problems). Then the problem becomes a rather simple dynamic programming idea. The recurrence is a follows:

f( i, j) = f(i, j-1) + ways[i][j] * f(i-1, j-1).

Where f( coder_number, coder_score ) is the total number of ways the scoreboard can have a score less than or equal to coder_score for coder # coder_number.

Then we just need to calculate ways[i][j]. It is another simple recurrence. This time for each coder, you decide the number of ways to score X using the K last problems (where K is 0, 1, 2, or 3). Then you can get the value for K=0 easily (It is 1 iif X=0). The remaining values can be found with the recurrence:

G(X, K) = sum( G(X-r, K-1) ) , where r is a possible amount to score in problem (K-1) .

The recurrences are dead simple. But they take plenty of memory and are slow. The first work around is that:

G(X, K) = sum( G(X-r, K-1) )

Can be calculated if you somehow have the accumulated values for G(X,K-1). That took me 30 minutes to implement correctly beyond segmentation faults, and perhaps that is where my mistake is.

Then, you have to worry about memory. The idea is that in G's recurrence, you only need to know the results for K-1. Also, you only need to know the ways[] table for one coder at once. Finally, F(i,j), also depends only on the last value of i.

But it was very hard to implement for me. I passed all example cases except 1). That includes the last case which was the most complicated. I am right now not sure what happens with 1) , but it is probably something small and evil.

I spent a lot of time trying to debug it. Eventually, I reached the 15 minutes mark and switched to the 250.

250 SRMCodingPhase
I decided to use dynamic programming to solve this problem. There are certainly other ways, I just thought dp would be simplest. In part, when I found my mistake , it was because implementing my dp way too mechanically.

First of all, the order in which you solve the problems does not matter, provided that their sum does not exceed 75 minutes. So you just have to decide which problems to solve. Then you also need to decide what to do with all your luck points.

My dp recurrence F( l, t, k) : The best score using k problems, l lucky points and with t total minutes available.

When k is 0, the result is 0, we just don't have any problem we can solve.

If k is not 0, we have to pick an amount x of luck points to reduce from skills[k-1]. Then we subtract from l and also make sure t does not go negative (else we wouldn't have time left to solve this problem). The recurrence step is then:

F(l, t, k) = max( F(l - x , t - nt, k - 1) ) + points[k-1] - (2 , 4, or 8) * nt.

Where nt is the amount of time needed to solve the (k-1)-th problem using x skill points.

int countScore(vector <int> points, vector <int> skills, int luck)
{
int dp[luck+1][76][4];
memset(dp,0, sizeof(dp));
for (int i=1; i<=3; i++) {
for (int l=0; l<=luck; l++) {
for (int t=0; t<=75; t++) {
// do not solve:
dp[l][t][i] = dp[l][t][i-1];

// solve using x luck points
for (int x=0; x<=l && x < skills[i-1]; x++) {
int nt = skills[i-1] - x;
if (nt <= t) {
dp[l][t][i] = max(dp[l][t][i], dp[l-x][t - nt][i-1] + points[i-1] - (1<<i)*nt );
}
}
}
}
}

return dp[luck][75][3];
}


I unfortunately, couldn't find a mistake I had in my code in time. If you are curious, the mistake was that I was doing: dp[..][..][..] += other stuff. Instead of : dp[..][..][..] = max(dp[..], otherstuff).

Update
I finally found the error in my code for div1 500.

Glance at this bit:

if (j > 0) {
assert(ways[j][1] >= 0);
assert(dp[j-1][ni] >= 0);
dp[j][ri] += (ways[j][1] * dp[j-1][ni]) % MOD;
assert(dp[j][ri] >= 0);
} else if (i == n-1) {
dp[j][ri] ++;
}


The mistake is very simple. I did that if, so that I would handle the case in which the new j becomes negative. If it is the last coder, then this is always possible. However, I forgot about the number of ways to have j as result for the coder (what I do in the first branch of the if then else. Thus the fix is just to:

if (j > 0) {
assert(ways[j][1] >= 0);
assert(dp[j-1][ni] >= 0);
dp[j][ri] += (ways[j][1] * dp[j-1][ni]) % MOD;
assert(dp[j][ri] >= 0);
} else if (i == n-1) {
dp[j][ri] += (ways[j][1]);
}


And yep, it passes system tests.

If it wasn't for me taking at least 15 fruitless minutes to find this mistake, I would have had more time (and less nervousness) to solve the 250 and possibly find the other silly mistake. This mistake was really the difference between position 50 and 0 points for me.

It took me hours to finally find this mistake. At the end, this is what worked: I noticed that in example one, all coders only have one problem, so the number of ways for each coder to have a score was not the problem, this helped me focus on the bottom part of the code.

I have taken a decision for future SRMs. Although using variable names like dp is very cool, I think it would have really helped me to use more a descriptive variable name like maxScore[][][] . Then it would have been very hard for me to unconsciously add the results up instead of max.

1 comment :

Anonymous said...

Extremely bad SRM for me too :( !!!!