Sunday, November 20, 2011

Codechef November cook-off

The other day I concluded that I need more practice. And the other other day, I concluded that the only sort of practice that is worth to do is to participate in more contests.

There is only one problem with that theory, and it is schedules. The other day I missed a codeforces contest because they moved it from one day I could easily attend to another in which it was impossible. Then they started scheduling things at 2:00 AM. On Thursday, I had to skip a class to attend a topcoder match, which was probably going to be the only one I could attend this month. (Wrote the first one, the other might conflict with a final).

In that regards, codechef contests have to have the worst possible schedule for me. Sunday afternoon is just very hard to accommodate. This time, I figured that I could participate in this one if I start it 45 minutes after the beginning of the match.

Subanagrams


(link to problem)
The good thing about starting a contest 45 minutes late is that it is very obvious to see which problems are the easiest by just looking at the stats. I opened this one, and indeed it was easy.

The trick is just to notice that "anagram" means that the order does not matter. When we combine the anagram requirement and the subsequence requirement we get that we need a string that has any letters from the original string in whatever order. The largest string that is an anagram of a subsequence of each string is just... the intersection. That's right, we need a list of letters in common between all the strings. Note however, that there are reptitions, so don't just assume you are dealing with sets. That way, the intersection between "aaa" and "america" is "aa".

The easiest way to do this is to keep the count of each letter in each string. The final string would have the minimum count of each letter. Note that the lex first requirement just means that we should sort the letters in the intersection before printing.

I did something lame when solving this problem. I was already 45 minutes late. So, how do I try to implement the problem? Of course, with the STL. The STL is a good idea if you actually know how to do multiset intersection using the STL. But if you don't know, you'll probably spend a quarter of hour googling and researching. At least the code is kind of cool...

using namespace std; 
#define for_each(s,q) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++)
//=========================================================
// program:
//
// N and the words as parsed by some I/O thingy
int N;
string words[100];

string solve()
{
multiset<char> mp( words[0].begin(), words[0].end() ) ;
for (int i=1; i<N; i++) {
// behold! The overcomplicated, unnecessary STL abuse:
multiset<char> mp2;
multiset<char> tem( words[i].begin(), words[i].end() ) ;
set_intersection( mp.begin(), mp.end(),
tem.begin(), tem.end(),
std::inserter(mp2, mp2.begin()) );
mp = mp2;
}
// the intersection among all the multisets from each string is the result!
string s = "";
for_each(mp, q) { //note that iteration of a multiset is in ascending order...
s += *q;
}
if (s=="") {
s = "no such string";
}
return s;
}



Moving Coins


(Link to problem)
The second easiest problem was, according to the statistics, moving coins.

This time I could see the greedy aspect quite quickly. Yes, I did say greedy.

First of all, let us focus on the last coin (the rightmost one). If this coin is next to an empty space, we will eventually have to move it. We may as well do it immediately. I can tell you that performing this move, will not lower your chances, you will not be preventing any further move when doing this. However, if you do move another coin, you may be lowering your chances to take this very coin to the left side faster.

Now, imagine if instead of a single coin, we had multiple coins in the right-most part of the input. We now need to put some thought to it. In fact, we need a move that moves a coin in as many positions as needed. The largest possible jump. This means that we will be moving the K-th coin in the rightmost group and we will move it to the empty cell next to the group. In some cases, K may be quite large to do it, in that case pick the last coin in the group.

Why do the largest jump? I think the real question is: Why not? As a semi proof convince yourself that doing this will not lower your chances to use the optimal number of moves. If you do not use the largest jump, then there is at least one extra coin to the right of the coin you moved that will stay. This is a coin you will have to move eventually, and this coin will need one step to move. From here, it can be seen that it is just not convenient to do any other move.

I first thought to myself "Ok, this solution can be implemented in quadratic time easily, but it is also possible to do it in linear time.". I then read the constraints and naively thought that 10 cases of 1000*1000 operations each should really work in under a second (the time limit). It turns out I was wrong.

So, we need to implement the strategy in linear time. I think it is easier to explain it with code. In the original, quadratic solution, before each move you need two integers j and t. t is the position of the last coin plus 1, and j the rightmost empty cell.

The linear trick is to update j and t in O(1) after each move, so you don't need a O(n) algorithm to update them. The "//update j" part of the code may look linear, and it is in fact linear, but its complexity is independent of the parent loop. In other words, that loop is going to visit each j at most once per instance of the function, not its parent loop.

using namespace std; 

//=========================================================
// program:
//
int N,K;
char coinpos[1000];

int solve(int t)
{
//Find j (right-most empty cell)
int j = t-1;
while( (j>=0) && coinpos[j] ) {
j--;
}
if (j < 0) {
return 0;
}

int res = 0;
while(j >= 0) {
assert( !coinpos[j] );
assert( coinpos[t-1] );
assert( coinpos[j+1] );
//empty space at j, sequence of t - j - 1 coins after it.
// move cell #K starting at j+1... (move it to position j).
res++;
int tomove = j+K;
if (tomove >= t) {
tomove = t - 1;
}
coinpos[tomove] = false;
coinpos[j] = true;
if (tomove == t-1) {
t--;
//update j
while( (j>=0) && coinpos[j] ) {
j--;
}

} else {
// the tomove cell is now empty.
j = tomove;
}
}
return res;

}


Colorful Chain


(Link to problem)
The remaining problems seemed to be much harder because of the low number of submissions. Colorful Chain had the most submissions so it was in theory the easiest. I dealt with this problem for all the time I had left after submitting MVCOIN. Reading the titles of the other problems, I think it was the right choice.

I actually solved this during the match. I just figured out the accumulated sums part too late and the match ended 10 seconds before I could finish the last change. Whatever.

Note that the constraints are such that we need something faster than quadratic complexity. First of all, notice that if we were interested in having each M consecutive links have all the colors. The problem would be equal to the factorial of M. This is simple to see when we have four colors, I will represent colors by uppercase letters.

Imagine we decided the first M colors:
ABCD.

Then the fifth color would forcefully have to be A. Because we want each 4 consecutive links to have all the colors available. Then we would be forced to make the next color B, then C, etc. Thus the first pattern of M numbers will have to be cyclic. There are only factorial(M) ways to decide the first M numbers.

When we want the groups of M+1 numbers to have the M available colors, it is a little different, but I thought that we should stay focusing on patterns and seeing how they force new patterns.

Solution
Imagine we have already decided the first (M+1) numbers. There will forcefully be exactly one repetition. Let us go back to M=4 (ABCD). Imagine we picked:

ABCAD....

Since the sequence BCAD contains all the four colors. We can choose any color we wanted for the next color. Now, let us ignore the initial letter. We will only have BCAD..... , decide the following color and all the remaining ones. We have to generalize this into a function F(N). F(N) returns the total number of ways to color the chain if the first M links are a permutation of all the M colors. We will solve F() later.

In the example, ABCAD, it was A which was repeated. What happens with another letter? For example, ABBCD. Now, BBCD dooes not contain A, and thus we are forced to pick A, which means ABBCDA.. . Finally, since BCDA are a permutation of the M letters, the result is F(N-2) (Already decided the first AB) .

Let us repeat B but in another position, like ABCDB, strangely, the same thing will happen: ABCDB... -> ABCDBA... -> CDBA... (F(N-2)).

We need to differentite things up, B is the second color we picked and when the second color picked is the one that is repeated, the result is F(N - 2). If we go back to the first example, when the first color is the one that is repeated, the result is F(N - 1). Now, try repeating the fourth-color, is the result F(N - 4)? ABCDD... -> ABCDDA... -> ABCDDAB... -> ABCDDABC... ( DABC is a permutation, and we have selected 4 letters before them, the result is F(N - 4).

Great, isn't it? If in the first M+1 colors, the repeated color is the i-th one, then there are F(i) ways to complete the rest. Now to calculate the total result, we have to sum up all the results, first find the total number of ways if the first color is repeated, then the second color etc. We need to find the number of ways to color the first M-1 links by repeating the first, second , third, etc, color. That is a combinatorics sub.problem of its own. To save you time, the solution to that problem is: M! * (M - i + 1). Where M! is the factorial of M. (Quick logic to find that formula: Imagine i = 4, we have M*(M-1) * (M-2) * (M-3) choices for the first four colors. Then we need to include the fourth color and the remaining colors in the other links, that is (M-i+1)! ).

What we have left is to calculate F(N). It is not very difficult, but we need the same logic. We assume the first M colors are a permutation of all the available colors. Let us say ABCD...

For the next color, we can choose each of the M colors without breaking the rule. What happens if we pick A? (first color of the permutation)

ABCD-A-....

Note that BCDA is yet another permutation, thus this is the same as starting the same problem all over again, but with smaller N. F(N-1), actually.

If we pick the second color, we have: ABCDB

ABCDB... -> ABCDBA... -> ABCDBA ...

We once again forcefully reach a permutation CDBA, this time we have F(N - 2).

It is the same thing for each of the M possibilities, if we choose the color at position i, the result is F(N - i). F(N) is then equal to the sum F(N - 1) + F(N - 2) + ... F(N - M). Since we have to solve it in linear time, you will need to calculate that sum in O(1) time. It is easy if we remember acum(N), the sum of all F from F(0) to F(N-1). Then the wanted sum is acum(N) - acum(N - M).

long long fact[100001]; 

const int MOD = 1000000007;
int solve(int N, int M)
{
long long F[N+1];
long long acum[N+1];
for (int i=0; i<=M; i++) {
F[i] = 1;
}
F[M+1] = M;
acum[0] = 0;
for (int i=1; i<=M+2; i++) {
acum[i] = (acum[i-1] + F[i-1]) % MOD;
}

//F[X] gives the number of ways to solve AFTER the first M colors have
//been picked.

for (int m=M+2; m<=N; m++) {
//F[m] = sum (F[m - 1], ... F[m - M] );
F[m] = (acum[m] - acum[m - M] + MOD) % MOD;
acum[m+1] = (F[m] + acum[m]) % MOD;
}

long long res = 0;
// res = sum( M! * (M - i +1) * F(N - i) )
for (int i=1; i<=M; i++) {
res += (F[N-i] * ( (fact[M] * (M - i + 1) )%MOD ) ) % MOD;
}

return res % MOD;
}

void init()
{
// precalculate the factorial
fact[0] = 1;
for (int i=1; i<=100000; i++) {
fact[i] = (fact[i-1]*i) % MOD;
}
}

No comments :