Tuesday, August 30, 2011

SRM 516 - Part 2. Div2 1000 and Div1 500

(Link to part 1)

Div2 1000 - NetworkXMessageRecovery
(Problem statement)

Another oddly interesting problem for its slot. The key idea is to pay attention to the constraints of the problem. The statement says the original message did not have repeated characters and there is a guarantee that at least one of such original messages will exist. The subsequences will then also not have repeated characters.

The message should contain the characters in the input and only the characters in the input. This is because we want to minimize its length. There is no reason to add characters not in the input, it would only make the message longer. We also cannot forget any of the characters since we want them to be subsequences.

Then we already know the shortest length - The total number of different characters in the input. What is left is to find the lexicographically first message of that length. Since the actual characters are known, what we want is to pick the appropriate order.

Inspect the examples: {"acd", "bce"} , since we want acd to be a subsequence of the message, then a must always be to the left of c and c to the left of d in the message. a must come before c and d. b must come before c. Now, it could happen that two characters never appear in the same fragment, and thus there is no direct relation ship that forces one to come before another. But it may happen that indirect relations do force something like that, for example: {"ad", "dc"} c must come after d, and d must come after a, ergo c must come after a. In the case there is no relation direct or not between two characters, then we can place any of them in the message, but we want the lexicographically-first message, so the smaller character of those two should come first.

What I have described - Pick the order of a series of elements, when some elements must forcefully come after other elements. Is actually a Topological sort. The graph is directed, and there is precedence between two characters if one comes before another in one of the fragments. There will never be cycles for example, {ac, ca}, because an original message with no repeated characters will always exist. So we can just do a topological sort. There are at most 52 different characters, so we can use the simple topsort algorithm. With the exception that when there are multiple characters with no requirements we need to pick the lexicographically one.

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++)
struct NetworkXMessageRecovery
{
string recover(vector <string> messages)
{
set<char> reqs[128];
set<char> seen;

// for each character in the input, add it to set
// and keep a list (set) of the characters that
// are required to appear before it.
for (int i=0; i<messages.size(); i++) {
string x = messages[i];
for (int j=0; j<x.size(); j++) {
seen.insert(x[j]);
for (int k=j+1; k<x.size(); k++) {
reqs[x[k]].insert( x[j] );
}
}
}
// pull a top sort.
string topsort = "";
while ( seen.size() > 0 ) {
//for_each macro on a seen visits the characters
// in ascending order.

for_each(x, seen) {
if ( reqs[*x].size() == 0 ) {
// once we find one without requirements, pick it.
topsort += *x;
seen.erase(x);
// erase the picked character from the requirements
// of everyone else.
for_each(y, seen) {
//STL FEVEER!
if ( reqs[*y].count(*x) ) {
reqs[*y].erase( reqs[*y].find(*x) );
}
}
break;
}
}
}
return topsort;
}
};


Div1 500 - NetworkXMessageRecovery
(Problem statement)

I eventually fixed my brain and finally understood how what I coded worked (once the silly mistakes were fixed). So, here we go.

First, we need to decide a permutation of columns. This permutation decides the order which we will use to compare the characters in each row.

The "permutation of rows" actually are rankings that you decide to think for each column and decides what the value of each number in that column is. For example, let us say we make a row permutation that begins with {1,2,..} it means that number 1 will come before number 2 when comparing the characters that belong to that column. If we instead used {2,1,...} then 2 would precede 1.

I was writing an explanation on how to solve this problem, but then I noticed that it was getting over verbose and hard to understand. I decided to scrap that explanation and give it a thought on how to explain it before rewriting it.

Edit: And the editorial is up. As you can notice, it was still very hard for me to explain the 500.

SRM 516 - So, I forgot I have a blog...

So, there we go, SRM 516 I was feeling under normal levels of energy today. Hence why I admitted in the chat that I was kind of foreseeing dropping below 2100. If I were to listen my mother I would know that when you feel down you shouldn't do something I would have skipped the match. On the other hand, that's what wusses do. You wouldn't have good SRMs if it wasn't for the bad ones. And so it begins...

Div1 500 - RowsOrdering
(problem statement)

Meh. You know, it was very difficult for me to understand this problem. I had a lot of difficulty parsing the problem so that the examples made sense. Brain was just not working correctly.

15 minutes after the beginning, still have no idea what the problem is about. I know you have to pick some permutations, but I cannot parse how exactly the permutations affect the problem. I open division summary, tons of solutions for 250. I open it sometime after, and 10 guys already have solutions for two problems. It seems that 500 is 'easy'. But I still have no idea what to do. Time passes and more and more people solve it. I know I have to do it.

I eventually, just deconstructed the examples, and took a conclusion about how it works. It turns into a greedy algorithm, a very easy greedy algorithm. I code it, but it is very late, only few minutes left, so if I want to finish the 250 I have to rush. And I have to finish the 250 because everyone seems to have solved two problems... I pass examples and submit. Rush to 250...

This is what I submitted: http://community.topcoder.com/stat?c=problem_solution&rm=309666&rd=14541&pm=11114&cr=22652965 . Bonus points if you find both of the stupid mistakes I made.

Div1 250 - NetworkXOneTimePad
(problem statement)

Out of the scores I know that this should be a very quick problem to solve. I also know that I have to get a good score because my score for the medium is low and I am not sure what I did there. So I open the problem.

Initially confused about int result and no indication of what to do if the result is large. - Of course! That must mean that the result will always fit 32 bits int. But why. I was initially thinking of some approach in which you pick each bit and multiply by 2 if it can be different. After examining the examples, that does not really work.

I eventually notice the obvious thing. Given two fixed plaintext and cipheredtext, the key can be found by a simple xor , (reversing the process). Because of this, and because each ciphered texts must forcefully come from one of the original plaintexts, then there is a potential key for every pair of plain and ciphered texts. We can just pick all those pairs, get the potential key and then try it out.

Some difficulties when implementing: After you have a potential key, you have to convert each plain text to a new ciphered texts. The input ciphered texts must be a subset of the ones you found. There is also the chance you find the same potential key multiple times, so you have to avoid repetition.

I named a function xor and I was getting very strange mistakes. It seems someone in charge of c++ thought it would be hilarious to do a define that replaces the word xor with ^ (xor operator). Fortunately, by then my brain was starting to work, else I don't think I would have managed to understand what was going on. I replaced the xor function with bor.

The result is some ugly, very ugly code: http://community.topcoder.com/stat?c=problem_solution&rd=14541&rm=309666&cr=22652965&pm=10846.

In fact, it over does things. First of all, the constraints forbid repetitions in the plaintext array. And since all of the cipheredtexts must come from then, we can just get the key for each pair ( plaintext[i], cipheredtext[0] ) (Fixing ciphered text 0). This guarantees that potential keys are never tried twice. Also, instead of coding the plaintexts, we can decode each cipheredtext , then check if the decoded result exists in the input.

The following is the refactored code, it is much better:

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++)
struct NetworkXOneTimePad
{
// xor between two strings...
string sxor(string a, string b) {
string s= "";
int n = a.size();
for (int i=0; i<n; i++) {
s += ( (a[i]==b[i]) ? '0' : '1' );
}
return s;
}
int crack(vector <string> plaintexts, vector <string> ciphertexts)
{
int cnt = 0;
// plaintexts as a set, so that we can find decoded ciphertexts
// on them.
set<string> plaintextset( plaintexts.begin(), plaintexts.end() );
for_each(p , plaintexts) {
// Find the key.
string key = sxor(*p, ciphertexts[0]);

// Each of the ciphertexts must become one of the plaintexts
// when decoded.
bool good = true;
for_each(q , ciphertexts) {
good &= ( plaintextset.count(sxor(*q, key)) );
}
if (good ){
cnt ++;
}
}
return cnt;
}
};


I was barely able to submit 250, so I have two very slow submissions, the worse was yet to come.

Aftermath
Of course, 500 got challenged. If it wasn't for the dumb mistakes, I could have gotten top 100 even with the slow submissions. It seems many people managed to make mistakes in both problems. Somehow I am still above 2100, 2 points above... Is the worst of this streak done yet?

Personal opinion warning: I did not really like the 500. For an easier than usual problem, the speed becomes very important and when most of the problem solving time comes from decoding the statement, that's not good.

Div2 250 - RowsOrdering, explained
(problem statement)
Now this was more interesting than your average div2 250. The trick is to note that the special property - That each sequence of even length has the same number of x and o - forces a specific pattern.

First, let us thing of the (second) simplest contiguous sequence of even length. There are two possibilities, xo and ox.

Now try a sequence of length four. First of all, it forcefully needs 2 x and 2 o. But something like xxoo is invalid, because the previous rule still forces all of the length-2 subsequences. So, in fact, we need xoxo or oxox. Note that in both of those cases, there is a length-2 sub-sequence in the middle (ox and xo) , the rule still applies.

In fact, the message will forcefully have to be in the pattern xoxox.... or oxoxoxo... . Because the rule for length-2 subsequences will always have to apply, and once it applies, it works for all even length subsequences.

The constraints guarantee exactly one result. So we should just check the two alternating sequences of the same size as the input. xox... and oxo... and pick the one that fits the pattern.

string reconstruct(string message)
{
//s variable decides whether to begin with o or x
for (int s=0; s<2; s++) {
string fixed = message;

// pick the first character
char ch = (s? 'o' : 'x' );

bool bad = false;
// build the alternating message, check against
// the input, if at one point one of the
// known characters in the message contradicts
// the one we are building, then we can't use that
// case.
for (int i=0; i<message.size(); i++) {
if ( message[i] == '?' ) {
fixed[i] = ch;
} else if ( message[i] != ch ) {
bad = true;
}

ch = ( (ch=='x') ? 'o' : 'x');
}
if (! bad ) {
return fixed;
}
}
//constraints guarantee this won't happen.
return "!";
}


Part two is coming soon...

Tuesday, August 09, 2011

SRM 514

Oh boy, two huge mistakes related to the poorly thought % operator in C++.

Div1 500 MagicalGirlLevelTwoDivOne

I did not let it trick me and I actually found the key to solve the problem quite quickly.

Imagine we have n+1 numbers in a column:

[][][][][][]...[]

The two consecutive groups of n numbers will overlap in exactly (n-1) numbers. Let us say that the sum of all the numbers in this overlap area is already known. There are two cases:

* The overlap sum is even: Then both of the remaining numbers must be odd so that the two n-sized areas are odd.
* The overlap sum is odd: Then both of the remaining numbers must be even so that the two n-sized areas are odd.

Now, this is the nice part, this means that for every column the parity of F[x][y] must be equal to the parity of F[x+n][y]. The same is true for rows, the parity of F[x][y] must be equal to the parity of F[x][y+n]. And this applies everywhere.

In fact, this means that we only need to decide the parity for the cells in the first n x m rectangle. This is an easier problem because n and m are at most 10. For each cell F[x][y] where (x < n) and (y < m) there are different numbers of ways to have that cell even or odd, it depends of all the cells F[x2][y2] such that (x2 % n = x) and (y2 % m = y). All of those cells must be even or odd numbers depending of the chosen parity for F[x][y] and if they are . there are 5 or 4 different digits for each of those cells. The number of ways to have a parity in cell F[x][y] can be calculated easily with a single n^4 loop.

Finally, we need to pick the parities. Imagine a certain row of m=6 elements was 010101 , now the first group of m elements will have an odd total parity and that's good. The second group of m elements will be: 10101 . 0 , because we have already determined that the parities are cyclic. The parity will be the same as the first group. So we should just need to worry that in our m x n rectangle , the rows and columns all make odd parities. The final result can be found with a dp. Just remember the current parities of each column and the parity of the current row and select 0 or 1 for the parity of each cell.

I took more time than necessary because at first I thought I could use a slower but simpler to implement dp, that just takes the 2^m parities of the columns and a current row number. (Needs to try 2^m row combinations for each row, in total 2^(2*m)*n). It turned out to be too slow. So I moved to the other approach.

const int MOD = 1000000007;

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++)
struct MagicalGirlLevelTwoDivOne
{
long long fact[10][10][2];
long long dp[11][11][2][1024];
vector<int> vmasks;
int theCount(vector <string> palette, int n, int m)
{
int h = palette.size();
int w = palette[0].size();

// fact[y][x][k] is the number of ways to have
// parity k (0 : even , 1 : odd) in cell [y][x]
// Where y < n and x < m.
for (int i=0; i<n; i++) {
for (int j=0; j<m; j++) {
for (int k=0; k<2; k++) {
long long & p = fact[i][j][k];
p = 1;
for (int ii=i; ii<h; ii+=n) {
for (int jj=j; jj<w; jj+=m) {
if ( palette[ii][jj] == '.' ) {
p = (p * (4+k) ) % MOD;
} else if (palette[ii][jj]%2 != k) {
p = 0;
}
}
}
}
}
}

// dp[y][x][r][mask] is the total number of ways to assign the parities
// if :
// * We have already assigned the parities of the cells previous to (y,x)
// * The current parity of the row is r (0 or 1).
// * The current parity of the j-th column is the j-th bit of mask.
//
memset(dp,0,sizeof(dp));
dp[n][0][0][ (1<<m)-1 ] = 1;
for (int y=n-1; y>=0; y--) {
for (int x=m; x>=0; x--) {
for (int r=0; r<2; r++) {
for (int mask=0; mask<(1<<m); mask++) {
if (x==m) {
if ( r == 1) {
dp[y][m][r][mask] = dp[y+1][0][0][mask];
} else {
dp[y][m][r][mask] = 0;
}
} else {
dp[y][x][r][mask] = 0;
for (int k=0; k<2; k++) {
int nmask = (mask ^ (k<<x));
int nr = (r^k);
dp[y][x][r][mask] += (fact[y][x][k] * dp[y][x+1][nr][nmask]) % MOD;
}
dp[y][x][r][mask] %= MOD;
}
}
}
}
}
return dp[0][0][0][0];
}
};




Div 250 MagicalGirlLevelOneDivOne
Things are doing great, it seems that when I finished 500 and made sure it was correct I was first at my room and I had a very good position. I opened 250 and then...

It seemed a little abstract. At the end after tons of thinking I decided to do what I should have done from the beginning, I coded a program that simulated a BFS on a 100x100 board to find any pattern related to n-knight moves. The results were nice.

For a 2-knight, it turns out that all of the cells were reachable. I tried many other even numbers and it was always possible.

More cooler When I tried 3-knight, it was different, the reachable cells formed a checkered board pattern:


x.x.x.x.x.x
.x.x.x.x.x.
x.x.x.x.x.x
.x.x.x.x.x.
x.x.*.x.x.x
.x.x.x.x.x.
x.x.x.x.x.x
.x.x.x.x.x.


The asterix is (0,0) and every x is a reachable cell. This turned out to seem true for all odd number.

So, if there is an even number in the allowed n-knight moves, the solution is always possible. Then we have to worry only about odd-moves, but note that since the checkered board pattern is always there, it does not matter what combination of odd knight moves you use, the reachable cells are always the same in the checkered board.

Leads me to a very simple solution:

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++)
struct MagicalGirlLevelOneDivOne
{
string isReachable(vector <int> jumpTypes, int x, int y)
{
for_each(n, jumpTypes) {
if (*n % 2 == 0) {
return "YES";
}
}
return ( (x&1) == (y&1) ) ? "YES": "NO";
}
};


Blunder
Did you notice the blunder? At first, I was a little paranoid that the solution cannot be so easy. I initially thought that when n is 1, the 1-knight moves are also always possible. But after trying the program, it seemed that it was still an odd number and followed the same rules.

The challenge phase came. I eventually saw that a solution using the same logic was challenged.

... It was some time after that I noticed the bug. Someone designing C didn't agree with making % compatible with the concept of true modulus. Negative numbers behave different than you would expect -1 % 2 is not 1 , it is -1. I knew that, but I just didn't think of that during the match. I figured out that my solution was wrong and it was eventually challenged.

Then I made another blunder, I tried to exploit this knowledge to score some challenges. But I misread code / didn't notice that in the case when the number is a multiple of the second operand of the % operation, it still returns 0 regardless of whether the first one is negative. I scored two wrong challenges that way.

The final outcome, I increased my rating to 2073. Which is barely one rating point above my previous maximum. It makes me wonder what would have happened if I didn't break rule #2: DO NOT CHALLENGE.

Here's a fixed version of the code for 250, note that it is still incredibly simple.

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++)
struct MagicalGirlLevelOneDivOne
{
string isReachable(vector <int> jumpTypes, int x, int y)
{
for_each(n, jumpTypes) {
if (*n % 2 == 0) {
return "YES";
}
}
return ( (x+y) % 2 == 0) ) ? "YES": "NO";
}
};