Tuesday, August 30, 2011

SRM 516 - Part 2. Div2 1000 and Div1 500

Div2 1000 - NetworkXMessageRecovery
(Problem statement)

Another oddly interesting problem for its slot. The key idea is to pay attention to the constraints of the problem. The statement says the original message did not have repeated characters and there is a guarantee that at least one of such original messages will exist. The subsequences will then also not have repeated characters.

The message should contain the characters in the input and only the characters in the input. This is because we want to minimize its length. There is no reason to add characters not in the input, it would only make the message longer. We also cannot forget any of the characters since we want them to be subsequences.

Then we already know the shortest length - The total number of different characters in the input. What is left is to find the lexicographically first message of that length. Since the actual characters are known, what we want is to pick the appropriate order.

Inspect the examples: {"acd", "bce"} , since we want acd to be a subsequence of the message, then a must always be to the left of c and c to the left of d in the message. a must come before c and d. b must come before c. Now, it could happen that two characters never appear in the same fragment, and thus there is no direct relation ship that forces one to come before another. But it may happen that indirect relations do force something like that, for example: {"ad", "dc"} c must come after d, and d must come after a, ergo c must come after a. In the case there is no relation direct or not between two characters, then we can place any of them in the message, but we want the lexicographically-first message, so the smaller character of those two should come first.

What I have described - Pick the order of a series of elements, when some elements must forcefully come after other elements. Is actually a Topological sort. The graph is directed, and there is precedence between two characters if one comes before another in one of the fragments. There will never be cycles for example, {ac, ca}, because an original message with no repeated characters will always exist. So we can just do a topological sort. There are at most 52 different characters, so we can use the simple topsort algorithm. With the exception that when there are multiple characters with no requirements we need to pick the lexicographically one.

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++)
struct NetworkXMessageRecovery
{
string recover(vector <string> messages)
{
set<char> reqs[128];
set<char> seen;

// for each character in the input, add it to set
// and keep a list (set) of the characters that
// are required to appear before it.
for (int i=0; i<messages.size(); i++) {
string x = messages[i];
for (int j=0; j<x.size(); j++) {
seen.insert(x[j]);
for (int k=j+1; k<x.size(); k++) {
reqs[x[k]].insert( x[j] );
}
}
}
// pull a top sort.
string topsort = "";
while ( seen.size() > 0 ) {
//for_each macro on a seen visits the characters
// in ascending order.

for_each(x, seen) {
if ( reqs[*x].size() == 0 ) {
// once we find one without requirements, pick it.
topsort += *x;
seen.erase(x);
// erase the picked character from the requirements
// of everyone else.
for_each(y, seen) {
//STL FEVEER!
if ( reqs[*y].count(*x) ) {
reqs[*y].erase( reqs[*y].find(*x) );
}
}
break;
}
}
}
}
};

Div1 500 - NetworkXMessageRecovery
(Problem statement)

I eventually fixed my brain and finally understood how what I coded worked (once the silly mistakes were fixed). So, here we go.

First, we need to decide a permutation of columns. This permutation decides the order which we will use to compare the characters in each row.

The "permutation of rows" actually are rankings that you decide to think for each column and decides what the value of each number in that column is. For example, let us say we make a row permutation that begins with {1,2,..} it means that number 1 will come before number 2 when comparing the characters that belong to that column. If we instead used {2,1,...} then 2 would precede 1.

I was writing an explanation on how to solve this problem, but then I noticed that it was getting over verbose and hard to understand. I decided to scrap that explanation and give it a thought on how to explain it before rewriting it.

Edit: And the editorial is up. As you can notice, it was still very hard for me to explain the 500.