Tuesday, November 29, 2011

SRM 595: And now , for something completely different

Believe it or not, I do have a life outside of programming contests. That life, however, involves barely studying Computer Science.

I have a sort of a final today, which overlaps with SRM 525. I always try to organize my subjects to minimize the amount of SRMs I lose. Unfortunately, it was not possible to avoid this overlap. In cases where the overlap is forced, I prioritize the SRM unless the class is very urgent/important. But today that's not the case.

However, I am tired of this, maybe I can't control the overlap, but I can still control my fate. I can still decide to participate in SRM 525. Here is how it goes: I can arrive to the sort of final at 08:30, the SRM starts at 08:02. I do not have the ability of teleportation. So, what I will do is participate until exactly 08:15 AM. That gives me 13 minutes which are statistically enough for me to solve the division 1 easy problem. Also statistically, that might be enough for me to gain rating points.

There are risks. If for some reason I don't manage to solve the problem on time. Or if the medium problem turns out to be more approachable, then the easy won't be enough. Let us see what happens.

Update after the SRM: Experiment went sort of ok.

Div1 300


You have a grid of at most 30x30 cells. There may be coins in each cell. You can move all the coins up, left, down or right in one step. If in any step a number of coins get out of the grid, they are lost forever. Return the minimum number of turns needed to end with exactly K coins. Or -1 if it is impossible.

First of all, note that at the end you will necessarily end with the coins inside one of the sub-rectangles of the grid. Thus if no sub-rectangle contains exactly K coins, it is impossible. If one does, however it is possible. There are w*w*h*h, total sub-rectangles. If one of these rectangles has K coins, then we can try to find the required number of steps to remove the coins in the cells outside it.

Now, for each rectangle we have to count the number of coins. We can actually do it in O(w*h) time. Many people did not notice that the constraints are small enough to allow a total time complexity of O(w*w*w*h*h*h), so they went with a way of counting the number of coins in O(1) time (by first making a matrix of accumulated sums).

Now that we know that a given sub-rectangle has K coins, we need to find the minimum number of steps to remove the coins in the other rectangles. This is similar to just removing rows or columns. First of all, I'd like to point out that we can treat rows and columns separately. Moving horizontally won't affect your requirements to move vertically.

So, we have to make a decision, we will go up a X number of times and down a Y number of times. Minimize (X+Y). What is cool here is that every move up, will add a requirement for a move down (There will be a whole new column that we have to remove). And viceversa. Thus if we first do the moves up required to clear the top rows, we will need to do a similar amount of extra moves down to return the grid to normal before removing the bottom rows.

Long story short, if c id the number of top rows to remove and d is the number of bottom rows. Then the number of moves will be at least (c+d) on top of that, we will need T extra moves, and T is either c or d. It is best to minimize T, and that is done by taking the minimum between c and d.

Now, the same will happen with columns, if you have to remove a left columns and b right columns, the number of steps will be (a + b + min(a,b)).

The rectangle with K coins that needs the minimum number of steps is the overall answer.
....

I then went to the "exam". To my disgrace it started later than I thought so I would have had more time to try the medium problem. During all the waiting time I had a lot of second thoughts about the approach, but I was able to think of many arguments to show that it is correct.

I was able to grab a computer during the challenge phase and tried to open some solutions. My room seemed very solid.

The final result: I passed system tests. The speed in 300 was indeed good enough to gain rating. But I am still in [less than 1900] land, which is not where I feel I should be.

Sunday, November 27, 2011

Seriously google, what's wrong with you?

Need a place to vent. Remember that post I made about not liking the google reader colors at all?.

Well, to be honest, the colors thing is still awful. There is still no way to pick a different color theme and the only color theme still seems to be greatly inspired by the Nazi flag. But the most depressing thing of it all is that that is not the biggest problem. Once I tried to get past the color issues, I learned that the new google reader was also awful in the inside. this somethingawful post does a good job explaining it. There is also this post from a former google employee.


It is almost as if google hated google reader and wanted to kill it. I don't like that idea. I liked google reader, it was useful. I used it to read feeds, find "feeds I may like", and also it used to be the way I picked news items to post in the right side of this blog. For some reason, google do not want that. It is almost as if they assigned someone who has never used google reader to update it. Why?

I don't want to use google+ to be able to share feed items. I don't want to use google+. Its anti-anonymity features are too lame.

Do you use google reader? What do you think of the changes?

Also, in the same vein, a gmail theme update is coming soon. In fact, you can already see the "new look". And boy does it seem to originate from the same anti-color people that made the google reader change. At least you can change the theme... Except that all the themes have little to no saturation nor contrast. I mean, I tried all the themes and they all seem to use the dullest possible color for text background. Except the "HD" themes which either use black text on dark background or have as dull a color scheme as the non-HD ones.

SWERC 2011, D, F, G, H and almost E

I had to wake at 5:15 AM for this, and it was still 65 minutes late. I messed up this time because I didn't know the NWERC problem set was also going to become a contest, but at spoj (which is more updated language-wise than uva) and 1 hour later (which would have been perfect for me).

At the end, I solved 4 problems at the 9-th position.

Problem D - Distributed ballot boxes
By the time I arrived, there were already stats , so I picked the problem that was solved the most in the stats. It was D.

You have to assign at least one box to each city, which means all the people will have to use that box. If the city has more than one box then you can decide who in the city gets which box, (but it is better to divide the quantities, because you want to minimize the maximum number of people to use the same box).

At first I thought the problem was just to accumulate all the populations and divide. (mind you, I was still waking up). Then I figured you cannot do that, you cannot move citizens across cities.

I thought of various things, I eventually settled for this. The trick is : What if the maximum for another city was already X? Then there is no point in using our effort trying to get a value smaller than X for another city.

That idea can be updated to a binary search. Let us change the problem into can the largest amount of citizens that use a single box in all the cities be less than or equal to ha?. If some value for ha is not possible, then smaller values won't be possible either. If a large value is possible, then any larger value will be possible too.

So, to answer the above question, we just need to find the minimum number of boxes needed by each city so that no box holds more than ha citizens. Note that if the sum of all the minimums exceeds b (The number of boxes we have to assign). Then ha is not possible.

For the minimum number of boxes needed by a city, know that all boxes should better have about the same number of citizens assigned. The best way to do this is with just a division.


//========================================================= 
// program:
//
// (initially, parse the input so that N,B and the population[] hold it).
int N, B;
int population[500000];

// is it possible to assign the boxes to the cities such that no box holds more
// than mx citizens?
bool possible(int mx)
{
int minb = 0;
for (int i=0; i<N; i++) {
//least amount of boxes we need to split population[i] in
// parts no larger than mx.
int x = population[i]/mx + (population[i]%mx != 0);

minb += x;
if (minb > B) {
return false;
}
}
return true;
}

int solve()
{
// Binary search for
// the minimum x such that possible(x).
int lo = 0; // We know possible(0) is false
int hi = 5000000; // We know possible(hi) is true (constraints).
while ( hi > lo + 1 ) {
int ha = (lo + hi)/2;
if (possible(ha)) {
hi = ha;
} else {
lo = ha;
}
}
return hi;

}



Problem H - Peer review
It seems that with ACM regionals there are always at least 3 or 4 problems that are way too straightforward and mostly annoying implementation. This is one of them.

Something I had issues with was that you can only count at most one mistake per paper. So, what exactly to do when a single scientist reviews the same paper twice? Somehow the 'per paper' part was confusing when scientists could be the ones with problems. At the end I concluded that in that case, we should count the mistake once, because it is a single paper that is being used twice by the same scientist.

A trick here is to convert the input. The problem is based on papers, yet the input is sorted by authors. It is easy to generate an array with the papers as indexes.

Then do exactly what it says on the tin. For each paper, verify that exactly K scientist review it. That nobody with the same institution as the paper's author reviews it and that each of the scientists reviewing it is distinct.
int K, N; //(From the input) 
string institution[1000]; //(From input, institution of the i-th author)
int review[1000][5]; // author x reviews paper #review[x][i] for (i < K)

int reviewed[1000][5]; // paper x is reviewed by scientist #review[x][i]
int reviewedn[1000]; // number of reviewers (shall be K at the end).

int paperHasProblem[1000]; //Allows us to count at most a problem per paper.


int countProblems()
{
fill(reviewedn, reviewedn + N, 0);
fill(paperHasProblem, paperHasProblem + N, 0);

// convert (from authors as indexes to papers as indexes).
for (int i = 0; i < N; i++) {
for (int j = 0; j < K; j++) {
int p = review[i][j];
int &t = reviewedn[p];
if (t == K) {
paperHasProblem[p] = true;
} else {
reviewed[p][ t++ ] = i;
}

}
}
// handle it
for (int i = 0; i < N; i++) {
// Exactly K reviewers, no less no more
if ( reviewedn[i] < K ) {
paperHasProblem[i] = true;
}
if (paperHasProblem[i]) {
continue;
}
for (int j = 0; j < K; j ++) {
// No collaborators
if ( institution[i] == institution[ reviewed[i][j] ] ) {
paperHasProblem[i] = 1;
break;
}
}
// no repeats
for (int j = 0; j < K; j ++) {
if (paperHasProblem[i]) {
break;
}
for (int s = j+1; s < K; s++) {
if (reviewed[i][j] == reviewed[i][s]) {
paperHasProblem[i] = 1;
break;
}
}
}

}
return accumulate(paperHasProblem, paperHasProblem + N, 0);
}



Problem F - Guess the numbers
It seems that in every ACM-related contest there is a problem that is about evaluating a parenthesis expression. And every time the complicated part is to evaluate them. But if you know the words recursion or stack it is not so hard...

So, really, there are n unknowns. And n is at most 5. You have n values available, and you have to assign them to the unknowns. You can just try all the Factorial(5) ways to do it. Then all that is needed is to evaluate the expression and see if it is equal to m.

At first I was not sure if I would have to be worried about overflows. Then I noticed that 50*50*50*50*50 is the largest factor you can have.

Note that the statement never says that the unknowns will be in sequence. So , we cannot be for sure that for n=3, the unknowns will be a,b,c. Knowing ACM problems, it is likely there were cases with n=3 and , for example, p , z, x. I do not know this for sure because I didn't try submitting without handling this corner case. But I bet that's true.

int n, m; //(From the input) 
int values[5]; // The available values, also from the input.
string expression; //The expressiong (Guess where it comes from).

// Evaluate the parenthesis expression starting at a.
// b will be the first index greater than a that is NOT part of the expression.
int evaluate(int a, int& b)
{
// The expression is an atom, just return the assigned value.
if (expression[a] != '(') {
b = a + 1;
return values[ expression[a] - '0' ];
}
// expression[a] is a (.
// then it follows another complete parenthesis expression
int p = evaluate(a+1, b);
// then an operator
char op = expression[b];
// and then another expression
int q = evaluate(b+1, b);
assert( expression[b] == ')' );
// The end of the main expression comes just after the end of the second
// sub-expression.
b++;
int r;
switch(op) {
case '+':
r = p + q;
break;
case '-':
r = p - q;
break;
case '*':
r = p * q;
break;
}
return r;


}

bool isPossible()
{
int t = 0;
// First of all, let us update the expression so that the unknowns
// are represented as fixed numbers instead of arbitrary letters.
map<char, int> assign;
for (int i=0; i<expression.size(); i++) {
char ch = expression[i];
if (ch >= 'a' && ch <= 'z') {
if (assign.count( ch )) {
expression[i] = ('0' + assign[ch] );
} else {
expression[i] = ('0' + t);
assign[ch] = t++;
}

}
}
// Try all permutations of values, evaluate and if it is m, return true.
t = expression.size();
sort(values,values+n);
do {
int t;
if ( evaluate(0, t) == (int)m ) {
return true;
}
} while (next_permutation(values, values+n) );
return false;
}


Problem G - Non-negative partial sums
Finally a more interesting problem. However, there was something I did not like about it, will explain bellow.

Ok... so the number of elements is 1000000. That's... high. You would certainly need O(n) or O(log(n)*n) solution to solve this. And that does not sound easy because you have to cycle the sequence n times and calculate the sums n times!

Due to the trauma from SRM 524 (That very special medium problem). I tried to keep things like accumulated sums in mind. Turns out they are partially helpful here.

The trick is to try your own sequence. How about: 7,-3,4,-5. Let us try the sums: (7, 4, 8, 3). These are the partial sums for the first rotation (no rotation at all). As you can see, there are no negative values here. What is interesting here, is what happens after a single rotation: (7,-3,4,-5) will become (-3,4,-5,7), but what happens to the accumulated values? They become (-3, 1, -4, 3). We now have negatives, so this rotation does not count.

We need to know how to convert the accumulated results of the first rotation into the second one. It is not actually very hard. From the first sums (7,4,8,3), we will remove the first element (a[0] = 7). If we remove the first element, the sums will become (4 - 7, 8 - 7, 3 - 7). However, we also need to add the first element, but to the last position. The new element will be (3 - 7 + 7). Because the last accumulated sum is (3 - 7) and we add 7.

Note that we do not really need the whole sequence of accumulated sums. What we do need is to know if any is less than 0. Note that if the minimum is not negative, then none of the elements in the sequence of sums will be negative. So we only need the minimum. When we remove a[i], it will make the values in the sequence of sums drop a[i] units. We will then remove a[i-1], that means to once again subtract a[i-1] from all sums. We can just keep a variable offset which will be the current number that we subtracted from all the sums.

Now, for each rotation, we just need a) The minimum value among all the current sums. b) The offset variable. If the (minimum + offset) is less than 0, then the rotation contains a negative.

We do need to update the minimum every time we remove and add elements from the sequence. And this better be fast. You can use a data structure like a tree to do this. Did you know that the STL's std::set is actually a very good tree structure (a red-black tree to be honest). It can find the minimum in the set in O(log(n)) time (using the .begin() method). It can also remove elements in O(log(n)) time, but you would need some sort to find them. And here I propose a pair, that way you store (acumulated_value[i], i) in the set. You can erase the i-th sum looking for (acumulated_value[i], i).

The first code I submitted looked like this:

int n;           //from the input 
int a[1000000]; // the values from the input
int acu[1000000]; //acu[i] will hold sum(a[j], for j<=i).

int solve()
{
// Build the first sequence of sums, add to our "tree"
set< pair<int,int> > tree;
int res = 0;
acu[0] = a[i];
tree.insert( acu[0], i);
int last = acu[0];
for (int i=1; i<n; i++) {
acu[i] = acu[i-1] + a[i];
tree.insert( make_pair(acu[i],i) );
last = acu[i];
}
res = ( tree.begin()->first >= 0); //if we found a negative acu, first cycle is wrong.

int offset = 0;
//The remaining rotations
for (int i=1; i<n; i++) {
// We remove a[i-1]
tree.erase( tree.find( make_pair(acu[i-1], i-1) ) );
offset -= a[i-1];

// We add last + a[i-1]
int id = i + n;
tree.insert( make_pair(last + a[i-1], id) );
// We update last:
last += a[i-1];

//verify the minimum
if (b->first + offset >= 0) {
res ++;
}

}
return res;
}


Unfortunately, it timed out. Which is very lame because it is clearly O(n*log(n)). It seems that as usual uva wants you to care about the constant factor. Which is something I don't like much (Because it teaches us the terrible lesson of not using libraries, when instead using libraries (a good practice) should be rewarded and not the other way around). I eventually came up with a solution that uses the same logic, but not a tree, instead we just sort the first sequence of sums. It has the same complexity but less overhead related to set and it passes in time.

int n;           //from the input 
int a[1000000]; // the values from the input

pair<int,int> acumvec[1000000];

int solve()
{
int res = 0;
// Build the first sequence of sums, then sort the vector
int acu = 0;
int last = 0;
for (int i=0; i<n; i++) {
acu += a[i];
acumvec[i] = make_pair(acu,i);
last = acu;
}
sort(acumvec, acumvec+n);
res = ( acumvec[0].first >= 0 ); //if we found a negative acu, first cycle is wrong.

int t = 0;

int offset = 0;
//remaining cycles
int ultramin = 2000000000; //The idea is that we will never have to remove
//the indexes greater than (n-1), so we do not need
// them in a tree or anything.
// This also means that we do not really need to
// update the first vector, so the order will
// always be the same and thus we only need to sort once.
for (int i=1; i<n; i++) {

int id = i + n;
// We 'remove' i-1
while( (t < n) && ( acumvec[t].second < i) ) {
t++; //Note that we will visit each t at most once IN TOTAL, so it is O(n)
// but outside the for(i=1;i<n;i++) loop.
}

offset -= a[i-1];
// We "add" last + a[i-1]
ultramin = min(ultramin, last + a[i-1]);
// We update last:
last += a[i-1];

// If the minimum + offset < 0, there are negative sums.
// we need to consider both minimums...
if ( (ultramin+offset < 0) || ( (t < n) && (acumvec[t].first+offset < 0) ) ) {

} else {
res++;
}
}
return res;


}



Problem E - Game, set and match
I tried many other problems but it was already quite late and didn't have promising ideas. Around 20 minutes before the end, I opened E. I think this problem is really not as hard as the statistics would tell you. But it does need you to implement 4 different subproblems and that takes quite some time.

My initial idea is dynamic programming with some handling of special cases. I only solved the game part so far. Here is an explanation about how to solve games.

Let us define f(a,b), the probability that the player will win a game if his score is a, and the other player's score is b. Note that the scores defined in the statement are 0,15,30,40, but we can consider them as (0,1,2,3). We will add a new score 4, that means a player won. The player that has score a has a p probability to win a point and we want to calculate his probability to win.

Let us find f(a,b) when they are both are 3, this is the most interesting case (equivalent to 40, 40 in the statement) as they are in a "deuce". We are interested in seeing the player win 2 consecutive times. But if the players score two alternate points, we are back to 40,40. That is important. Let us put things into an equation:

f(3,3) = p*p + p*(1-p)*f(3,3) + (1-p)*p*f(3,3) + (1-p)*(1-p)*0.

- If the player scores two consecutive points, that has probability p*p, and he wins. If the other player scores two consecutive points, that has probability (1-p)*(1-p) and the first player loses (thus the 0). Else, in the other two cases each of the players scores once in the two next turns. This takes us back to the starting point.

The equation is just that, an equation. You can solve it and you have f(3,3).

Now, once you have f(3,3), you can solve the remaining cases of f(a,b) . When a is 4, the player won, that is probability 1 to win. When b is 4, the other player won, so the first player has 0 probability to win. Else there is p probability that the game will become f(a+1,b) and (1- p) probability that it will become f(a,b+1). We just need some good-old memoization to solve the f recurrence.

The other parts of the statement are solved using similar methods, use memoization and solve equations to solve cycles. But it should take long to deduct everything and code.
// Does not solve all the problem, only finds the probability to win a game. 
bool rgame[5][5];
double game[5][5];

double gameProbability(double p, int a=0, int b=0)
{
double & res = game[a][b];
if (!rgame[a][b]) {
rgame[a][b] = true;
if (a==4) {
res = 1;
} else if (b==4) {
res = 0;
} else if ( (a==3) && (b==3) ) {
//how to win a deuce
// f = p*p + (1-p)*p*f + p(1-p)*f
res = (p*p)/(1 - 2*p + 2*p*p);
} else {
res = p * gameProbability(p, a+1, b)
+ (1-p) * gameProbability(p, a, b+1);
}
}
return res;
}

//=========================================================
// I/O:
//
int main()
{
double p;
while ( (cin >> p) && (p > -0.1) ) {
memset(rgame,0,sizeof(rgame));
memset(rset,0,sizeof(rset));
cout.precision(11);
cout.setf(ios::fixed, ios::fixed);
cout << gameProbability(p) <<endl;
}
return 0;
}