Sunday, February 26, 2012

ACM 2012 World finals warmp up I

This contest was fun. So much that I stayed after the 5 official hours (Although I had long breaks). I accomplished the sorry amount of 2 correct problems during the 5 hours and later did 2 more.

It feels like the contest system at UVA is not aging graciously. The absence of AJAX can really be felt at times. It is also frequent to see low participation in these contests even though the problems are usually good and there is always something solvable.

Problem G: February 29
This is the first problem I felt like solving. It is really mostly an implementation conondrum. I would suggest ignoring the amount of days of each month for the most part and just considering whether a given date is before, after or equal to February 29. Once you have that, also calculate then umber of leap years less than a given year. Combining these things with ad hoc magic, you get a quite simple solution:

map<string,int> monthid; 

int fix(int m, int d)
//before, after, or equal to february 29?
if (m < 1) {
return 0;
if (m == 1 && d < 29) {
return 0;
if (m == 1) {
return 1;
return 2;

// how many leap years in years x : (0 <= x < year) ?
int before(int year)
// x: number of multiples of 4
int x = (year-1) / 4;
// y : number of multiples of 100
int y = (year-1) / 100;
// z : number of multiples of 400
int z = (year-1) / 400;

return x - y + z;


int solve(string s1, int d1, int y1, string s2, int d2, int y2)
int m1 = monthid[s1];
int m2 = monthid[s2];

int x1 = fix(m1,d1);
int x2 = fix(m2,d2);

if (x1 > 1) {
y2 ++;
if (x2 < 1) {
y2 --;
if (y1 >= y2) {
return 0;
//cout<<"["<<x1<<" ; "<<x2<<endl;
//cout << "{"<<y2<<":"<<before(y2)<<" ;; "<<y1<<":"<<before(y1)<<endl;
return before(y2) - before(y1);


void init()
const char* s[12] = {"January", "February", "March", "April", "May",
"June", "July", "August", "September", "October", "November",
for (int i=0; i<12; i++) {
monthid[s[i]] = i;

Problem J: Forwarding emails
It sometimes feels like the easiest problem in ACM contests is J. Why? I don't know. But it looked like a lot of people were solving this problem, so I opened it.

Every martian has exactly one martian to forward the email. Note that there are always cycles in the input. The solution is to first find all the cycles, for each martian that belongs to a cycle, save the amount of martians that are in the cycle. Once you do that, you can do the following dp-friendly recursive relation on the remaining martians. The maximum number of martians you can meet by emailing martian x is:
- Cycle size[x] if he belongs to a cycle.
- Or 1 + solution( sends[x] ) in the other case. (sends[x] is the martian x sends an email to.

Problem C and D: I plan to explain these later in big posts because they are interesting.

An issue I tend to have in contests hosted at UVA is that sometimes I have no idea what the problem expects me to do. For problem C, I spent a lot of time wondering if the intended solution is O(n*n), O(n*n*log(n)) or o(n*n). The humongous number of submissions I made to this problem was actually to reverse engineer the sizes of test cases so I can estimate whether some solutions were faster or not than others. Part of my issues here is that I am apparently unable to use hash_set and unable to implement a fast enough hash table that is actually faster than the n*n*log(n) solution I came up with first.

One hour before the end of the (official) match, I had to go for lunch. I came back two hours after it ended, but continued trying to solve C. I finally did it with a very hackish solution that is sort of n*n*log(n) but does a very dirty trick, more details later.

Problem D is cool because it exploits two cliches (Interval trees and calculating sums of arithmetic series in O(1)) into a problem that is actually interesting.

Problem E
Opened this during the match as it seemed popular. I am still not sure about what is a valid path here. Tried asking for clarifications, but I would say that no one actually reads those emails. This was the last problem I tried seriously, but I started to try to solve it around this morning and ended up without time to finish debugging.

Saturday, February 25, 2012

Codeforces 109 1B / 2D - Colliders

Link to statement
I kind of solved this during the contest, I just made the absurdly wrong assumption that I only needed prime factors up to sqrt(100000), this is completely wrong and stupid. A prime factor can be up to (N/2). I also did something complicated to get the number of the conflicting collider (I knew that only a collider can use a prime factor, but I forgot about that when making this operation). I have no idea why this happened.

I find this problem to share a lot of traits to the topcoder problem from the round just after. The same logic to ensure that the numbers are co-prime is very useful in both problems.

n and m can be sort of large, up to 100000. It is best to make the algorithms better than O(n*m), just for example. The following solution will work in O(m * sqrt(n)) time, so that's quite fine.

First of all, co-prime numbers. Although gcd(a,b) = 1 is the definition used in the statement, it really pays to see the problem as (No two colliders can share a prime factor). You can be sure that the maximum number of distinct prime factors a number can have is O(log(n)) (It is actually better than that). For example, try 2*3*5*7*11*13*17 , that is already larger than 100000. Thus you can assume a number will never have more than 7 distinct prime factors.

So, the algorithm is rather simple, when considering that. Assume that you have a list of all the prime factors of each of the numbers from 1 to 100000. That is an array [6][100001]. Which is perfectly fine for the memory limit. Then, let's say that you have an array that for each prime factor, sets whether it is currently in use by a collider (A collider that is a multiple of that prime exists) or not. The solution becomes an easy simulation: When turning on a collider, set the array to true for all of its prime factors. When verifying if you can turn a collider on, see for each of its prime factors if the array is already true or not. When turning a collider off, set the boolean values of its prime factors to false (This works because only one collider at the same time can be a multiple of a prime factor).

We forgot about something, the conflict message requires you to say the number of a collider that conflicts with the request. That is easy, instead of just setting leaving true or false to the array that determines used prime factors. Leave the number of the collider that uses it or -1 if no collider uses it.

Prime factors quickly
The previous solution is correct and works in O(m * log(n) ) time but it assumes we already have a list of the prime factors for each number. We need to get this list quickly.

Looping through each of the n numbers, and factorizing them, may work in time. Maybe 100000 * sqrt(100000) is fine. maybe.... If you want something faster, here is a trick that uses a slight modification to the Sieve of Eratosthenes. In the normal sieve, you mark the numbers you find to be composite as such. However, think of it, whenever you mark a composite number as false for the first time, you are doing so because you have just found the first prime factor of the number. Thus, you can modify the Sieve of Eratosthenes slightly to make it return a list of the first prime factor for each number from 2 to n (For a prime number, its first and only prime factor is itself). You can then use this list to quickly factorize any number from 2 to n:
- Extract first prime factor : q
- Divide x by q until x is no longer a multiple of x.
- Use the new value of x to find the next prime factor, and repeat until x is 1.

Friday, February 24, 2012

SRM 534: Div1 500, EllysNumbers

I have finished coding an idea I got during the morning while wasting time on a typical sub-human education system queue. This problem is fun and cute and almost makes up for the terrible Fake difficulty from the 250.

You have a list of 500 special numbers, each from 1 to 109. And also a number n from 2 to 1018, inclusive. Count the total number of subsets of the list of special numbers such that a) All the numbers in the set are pair-wise coprime. And b) The product of all the numbers in the set is equal to n.

For simplicity, let us just get rid of any special number that is not a divisor of n. If it is not a divisor, we cannot use it in the product.

This is a solution that is conceptually simple and easy to prove that runs under the 2 seconds limit. But it requires quite an amount of implementation steps. Funnily enough, after coding the solution I learned that ACRush's top score solution is basically the same, but he can implement it in far less lines. Well, I got to accept that my programs are usually longer than average. It is because I am very strict to myself in following some coding style rules I imposed myself. Now that I notice, the reasons for each of these rules could make a nice discussion for another day.

Anyway, the key to this problem is how to ensure that all the numbers picked in your subset are pairwise coprime. This time it is best not to use the gcd(a,b, ... ) = 1 definition (not directly, at least). Instead, think about factorization.. A group of numbers are pair-wise coprime if no pair of those numbers shares a prime factor.

Why prime factors? Well, for starters numbers in general don't have a lot of prime factors. Note that we are not considering the exponents when making this statement. For example 72, has in our current focus, 2 prime factors: 2 and 3. In reality it is 2*2*2*3*3, but that does not matter yet. Why doesn't it matter? Ok, since the numbers in each subset have to be coprime, you cannot pick two numbers that share a prime factor. You cannot pick 8 and 24 simultaneously. So, for each prime factor of n, you need to pick one of the special numbers. There is an extra condition, this number must have the prime factor raised to the same exponent that n needs. For example, if n = 72 = 2*2*2*3*3, you cannot pick 4 or 36 or 6 as a number in the subset. Because 36 will bring only 2*2 to the matter, but you need 2*2*2 to make 72, but you cannot use any other number that has 2 as factor again.

Now, here is the key result to find an answer what is the maximum number of different prime factors a number less than or equal to 1018 can have? Since we want to maximize the number of factors, we should do something like 2*3*5*7*... (A product of the smallest primes). Find the product of the K smallest primes that is closest to 1018. You will notice that the maximum number of distinct prime factors a number in the input can have is 14.

A typical dynamic programming problem
14... That number is rather small. Now, imagine that we already know the distinct f prime factors of n. Let us index these prime factors and assign each of them a number from 0 to f-1. Now, for each special integer, find the indexes of the prime factors of n it contains. Note that the special integer must either contain the prime factor 0 times or exactly the same number of times as n. In other case, we cannot use the special integer and we must discard it. Since we are assuming that all special integers are divisors of n, there will never be more factors.

Once the special integers have been decomposed into the prime factors of n, there is a recurrence relation that can count the number of subsets: Let f( used[], s) the number of subsets using only the s first special integers if the current list of used prime factors is given by used[]. For each instance of the recurrence, there are two choices: Either use integer # (s-1) or do not use it. In order to use the integer, it must not use any prime factor already in the used[] array. Of course, a base case is when s=0, then we need to have used all factors of n. It becomes something like this:

f(used[], s) {
res = 0
if (s == 0) {
// base case
if (All prime factors of n are in used[]) {
res = 1
} else {
res = 0
} else {
if ( (factors in s) does not intersect used[] ) {
// update the used set
// procceed to use smaller sets
res += f( used[] union (factors in s), s - 1)
// do not use (no conditions)
res += f(used[], s - 1)
return res

Now, used[] can be represented as a bitmask . If you do that, let us also represent the factors allowed by each special number as a bitmask too. To check if they intersect , just check if their intersection (used_mask & (mask of special) ) is not 0. Now that the arguments are encoded as a bitmask and a number s. We can verify the recurrence relation is acyclic, and we can use dynamic programming. This yields an algorithm that requires at most (214 * 500) in time units. A small issue is memory, since we always use only the previous value of s when calling the recurrence, we can implement this algorithm in O(214).

Details, However
The dynamic programming algorithm is a rather standard set cover algorithm. We have skipped some work though. We need to factorize n. And n can be too large even for the O(sqrt(n)) algorithm to factorize.

The trick is that the problem requires all prime factors of n to be present in one way or another in the special integers. (Note that the special numbers are at most 109, unlike n) Thus we can just factorize each special integer. The total union of all the factors found can be a factorization of n, or be short of some prime factors. But if you couldn't find enough prime factors in the special integers, we may as well return 0. Because there is no way to ever find n as a product.

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++) 
struct EllysNumbers
long long getSubsets(long long n, vector <string> sspecial)
// a) Parse the input vector<string>, make sure to remove any number that
// is not a divisor of n, because we hate them.
vector<long long> special;
string s = accumulate(sspecial.begin(), sspecial.end(), string("") );
istringstream st(s);
long long x;
while (st >> x) {
if (n % x == 0) {
set<int> setPrimeFactors;
// b) Extract prime factors from all special numbers
for_each(q, special) {
int x = *q;
// Factorize x...
for (int y = 2; y <= x/y; y++) {
if (x % y == 0) {
do {
x /= y;
} while (x % y == 0);
if (x != 1) {
// c) Get the amounts of each factor needed to build n.
// also verify that all the factors of n exist in the input.
vector<int> primeFactors( setPrimeFactors.begin(), setPrimeFactors.end() );
int neededFactors[ primeFactors.size() ];
long long z = n;
for (int j=0; j<primeFactors.size(); j++) {
cout << "PF "<<primeFactors[j]<<endl;
int & c = neededFactors[j];
c = 0;
while( z % primeFactors[j] == 0) {
z /= primeFactors[j];
c ++;
if (z != 1) {
cout << "incomplete "<<z<<endl;
// not all prime factors of n are present...
return 0;

// ========================================================
// d )
// Get the factor mask for each of the special numbers.
// However, there is a chance a special number is "incomplete".
// discard those.

int factorMask[special.size() ];
for (int i=0; i<special.size(); i++) {
int & mask = factorMask[i];
int x = special[i];
mask = 0;
for (int j=0; j<primeFactors.size(); j++) {
int c = 0 ;
while (x % primeFactors[j] == 0) {
x /= primeFactors[j];
if (c!=0) {
if (c != neededFactors[j]) {
mask = -1;
} else {
mask |= (1 << j);
// ==========================================================
// e) The dynamic programming algorithm is rather standard
// now that we have all that info.
// However, we need it in O(2^f) memory.

int f = primeFactors.size();
int s = special.size();
long long dp[2][1<<f];

memset(dp, 0, sizeof(dp) );
// The base case, when all the prime factors are done and no special ints are left
dp[0][ (1<<f) - 1 ] = 1;

for (int t = 1; t <= s; t++) {
for (int mask=0; mask<(1<<f); mask++) {
long long & res = dp[t&1][mask];
res = 0;
// use the (t-1)-th integer
if ( (factorMask[t-1] != -1) && ( (mask & factorMask[t-1]) == 0) ) {
// no intersection, use it.
res += dp[~t&1][ mask|factorMask[t-1] ];
// Do not use the (t-1)-th integer
res += dp[~t&1][mask];

return dp[s&1][0];


Wow, that was long, I hope someone was able to understand it.

Thursday, February 23, 2012

SRM 534: Notes that contradict the problem statement

I had class till 9:00 (TC time), so I wouldn't be able to start the match till 9:35.

I opened the medium problem, I think I could have used those extra 30 minutes to solve it. I got to the point I can get a list of all the relevant divisors of n quite quickly. (Relevant divisors would be those that can be made after multiplying the special integers together). After that though there are some doubts. According to the internet, the bound on the number of divisors is math.exp( math.log(10**18) / math.log( math.log(10**18) ) ) = 68075. There are probably a lot of things to consider. And the relevant divisors might as well be much less. But the last test case already has 4000+.

The div1 250 was lame. And I mean it, lame. It is becoming quite common to do bitmask dp problems in div1 250 that can also be solved quickly by guessing a solution that is not easy to prove. I guess the point is to give time advantage to people who don't prove solutions before submitting. I decided to play safe and just do bitmask dp. (By the time I opened 250 I already knew there were a lot of solutions for 500, and that I didn't have time to think of something for it, I also knew that I needed a fast submission in order to save the match.) So I coded the bitmask dp. I am confident at bitmask dp, I got a 238 score, which would of have been really fine.

Unfortunately, the problem included a very stupid corner case. The problem statement says that checkers that reach the right most cell are removed. keyword is "reach". As I was coding my heavy dp solution and busy doing bit operations I coded explicitly the story in the problem statement. So I made checkers go away whenever they reach the last cell. A checker that is initially in the right most cell does not really "reach" it. It turns out a note saying the opposite was hidden in the notes section. But I don't usually expect notes to contradict the statement. It would have been much, much better problem design to make the constraints unable to have a checker initially in the right most cell. A good compromise would be to include "oo" in the examples. Or if Fake difficulty is really such a priority, at least make the statement say explicitly that checker on the rightmost cell are removed before any turn, instead of unnecessarily adding ambiguities by using the word 'reach'.

Update: Contradict is a heavy word. But notes should be there to clarify ambiguity. Not to add new things to the statement. The idea of (You must remove the checker on top of the rightmost cell at the beginning of the game) is a complete new step to do. There's also the fact of how this corner case was not necessary at all - Instead of asking coders to always ignore part of the input, just make it so that part of the input is impossible to happen.

I figured this out just as intermission started, so I couldn't fix it. My only hope was challenges, and maybe other people made the same mistake in my room. Seems not. I initially thought someone in my room made the mistake, it turns out that I am just blind as usual. It is always better not to challenge.

Anyway, the -25 is likely to cause me to lose all rating gained in the last three months. The unsuccessful challenge is entirely my fault. But the 0.00 score in the 250, I will blame 80% on the urge to add useless corner cases to problems. 10% is my responsibility for not reading notes, I was actually in such a rush that I didn't notice the problem had notes, and 10% because I knew that the problem likely had an easy to do reduction that didn't need bitmasks, and it turns out it did. But I preferred to play safe and do bitmasks.

Explanation: Div1 250: The one with the checkers and the board

You have a row of cells. Some cells contain one checker. Two players play the following game: In each turn, you can move a checker one step to the right, if that cell is empty. OR, if there are two checkers in the two cells next to it, you can also make the checker jump three spaces to an empty cell. If at any point of the game (including the start) there is a checker in the last position of the board, it will get removed.

Yes, sure, you can do bitmask dp. But there is also an easy solution that involves this:

- For each checker that is not in the right-most cell, get the distance between this checker and that cell. If the sum of these distances is even, the first player loses. Else she wins.

The trick to this finding and proving this solution is to notice that moving a cell from its position i to position i+1, or to position i+3 (a jump) will do exactly the same thing to the parity of the sum of the distances - It will toggle the parity. Thus no matter what move you do in a turn, the parity of the sum of distances will always change.

The obvious losing position (An empty board) has a the sum of distances equal to 0, which is even. Consider all positions in which the sum of distances is 1. You can always make at least one move that reduces this sum to 0, and causes next player to be in a losing position. When the sum is 2, you can only reduce by 1, and take the next player to a winning position. When the sum is 3, you can only reduce by 1 and maybe by 3 by jumping something (This is only the case for higher sum values, but as you'll see this does not matter) - Whatever you do, the parity of the new sum will be even, and thus 3 is also a winning position.

From there, you can conclude that all cases in which the sum of distances is odd are winning positions. Else all cases in which the sum of the distances is even is a losing position.

Monday, February 20, 2012

Codeforces round #108

I participated unofficially because of lame holiday. Since it was unofficial I dind't rush too much in the first three problems. It was a mistake because problem D and E I think are solvable if I give them enough time. I am starting to write this 3 minutes before the end of the match, because although I am sure my code for D works algorithmically, I doubt I can finish it in three minutes.

Problem A (Marks) - Problem statement

This is the typical div2 easy. Not much to do here besides implementation. Just find, for each subject, the maximum mark, then set students that have this mark in that subject as successful. Return the total number of students you set as successful.

int n, m; 
char grades[100][100];
int countSuccessful()
bool success[n];
fill(success, success+n, false);
for (int i=0; i<m; i++) {
char b = '0';
//find maximum grade for this subject:
for (int j=0; j<n; j++) {
b = std::max(b, grades[j][i]);
//find succesful students for this subject.
for (int j=0; j<n; j++) {
if (grades[j][i] == b) {
success[j] = true;

//The total number of succesful students.
return count(success, success+n, true);

Problem B (Steps) - Problem statement
K and the coordinates can be quite large. So, the main trick to this problem is to do it in O(K) steps. In order to do this, we need to find the maximum steps of a given direction we can do in O(1) (Based on your current location and n and m).

You can actually treat dx and dy separately, and find the maximum for each coordinate. Then take the minimum of the maximums you found and that is the maximum number of steps overall. So, you have your current coordinate z, the direction dz, and t (the maximum value for the coordinate). Imagine for now that dz is positive. The problem becomes to find a maximum s such that: z + s * dz <= t. Solve this in-equation and you will find a formula like s <= (t - z) / dz . Since s has to be a integer, perform a integer division. When dz is negative, use the same process. In fact, what I did was just to convert the negative case to positive and use the same logic (you just need to reverse z, see code).

typedef long long int64; 
#define long int64

long n,m;
int k;
long x, y;
long dx[10000];
long dy[10000];

const long INF = 1000000000000ll;

// Returns the maximum number of steps possible if your current
// coordinate is z, the maximum value of the coordnate is t (the minimum is 1)
// and the direction is dz.
long maxSteps(long z, long dz, long t)
if (dz == 0) {
return INF;
} else if (dz < 0) {
return maxSteps(t - z + 1, -dz, t);
} else {
//max s: (z + s*dz <= t)
// ( s*dz <= t - z )
// ( s <= (t - z) / dz
return (t - z) / dz;
long numSteps()
long total = 0;
for (int i=0; i<k; i++) {
long xsteps = maxSteps(x, dx[i], n);
long ysteps = maxSteps(y, dy[i], m);

long s = std::min(xsteps, ysteps);
x += s * dx[i];
y += s * dy[i];
total += s;

return total;
inline void init(){}

Problem C (Pocket book) - Problem statement
This one is mostly to think things through. You can swap the prefixes of any size between any pair of names. In fact, this overall means that for each character position in the names, you can pick any letter in this position available in any of the names. This means that for each position, there are (number of different letters in the position) choices. And these choices are actually independent. So, just multiple the number of choices for each position and you find the result.

Do you want a informal demonstration? Let us say we have the following names:


According to our logic, the string 1BZ4 is possible. A strategy to reach this name, is to first pick the name that has 4 in the last position, and swap the entire strings:


Now we want a Z in the third location. Let us just pick the string that has Z in that position and swap only the first three characters:


From here, you can continue doing it. Pick "AB" and place them in the first name. Then pick "1".

typedef long long int64;
#define long int64

const int MOD = 1000000007 ;
int n, m;
string names[100];

int countNames()
long c = 1;
for (int i=0 ; i < m ; i++) {
//Count the number of different letters we can use in this position
//a set is great for this:
set<char> st;
for (int j = 0; j < n; j++) {
c = (c * st.size()) % MOD;
return (int)c;

Problem D (Frames) - Problem statement
Now this is an evil problem. Note that the coordinates mean that you need a O(n*m) solution. I could think of so many possible ways to fail it that it seemed risky. So I switched to E. Then I solved this problem whilst trying to find a solution for problem E.

Well, in fact, it is easy to find a solution for D. It is just difficult to find a solution that you can code without making your head explode and without much risk to make a wrong assumption and fail it. For example, you could always make code that detects each possible case. That will require you to do something short of 8 different solutions, each with something complicated.

My solution, which is still quite long to code involves noticing that there will be at most 7984 painted squares in a valid case. (Try it, in a 1000x1000 board paint as many squares by following the rules). Also, the top-left painted cell should forcefully be the top-left corner of at least one frame.

There are now only two possible cases: 1) The top-left painted cell is the corner of two rectangles (they could possibly overlap completely and thus it will look as one rectangle). or 2) This cell is the corner of only one of the rectangles (There is another top-left corner.)

A solution that needs less than 8000*8000 steps to detect case 1): After you find the top-left cell, try all 8000*8000 pairs of possible bottom-right corners you can find. You need a way to verify that a pair of top left and bottom right corners forms a valid rectangle. For this, it is best to use accumulated sums of the number of painted cells in the rectangle to calculate the number of cells inside a rectangle of squares in O(1). Subtract the middle rectangle from the smaller rectangle to find the number of cells in the border of the "frame". If this is equal to the perimiter, then this is a valid frame. If you find two corners that make a valid frame with the top left corner (Note that these two bottom-right corners might be equal) then you still have to verify that the cells that make these two frames are the only painted cells in the paper sheet. You will have to find a function that calculates the sum of the perimeters of two rectangles. That requires you to find the intersection of the perimeters of the rectangles, this is possible, but should take some case matching... (This is the reason I halted coding the solution).

Test case 2) Is similar. You first need to find the one unique bottom right cell that makes a valid frame with the top-left cell. After that, you can find the two corners of the other rectangle. This is once again at most 8000*8000 tries. You will need to reuse the code from case 1).

I am very sure there is an easier approach for problem D.

Problem E - Problem statement
This problem is similar to finding those trees that are like the minimum spanning trees but can use any intermediary points. This problem needs some sort of exponential solution. Which is likely confirmed by the small constraints of K. Also note that the size of the smallest dimension is at most 14 (Because n*m is at most 200). The solution likely abuses these constraints. Had to stop working in this problem when I thought I had a simple solution for D.

What do you think?
I liked the match until problem C. I think D is too convoluted. A lot of people preferred to skip to E, which seems nice.