Friday, February 24, 2012

SRM 534: Div1 500, EllysNumbers

I have finished coding an idea I got during the morning while wasting time on a typical sub-human education system queue. This problem is fun and cute and almost makes up for the terrible Fake difficulty from the 250.

EllysNumbers
You have a list of 500 special numbers, each from 1 to 109. And also a number n from 2 to 1018, inclusive. Count the total number of subsets of the list of special numbers such that a) All the numbers in the set are pair-wise coprime. And b) The product of all the numbers in the set is equal to n.

For simplicity, let us just get rid of any special number that is not a divisor of n. If it is not a divisor, we cannot use it in the product.

This is a solution that is conceptually simple and easy to prove that runs under the 2 seconds limit. But it requires quite an amount of implementation steps. Funnily enough, after coding the solution I learned that ACRush's top score solution is basically the same, but he can implement it in far less lines. Well, I got to accept that my programs are usually longer than average. It is because I am very strict to myself in following some coding style rules I imposed myself. Now that I notice, the reasons for each of these rules could make a nice discussion for another day.


Anyway, the key to this problem is how to ensure that all the numbers picked in your subset are pairwise coprime. This time it is best not to use the gcd(a,b, ... ) = 1 definition (not directly, at least). Instead, think about factorization.. A group of numbers are pair-wise coprime if no pair of those numbers shares a prime factor.

Why prime factors? Well, for starters numbers in general don't have a lot of prime factors. Note that we are not considering the exponents when making this statement. For example 72, has in our current focus, 2 prime factors: 2 and 3. In reality it is 2*2*2*3*3, but that does not matter yet. Why doesn't it matter? Ok, since the numbers in each subset have to be coprime, you cannot pick two numbers that share a prime factor. You cannot pick 8 and 24 simultaneously. So, for each prime factor of n, you need to pick one of the special numbers. There is an extra condition, this number must have the prime factor raised to the same exponent that n needs. For example, if n = 72 = 2*2*2*3*3, you cannot pick 4 or 36 or 6 as a number in the subset. Because 36 will bring only 2*2 to the matter, but you need 2*2*2 to make 72, but you cannot use any other number that has 2 as factor again.

Now, here is the key result to find an answer what is the maximum number of different prime factors a number less than or equal to 1018 can have? Since we want to maximize the number of factors, we should do something like 2*3*5*7*... (A product of the smallest primes). Find the product of the K smallest primes that is closest to 1018. You will notice that the maximum number of distinct prime factors a number in the input can have is 14.

A typical dynamic programming problem
14... That number is rather small. Now, imagine that we already know the distinct f prime factors of n. Let us index these prime factors and assign each of them a number from 0 to f-1. Now, for each special integer, find the indexes of the prime factors of n it contains. Note that the special integer must either contain the prime factor 0 times or exactly the same number of times as n. In other case, we cannot use the special integer and we must discard it. Since we are assuming that all special integers are divisors of n, there will never be more factors.

Once the special integers have been decomposed into the prime factors of n, there is a recurrence relation that can count the number of subsets: Let f( used[], s) the number of subsets using only the s first special integers if the current list of used prime factors is given by used[]. For each instance of the recurrence, there are two choices: Either use integer # (s-1) or do not use it. In order to use the integer, it must not use any prime factor already in the used[] array. Of course, a base case is when s=0, then we need to have used all factors of n. It becomes something like this:


f(used[], s) {
res = 0
if (s == 0) {
// base case
if (All prime factors of n are in used[]) {
res = 1
} else {
res = 0
}
} else {
if ( (factors in s) does not intersect used[] ) {
// update the used set
// procceed to use smaller sets
res += f( used[] union (factors in s), s - 1)
}
// do not use (no conditions)
res += f(used[], s - 1)
}
return res
}


Now, used[] can be represented as a bitmask . If you do that, let us also represent the factors allowed by each special number as a bitmask too. To check if they intersect , just check if their intersection (used_mask & (mask of special) ) is not 0. Now that the arguments are encoded as a bitmask and a number s. We can verify the recurrence relation is acyclic, and we can use dynamic programming. This yields an algorithm that requires at most (214 * 500) in time units. A small issue is memory, since we always use only the previous value of s when calling the recurrence, we can implement this algorithm in O(214).


Details, However
The dynamic programming algorithm is a rather standard set cover algorithm. We have skipped some work though. We need to factorize n. And n can be too large even for the O(sqrt(n)) algorithm to factorize.

The trick is that the problem requires all prime factors of n to be present in one way or another in the special integers. (Note that the special numbers are at most 109, unlike n) Thus we can just factorize each special integer. The total union of all the factors found can be a factorization of n, or be short of some prime factors. But if you couldn't find enough prime factors in the special integers, we may as well return 0. Because there is no way to ever find n as a product.

#define for_each(q, s) for(typeof(s.begin()) q=s.begin(); q!=s.end(); q++) 
struct EllysNumbers
{
long long getSubsets(long long n, vector <string> sspecial)
{
//===========================================
// a) Parse the input vector<string>, make sure to remove any number that
// is not a divisor of n, because we hate them.
vector<long long> special;
{
string s = accumulate(sspecial.begin(), sspecial.end(), string("") );
istringstream st(s);
long long x;
while (st >> x) {
if (n % x == 0) {
special.push_back(x);
}
}
}
set<int> setPrimeFactors;
//======================================================
// b) Extract prime factors from all special numbers
for_each(q, special) {
int x = *q;
// Factorize x...
for (int y = 2; y <= x/y; y++) {
if (x % y == 0) {
setPrimeFactors.insert(y);
do {
x /= y;
} while (x % y == 0);
}
}
if (x != 1) {
setPrimeFactors.insert(x);
}
}
//=============================================================
// c) Get the amounts of each factor needed to build n.
// also verify that all the factors of n exist in the input.
vector<int> primeFactors( setPrimeFactors.begin(), setPrimeFactors.end() );
int neededFactors[ primeFactors.size() ];
long long z = n;
for (int j=0; j<primeFactors.size(); j++) {
cout << "PF "<<primeFactors[j]<<endl;
int & c = neededFactors[j];
c = 0;
while( z % primeFactors[j] == 0) {
z /= primeFactors[j];
c ++;
}
}
if (z != 1) {
cout << "incomplete "<<z<<endl;
// not all prime factors of n are present...
return 0;
}

// ========================================================
// d )
// Get the factor mask for each of the special numbers.
// However, there is a chance a special number is "incomplete".
// discard those.
//

int factorMask[special.size() ];
for (int i=0; i<special.size(); i++) {
int & mask = factorMask[i];
int x = special[i];
mask = 0;
for (int j=0; j<primeFactors.size(); j++) {
int c = 0 ;
while (x % primeFactors[j] == 0) {
x /= primeFactors[j];
c++;
}
if (c!=0) {
if (c != neededFactors[j]) {
mask = -1;
} else {
mask |= (1 << j);
}
}
}
}
// ==========================================================
// e) The dynamic programming algorithm is rather standard
// now that we have all that info.
// However, we need it in O(2^f) memory.

int f = primeFactors.size();
int s = special.size();
long long dp[2][1<<f];

memset(dp, 0, sizeof(dp) );
// The base case, when all the prime factors are done and no special ints are left
dp[0][ (1<<f) - 1 ] = 1;

for (int t = 1; t <= s; t++) {
for (int mask=0; mask<(1<<f); mask++) {
long long & res = dp[t&1][mask];
res = 0;
// use the (t-1)-th integer
if ( (factorMask[t-1] != -1) && ( (mask & factorMask[t-1]) == 0) ) {
// no intersection, use it.
res += dp[~t&1][ mask|factorMask[t-1] ];
}
// Do not use the (t-1)-th integer
res += dp[~t&1][mask];
}
}

return dp[s&1][0];

}
};



Wow, that was long, I hope someone was able to understand it.

1 comment :

Test said...

Thanks