Showing posts with label topcoder. Show all posts
Showing posts with label topcoder. Show all posts

Thursday, September 10, 2015

One handed programming and other things you didn't expect to read about in this blog

Hello all,

It's really been a while since I updated this blog. I've been dealing with difficulties balancing the work demands of writing editorials in topcoder and etc.

Tomorrow we have a new SRM, 667. It's going to be a very special SRM for me although not for happy reasons. You see; last September 1st a car hit me and I got a shoulder fracture. It was described as a lucky outcome and it seems like conservative therapy that consists of immobilization and waiting for the bone to heal on its own is the best approach. As a result, tomorrow I will only use one hand, my left hand (I am right-handed) in tomorrow's SRM. Let's see what happens.

Of course my real concern is about how it will affect my performance writing the editorial. As of late and after finally dealing with the old delayed problems that stacked up since last year. I've been really trying to get the editorials back to their former quick publishing glory. It kinda worked with 666's editorial. But of course the issue at hand might slow me down. I am optimistic, however. I was able to solve and write the editorial for the TopCoder Open round 2D after the accident. I just needed longer breaks after each problem.

In happier news: I have recently started a Patreon . It's focusing on my computer art and twitter anti spam work. I wonder if I can expand it for explanations of algorithmic problems? Maybe.

Monday, April 27, 2015

SRM 657

I guess I just solved the div1 easy. Tried the div1 medium, modulos, modulos , modulos. But I finished the div1 easy editorial, you can find it at http://apps.topcoder.com/wiki/display/tc/SRM+657. Or here, because really, why not:

The problem consists of 6 variables: E, EM, M, MH, H, they are the number of problems you can use in problem sets. Each problem set needs an Easy, A Medium and a Hard problem to be used. E,M and H problems MUST be used as easy, medium or hard, respectively. There are EM problems that can be either easy or medium and MH problems that can be either Medium or Hard. What is the maximum number of problem sets you can make without repeating any problem? Oh, and each variable is a integer as long as `10^18`.

Two decisions are involved in this problem:

  • Of the EM problems that can be easy or medium, how many will be assigned as medium? Let that number be `a`
  • Of the MH problems that can be medium or hard, how many will be assigned as medium? Let that number be `b`

When `a` and `b` are known, we can calculate the number of problems of each difficulty we have available:

  • `E + ("EM" - a) ` easy problems.
  • `M + a + b` medium problems.
  • `H + ("MH" - b) ` hard problems.

To have a problem set we need one of each kind of problems, so the maximum problem sets is `min(E + "EM" - a, M + a + b, H + "MH" - b)`.

Imagine we wanted to see if it is possible to have `x` problems in total:

  • If `E + "EM" < x`, no matter what value of `a` we pick, the number of easy problems would be too small.
  • If `E + "EM" >= x`, then there are normally enough easy problems unless `a` becomes too large. We can find the maximum allowed value of `a`: `M_a = x - E + "EM"`. Also `a` cannot be larger than `"EM"`
  • Similarly, `H + "MH"` must be at least `x` and `M_b = x - H - "MH"` is the maximum allowed value for `b`.
  • We just need `M + a + b` to be at least `x`, imagine we take the maximum allowed values: if `M + M_a + M_b >= x` then we can know `x` is a valid number of problem sets

Now note that if a value of `x` is allowed, then so will all smaller values and if a value of `x` is not allowed , no value larger than `x` will be allowed. So we can use Binary Search to find the maximum allowed value for `x`, this is the maximum possible number of problem sets.

    
long allowedX(long E, long EM, long M, long MH, long H, long x)
{
    if (E + EM < x) {
        return false;
    }
    long aa = std::min(EM, E + EM - x); //maximum a
    if (H + MH < x) {
        return false;
    }
    long bb = std::min(MH, H + MH - x); //maximum b
    return (M + aa + bb >= x); 

}

long maxSets(long E, long EM, long M, long MH, long H)
{
    // We have to maximize:
    // min(E + EM - a, M + a + b, H + MH - b)
    // where 0 <= a <= EM , 0 <= b <= MH
    long lo = 0; // we know 0 is allowed
    long hi = M + MH + 1; // we know this will never be allowed
    while (lo + 1 < hi) {
        long x = (lo + hi) / 2;
        if (allowedX(E,EM,M,MH,H, x)) {
            lo = x;
        } else {
            hi = x;
        }
    }
    return lo;
}

That was the solution in the editorial, but I prefer my lambda solution:

long maxSets(long E, long EM, long M, long MH, long H)
{
    // We have to maximize:
    // min(E + EM - a, M + a + b, H + MH - b)
    // where 0 <= a <= EM , 0 <= b <= MH
    return BinarySearch<long>( 0LL, H + MH + 1,  [&](long x) {
            if (E + EM < x) {
                return false;
            }
            long aa = std::min(EM, E + EM - x); //maximum a
            if (H + MH < x) {
                return false;
            }
            long bb = std::min(MH, H + MH - x); //maximum b
            return (M + aa + bb >= x); 
    });
}

It uses this Binary Search library function because it is funny to use functions as arguments.

// Library code
// takes lo, hi, P() such that:
// P(lo) is true
// P(hi) is false
// If P(x) is true and (x > y) then P(y) is true
// If P(x) is false and (x < y) then P(y) is false
// returns the maximum x such that P(x) is true
template<class D> D BinarySearch(D lo, D hi, std::function<bool(D)> P) {
    while (lo + 1 < hi) {
        D ha = lo + (hi - lo) / 2;
        if (P(ha)) {
            lo = ha;
        } else {
            hi = ha;
        }
    }
    return lo;
}

Saturday, April 11, 2015

TCO 2015 Round 1A

I guess I advanced? Unless there is an implementation bug in my algorithms, which is quite possible. I know they are theoretically correct.

While solving the first two problems I kept having the feeling I was doing things the overcomplicated way.

250 points: The one with digits

You are given `1 <= L < R <= 100000`, find a pair `L <= a < b <= R` such that `a` and `b` have as many digits in common (ignore repeated digits).

So for a 250 in an "easy" round, this is not like the div2 easies. Can't just try all pairs `(a,b)`. In fact, the solution I used is quite complex (not hard, complex) and sounds like a division 2 hard thing?

The numbers will have at most 5 digits. So each number `L <= i <= R` will have at most `2^5 = 32` subsets of digits. The idea is to try them all and check if any number in the past had that subset of digits. Hey, it works.

// extract digits from x, call f(d) for each digit d we find.
void forEachDigit(int x, std::function<void(int)> f)
{
    while (x > 0) {
        f(x % 10);
        x /= 10;
    }
}

int maxsim(int L, int R)
{
    vector<bool> exists(1024, false);
    int best = 0;
    for (int i = L; i <= R; i++) {
        // create a bit mask of the digits in i:
        int mask = 0;
        forEachDigit(i, [&](int d) {
                mask |= (1 << d);
        });
        // for each subset of i's digits:
        for (int smask = mask; smask != 0; smask = (smask - 1) & mask) {
            // did we see this subset before?
            if (exists[smask]) {
                // if so, consider for maximum
                best = std::max(best, __builtin_popcount(smask) );
            }
            exists[smask] = true;  //whoops
        }
    }
    return best;
}

I did a wrong thing in my first submission, instead of marking exists[smask] = true for each subset I was marking exists[mask] = true for just the set...

500: The one with simulation

There is a graph in which all vertices have exactly one out-edge. You initially place tokens, at most one per vertex. In each step of the game, the tokens move to the vertex pointed by their out-edge. If at any time there are two tokens in the same vertex, you lose. Count the number of ways to place tokens without losing. The steps are repeated `K`<= 100000000` times.

So once two tokens end up in the same vertex, they will never separate again. So for example we can try simulating with a graph that has a token each. At the end of the simulation, some vertices will have multiple tokens. The trick is to notice that each of these tokens represent an initial vertex. If there are 4 tokens in a single vertex, it means there are 4 vertices that must all contain at most one token in total initially. So for each of the vertices at the end of the game, multiply (Number of tokens in vertex + 1).

The problem is that `K` can be pretty large. I used something similar to matrix exponentiation to simulate. But I think maybe it is possible to prove that you don't need to do too many simulations?

const int MOD = 1000000007;


// merge two simulation vectors
vector<long> merge(const vector<long> &A, const vector<long> &B)
{
    int N = A.size();
    vector<long> R(N);
    
    for (int i = 0; i < N; i++) {
        R[i] = 0;
        for (int j = 0; j < N; j++) {
            if ( (1LL << j) & B[i] ) {
                R[i] |= A[j];
            }
        }
    }
    
    return R;
}

int wayscnt(vector<int> a, int K)
{
    int N = a.size();
    // The "identity" vector each token in its initial vertex
    vector<long> I(N);
    for (int i = 0; i < N; i++) {
        I[i] = (1LL << i);
    }
    // B represents what happens after a single step:
    vector<long> B(N);
    for (int i = 0; i < N; i++) {
        B[i] = 0;
        for (int j = 0; j < N; j++) {
            if (a[j] - 1 == i) {
                B[i] |= (1LL << j);
            }
        }
    }
    // P[0] is what happens after one step
    // P[1] is what happens after two steps
    // P[2] is what happens after four steps
    //...
    // P[i] is what happens after 2^i steps 
    const int MAX = 30;
    vector<vector<long>> P(MAX, vector<long>(N) );
    P[0] = B;
    for (int i = 1; i < MAX; i++) {
        P[i] = merge(P[i-1], P[i-1]);
    }
    // with that in mind, we can find what happens after any number of steps
    vector<long> x = I;
    for (int j = 0; j < MAX; j++) {
        if ( (1<<j) & K ) {
            x = merge(x, P[j]);
        }
    }
    
    long res = 1;
    for (long y: x) {
        res = (res * (__builtin_popcountll(y) + 1 ) ) % MOD;
    }
    
    return (int)(res % MOD);
}

1000: The one with bipartite matching

Find the minimum total weight of edges to remove from a bipartite graph so that there is no perfect bipartite matching. o_O.

According to this: ItayGonshorovitzThesis.pdf, if we reduce the bipartite matching problem with flow and try to find the minimum cost to reduce the flow, that is NP-hard. Although with such a low constraint for `n`, maybe non-polynomial IS intended. I think that paper has interesting reductions for the problem so maybe it helps?

Saturday, December 27, 2014

SRM 643

div1 250: The one with factorization

Factorize `2 <= N <= 10^18`, but half of the vector containing the sorted prime factors is given to you as the input. (The even-indexed ones) For example, 18 = 2*2*3, the prime factors vector is {2,2,3}, which means that the input will be 18, {2,3}, get it?

Factorization is supposed to be a hard problem. And for `N <= 10^18` it is, specially given the constraints, not so much the 2 seconds execution constraint, but the 256 MB memory constraint, the available time to code stuff and the lack of being able to save useful stuff in hard drive before running each test case... I mean, the *n*x* factor command can certainly factorize some large numbers, but you are supposed to do things the manual way here rather than reuse code..

So factorizing `N` is out of the question, maybe that's why it has to include some factors in the input. My first instinct was to think, well, just divide `N` by the known prime factors, the remaining value `n` shall be small enough to factorize, right? This is false. Just imagine `N` equal to 2 multiplied by a large prime number close to `10^18 / 2`. The large prime number would be a bit too large to do the `O(sqrt(N))` factorization by trial division algorithm we normally use.

Second idea was to do some ad hoc things. If there is only one prime factor provided, it means the number of prime factors is either 1 or 2, these are both easy cases to solve (result is either `{"primes"[0]}` or `{"primes"[0], N/"primes"[0]}` ). But this doesn't help. Cases in which the input has more than one factor can still lead to a rather large `n` after dividing: `999999999999999896 = 2 * 2 * 2 124999999999999987`.

So the solution is based in the following logic: The very last prime number in the factorization can be rather large `O(N)`, large, but if it is really so large, it will be the only such large number in the factorization. If this number is in an odd position, it will be ignored, and the resulting `N` after dividing by the known factors will be large because it includes this prime number. So the idea (which may not be 100% correct) is to just do trial division up to the `sqrt(2000000000)` and then, if there is still a number after dividing that much, what's left ought to be a prime number, right? I have no idea how good this is.

typedef long long int64;
#define long int64
struct TheKingsFactorization
{
    vector<long> getVector(long N, vector<long> primes)
    {
        long n = N;
        for (long p: primes) {
            n /= p;
        }
        //for (long p = 2; p * p <= 2000000000LL; p++) {
        for (long p = 2; p * p <= n; p++) {
            while (n % p == 0) {
                n /= p;
                primes.push_back(p);
            }
        }
        if (n > 1) {
            primes.push_back(n);
        }
        sort(primes.begin(), primes.end());
        
        return primes;
    }
};
#undef long

I suspect there will be many challenges in the challenge phase. I prepared up some cases, it's unlikely I'll get to use them, because Petr is in my room. Finished typing this as the coding phase ended. Let's see what happens.

Tuesday, December 09, 2014

SRM 640 : Do I still have write access here?

It's SRM 640, that means there have been 140 SRMs since SRM 500, remember SRM 500? It was fun.

Div1 250: The one with Trees

You have a bunch of stars with different colors. Use Ribbons to connect the stars, so that the ribbons and stars make a tree (connected, no cycles). Only some specific pairs of stars may be connected. Return the minimum number of ribbons connecting two stars with equal colors we should use.

So... isn't this the very definition of the minimum spanning tree problem? You need to find an undirected tree, with a minimum sum of used edge costs. Just make an edge cost 1 if the two stars it connects have the same color.

So just implement Prim's or Kruskal's Minimum Spaning Tree algo. I went for Kruskal because I am odd.

class ChristmasTreeDecoration:
 def solve(self, col, x, y):
    n = len(col)
    # I like python for things like this, create a list of edges (c,u,v) c is cost
    edges = [ ( int(col[i-1] == col[j-1]), i-1, j-1) for (i,j) in zip(x,y) ]
    
    # Precarious union find thingie 
    parent = range(n)
    def getParent(x):
        while parent[x] != x:
            (x, parent[x]) = (parent[x], parent[parent[x]]) 
        return x
    
    # Kruskal:
    total = 0;
    for (cost, u,v) in sorted(edges):    #sorting edges by cost
        if getParent(u) != getParent(v): #distinct components
            parent[getParent(u)] = v     #connect them
            total += cost                # add cost
    return total

After coding this and realizing it is super short and elegant I starred at the screen for a while. Skeptical that this was so easy. 250s lately are harder than this. I also doubted for a while if my getParent(x) function is bugged or not. Normally I use a stack to move all x to have parent[x] = the super grand parent. But this seems to work fine.

Div1 550: The one with bipartite graphs

You are given `n_1,n_2,a,d`, `n_1,n_2 <= 1000000000` return the maximum number of edges in a graph with `n_1+n_2` vertices, where each vertex has degree of at least `d` and the maximum bipartite matching is `a`.

So yeah... this problem exists. I am sure there has to be some sort of solution.

Div1 1000:

Another problem that probably has a solution. My hunch is that this one is rather easy once you do plenty of math.

Challenge phase

Nothing to report.

Let's see if that 250 survives.

Saturday, August 30, 2014

SRM 631 : A new arena

Remember when I wrote blog posts periodically? Remember when I actually had energy to have an explanation ready just by the end of the match? Remember when editorials weren't super late? It was a nice time for sure.

Participated in Topcoder SRM 631 . It was a special match because earlier this week the Web version of the Topcoder arena was released (in beta). There is actually a thing in which if you use the arena in SRM 631 or 632 and fill a survey you can maybe, if you are super lucky, maybe, win a ticket to the TCO. So I decided to use it in these matches. The new Arena currently lacks many features. And I have the impression that there were some few wasted opportunities in regards to how it was designed. I will focus this post on what I experienced about the new arena. If

Registration

I was very busy in the morning finishing the first part of SRM 630 editorial (log in required, which is terrible). I've had a very bad streak with my editorial work. First I was actually solving the problem in SRM 629 quite fast, even figured out div1 medium by myself and all, but the div2 hard got me stuck for too long, so long that SRM 630 started. I made the foolish decision to finish 630 before taking back to work in 629., thinking that at least then only one editorial will be late. That was a very bad decision because it turns out that div2 hard and div1 medium were both the same problem. One with a very easy solution that I couldn't understand for over a week, not helped by a paper that made me feel dizzy when reading it...

The short story is , I was busy and distracted so when it came time to register for the match , I forgot to use the web applet (I was already using the Java one, because I needed to test the SRM 630 problems and the web applet doesn't have practice or plugins). Lucky because it turns out that people who tried to register using the web Arena had issues.

One of the Use Cases I had in mind regarding the web arena was the ability to register to the match from a mobile device. You cannot run the Java Arena in Android or iOS. Unfortunately, my attempts to load the arena in Android Firefox were futile and I didn't have time to unfreeze the Google Chrome Android app so I could test it.

I really hope that whatever issue is making it so hard for the web app to load in mobile browers (or maybe just Mobile Firefox) is eventually not a problem. Will make sure to report once I have more info.

The match

I opened the web arena and looked for a way to enter my room. In the competition room, you meet this:

It actually took me a while to find the problem links. They are in the right. The web arena actually shows you the problem names before you open the problem. This has interesting consequences. Remember that the submission score starts to drop the instant you open a problem, so in topcoder matches we tend to try to solve one problem at a time and usually it is best to start with the easiest (lowest score) problem first. Some problem names are very descriptive , however, so maybe if a problem hints you up that the solution will be too easy or too hard for you the problem names are useful information. Or maybe disinformation if the problem name is not that good. Also, in some cases the problem name can give you a hint of who is the writer of the problem...

I don't really like that you need to click "Unopened" to open the problem. It is just not intuitive. I would have preffered an [open] link/button and some test below problem name to indicate if you already opened. Optimal would be to show the problem score, decreasing real time.

I open the div1 easy problem, remember that I am in low energy mode. It took me a while to read the statement. In this problem, you have to turn a board containing either Black or White cells into a board such that, for each column, the maximum number of consecutive cells (in the column) of the same color is `N/2`. A step consists of picking a row and either painting all white cells black or all black cells white. Note that this is the same as painting the whole row white or black. Return the minimum number of moves. I was trying to solve this problem when I noticed something was missing : The constraints. Turns out that the problem statement didn't display them... This is the kind of bug that we could have caught before the match if there was a practice feature - Just saying.

So I needed to open the problem in the Java arena to read the constraints (If the maximum `N` is 8 the problem is straightforward, if the maximum `N` is 1000 it would be very difficult to make it run in time, Constraints are everything). Logging in the Java arena forces the web arena to close. Logging in the Web arena causes the Java arena to freeze. Both arenas need a constant connection, and I was hoping the web one didn't (maybe make it so having a constant connection enables chat and real time score update but is optional).

On the bright side, having to open the problem in the Java arena meant that the Greed plugin would generate test code for me.

How to solve div1 250

The constraints were: Matrix is size `N times N`, `N ` is even, `N <= 50`.

I think the key here is to understand how the columns work. There can only be at most one group of more than `N/2` consecutive cells of the same color in a single column. If no such group exists, we can ignore the column. So we end up with a problem in which each column has a group of at least `N/2 + 1` that we need to cut. The cool property here is that at least 2 rows will be shared by ALL of these groups. Even in an extreme case, if there were 2 groups in two columns, one group was topmost and the other bottommost and the number of consecutive equal colors was `N/2+1` in both columns; even in this case, the two groups would share two rows. It is nice that there will always be two shared rows, because we can always just paint one of them white and the other black and it will be enough to "cut" all the columns. In short: We can always solve the problem in 2 steps. So we only need to worry about two special cases : a) If the board is already correct then the result is 0 steps. b) Maybe the board can be solved in a single step, pick one of `O(n)` rows and paint either completely black or completely white and check if the board is correct. This leads to an `O(n^3)` algorithm.

class TaroJiroGrid:
 def getNumber(self, grid):
    n = len(grid)
    # result is at most 2
    res = 2
    
    def goodGrid(g):
        def goodColumn(i):
            last = '!'
            x = 1
            for j in xrange(1, n):
                x = x + 1 if g[j][i] == g[j-1][i] else 1
                if x > n / 2:
                    return False
            return True
        
        return all(goodColumn(i) for i in xrange(n))
    
    if goodGrid(grid):
        return 0
    
    for i in range(n + 1):
        for color in ('W', 'B'):
            ngrid = [ grid[j] if i != j else color * n for j in xrange(n) ]
            if goodGrid(ngrid):
                return 1
            
    return 2

So I still had to open the web arena, paste the code and submit... But first I wanted to test in the arena. I noticed something cool, unfortunately the screenshot wasn't saved for some reason. Each example had a checkbox next to it. Then there was a [Test All] button. So you can test multiple examples at once. Thus killing one of the main reasons most people use plugins... Unfortunately, when I tried to do this I had an error that EXACTLY ONE example must test. Apparently this is a bug.

The rest

I opened div1 500. Seemed difficult. Then I went to eat. Came back and Tried to chat in the arena, but scrolling text in the chat box turned out to be very difficult. I hope this gets fixed.

I wish the new arena didn't forcefully require a constant connection. It is very annoying to log back in when your internet has a small hiccup... Also hope that whatever makes me unable to use it in mobile gets fixed (either the browser or the arena, will research more later).

Random screenshot with some glitches:

Some of the glitches are my fault : I force a minimum font in the web browser. Some seem to have been caused by slow internet or something.

Tuesday, July 22, 2014

SRM 628: sleeping during the srm

Okay, so here is the thing. This is a 7:00 AM match. That is not usually a problem except I went to sleep at 4:00+something AM, at least I could register early to the match. So I put my alarm clock to 7:15 AM, I thought losing those 15 minutes wouldn't really change my score too much unless div1 medium is easy. I *really* woke up at 7:30. With 45 minutes the objective was to do at least div1 easy...

Div1 Easy: The one with sub-routines

Given `2 <= n <= 10^18` Find the maximum `x` such that : `(x ^ d(x) = n)` where `d(x)` is the number of divisors of `x`. If no such `x` exists, return -1.

The key is to notice that `d(x)` cannot really get too large. Any valid `d(x)` is `O(log(n))`, so we iterate through all the values of `d` such that `n` is a power of `x`. Until we reach a `d` that is very large.

For a given `d`, let's find `x : x ^ d = n` , this is the `d`-th root of `n`. We could use the pow function here, but it uses floating point numbers and you would need to be careful with precision and stuff... So we can just use a binary search for the largest `x` such that `x ^ d <= n`. Incidentally, if this `x` is 1, then we know that `d` has become too large and we can stop the search.

Once we have `d` and its valid value of `x` , all we need to do is verify that the number of divisors of `x` is equal to `d`. `d` will be at least 2, which means that `x` is `O(sqrt(n))`. We can count the divisors of `x` in `O(sqrt(x))` time. This yields an `O(n ^ (1/4))` algorithm. Nice?

The main method of my solution looks like this:

long findArgument(long n)
{
    long d = 2;
    long res = -1;
    while (true) {
        // find x : x ^ d = n
        
        long x = binarySearch(1, n+1, [&](long x) {
                return ipow(x,d) <= n;
        });
        if (ipow(x,d) == n) {
            // possible
            if (countDivisors(x) == d ) {
                res = std::max(res, x);
            }
        } else if (x == 1) {
            break;
        }
        
        d++;
    }
    
    return res;
}

Everything else are standard sub-routines ...

long binarySearch(long lo, long hi, std::function<bool(long)> P)
{
    // P(lo) true
    // P(hi) false
    while (lo + 1 < hi) {
        long ha = lo + (hi - lo) / 2;
        if (P(ha)) {
            lo = ha;
        } else {
            hi = ha;
        }
    }
    return lo;
}

const long INF = 1000000000000000002LL;
long ipow(long x, long d)
{
    // returns x ^ d if x ^ d < INF, else INF 
    long r = 1;
    while (d > 0) {
        if (d % 2 == 1) {
            if (r > INF / x) {
                r = INF;
            } else {
                r = r * x;
            }
        }
        d /= 2;
        if (x > INF/x) {
            x = INF;
        } else {
            x = x * x;
        }
    }
    return r;
}

long countDivisors(long x)
{
    long p = 1;
    long c = 0;
    while ( p <= x / p) {
        if (x % p == 0) {
            c++;
            if (x / p != p) {
                c++;
            }
        }
        p += 1;
    }
    return c;
}

Div1 medium: The one with expressions

You are given a tree of binary expressions. The expressions can either be the ADD function or the MAX function. You also have a list of (positive) values you can assign to each of the leaves in the tree. There are at most 2000 leaves. Return the maximum total value

Good luck with THAT. Solution is likely some sort of greedy but I am too asleep to do this.

.

Afterwards

During the challenge phase I read codes for div1 500. The idea appears to be that you can find out the maximum quantity of values that will be added up together in the final expression. If you have this number `x`, then you just have to pick the largest `x` integers from the values and add them together. Of course they had to pick THIS match to have an easy div1 medium...

Thursday, July 10, 2014

SRM 627: Did I really code THAT?

Is this just fantasy? I am not sure if this actually happened, backtracking with binary indexed trees? What?

Div1 250: The one with letters

Game starts with a multiset of alphabetic letters. In every step, two distinct letters are taken and removed. Eventually, the multiset will be empty or contain only repeated letters, those letters are called winning letters. Return all the letters that can possibly be winning letters.

My idea, which I am not totally sure is correct. Is to check for each letter `c`, if it is possible that it will win. Then all the moves have to have the intention that the specified letter remains at the end. This means that we should first try to clear as many letters different to `c` as possible. Eventually, a list of equal letters that are distinct to `c` will remain, and we need to get rid of them by using the `c` letters. If these repeated letters are less than `c`, then `c` can win.

The trick then is to always pick pairs of letters distinct to `c` , and pick the two letters that are currently most frequent.

class HappyLetterDiv1:
 def getHappyLetters(self, letters):
    res = ''
    # for each letter ch, can we do it?
    for ch in sorted(set(letters)):
        # get a dict of distinct letters and counts
        other = { x : letters.count(x) for x in letters if x != ch }
        while len(other) > 1:  # at least two different ones
            # sort by frequency
            least = sorted([ (other[x],x) for x in other.keys() ])
            # remove a letter of each of the two most frequent:
            a = least[-1][1]
            b = least[-2][1]
            def dec(y):
                other[y] -= 1
                if other[y] == 0:
                    other.pop(y)
            dec(a)
            dec(b)
        if (len(other) == 0) or (other[other.keys()[0]] < letters.count(ch)) :
            res += ch
    return res

Div1 500: The one with a graph and inversions

You have a graph of at most `1000` vertices. The graph is special in which it is a tree with an extra edge added onto it. (The graph is connected, undirected, lacks loops and has exactly `|V|` edges)

The value of a vertex `x` is `V[x]`.

For a simple path of vertices in the graph, we make a sequence `S` of the values of the vertices in visited order. Now count the number of "inversions" of the graph. This is the number of pairs `(i,j)` such that `(i < j)` and `(S[i] > S[j])`. Return the minimum number of inversions in a sequence of at least `K` elements.

.... So the trick is that the graph is a tree with exactly one extra edge. So there is only one simple cycle in it. So there are at most 2 simple paths connecting each pair of vertices. So there are at most 2000000 paths. 2000000 sequences. With backtracking, we can visit them all. The problem is that you need to calculate the sequence's number of inversions as you build the sequence...

Let's say you have already chosen some of the previous elements in a sequence, and want to add `V[x]` to the sequence, we can count the number of added inversions by calculating the number of elements in the sequence that are higher than `V[x]`. So we "just" need to use a data structure that adds elements, removes elements and counts the number of elements larger than a integer `x` in `O(log(N))`. And we need to update that sequence (add , then remove) as part of the backtracking. Easy... So I implemented a binary indexed tree , called it fenwik for some reason and it seems to work.

// for each pair of edges (x,y), there are at most 2 paths.
// thus there will be at most 2000000 paths.


int fenwick[2048]; //probably wrong name

void fenwick_add(int x, int a = 0, int b = 1024, int node = 0)
{
    if ( a <= x && x < b ) {
        fenwick[node]++;
        if (a < b - 1) {
            int c = (a + b) / 2;
            fenwick_add(x, a,c, node*2 + 1);
            fenwick_add(x, c,b, node*2 + 2);
        }
    }
}
void fenwick_remove(int x, int a = 0, int b = 1024, int node = 0)
{
    if ( a <= x && x < b ) {
        fenwick[node]--;
        if (a < b - 1) {
            int c = (a + b) / 2;
            fenwick_remove(x, a,c, node*2 + 1);
            fenwick_remove(x, c,b, node*2 + 2);
        }
    }
}
int fenwick_counthigher(int x, int a = 0, int b = 1024, int node = 0)
{
    if (a > x) {
        return fenwick[node];
    }
    if ( (a==x) || (b <= x) ) {
        return 0;
    }
    int c = (a + b) / 2;
    int r = fenwick_counthigher(x, a,c, node*2 + 1)
          + fenwick_counthigher(x, c,b, node*2 + 2);
    return r;
}


int getMinimumInversions(vector<int> A, vector<int> B, vector<int> V, int K)
{
    int N = V.size();
    vector<list<int>> g(N);
    for (int i = 0; i < A.size(); i++) {
        g[A[i]].push_back(B[i]);
        g[B[i]].push_back(A[i]);
    }

    const int INF = 20000000;
    int res = INF;
    memset(fenwick, 0, sizeof(fenwick));
    for (int s = 0; s < N; s++) {
        vector<bool> visited(N, false);
        int t = 0;
        int inv = 0;
        
        std::function< void(int) > backtrack = [&](int x) {
            visited[x] = true;
            t++;
            int oinv = inv;
            inv += fenwick_counthigher( V[x] );                
            fenwick_add( V[x] );
            if (t >= K) {
                res = std::min(res, inv);
            }
            for (int y : g[x]) {
                if (! visited[y] ) {
                    backtrack(y);
                }
            }
            fenwick_remove( V[x] );
            inv = oinv;
            t--;
            visited[x] = false;
        };
        backtrack(s);
    }
    
    
    
    return (res >= INF) ? -1: res;
}

Challenge phase

Got lucky, the first place in the room clearly did 500 without the data structure part (Was using a std::multiset, but multiset doesn't help in the counting higher elements part). So I just challenged with a large case.

Saturday, June 28, 2014

SRM 626: Not a math person

Another SRM.

Div1 250: The one with dice

Alice throws `a` dice with `b` faces each (Numbered 1 to `b`, all equally likely). Bob throws `c` dice with `d` faces each (same). The score is the sum of all the dice rools each player gets. Return the expected value of Alice score IN CASES SHE WINS. If it is impossible for Alice to win, return -1. All `a,b,c,d` are at least 1 and at most 50

Okay... For each die, can you calculate the probability to get `X` as the total result? Well, yes. This is an `O(a^2 b^2)` dynamic programming thing. Possibly other good methods too. The thing is that we can precalculate two lists that give us the probability Alice gets `X` and the probability Bob gets `Y`.

Then for each pair `(X,Y)` it is easy to get the probability that Alice gets X AND Bob gets Y. So consider only the cases `(X > Y)` and find the probability Alice wins. If the probability is 0, return -1.

Once again, for each pair `(X,Y)`, find its probability, if Alice wins in this case, then the probability that Alice gets that score, provided Alice wins the whole game is `P_A(X) * P_B(Y) / P_"Alice wins"`. The sum of these probabilities multiplied by `X` is the expected value.

My first instinct was to use Bayes theorem , maybe there is a way to simplify things with it?

Total complexity is `O(a^2 b^2 + c^2 d^2 + abcd)`, so around 3* 2500 * 2500 in the worst case :)


vector<double> findProbabilities(int a, int b)
{
    double dp[51][2501];
    
    memset(dp,0,sizeof(dp));
    dp[0][0] = 1.0;
    
    for (int n = 1; n <= a; n++) {
        
        for (int y = 1; y <= n*b; y++) {
            dp[n][y] = 0.0;
            for (int x = 1; x <= b && x <= y; x++) {
                dp[n][y] += dp[n-1][y - x] / b;
            }
        }
    }

    
    vector<double> res(a*b + 1, 0.0);
    for (int i = 0; i <= a*b; i++) {
        res[i] = dp[a][i];
    }
    return res;
}

double getExpectation(int a, int b, int c, int d)
{
    vector<double> Alice,Bob;
    
    Alice = findProbabilities(a,b);
    Bob   = findProbabilities(c,d);
    
    double pw = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                pw += Alice[A] * Bob[B];
            }
        }
    }
    
    if (pw <= 0.0) {
        return -1.0;
    }
    
    double ew = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                double p = Alice[A] * Bob[B]; //probability this happens
                ew += A * (p / pw);
            }
        }
    }
    return ew;
}
};

Div1 medium the one with negative edges

A directed graph, possibly with multiple edges and loops and cycles. Find the minimum cost to move from a vertex to another.... EXCEPT that you have at most `X` opportunities to make the cost of an edge negative when you use it. `X` can be is obscenely large.

Okay, no idea.

Div1 hard: The one with mirrors

There is a rectangle with mirror sides of some dimensions. Find each angle from which you can cast a ray from the bottom-left corner in such a way that the ray of light bounces exactly `b` times before touching any corner and finally reaches the top-right corner. `b` is at most 1000000000. For each of those angles, add the total distance traveled by the ray before hitting the corner. Calculate all that modulo 1000000007

I thought I had a good chance to solve this problem. Because of the old codejam problem Hall of mirrors I knew a lot of properties about these rectangles of mirror sides. For example, we could imagine that a bounce is the same as moving around a grid made of rectangles of those mirrors. I would have more luck explaining this if I had time to inkscape....

The thing is that if you imagine a grid of mirrors and make the coordinates in each of these mirror corners `(x,y)` . Then moving from the origin to this coordinate is valid if and only if: `(x,y)` are coprime and also the coordinates should share a diagonal with `(1,1 + b)`. In addition, `b` must be even.

Unfortunately, because `b` can be very large, just knowing that isn't enough. It needs some mathematical trick to avoid doing an `O(b)` loop.

The following passes examples 0 o 3.

const int MOD = 1000000007;

long gcd(long a, long b)
{
    while (b != 0) {
        tie(a,b) = make_tuple(b, a%b);
    }
    return a;
}

int findSum(int sideA, int sideB, int bounces)
{
    if (bounces % 2 == 1) {
        return 0;
    }
    long y = 1 + bounces;
    long x = 1;
    
    // find solutions to this:
    // 1 <= y - p <= y
    // gcd(y - p, 1 + p) = 1
    
    long sx = 0, sy = 0;
    while (y >= 1) {
        if (gcd(x,y) == 1) {
            sx += (x*x) % MOD;
            sy += (y*y) % MOD;
        }
        y--;
        x++;
    }
    sx = ( ((sx * sideA) % MOD) * sideA) % MOD;
    sy = ( ((sy * sideB) % MOD) * sideB) % MOD;
    return (sx + sy) % MOD;
}

I submitted it just for fun, it will time out with slightly large `b`. No one dares to challenge it so far.

Thursday, June 19, 2014

SRM 625: I semi-wrote it

Two days ago I was busy trying to come up with a way to finish SRM 624's editorial. I get how to solve div1 hard in the top level, I have no idea how to explain the initial step: Split in centers subtrees. I am fairly sure it is probably a known algorithm, can't find info. Was stuck there when I was contacted by the admins about SRM 625. They needed problems urgently and I apparently had some problems that were created long ago and could work.

I wrote all problems except div1 hard (the important one). They really weren't intended to be part of the same match. And I think that they go more towards the standard side of things. Please note that I've gone through a huge writer's block and I can't seem to come up with interesting problems anymore. But oh well. I just hope there are no big issues of the sort that cause match cancellations.

Div2 250: The one with challenge phase

Unlike the other problems, I thought of this one specifically for this match. Ever since multiple-answer problems were added to topcoder, I wanted to make something that takes advantage of it.

The problem is easy, given `y` find `(a,b,c)` such that `ab + c = y`, `(a,b,c)` must be integers between -1000 and 1000 different to 1 and 0. `y` is between 0 and 500. Otherwise any `(a,b,c)` is correct.

There is a bruteforce search here. But we can also get clever. Note that no matter which two values `(a,b)` you pick, there will always be a `c = y - ab`. So we can do `(a,b) = (1,1)` and then `c = y - 1`. This would be wrong because values cannot be 0 or 1. But we can try to pick values `(a,b)` that don't have this problem. For example `(a,b) = (2,2)`, now `c = y - 4`. This is almost fine, but `y = 4` or `y = 5` would yield a wrong value of `c`. We could handle those cases separately OR we could try something different. We would like `y - ab` to be distinct to 0 or 1. How about : `y - ab > 1` then `ab < y -1`. We can guarantee this to be true with `(a,b) = (2,-2)`. Then we have `c = y + 4`. So the result can always be `(2,-2, y + 4)`.

Div2 500: The one with sequences

This problem is very old. I think I thought of it for some TCO qualification round of a past TCO. I didn't even remember much about the problem. It was already developed but it was never used. So we mostly just enabled the problem for use.

If I remember correctly, this problem can be very tricky.

Div2 1000: Minimum Liars Again

A circle of friends. For each i, we have answers[i]:

  • If answers[i] is ?, we didn't ask friend i anything.
  • Else we asked the i-th friend if friend (i + 1) mod n is a liar. If friend i is a honest person, they would always tell the truth , else they would always lie. answers[i] is H if friend i said friend (i + 1) mod n is a honest person, else L.

Return the minimum number of liars in a consistent case. If no case can be consistent, return . 1.

This is another very old problem. It was also already developed, but for `n <= 20`. We decided `n <= 50` is better for this problem slot.

The idea is simple: If there are no ? in the input, the whole thing is cyclic. You can try making the first person a liar or not. If first person is a liar, and you know their answer, then you know if second person is a liar, then third person and so and so. Eventually, you will know if person `n - 1` is a liar or not, and this is important the conclusion we make about person 0 must be consistent with the hypothesis we made. Return the minimum number of liars in the consistent cases.

If there is at least one ? , the case is not really cyclic. You can just split the friends in groups of consecutive non-? friends. These cases are linear and you can use the same logic (try if first person is a liar or not). But this time, no consistency is needed.

Like this:

import itertools
class ConundrumReloaded:
 def minimumLiars(self, answers):
        
    def simulateLine(f, s):
        p = f
        liars = 0
        for i in range(1, len(s) + 1):
            h = (answers[i - 1] == 'H') ^ (not p)
            liars += not h
            p = h
        #print (f,s), (liars,p)
        return (liars, p)
        
    n = len(answers)
    q = -1
    for i in range(n):
        if answers[i] == '?':
            q = i
    INF = 10**9
    res = INF
    if q == -1:
        for f in (0,1):
            (liars, p) = simulateLine(f, answers)
            if p == f:
                res = min(res, liars)
    else:
        res = 0
        curr = ''
        for i in range(n + 1):
            j = (i + q) % n
            if answers[j] == '?':
                if curr != '':
                    res += min( (not f) + simulateLine(f, curr)[0] for f in (0,1) )
                    curr = ''
            else:
                curr = curr + answers[j]
        
    return -1 if res >= INF else res

Div1 250: When you are desperate combine palindromes and probabilities

Given a string. We pick one of its anagrams with uniform probability. Return the probability the picked anagram is palindrome.

This problem is also very old. I think I tried to use it in many matches, but (thankfully) we always found something more interesting and less standard. Until now.

Simply return: (Number of palindrome anagrams) / (Number of anagrams).

The number of anagrams is calculated by dividing `n!` by `r!` for each letter in word that is repeated `r` times.

So we just need to calculate the number of palindrome anagrams. First of all, if `n` is even, then all characters must appear an even number of times. If `n` is odd then only one character should appear an odd number of times. If `n` is odd, we already know what character is in the center and we can just remove it and treat it as if everything was even.

We have many letters that appear an even number of times. We know that we should put each of these letters in both sides of the palindrome. The order depends only on the first side. So actually, we should count the number of ways to order the `n/2` letters that remain after dividing each letter quantity by 2. This is the same as dividing `(n/2)!` by `(r/2)!` for each letter that is repeated `r` times.

As I type this, it appears that many people including div1 hard's writer have issues with precision in their solutions.

Div1 500: The one with blocks

The original version of this problem was meant for the last TCO rounds, I think last year. But was too complicated. In the original version , there were multiple goals and exactly two start cells. I don't even remember how was it solved.

In the current version, there are multiple start cells but only one goal. The game is symmetric so we can just begin at the goal and if it reaches a start cell, there is a solution if you begin at that start cell.

So we should just find the minimum number of holes to cut the path between the goal cell and any start cell.

Now analyze the allowed moves, you will see that the set of reachable cells is like this:

*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*

Where SZ are horizontal 2x1 pairs of cells that can be touched by a 2x1 face, the * are the cells that can be touched by 1x1 faces (including the goal) and AV the vertical 2x1 pairs of cells.

We can just ignore any start cell that is not *, no matter what we do, we can't reach them.

So now we just have a minimum cut problem. The trick is to take the 2x1 pairs of cells as edges then the cost to remove an edge equals the number of normal cells in the edge. You can also remove * cells, this means using an intermediary edge connecting entry and exit point for each * cell. Okay. The minimum cut requires you to get clever and implementation will be a headache but the concept is easy!

Thursday, June 12, 2014

*Taunts you*

In today's SRM 624 there was a problem that asked you to return something modulo 1000000009. Usually the modulo is 1000000007 or 1000000009 , but people have the impression it is 1000000007 most of the time.

Usually topcoder example tests in this kind of problem include a test case that makes use of the modulo. This is intentional so that people can catch a mistake in writing this constant. Unfortunately, today's problem didn't have such test case.

I didn't even notice. If you read my recap of SRM 624 you will see I even mistakenly used 1000000007 when typing the explanation. And yet I passed , well I have a secret weapon (not so secret).

This is my Greed c++ template:

#include <bits/stdc++.h>
// ${Contest.Name} - ${ClassName} (${Problem.Score} points) by vexorian
using namespace std;

struct ${ClassName}
{
${<if Problem.Description.Modulo}
    const int MOD = ${Problem.Description.Modulo};

${<end}
    ${Method.ReturnType} ${Method.Name}(${Method.Params})
    {
        return ${if Method.ReturnType.Array}{}${else}${Method.ReturnType;zeroval}${end};
    }
};

${CutBegin}
${<TestCode}
${CutEnd}

Take a look to the if Problem.Description.Modulo part. Yes, this part of my template actually generates the MOD constant for me :). One day I spent an afternoon or so making the Greed topcoder plugin able to parse the modulo constant and make it accessible to templates. This is why :).

SRM 624: Slooooooooow

I just did the most amazing c++ code, got to share.

Div1 250: The one with buildings

You are given a list of integers (building heights). The maximum number of integers possible is 4000 and the maximum integer possible is 4000. For each `0 <= i <= n`, find the minimum number of steps so that the final sequence has `i` equal integers. Every step is to increment a single integer by 1 (building a new floor in a building).

I had some difficulties getting to the right approach. At first trying to do things like sweeps and dynamic programmings :/ . Turns out the trick is to solve: What is the minimum cost to have `y` buildings of height `x` ? For each `x`, calculate those costs for all `y`, updating the values in a total cost function that considers all possible heights.

The trick is that there are only `O(N)` relevant values of `x` (those equal to the heights), and that it is easy to find, for each `y` the minimum cost to have `y` buildings of height `x`. Just pick the `y` largest buildings of height greater than or equal to `x` and increment them. This gives us an `O(N^3)` algorithm, however, for each `y`, you can reuse the result of `y-1` (you know the cost of fixing `y-1` buildings, just add a single building and we know the result for `y`.

This is an `O(N^2)` algorithm:

int minimum(vector<int> heights)
{
    int n = heights.size();
    // sort non-increasingly by height:
    sort(heights.begin(), heights.end());
    const int INF = 4000 * 4000;
    // cost[i] will have the cost to have i equal buildings
    vector<int> cost(n + 1, INF);
    cost[1] = 0;
    // for each height:
    for (int i = 0; i < n; i++) {
        int c = 0;
        // for each number of buildings i - j + 1:
        for (int j = i - 1; j >= 0; j--) {
            // the minimum cost to have i - j + 1 buildings of this height 
            c += heights[i] - heights[j];
            // remember the best
            cost[i - j + 1] = std::min( cost[i - j + 1], c);
        }
    }
    
    
    // the problem actually asks for the XOR of all values, but I suspect
    // this is because of limitations in the return value size.
    int res = 0;
    for (int i = 2; i <= n; i++) {
        res ^= cost[i];
    }
     
    
    return res;
}

Div1 450: The one with implementation

Classical problem. You are given a graph (no loops, bidirectional, connected, weighted, some weights are zero). Return the total number of shortest paths between vertices `0` and `N-1` modulo 1000000007. Note that zero-weight edges might make the number of shortest paths infinite, return -1 in that case. At most 2000 vertices and at most 2000 edges.

Okay, the complicated path is when paths are infinite. This can only happen if there is at least one shortest path that visits an edge with weight 0. So we need a way to know, for each edge, if it can belong to a shortest path.

The way to check if an edge `(u,v)` can belong to an optimal path requires us to know: The weight of the edge, the minimum distance `D` between 0 and `N-1`, the minimum distance `d` between 0 and `u` and the minimum distance `r` between `N-1` and `v`. If `d + w + r = D`, then the edge can be used in a shortest path. If any 0-weight edge is one of those, return -1.

The rest is to count the number of paths. How about `f(x)` which returns the number of shortest paths towards `N-1` starting with `x`. `f(N-1) = 1`, there is only one shortest path there. For the other vertices `x`, you find every `y` such that `(x,y)` can belong to a shortest path. Then `f(x) = sum(f(y))` , easy, right? . Just evaluate the vertices in non-increasing order of distance to `N-1`, since none of the edges used will have weight 0, we can do this sorting.

I broke a record with this problem. I used 4 lambdas in total. And I even nested lambdas this time, which is so unusual in c++ :).

const int MOD = 1000000009;



int count(int N, vector<int> A, vector<int> B, vector<int> C)
{
    // make the graph easier to use:
    vector<vector<int>> g(N);
    vector<vector<int>> w(N);
    vector<int> degree(N);
    for (int i = 0; i < A.size(); i++) {
        degree[--A[i]]++;
        degree[--B[i]]++;
    }
    for (int i = 0; i < N; i++) {
        g[i].resize(degree[i]);
        w[i].resize(degree[i]);
        degree[i] = 0;
    }
    auto addEdge = [&](int u, int v, int weight) {
        g[u][degree[u]] = v;
        w[u][degree[u]++] = weight;
    };
    for (int i = 0; i < A.size(); i++) {
        addEdge( A[i], B[i], C[i] );
        addEdge( B[i], A[i], C[i] );
    }
    
    //Dijkstra time!
    auto dijkstra = [&](int s, vector<int> &dist) {
        set<pair<int,int>> Q;
        const int INF = 2000000000; 
        dist.resize(N, INF);
        for (int i = 0; i < N; i++) {
            Q.insert( {INF, i} );
        }
        auto push = [&](int x, int d) {
            if (d < dist[x]) {
                Q.erase(Q.find( {dist[x],x}));
                dist[x] = d;
                Q.insert( {dist[x], x} );
            }
        };
        push(s, 0);
        while (! Q.empty()) {
            auto it = Q.begin();
            int x = it->second;
            Q.erase(it);
            
            for (int i = 0; i < degree[x]; i++) {
                int v = g[x][i];
                int we = w[x][i];
                push(v, dist[x] + we);
            }
        }
    };
    vector<int> dist, rdist;
    dijkstra(0, dist);
    dijkstra(N-1, rdist);
    
   
    // if any of the shortest paths visits an edge of weight 0, the result
    // is -1. It is possible to have 0s in the input and result diff to -1
    for (int i = 0; i < A.size(); i++) {
        if (C[i] == 0) {
            int u = A[i], v = B[i];
            for (int r = 0; r < 2; r++) {
                if (dist[u] + rdist[v] == dist[N-1]) {
                    //critical
                    return -1;
                }
                swap(u,v);
            }
        }
    }
    
    // And now the part in which we actually count the roads.
    vector<long> dp(N);
    vector<int> byRDist(N);
    for (int i = 0; i < N; i++) {
        byRDist[i] = i;
    }
    sort(byRDist.begin(), byRDist.end(), [&](int a, int b) {
        return rdist[a] < rdist[b];
    });
    for (int u: byRDist) {
        dp[u] = 0;
        if (u == N-1) {
            dp[u] = 1;
        }
        for (int i = 0; i < degree[u]; i++) {
            int v = g[u][i];
            int we = w[u][i];
            if ( (rdist[v] < rdist[u]) && (dist[u] + rdist[v] + we == dist[N-1]) ) {
                dp[u] += dp[v];
            }
        }
        dp[u] %= MOD;
    }
    return dp[0];
}

The rest

I was very slow, everybody else has a much better score. I hope many of them fail. I am confident in the correctness of my solutions and I don't suspect of any likely implementation bug. Typing this as the challenge phase begins.

During challenge phase: 8 minutes of challenge phase, and I doubt I'll score a successful challenge, there are some tough challengers in my room and they are very quick to find errors. There seem to be many ways to get 450 wrong.

Wednesday, June 04, 2014

SRM 623: Saved by the challenge phase

Div1 300: The evil one

This problem was evil. Let me explain, you have a board of maximum size 20x20, some cells contain apples, some pears, some are empty. A valid move is to move a fruit to an empty cell. You can do at most `K` moves. The objective is to maximize the size of a rectangle containing only apples. What is the maximum?

Say you already picked an arbitrary rectangle, there are `O(n^4)` of those. There are two things that are important about this rectangle: `a` : Number of apples in side, `p` Number of pears inside, `e` number of empty cells inside. You can also know the number of apples/pears/empty cells outside the square. Finding this values is a complicated implementation part. Also, I think too slow for python, so I tried to do something in c++, using lambdas of course. It took eons. But when I finally did it , I had a new problem...

Knowing that information about the rectangle we need to find the minimum number of moves needed to have only apples in the rectangle.

At first I was trying something very wrong: Move each pear to a (different) empty cell outside, then move all the apples inside. This took me ages to debug, and when I finally passed system tests with it, I was about to click submit when I started to wonder "Is this really it?" , and I noticed that noooo, the minimum number of moves is much trickier to find. Hence the examples were weak.

So what is the real optimal strategy? Well, the first issue is that we only need one empty cell in the board. We can keep reusing it to do all the moves. Then we have this: If there are any empty cells in the square, it is easy to see that it is optimal to just move any outside apples and cover them. The remaining case is when there are some pears inside the rectangle, at least one empty cell outside and enough apples outside. A single pear is treated like this: Move pear to empty cell, move apple to the position the pear was in, this leaves an empty cell in the apple's position, and we can repeat.

So in fact, the minimum number of moves is equal to:

  • 0 if all the cells already contain apples.
  • # of empty cells inside the rectangle, if the rectangle contains no pears.
  • # of empty cells + 2 * number of pears inside the rectangle, if and only if there is at least one empty cell in the board
int getBoard(vector<string> board, int K)
{
    // stuff to count the number of apples / pears / empty inside a rectangle:
    int W = board.size(), H = board[0].size(); 
    map<char, vector<vector<int>>> total;
    string chars = ".AP";
    for ( char ch: chars ) {
        vector<vector<int>> t(W+1, vector<int>(H+1,0));
        for (int i = 1; i <= W; i++) {
            for (int j = 1; j <= H; j++) {
                int x = ( board[i-1][j-1] == ch );
                t[i][j] = x + t[i-1][j] + t[i][j-1] - t[i-1][j-1];
            }
        }
        total[ch] = t;
    }
    auto rectangle = [&](char ch, int x0, int x1, int y0, int y1) {
        auto &t = total[ch]; 
        return t[x1][y1] - t[x0][y1] - t[x1][y0] + t[x0][y0];
    };
    
    int A = rectangle('A', 0, W, 0, H);
    int E = rectangle('.', 0, W, 0, H);
    
    int res = 0;
    for (int x0 = 0; x0 < W; x0++) {
        for (int y0 = 0; y0 < H; y0++) {
            for (int x1 = x0 + 1; x1 <= W; x1++) {
                for (int y1 = y0 + 1; y1 <= H; y1++) {
                    // the actual logic:
                    // inside:
                    int e = rectangle('.',x0,x1,y0,y1);
                    int a = rectangle('A',x0,x1,y0,y1);
                    int p = rectangle('P',x0,x1,y0,y1);
                    
                    // out side:
                    int na = A - a;
                    int ne = E - e;
                    
                    if (e <= na) {
                        int x = e;
                        int mov = x;
                        na -= x;
                        a += x;
                        e = 0;
                        ne += x;
                        if (mov <= K) {
                            if (p == 0) {
                                res = std::max(res, a);
                            } else if ( (p <= na) && (ne >= 1) && (mov + 2*p <= K) ) {
                                res = std::max(res, a + p);
                            }
                        }
                    }
                }
            }
        }
    }
    return res;
}

Div1 450: The difficult one

All I can think of is that we can imagine a different problem, the points stay static and the player moves either in a vertical line upwards (increasing `y` coordinate) or in a diagonal (incrementing `y` and incrementing or decrementing `x`).

Challenge phase

I took too long to solve div1 300, but I did learn that the example cases were weak. So I thought of a challenge case to make my initial solution fail (using only one empty square). Luckily my room wasn't very aggressive and I could find 2 solutions with that wrong idea and earned 100 points.

Wednesday, May 28, 2014

SRM 622: Anticlimatic

It was a long day, trying to finish the very late SRM 621 editorial. I eventually improvised something. I tried to learn as much as possible about suffix automata until I can at least explain the solution with some more detail than the sketches.

So then the match started, I wasn't too thrilled because it is a 9:00 PM match and they tend to be... boring and have too few coders. But oh well...

Div1 250: The one with roads

You are given a weighted complete directed graph (A map of roads connecting each pair of cities) and a integer `T`. On a special holiday, for each ordered pair of distinct cities `(i,j)` a bus must travel from city `i` to city `j`, the path used by the bus must be the shortest possible. If there are multiple shortest pats, the bus driver may pick any of them - We don't know which. A road between two cities is unsafe if it is possible that more than `T` buses will go through it. Find the total sum of lengths of unsafe roads.

The thing here is how to know if a road might be used by a bus. There are `O(n^2)` buses and each might use each road. For pair of cities `(a,b)` , then road `(x -> y)` is used if and only if: mindist[a][x] + road[x][y] + dist[y][b] = dist[a][b]. Where mindist[p][q] is the minimum distance between cities p and q and road[p][q] is the length of the direct road connecting them. We can find mindist[p][q] in `O(n^3)` with Floyd-Warshall, sure, we could do something even faster, but it is not needed because the logic (picking `a,b,x,y` is `O(n^4)`.

Code some cute python code, submit:

class BuildingRoutes:
 def build(self, dist, T):
    n = len(dist)
    allN = range(n)
    # turn string list dist into a matrix:
    dist = [ [ord(x)-ord('0') for x in row] for row in dist ]
    # number of times a road might be used
    roadcount = [ [0] * n for i in allN ]
    INF = 10**9
    # Floyd-Warshall
    mindist = [ [dist[i][j] for j in allN] for i in allN ]
    for k in xrange(n):
        for i in xrange(n):
            for j in xrange(n):
                mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j])
    # Find potential uses:
    for a in allN:
        for b in allN:
            if a != b:
                for x in allN:
                    for y in allN:
                        if x != y:
                            if mindist[a][x] + dist[x][y] + mindist[y][b] == mindist[a][b]:
                                roadcount[x][y] += 1
    res = 0
    for i in allN:
        for j in allN:
            if roadcount[i][j] >= T:
                res += dist[i][j]
    return res 

Unfortunately, I forgot to test the largest case before submitting the solution. I test the largest case and it turns out it times out. Very disappointing that it times out in python. And it is only by a few seconds. I had to remake the code in c++, lost plenty of points because of spending double time and recoding. I should know better, need to keep in mind that `50^4` is too bad for python.

int build(vector<string> dist, int T)
{
    // O(n^4) too slow in python, had to migrate code to c++ . arg
    int n = dist.size();
    vector<vector<int>> roadcount(n, vector<int>(n,0));
    const int INF = 1000000000;
    vector<vector<int>> mindist(n, vector<int>(n,INF));
    // Floyd-warshall
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            mindist[i][j] = dist[i][j] - '0';
        }
    }
    for (int k = 0; k < n; k++) {
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < n; j++) {
                mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j]);
            }
        }
    }
    // for each pair of cities (a,b), find if a road(x,y) could be used:
    for (int a = 0; a < n; a++) {
        for (int b = 0; b < n; b++) {
             for (int x = 0; x < n; x++) {
                 for (int y = 0; y < n; y++) {
                    if ( (x != y) && (a != b) ) {
                        if (mindist[a][x] + (dist[x][y]-'0') + mindist[y][b] == mindist[a][b]) {
                            roadcount[x][y]++;
                        }
                    }
                 }
             }
        }
    }
    int res = 0;
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            if (roadcount[i][j] >= T) {
                res += dist[i][j] - '0';
            }
        }
    }
    return res;
}

Div1 500: The one with tree dp (Tree dp again...

You are given a weighted tree. We want to divide the tree into a minimum number `K` of connected subtrees such that their diameter does not exceed `K`. (Diameter is the maximum distance between two vertices in a subtree).

Okay, I didn't have time to solve . I got distracted and was already quite late because of 250. But here is what seems like a solution:

Imagine the root. Either THE root, or a node we are treating as root because we cut the edge above it. Then we have to decide if the children of the root will be used in the subtree or not.

In general, it is best to use a child instead of not using it. Because if we do not use it, we forcefully are creating a new tree, but if we do not, we might be able to avoid it. The issue is how to balance out the lengths of the subtrees that we will keep conencted to the root.

This is a neat trick: Let `M` be the maximum distance between the root and the nodes in one of the arms of the subtree. Then the maximum distance between the root and the nodes in the other arms cannot be greater than `K - M`. We can pick one of the children, decide it will be the one with distance `M`, and then the remaining children must have arms of distance of at most `K - M`. If the edge connecting the root to a child is larger than `K - M`, we cannot use this child and must create a new tree.

We need to solve a subproblem: Do the decision for a subtree rooted at a node `X`, such that the distance between `X` and any grand children cannot exceed `M` and the diameter cannot exceed `K`. This is solved in a similar way, picking a children of `X` to have the maximum distance.

All in all, we have to solve the problem for `f(X, M)` , `M` will not exceed 500, (because `K` is at most 500), so we will be fine.

Challenge phase / etc

Challenge phase already started, I will go check what hapens. Scheduling this post so that it is posted when the challenge phase ends.

Tuesday, May 20, 2014

SRM 621: Invisible slowness

So another SRM day. I've been with headache since yesterday so running at half the steam...

Div1 275: The one with the geometry

So you have a circle (signal) centered at `(0,0)` with radius `S` , `S` is a real number picked uniformly at random from 0.00 to `Z`. There are some other circles (cities) each with center `(x,y)` and radius `r`. Things are good if, for each city, the main circle (signal) either doesn't include the city or it includes ALL of the points inside the city. Return the probability that this happens. The answer can have an absolute or relative error or 1e-6 (whichever is smallest)

So that's a nice thing about the allowed error. After topcoder added the ability to include a custom checker in problem sets, I thought it was only a matter of time until someone makes a problem that allows less precision than the usual 1e-9. The first thing I did in this problem was take my special python tester template that allows you to test automatically even when there is a custom checker. I used CNP super powers to paste the usual python checker, but replacing the 1e-9 with 1e-6... I really wanted to have accurate tester results.

The first thing is to notice what makes a radius `S` valid for a specific circle with `(x,y,r)`. A nice trick in problems involving plenty of circles is to consider only the distance between their centers. Eventually you will notice that if the distance between `(0,0)` is `d`, then there are two ranges in which is `S` is valid: `S <= d - r` or `S >= d + r`. Something that caused me to take a bit of time in this sub-problem, because of a special corner case: What if `(0,0)` is inside circle `(x,y,r)` ? Then `d-r` is negative. It turns out that it doesn't matter. `S` cannot be negative, and thus: `S <= d - r` or `S >= d + r` is still a good condition.

Once we know that condition. How to deal with the probability? You should see those `d-r` and `d+r` (for all cities) as the only points where the validity of `S` could change. So we can divide the `[0,Z]` interval using those points as delimiters. Now we can test each segment `[a,b]`, we know that `a` or `b` are points in which the result could change, so all points exclusively inside `(a,b)` will have the same result. `S` is valid for `(a+b)/2` if and only if it is valid for all points in interval `[a,b]`. For each valid interval `[a,b]`, if any of those points is picked for `S`, it is valid just add `(b - a) / Z` to the probability.

class RadioRange:
 def RadiusProbability(self, X, Y, R, Z):
    cities = zip(X,Y,R)
    
    def goodCity(x,y,r,s):     #city completely inside circle of radius s or not
        d = (x**2 + y**2) ** 0.5
        return ( (s <= d - r) or (s >= d + r) )
    
    def goodRadius(s):         # each circle must be completely included or not
        return all( goodCity(x,y,r,s) for (x,y,r) in cities )
    
    def getEvents():
        yield 0.0
        yield Z
        for (x,y,r) in cities:
            d = (x**2 + y**2) ** 0.5
            yield d + r
            yield d - r
    
    # remove invalid events (less than 0, more than Z), duplicates (using set)
    # and turn into a list (so we can use indexes) and sort
    # (possibly unnecessary to call sorted, the set likely already sorted)
    evs = sorted( list(set( x for x in getEvents() if (0 <= x <= Z) ) ) )
    good = 0.0
    for i in range(0, len(evs) - 1):
        p = (evs[i] + evs[i+1]) / 2
        if goodRadius(p):
            good += evs[i+1] - evs[i]
            
    return good / Z

I knew how to solve the problem since the beginning, except dealing with the corner case that turned out not to exist anyway. I am not sure why it took me so long to solve it, but it did.

Div1 500: tree stuff

Seems hard, I was tired and with headache so I went to lie down while I supposedly thought of the problem, but couldn't think of much.

Challenge phase

During the challenge phase I discovered why people were faster than my in 275, it turns out that `S` is invalid if and only if it intersects with any circle (city) in the input. While most of the code remains the same, this makes reaching the code a bit easier and also removes the need for checking for validity of `(a+b)/2`at the end, instead you just do a line sweep. (Each event `d-r` increments the number of intersecting circles, each `d+r` decrements it, sum up intervals in which the count is 0.)

Sunday, May 18, 2014

Auto-test Topcoder problems with multiple correct answers

Topcoder have been improving the decade old contest system a bit. Things like allowing larger constraints, possibility of different time/memory limit per problem, etc. But the greatest change of them all, the one that actually brings problem variety rather than Fake Difficulty is that some problems allow multiple correct answers.

The first SRM to have these problems was SRM 617. The division 2 easy and division 1 medium allow multiple correct answers. They were interesting problems that would not have worked quite as well without this ability.

In SRM 617 div2 medium, you have to return two composite numbers that add up to `n`. `n` is at least 20 and at most 1000. It turns out there is a very cool and easy solution: Return `(4, n-4)` if `n` is even, else `(9, n-9)` if `n` is odd. But if this problem appeared in old topcoder, how would we have led coders to this solution? We would have had to specify tons of things. Like, the first integer must be the smallest composite possible that makes the result as intended. But then there would be odd cases in which `(4, n-4)` is an answer, and then the solution would be more complicated (not appropriate for div2 easy). Even if we did all that, the problem wouldn't be very nice, because the results of the example cases's results would give the solution away too easily...

However, the bad thing about these new problems is that, it is Topcoder, and many of us are used to Topcoder Arena plugins that generate automatic testers based on the example test cases. If we used our old arena plugins on these problems and the correct results of our solution didn't exactly match the expected solutions, the automated tester would tell us that the results are wrong. Even when they are actually right. We need some changes.

So after SRM 617 I made plans to work as hard as possible in making my Greed tester setup able to work around this a bit. This is the story.

Identifying those problems

The first thing to do is to identify the problems that allow multiple correct answers for a test case. This requires help from the Topcoder Arena developers. The first thing I did was, during SRM 617 I told admins to remember that arena plugin API needs a way to detect this. Then I waited, and waited. Then I figured, there is a thread I know that seems to be monitored closely by Arena developers. Making post in this thread seems to be the only way I have to contact them, and so I did. I really the plugin API was documented at all and that each change was announced and explained and all new methods were known. But this is not the case. Even after receiving confirmation that this feature was in the API, I had to dig through the Jar looking for something that might be it.

problem.getComponent().isCustomChecker();
// where problem is com.topcoder.client.contestant.ProblemComponentModel

So now I just needed to add the feature to Greed so that any template was able to identify these problems and then we can do the magic.

I made a pull request, that PR adds a Problem.HasCustomChecker field. You can use it in templates: ${if ProblemHasCustomChecker}...${end} . The Pull request is still not merged and thus you will have to do some magic to compile greed using it. If you want...

Modifying the tester templates

We also need to modify the tester so that it can make use of this information. But how?

My idea of how to allow automated testers in these problems consists of two parts:

  • My tester already supports test results in which it is not known whether or not the returned answer is correct. And it can already report results as such (Using a ? symbol). Testing is still useful because you can get the results for all examples with a single run and you can also detect any example that times out or has a runtime error. So the first idea is to make the generated test code to make all test cases do this when the problem has this flexibility.
  • However, that doesn't fix that maybe the coder wants/needs also to detect wrong answers. The only workaround to this is to allow user to code a checker method for the returned output.

In many problems, it is easier to check if an output is correct than to solve the problem. For example, the problem I described above. It is easy to check that the returned value is a pair of integers, that they are both positive, they add up to n and they are composite. So we can make a tester.

In other problems, it may not be so easy to know if a result is correct without solving the problem. But if you already know one correct result for that input, then it is still easy to write a grader. For example. A problem that returns the shortest sequence following some conditions. If you know that the provided example case result gives a 3 elements sequence, then we can tell that any sequence with more than 3 elements is wrong and can report as such making use of this result.

My idea is to have a checker method in the problem class. It should take input, resulting output and the example' expected output (if it exists) If checker method returns 0, then it is not known if the provided answer is correct. If return -1, we know it is wrong and if it is 1, we know it is correct. Initially, this method returns 0 in all cases, but it can be modified by the user.

I implemented this in python, it looks like this:

class SilverbachConjecture:
 def solve(self, n):
    if n % 2 == 0:
        return (4,n - 4)
    else:
        return (9,n - 9)

# CUT begin
#-------------------------------------------------------------------------------
 def greedCheckResult(self, testInput, returnedAnswer, expectedAnswer):
    (n,) = testInput
    #You can implement a custom checker.
    # Return value: -1 is WA, 0 is unknown and 1 is AC.
    # expectedAnswer is not None, then expectedAnswer is a known correct answer.
    if len(returnedAnswer) != 2:
        return -1
    (a,b) = returnedAnswer
    if (a + b != n) or (a <= 0) or (b <= 0):
        return -1
    
    def isComposite(x):
        p = 2
        while p*p <= x:
            if x % p == 0:
                return True
            p += 1
        return False
        
    return 1 if isComposite(a) and isComposite(b) else -1

The trick is that the generic tester is a separate file, but I just updated it a bit to detect if a greedCheckResult method exists, if it does, then it does the required logic.

I am not sure yet how to do this in c++, yet, but I should try something tomorrow.

Besides of modifying the generic tester, I also modified my python template. This is the part that makes use of the HasCustomChecker field:

# ${Contest.Name} - ${ClassName} (${Problem.Score} points) by vexorian :

${<if Problem.Description.Modulo}
MOD = ${Problem.Description.Modulo}

${<end}
class ${ClassName}:
 def ${Method.Name}(self, ${Method.Params}):
    return ${Method.ReturnType;zeroval}
${<if Problem.HasCustomChecker}


${CutBegin}
#-------------------------------------------------------------------------------
 def greedCheckResult(self, testInput, returnedAnswer, expectedAnswer):
    (${Method.Params},) = testInput
    #You can implement a custom checker.
    # Return value: -1 is WA, 0 is unknown and 1 is AC.
    # expectedAnswer is not None, then expectedAnswer is a known correct answer.
    return 0

${CutEnd}
${<end}

${CutBegin}
${<TestCode}
${CutEnd}

The relevant changes to the generic tester affect only the run_tests function:

def run_testcase( problemClass, methodName, testInExpected, caseNo, totalCase, testTimeOut, finalLines, compactMode ):
    (testIn, expected) = testInExpected # so that it compiles in python3
    if compactMode != ONLY_REPORT: 
        sys.stdout.write(TEST_COLOR + "Test " + str(caseNo) + COLOR_RESET + ": " + prettyStr(testIn)+"\n")
    startTime = time.time()
    instance = problemClass()
    exception = None
    try:
        result = getattr(instance, methodName)(*testIn);
    except Exception as e:
        ln = None
        for x in traceback.extract_tb(sys.exc_traceback):
            if x[0] == problemClass.__name__+'.py':
                ln = x[1] 
        if ln is None:
            exceptionShort = '%s, (unknown line)'%( type(e).__name__ )
        else:
            exceptionShort = '%s, line: %d'%( type(e).__name__, ln )
        exception = traceback.format_exc()
        
    elapsed = time.time() - startTime   # in sec
    if compactMode != ONLY_REPORT:
        sys.stdout.write("Time: %.02f seconds.\n" % elapsed)

    knownAnswer = (expected is not None)
    if exception is not None:
        if compactMode != ONLY_REPORT:
            sys.stdout.write("RUNTIME ERROR: \n")
            sys.stdout.write(exception)
        correct = False
    else:
        if hasattr(instance, 'greedCheckResult'):
            check = instance.greedCheckResult(testIn, result, expected)
            correct = (check >= 0)
            knownAnswer = (check != 0)
        else:
            correct = expected is None or tc_equal(expected, result)
        if compactMode != ONLY_REPORT:
            if not correct or not knownAnswer:
                sys.stdout.write("Desired answer:\n")
                sys.stdout.write("\t%s\n" % prettyStr(expected) )
            sys.stdout.write("Your answer:\n")
            sys.stdout.write("\t%s\n" % prettyStr(result) )
    
    res = '?'
    if exception is not None:
        res = 'E'
    elif not correct:
        res = 'X'
    elif elapsed > testTimeOut:
        res = 'T'
    elif knownAnswer:
        res = '+'
    
    # final lines:
    s = ( TEST_COLOR + "t" + prettyCase(caseNo, totalCase) + COLOR_RESET + ": " + colorTestResult(res) )
    s += (" (%.2fs)" % elapsed)
    if res in ('?', 'X'):
        s += (" (" + prettyStr( result)+ ")" )
    elif res == 'E':
        s += (" " + exceptionShort)
    finalLines.append(s)

    if compactMode != ONLY_REPORT:
        sys.stdout.write(" %s\n" % colorTestResult(res) )
        sys.stdout.write( BAR_COLOR + ('='*BAR_LENGTH) + COLOR_RESET + "\n")
    return res

I'll eventually make a more formal release of this update once the field is added to the official Greed and I figure out a clean way to do this in c++.

Sure , it is possible that the checker you implement has a bug and thus it reports your results as correct when they are not, you submit and fail system tests. But it is better than nothing.

Victory (for now):

Sunday, May 11, 2014

The greatest topcoder t-shirt never made

The other day I received my SRM 600 t-shirt. If you kept attention you'd remember I did terribly in SRM 600 / SRM 600.5 , so you are wondering what's up with that? Simply, turns out all the problem setters / testers who worked in SRM 600 got their t-shirts too. And I was the editorial writer, so this is actually the first t-shirt I got because of writing editorials rather than because of competing.

If you are wondering how the SRM 600 t-shirt looks like. In the front it says [SRM 600] in large letters the back , it has the new Topcoder logo. It is black.

But this blog post is not about that t-shirt. But about this old TC forums bit:

forums link
Members, admins, country wo/men:

We're trying something a little different for the TCCC T-shirt this year.

Thanks to the thread about best member quotes, http://forums.topcoder.com/?module=Thread&threadID=512175&start=0 we thought, "Wouldn't it be cool to see a member's quote on the back of the TCCC T-shirt?"

So, here's what we'd like you to do. Search your little hearts out for your favorite member quotes. Post each one individually, so the member community can use the feedback symbols (+/-) to show their like or dislike of each quote.

Then, on September 11, TopCoder will select the top five quotes. A quote will be chosen if it meets all of the following requirements: member popularity, originality and TopCoder/programming subject matter. On September 18 everyone (admins included) will vote from the chosen five for their absolute favorite.

There won't be any prize money given, but won't the winner feel warm and fuzzy knowing a little part of him or her is being worn by 2000 people? I know I sure would.

To get the ball rolling, I've posted a few of my favorites below.

C'mon, let's see some more now!

(This was eons ago, back when TCCC was a thing)

And the reply:

sql_lall wrote:

I have this quote on my wall at home :) I'm wondering though, how long it'll be until someone changes their quote just for this thread. i.e. ""I memoized inside a ternary search and all I got was this lousy t-shirt" ;)

[Edit: removed my member quote suggestion, as I'm guessing most people are voting for the quote mentioned above. But then again, I could be wrong :p]

So, this is awesome, what happened with this t-shirt idea? Back then, topcoder decided t-shirt contents using the Schulze method. (Topcoder actually deciding anything using a poll now sounds too Alien of a concept). And another idea won: doh

I mean really, An Edison quote? EDISON?? Nobody likes that guy anymore. Disappoint.

Saturday, May 10, 2014

SRM 620, python fails again

This was the last of the 10 SRMs I decided to use python exclusively on. Also definitely the one in which the poor topcoder python support did the most harm.

Div1 250, the one with gcd

Game: You start with a pair of positive integer `(x,y)`, in each move you can move to `(x,x+y)` or `(x+y,y)`. You are given two paris `(a,b)` and `(c,d)` , where each integer is up to 1000000. You want to know if there is a pair `(x,y)` such that both `(a,b)` and `(c,d)` are reachable using it as a starting position. Return the largest possible sum `x+y` in that case else -1

The idea is to go backwards. From `(a,b)`, there are two candidate backward moves: `(a-b,b)` and `(a,b-a)`, however, since the integers must be positive, only one of them is valid: If `(a>b)`, then pick `(a-b,b)` , if `(a<b)` pick `(a,b-a)` , if they are equal, this is the time to stop, because else one will go negative.

So it is easy to make the two sets of pairs `(x,y)` that can reach both `(a,b)` and `(c,d)`, we could find if the two sets have an intersection, if they do, return the maximum sum. Easy:

Notice that when one of `a` or `b` is 1, and the other is 1000000, there are 1000000 elements in one list. So there can be way tons of elements and looking for the intersection is hard unless we go clever and use sets that work in logarithmic time to find if they contain stuff or not. (Or maybe a hash table, but hash tables really blow) In c++, it means std::set or std::map, in python:

class PairGame:
 def maxSum(self, a, b, c, d):
    # find all the pairs that can be reached going backwards from (a,b):
    def backwards(a,b):
        while a > 0 and b > 0:
            yield (a,b)
            if a >= b:
                a -= b
            else:
                b -= a
    # find the intersection between the two sets:
    inter = set( backwards(c,d) ) & set( backwards(a,b) )
    return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)

I submitted it and scored 225~ points. I was so proud of this code. I knew that people coding in c++ or one of the joke languages would be making totally awful things, like repeating the code used to go backwards from `(a,b)`, copy and pasting it to do it with `(c,d)` but doing the slightly different search.

Unfortunately, it times out with `(a,b) = (1000000,1)` and `(c,d) = (1000000,1)`, or similar. I could notice that it can barely run around 1 second when none of the integers is 2. So the only way to fix this is to take care of the cases when any of them is 1 individually (without making huge loops). This complicates the code a ton, and I needed tons of resubmits before getting the following code which might still be wrong:

class PairGame:
 def maxSum(self, a, b, c, d):
    # this wouldn't be necessary if python wasn't too slow for O(n) when n = 1000000
    if a == 1:
        # The path to (a,b) is (1,1), (1,2), ... (1,x), ... (1,b)
        #can we reach (1,x) going backwards from (c,d) ?
        #if so, what's maximum x that we can reach?
        while (c > 1) and (d > 0):
            if d > c:
                d = d % c
            else:
                c = c % d
                if c == 0 and d == 1:
                    c = 1
        if c == 1:
            return min(b,d) + 1
        else:
            return -1
    if b == 1:
        return self.maxSum(b,a,d,c)
    if c == 1 or d == 1:
        return self.maxSum(c,d,a,b)
        
    # Let's pretend the followign is the only real part of the code, okay?
    def backwards(a,b):
        while a > 0 and b > 0:
            yield (a,b)
            if a >= b:
                a -= b
            else:
                b -= a
    inter = set( backwards(c,d) ) & set( backwards(a,b) )
    return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)

(In fact, at first it seemed like the main problem was memory, but even after fixing the memory, it is still too slow

All those using c++ or Java or perhaps even C# wouldn't have this issue. It caused me to lose plenty of points in resubmits and I might fail still. The problem is much harder when `n = 1000000` is too slow. And the worst thing is, that difficulty wouldn't have changed much for c++/Java if the constraint was 500000, but it would have saved python. Well, too bad.

Div1 500

I had little time. All I can say is that the problem is the same as if you just picked a sequence of skills and sorted by this sequence of skills lexicographically. The rest is fuzzy

Monday, May 05, 2014

SRM 619: Nesting min-cost matching inside dynamic programming

Div1 250: The one with stacks

You start with up to 50 piles of stones. Two players play a game alternating turns. In a given turn, a player must pick three piles: `(a,b,c)` such that pile `a` has at least 2 stones. Pile `a` is split into two non-empty piles and their amounts are added to pile `b` and pile `c`. If a player cannot find any valid turn, they lose. Find if the first player is guaranteed to win or to lose assuming the two players play optimally.

During the match, I did a dynamic programming based on the fact that the game state can be described by two variables `(p,q)` where `p` is the number of piles with more than 1 stone and `q` the number of piles with 1 stone. This leads to an `O(n^2)` dp.

However, as it turns out, the problem can be solved in `O(1)` time. Note that every valid step will always reduce the number of piles by 1. It turns out that all cases with an even number of piles are losing states, so if the number of piles is odd AND there is at least one valid move, then the state is winning.

Need to prove that all even states are losing ones. This is obviously true for `n = 2`. So for `n >= 4`, we figure that the only losing states with `n-1` are those that consist entirely of 1s. However, it is impossible to reach such a state using any valid number of moves, because after any valid move, at least two of the piles will be greater than 1.

Div1 600: ouch

This problem's statement is very long to copy and to re-write. So I will skip to the solution.

Imagine you wanted to calculate `f(x,i)`, the minimum cost to assign skills to the sub-tree rooted at `x` if you already know that one of the skills chosen for `x` is skill `i`.

We can easily find `f(y, j)` for any child of `x` and skill `j`. Out of all the skills assigned to `x` and its children, it should be possible to make a permutation of skills. Let's focus on this permutation. There are two possibilities.

  • Use `i` in the permutation, meaning that all children must use skills different to `i`.
  • Use a different skill in the permutation for `x`. All children and `x` need to pick a unique skill.

Using the values of `f(y,j)` and some ad hoc stuff, you can easily build a table of costs for each case, the cost of assigning each of the employees to each skill in the permutation. Then you have a minimum-cost bipartite matching. Too difficult to implement in the little time I had after figuring this out.