Saturday, December 27, 2014

SRM 643

div1 250: The one with factorization

Factorize `2 <= N <= 10^18`, but half of the vector containing the sorted prime factors is given to you as the input. (The even-indexed ones) For example, 18 = 2*2*3, the prime factors vector is {2,2,3}, which means that the input will be 18, {2,3}, get it?

Factorization is supposed to be a hard problem. And for `N <= 10^18` it is, specially given the constraints, not so much the 2 seconds execution constraint, but the 256 MB memory constraint, the available time to code stuff and the lack of being able to save useful stuff in hard drive before running each test case... I mean, the *n*x* factor command can certainly factorize some large numbers, but you are supposed to do things the manual way here rather than reuse code..

So factorizing `N` is out of the question, maybe that's why it has to include some factors in the input. My first instinct was to think, well, just divide `N` by the known prime factors, the remaining value `n` shall be small enough to factorize, right? This is false. Just imagine `N` equal to 2 multiplied by a large prime number close to `10^18 / 2`. The large prime number would be a bit too large to do the `O(sqrt(N))` factorization by trial division algorithm we normally use.

Second idea was to do some ad hoc things. If there is only one prime factor provided, it means the number of prime factors is either 1 or 2, these are both easy cases to solve (result is either `{"primes"[0]}` or `{"primes"[0], N/"primes"[0]}` ). But this doesn't help. Cases in which the input has more than one factor can still lead to a rather large `n` after dividing: `999999999999999896 = 2 * 2 * 2 124999999999999987`.

So the solution is based in the following logic: The very last prime number in the factorization can be rather large `O(N)`, large, but if it is really so large, it will be the only such large number in the factorization. If this number is in an odd position, it will be ignored, and the resulting `N` after dividing by the known factors will be large because it includes this prime number. So the idea (which may not be 100% correct) is to just do trial division up to the `sqrt(2000000000)` and then, if there is still a number after dividing that much, what's left ought to be a prime number, right? I have no idea how good this is.

typedef long long int64;
#define long int64
struct TheKingsFactorization
{
    vector<long> getVector(long N, vector<long> primes)
    {
        long n = N;
        for (long p: primes) {
            n /= p;
        }
        //for (long p = 2; p * p <= 2000000000LL; p++) {
        for (long p = 2; p * p <= n; p++) {
            while (n % p == 0) {
                n /= p;
                primes.push_back(p);
            }
        }
        if (n > 1) {
            primes.push_back(n);
        }
        sort(primes.begin(), primes.end());
        
        return primes;
    }
};
#undef long

I suspect there will be many challenges in the challenge phase. I prepared up some cases, it's unlikely I'll get to use them, because Petr is in my room. Finished typing this as the coding phase ended. Let's see what happens.

Tuesday, December 09, 2014

SRM 640 : Do I still have write access here?

It's SRM 640, that means there have been 140 SRMs since SRM 500, remember SRM 500? It was fun.

Div1 250: The one with Trees

You have a bunch of stars with different colors. Use Ribbons to connect the stars, so that the ribbons and stars make a tree (connected, no cycles). Only some specific pairs of stars may be connected. Return the minimum number of ribbons connecting two stars with equal colors we should use.

So... isn't this the very definition of the minimum spanning tree problem? You need to find an undirected tree, with a minimum sum of used edge costs. Just make an edge cost 1 if the two stars it connects have the same color.

So just implement Prim's or Kruskal's Minimum Spaning Tree algo. I went for Kruskal because I am odd.

class ChristmasTreeDecoration:
 def solve(self, col, x, y):
    n = len(col)
    # I like python for things like this, create a list of edges (c,u,v) c is cost
    edges = [ ( int(col[i-1] == col[j-1]), i-1, j-1) for (i,j) in zip(x,y) ]
    
    # Precarious union find thingie 
    parent = range(n)
    def getParent(x):
        while parent[x] != x:
            (x, parent[x]) = (parent[x], parent[parent[x]]) 
        return x
    
    # Kruskal:
    total = 0;
    for (cost, u,v) in sorted(edges):    #sorting edges by cost
        if getParent(u) != getParent(v): #distinct components
            parent[getParent(u)] = v     #connect them
            total += cost                # add cost
    return total

After coding this and realizing it is super short and elegant I starred at the screen for a while. Skeptical that this was so easy. 250s lately are harder than this. I also doubted for a while if my getParent(x) function is bugged or not. Normally I use a stack to move all x to have parent[x] = the super grand parent. But this seems to work fine.

Div1 550: The one with bipartite graphs

You are given `n_1,n_2,a,d`, `n_1,n_2 <= 1000000000` return the maximum number of edges in a graph with `n_1+n_2` vertices, where each vertex has degree of at least `d` and the maximum bipartite matching is `a`.

So yeah... this problem exists. I am sure there has to be some sort of solution.

Div1 1000:

Another problem that probably has a solution. My hunch is that this one is rather easy once you do plenty of math.

Challenge phase

Nothing to report.

Let's see if that 250 survives.

Saturday, August 30, 2014

SRM 631 : A new arena

Remember when I wrote blog posts periodically? Remember when I actually had energy to have an explanation ready just by the end of the match? Remember when editorials weren't super late? It was a nice time for sure.

Participated in Topcoder SRM 631 . It was a special match because earlier this week the Web version of the Topcoder arena was released (in beta). There is actually a thing in which if you use the arena in SRM 631 or 632 and fill a survey you can maybe, if you are super lucky, maybe, win a ticket to the TCO. So I decided to use it in these matches. The new Arena currently lacks many features. And I have the impression that there were some few wasted opportunities in regards to how it was designed. I will focus this post on what I experienced about the new arena. If

Registration

I was very busy in the morning finishing the first part of SRM 630 editorial (log in required, which is terrible). I've had a very bad streak with my editorial work. First I was actually solving the problem in SRM 629 quite fast, even figured out div1 medium by myself and all, but the div2 hard got me stuck for too long, so long that SRM 630 started. I made the foolish decision to finish 630 before taking back to work in 629., thinking that at least then only one editorial will be late. That was a very bad decision because it turns out that div2 hard and div1 medium were both the same problem. One with a very easy solution that I couldn't understand for over a week, not helped by a paper that made me feel dizzy when reading it...

The short story is , I was busy and distracted so when it came time to register for the match , I forgot to use the web applet (I was already using the Java one, because I needed to test the SRM 630 problems and the web applet doesn't have practice or plugins). Lucky because it turns out that people who tried to register using the web Arena had issues.

One of the Use Cases I had in mind regarding the web arena was the ability to register to the match from a mobile device. You cannot run the Java Arena in Android or iOS. Unfortunately, my attempts to load the arena in Android Firefox were futile and I didn't have time to unfreeze the Google Chrome Android app so I could test it.

I really hope that whatever issue is making it so hard for the web app to load in mobile browers (or maybe just Mobile Firefox) is eventually not a problem. Will make sure to report once I have more info.

The match

I opened the web arena and looked for a way to enter my room. In the competition room, you meet this:

It actually took me a while to find the problem links. They are in the right. The web arena actually shows you the problem names before you open the problem. This has interesting consequences. Remember that the submission score starts to drop the instant you open a problem, so in topcoder matches we tend to try to solve one problem at a time and usually it is best to start with the easiest (lowest score) problem first. Some problem names are very descriptive , however, so maybe if a problem hints you up that the solution will be too easy or too hard for you the problem names are useful information. Or maybe disinformation if the problem name is not that good. Also, in some cases the problem name can give you a hint of who is the writer of the problem...

I don't really like that you need to click "Unopened" to open the problem. It is just not intuitive. I would have preffered an [open] link/button and some test below problem name to indicate if you already opened. Optimal would be to show the problem score, decreasing real time.

I open the div1 easy problem, remember that I am in low energy mode. It took me a while to read the statement. In this problem, you have to turn a board containing either Black or White cells into a board such that, for each column, the maximum number of consecutive cells (in the column) of the same color is `N/2`. A step consists of picking a row and either painting all white cells black or all black cells white. Note that this is the same as painting the whole row white or black. Return the minimum number of moves. I was trying to solve this problem when I noticed something was missing : The constraints. Turns out that the problem statement didn't display them... This is the kind of bug that we could have caught before the match if there was a practice feature - Just saying.

So I needed to open the problem in the Java arena to read the constraints (If the maximum `N` is 8 the problem is straightforward, if the maximum `N` is 1000 it would be very difficult to make it run in time, Constraints are everything). Logging in the Java arena forces the web arena to close. Logging in the Web arena causes the Java arena to freeze. Both arenas need a constant connection, and I was hoping the web one didn't (maybe make it so having a constant connection enables chat and real time score update but is optional).

On the bright side, having to open the problem in the Java arena meant that the Greed plugin would generate test code for me.

How to solve div1 250

The constraints were: Matrix is size `N times N`, `N ` is even, `N <= 50`.

I think the key here is to understand how the columns work. There can only be at most one group of more than `N/2` consecutive cells of the same color in a single column. If no such group exists, we can ignore the column. So we end up with a problem in which each column has a group of at least `N/2 + 1` that we need to cut. The cool property here is that at least 2 rows will be shared by ALL of these groups. Even in an extreme case, if there were 2 groups in two columns, one group was topmost and the other bottommost and the number of consecutive equal colors was `N/2+1` in both columns; even in this case, the two groups would share two rows. It is nice that there will always be two shared rows, because we can always just paint one of them white and the other black and it will be enough to "cut" all the columns. In short: We can always solve the problem in 2 steps. So we only need to worry about two special cases : a) If the board is already correct then the result is 0 steps. b) Maybe the board can be solved in a single step, pick one of `O(n)` rows and paint either completely black or completely white and check if the board is correct. This leads to an `O(n^3)` algorithm.

class TaroJiroGrid:
 def getNumber(self, grid):
    n = len(grid)
    # result is at most 2
    res = 2
    
    def goodGrid(g):
        def goodColumn(i):
            last = '!'
            x = 1
            for j in xrange(1, n):
                x = x + 1 if g[j][i] == g[j-1][i] else 1
                if x > n / 2:
                    return False
            return True
        
        return all(goodColumn(i) for i in xrange(n))
    
    if goodGrid(grid):
        return 0
    
    for i in range(n + 1):
        for color in ('W', 'B'):
            ngrid = [ grid[j] if i != j else color * n for j in xrange(n) ]
            if goodGrid(ngrid):
                return 1
            
    return 2

So I still had to open the web arena, paste the code and submit... But first I wanted to test in the arena. I noticed something cool, unfortunately the screenshot wasn't saved for some reason. Each example had a checkbox next to it. Then there was a [Test All] button. So you can test multiple examples at once. Thus killing one of the main reasons most people use plugins... Unfortunately, when I tried to do this I had an error that EXACTLY ONE example must test. Apparently this is a bug.

The rest

I opened div1 500. Seemed difficult. Then I went to eat. Came back and Tried to chat in the arena, but scrolling text in the chat box turned out to be very difficult. I hope this gets fixed.

I wish the new arena didn't forcefully require a constant connection. It is very annoying to log back in when your internet has a small hiccup... Also hope that whatever makes me unable to use it in mobile gets fixed (either the browser or the arena, will research more later).

Random screenshot with some glitches:

Some of the glitches are my fault : I force a minimum font in the web browser. Some seem to have been caused by slow internet or something.

Tuesday, August 12, 2014

SRM 629: A mistake

I think reading the problem statement is half the batte. Or actually, understanding the problem statement is half the battle.

Div1 250: The one with a rectangular hole

You have a rectangular hole of some possibly large dimensions. Also some boards that you can use to cover the rectangle. The boards are rectangles themselves. They may be rotated and overlap. But all of their corners must lie strictly outside the whole. Also the sides of the boards must be paralel to the sides of the hole. I put emphasis in all because I spent most of the match with the ridiculous assumption that only one of the corners must be outside the hole. This assumption makes no sense to be made because the statement is quite clear. I have no idea what happened, but this other variation of the problem seems very difficult, might even be impossible with the given constraints :/. Also note that the corners do not need to be put in integral coordinates, the statement doesn't mention this explicitly but an example needs this. Of course, you return the minimum number of boards needed to cover the rectangle. There are up to 50 boards.

So the trick to this problem is to notice that with the corner constraint, then all boards must be put in sequence either horizontally or vertically . Overlaps are not helpful. With this knowledge, we can first try horizontally and then vertically. As easy as swapping the dimensions of the rectangle, so we first try with W = holeW, H = holeH and then with H = holeW, W = holeH. The idea is that each board's height must be strictly larger than the hole's height (H). Then the board will cover a part of the width. Just find the top X possible widths, if those widths can cover the whole width W, then you have a result `t`. Note that when choosing each board's width and height we need to pick a good rotation...

import itertools

INF = 10**9

def minimumNumberInf(holeH, holeW, boardH, boardW):
    res = INF
    for (W,H) in ( (holeH,holeW), (holeW,holeH) ):
        # all vertical
        def correctWidths():
            for (wr,hr) in zip(boardH, boardW):
                best = -1
                for  (w,h) in ( (wr,hr), (hr,wr) ):
                    if h > H:
                        best = max(best, w)
                yield best
    
        widths = sorted( [ w for w in correctWidths() if w != -1 ] )
        
        if sum(widths) >= W:
            r = next( t for t in itertools.count(1) if sum(widths[-t:]) >= W )
            res = min(res, r ) 
    return res


class RectangleCovering:
 def minimumNumber(self, holeH, holeW, boardH, boardW):
    r = minimumNumberInf(holeH, holeW, boardH, boardW)
    return -1 if r >= INF else r

Because of the interpretation mistake, I took way too long to solve this problem. I also started the match a bit late so I didn't have time to read div1 500 either :/

Tuesday, July 22, 2014

SRM 628: sleeping during the srm

Okay, so here is the thing. This is a 7:00 AM match. That is not usually a problem except I went to sleep at 4:00+something AM, at least I could register early to the match. So I put my alarm clock to 7:15 AM, I thought losing those 15 minutes wouldn't really change my score too much unless div1 medium is easy. I *really* woke up at 7:30. With 45 minutes the objective was to do at least div1 easy...

Div1 Easy: The one with sub-routines

Given `2 <= n <= 10^18` Find the maximum `x` such that : `(x ^ d(x) = n)` where `d(x)` is the number of divisors of `x`. If no such `x` exists, return -1.

The key is to notice that `d(x)` cannot really get too large. Any valid `d(x)` is `O(log(n))`, so we iterate through all the values of `d` such that `n` is a power of `x`. Until we reach a `d` that is very large.

For a given `d`, let's find `x : x ^ d = n` , this is the `d`-th root of `n`. We could use the pow function here, but it uses floating point numbers and you would need to be careful with precision and stuff... So we can just use a binary search for the largest `x` such that `x ^ d <= n`. Incidentally, if this `x` is 1, then we know that `d` has become too large and we can stop the search.

Once we have `d` and its valid value of `x` , all we need to do is verify that the number of divisors of `x` is equal to `d`. `d` will be at least 2, which means that `x` is `O(sqrt(n))`. We can count the divisors of `x` in `O(sqrt(x))` time. This yields an `O(n ^ (1/4))` algorithm. Nice?

The main method of my solution looks like this:

long findArgument(long n)
{
    long d = 2;
    long res = -1;
    while (true) {
        // find x : x ^ d = n
        
        long x = binarySearch(1, n+1, [&](long x) {
                return ipow(x,d) <= n;
        });
        if (ipow(x,d) == n) {
            // possible
            if (countDivisors(x) == d ) {
                res = std::max(res, x);
            }
        } else if (x == 1) {
            break;
        }
        
        d++;
    }
    
    return res;
}

Everything else are standard sub-routines ...

long binarySearch(long lo, long hi, std::function<bool(long)> P)
{
    // P(lo) true
    // P(hi) false
    while (lo + 1 < hi) {
        long ha = lo + (hi - lo) / 2;
        if (P(ha)) {
            lo = ha;
        } else {
            hi = ha;
        }
    }
    return lo;
}

const long INF = 1000000000000000002LL;
long ipow(long x, long d)
{
    // returns x ^ d if x ^ d < INF, else INF 
    long r = 1;
    while (d > 0) {
        if (d % 2 == 1) {
            if (r > INF / x) {
                r = INF;
            } else {
                r = r * x;
            }
        }
        d /= 2;
        if (x > INF/x) {
            x = INF;
        } else {
            x = x * x;
        }
    }
    return r;
}

long countDivisors(long x)
{
    long p = 1;
    long c = 0;
    while ( p <= x / p) {
        if (x % p == 0) {
            c++;
            if (x / p != p) {
                c++;
            }
        }
        p += 1;
    }
    return c;
}

Div1 medium: The one with expressions

You are given a tree of binary expressions. The expressions can either be the ADD function or the MAX function. You also have a list of (positive) values you can assign to each of the leaves in the tree. There are at most 2000 leaves. Return the maximum total value

Good luck with THAT. Solution is likely some sort of greedy but I am too asleep to do this.

.

Afterwards

During the challenge phase I read codes for div1 500. The idea appears to be that you can find out the maximum quantity of values that will be added up together in the final expression. If you have this number `x`, then you just have to pick the largest `x` integers from the values and add them together. Of course they had to pick THIS match to have an easy div1 medium...

Thursday, July 10, 2014

SRM 627: Did I really code THAT?

Is this just fantasy? I am not sure if this actually happened, backtracking with binary indexed trees? What?

Div1 250: The one with letters

Game starts with a multiset of alphabetic letters. In every step, two distinct letters are taken and removed. Eventually, the multiset will be empty or contain only repeated letters, those letters are called winning letters. Return all the letters that can possibly be winning letters.

My idea, which I am not totally sure is correct. Is to check for each letter `c`, if it is possible that it will win. Then all the moves have to have the intention that the specified letter remains at the end. This means that we should first try to clear as many letters different to `c` as possible. Eventually, a list of equal letters that are distinct to `c` will remain, and we need to get rid of them by using the `c` letters. If these repeated letters are less than `c`, then `c` can win.

The trick then is to always pick pairs of letters distinct to `c` , and pick the two letters that are currently most frequent.

class HappyLetterDiv1:
 def getHappyLetters(self, letters):
    res = ''
    # for each letter ch, can we do it?
    for ch in sorted(set(letters)):
        # get a dict of distinct letters and counts
        other = { x : letters.count(x) for x in letters if x != ch }
        while len(other) > 1:  # at least two different ones
            # sort by frequency
            least = sorted([ (other[x],x) for x in other.keys() ])
            # remove a letter of each of the two most frequent:
            a = least[-1][1]
            b = least[-2][1]
            def dec(y):
                other[y] -= 1
                if other[y] == 0:
                    other.pop(y)
            dec(a)
            dec(b)
        if (len(other) == 0) or (other[other.keys()[0]] < letters.count(ch)) :
            res += ch
    return res

Div1 500: The one with a graph and inversions

You have a graph of at most `1000` vertices. The graph is special in which it is a tree with an extra edge added onto it. (The graph is connected, undirected, lacks loops and has exactly `|V|` edges)

The value of a vertex `x` is `V[x]`.

For a simple path of vertices in the graph, we make a sequence `S` of the values of the vertices in visited order. Now count the number of "inversions" of the graph. This is the number of pairs `(i,j)` such that `(i < j)` and `(S[i] > S[j])`. Return the minimum number of inversions in a sequence of at least `K` elements.

.... So the trick is that the graph is a tree with exactly one extra edge. So there is only one simple cycle in it. So there are at most 2 simple paths connecting each pair of vertices. So there are at most 2000000 paths. 2000000 sequences. With backtracking, we can visit them all. The problem is that you need to calculate the sequence's number of inversions as you build the sequence...

Let's say you have already chosen some of the previous elements in a sequence, and want to add `V[x]` to the sequence, we can count the number of added inversions by calculating the number of elements in the sequence that are higher than `V[x]`. So we "just" need to use a data structure that adds elements, removes elements and counts the number of elements larger than a integer `x` in `O(log(N))`. And we need to update that sequence (add , then remove) as part of the backtracking. Easy... So I implemented a binary indexed tree , called it fenwik for some reason and it seems to work.

// for each pair of edges (x,y), there are at most 2 paths.
// thus there will be at most 2000000 paths.


int fenwick[2048]; //probably wrong name

void fenwick_add(int x, int a = 0, int b = 1024, int node = 0)
{
    if ( a <= x && x < b ) {
        fenwick[node]++;
        if (a < b - 1) {
            int c = (a + b) / 2;
            fenwick_add(x, a,c, node*2 + 1);
            fenwick_add(x, c,b, node*2 + 2);
        }
    }
}
void fenwick_remove(int x, int a = 0, int b = 1024, int node = 0)
{
    if ( a <= x && x < b ) {
        fenwick[node]--;
        if (a < b - 1) {
            int c = (a + b) / 2;
            fenwick_remove(x, a,c, node*2 + 1);
            fenwick_remove(x, c,b, node*2 + 2);
        }
    }
}
int fenwick_counthigher(int x, int a = 0, int b = 1024, int node = 0)
{
    if (a > x) {
        return fenwick[node];
    }
    if ( (a==x) || (b <= x) ) {
        return 0;
    }
    int c = (a + b) / 2;
    int r = fenwick_counthigher(x, a,c, node*2 + 1)
          + fenwick_counthigher(x, c,b, node*2 + 2);
    return r;
}


int getMinimumInversions(vector<int> A, vector<int> B, vector<int> V, int K)
{
    int N = V.size();
    vector<list<int>> g(N);
    for (int i = 0; i < A.size(); i++) {
        g[A[i]].push_back(B[i]);
        g[B[i]].push_back(A[i]);
    }

    const int INF = 20000000;
    int res = INF;
    memset(fenwick, 0, sizeof(fenwick));
    for (int s = 0; s < N; s++) {
        vector<bool> visited(N, false);
        int t = 0;
        int inv = 0;
        
        std::function< void(int) > backtrack = [&](int x) {
            visited[x] = true;
            t++;
            int oinv = inv;
            inv += fenwick_counthigher( V[x] );                
            fenwick_add( V[x] );
            if (t >= K) {
                res = std::min(res, inv);
            }
            for (int y : g[x]) {
                if (! visited[y] ) {
                    backtrack(y);
                }
            }
            fenwick_remove( V[x] );
            inv = oinv;
            t--;
            visited[x] = false;
        };
        backtrack(s);
    }
    
    
    
    return (res >= INF) ? -1: res;
}

Challenge phase

Got lucky, the first place in the room clearly did 500 without the data structure part (Was using a std::multiset, but multiset doesn't help in the counting higher elements part). So I just challenged with a large case.

Saturday, June 28, 2014

SRM 626: Not a math person

Another SRM.

Div1 250: The one with dice

Alice throws `a` dice with `b` faces each (Numbered 1 to `b`, all equally likely). Bob throws `c` dice with `d` faces each (same). The score is the sum of all the dice rools each player gets. Return the expected value of Alice score IN CASES SHE WINS. If it is impossible for Alice to win, return -1. All `a,b,c,d` are at least 1 and at most 50

Okay... For each die, can you calculate the probability to get `X` as the total result? Well, yes. This is an `O(a^2 b^2)` dynamic programming thing. Possibly other good methods too. The thing is that we can precalculate two lists that give us the probability Alice gets `X` and the probability Bob gets `Y`.

Then for each pair `(X,Y)` it is easy to get the probability that Alice gets X AND Bob gets Y. So consider only the cases `(X > Y)` and find the probability Alice wins. If the probability is 0, return -1.

Once again, for each pair `(X,Y)`, find its probability, if Alice wins in this case, then the probability that Alice gets that score, provided Alice wins the whole game is `P_A(X) * P_B(Y) / P_"Alice wins"`. The sum of these probabilities multiplied by `X` is the expected value.

My first instinct was to use Bayes theorem , maybe there is a way to simplify things with it?

Total complexity is `O(a^2 b^2 + c^2 d^2 + abcd)`, so around 3* 2500 * 2500 in the worst case :)


vector<double> findProbabilities(int a, int b)
{
    double dp[51][2501];
    
    memset(dp,0,sizeof(dp));
    dp[0][0] = 1.0;
    
    for (int n = 1; n <= a; n++) {
        
        for (int y = 1; y <= n*b; y++) {
            dp[n][y] = 0.0;
            for (int x = 1; x <= b && x <= y; x++) {
                dp[n][y] += dp[n-1][y - x] / b;
            }
        }
    }

    
    vector<double> res(a*b + 1, 0.0);
    for (int i = 0; i <= a*b; i++) {
        res[i] = dp[a][i];
    }
    return res;
}

double getExpectation(int a, int b, int c, int d)
{
    vector<double> Alice,Bob;
    
    Alice = findProbabilities(a,b);
    Bob   = findProbabilities(c,d);
    
    double pw = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                pw += Alice[A] * Bob[B];
            }
        }
    }
    
    if (pw <= 0.0) {
        return -1.0;
    }
    
    double ew = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                double p = Alice[A] * Bob[B]; //probability this happens
                ew += A * (p / pw);
            }
        }
    }
    return ew;
}
};

Div1 medium the one with negative edges

A directed graph, possibly with multiple edges and loops and cycles. Find the minimum cost to move from a vertex to another.... EXCEPT that you have at most `X` opportunities to make the cost of an edge negative when you use it. `X` can be is obscenely large.

Okay, no idea.

Div1 hard: The one with mirrors

There is a rectangle with mirror sides of some dimensions. Find each angle from which you can cast a ray from the bottom-left corner in such a way that the ray of light bounces exactly `b` times before touching any corner and finally reaches the top-right corner. `b` is at most 1000000000. For each of those angles, add the total distance traveled by the ray before hitting the corner. Calculate all that modulo 1000000007

I thought I had a good chance to solve this problem. Because of the old codejam problem Hall of mirrors I knew a lot of properties about these rectangles of mirror sides. For example, we could imagine that a bounce is the same as moving around a grid made of rectangles of those mirrors. I would have more luck explaining this if I had time to inkscape....

The thing is that if you imagine a grid of mirrors and make the coordinates in each of these mirror corners `(x,y)` . Then moving from the origin to this coordinate is valid if and only if: `(x,y)` are coprime and also the coordinates should share a diagonal with `(1,1 + b)`. In addition, `b` must be even.

Unfortunately, because `b` can be very large, just knowing that isn't enough. It needs some mathematical trick to avoid doing an `O(b)` loop.

The following passes examples 0 o 3.

const int MOD = 1000000007;

long gcd(long a, long b)
{
    while (b != 0) {
        tie(a,b) = make_tuple(b, a%b);
    }
    return a;
}

int findSum(int sideA, int sideB, int bounces)
{
    if (bounces % 2 == 1) {
        return 0;
    }
    long y = 1 + bounces;
    long x = 1;
    
    // find solutions to this:
    // 1 <= y - p <= y
    // gcd(y - p, 1 + p) = 1
    
    long sx = 0, sy = 0;
    while (y >= 1) {
        if (gcd(x,y) == 1) {
            sx += (x*x) % MOD;
            sy += (y*y) % MOD;
        }
        y--;
        x++;
    }
    sx = ( ((sx * sideA) % MOD) * sideA) % MOD;
    sy = ( ((sy * sideB) % MOD) * sideB) % MOD;
    return (sx + sy) % MOD;
}

I submitted it just for fun, it will time out with slightly large `b`. No one dares to challenge it so far.

Thursday, June 19, 2014

SRM 625: I semi-wrote it

Two days ago I was busy trying to come up with a way to finish SRM 624's editorial. I get how to solve div1 hard in the top level, I have no idea how to explain the initial step: Split in centers subtrees. I am fairly sure it is probably a known algorithm, can't find info. Was stuck there when I was contacted by the admins about SRM 625. They needed problems urgently and I apparently had some problems that were created long ago and could work.

I wrote all problems except div1 hard (the important one). They really weren't intended to be part of the same match. And I think that they go more towards the standard side of things. Please note that I've gone through a huge writer's block and I can't seem to come up with interesting problems anymore. But oh well. I just hope there are no big issues of the sort that cause match cancellations.

Div2 250: The one with challenge phase

Unlike the other problems, I thought of this one specifically for this match. Ever since multiple-answer problems were added to topcoder, I wanted to make something that takes advantage of it.

The problem is easy, given `y` find `(a,b,c)` such that `ab + c = y`, `(a,b,c)` must be integers between -1000 and 1000 different to 1 and 0. `y` is between 0 and 500. Otherwise any `(a,b,c)` is correct.

There is a bruteforce search here. But we can also get clever. Note that no matter which two values `(a,b)` you pick, there will always be a `c = y - ab`. So we can do `(a,b) = (1,1)` and then `c = y - 1`. This would be wrong because values cannot be 0 or 1. But we can try to pick values `(a,b)` that don't have this problem. For example `(a,b) = (2,2)`, now `c = y - 4`. This is almost fine, but `y = 4` or `y = 5` would yield a wrong value of `c`. We could handle those cases separately OR we could try something different. We would like `y - ab` to be distinct to 0 or 1. How about : `y - ab > 1` then `ab < y -1`. We can guarantee this to be true with `(a,b) = (2,-2)`. Then we have `c = y + 4`. So the result can always be `(2,-2, y + 4)`.

Div2 500: The one with sequences

This problem is very old. I think I thought of it for some TCO qualification round of a past TCO. I didn't even remember much about the problem. It was already developed but it was never used. So we mostly just enabled the problem for use.

If I remember correctly, this problem can be very tricky.

Div2 1000: Minimum Liars Again

A circle of friends. For each i, we have answers[i]:

  • If answers[i] is ?, we didn't ask friend i anything.
  • Else we asked the i-th friend if friend (i + 1) mod n is a liar. If friend i is a honest person, they would always tell the truth , else they would always lie. answers[i] is H if friend i said friend (i + 1) mod n is a honest person, else L.

Return the minimum number of liars in a consistent case. If no case can be consistent, return . 1.

This is another very old problem. It was also already developed, but for `n <= 20`. We decided `n <= 50` is better for this problem slot.

The idea is simple: If there are no ? in the input, the whole thing is cyclic. You can try making the first person a liar or not. If first person is a liar, and you know their answer, then you know if second person is a liar, then third person and so and so. Eventually, you will know if person `n - 1` is a liar or not, and this is important the conclusion we make about person 0 must be consistent with the hypothesis we made. Return the minimum number of liars in the consistent cases.

If there is at least one ? , the case is not really cyclic. You can just split the friends in groups of consecutive non-? friends. These cases are linear and you can use the same logic (try if first person is a liar or not). But this time, no consistency is needed.

Like this:

import itertools
class ConundrumReloaded:
 def minimumLiars(self, answers):
        
    def simulateLine(f, s):
        p = f
        liars = 0
        for i in range(1, len(s) + 1):
            h = (answers[i - 1] == 'H') ^ (not p)
            liars += not h
            p = h
        #print (f,s), (liars,p)
        return (liars, p)
        
    n = len(answers)
    q = -1
    for i in range(n):
        if answers[i] == '?':
            q = i
    INF = 10**9
    res = INF
    if q == -1:
        for f in (0,1):
            (liars, p) = simulateLine(f, answers)
            if p == f:
                res = min(res, liars)
    else:
        res = 0
        curr = ''
        for i in range(n + 1):
            j = (i + q) % n
            if answers[j] == '?':
                if curr != '':
                    res += min( (not f) + simulateLine(f, curr)[0] for f in (0,1) )
                    curr = ''
            else:
                curr = curr + answers[j]
        
    return -1 if res >= INF else res

Div1 250: When you are desperate combine palindromes and probabilities

Given a string. We pick one of its anagrams with uniform probability. Return the probability the picked anagram is palindrome.

This problem is also very old. I think I tried to use it in many matches, but (thankfully) we always found something more interesting and less standard. Until now.

Simply return: (Number of palindrome anagrams) / (Number of anagrams).

The number of anagrams is calculated by dividing `n!` by `r!` for each letter in word that is repeated `r` times.

So we just need to calculate the number of palindrome anagrams. First of all, if `n` is even, then all characters must appear an even number of times. If `n` is odd then only one character should appear an odd number of times. If `n` is odd, we already know what character is in the center and we can just remove it and treat it as if everything was even.

We have many letters that appear an even number of times. We know that we should put each of these letters in both sides of the palindrome. The order depends only on the first side. So actually, we should count the number of ways to order the `n/2` letters that remain after dividing each letter quantity by 2. This is the same as dividing `(n/2)!` by `(r/2)!` for each letter that is repeated `r` times.

As I type this, it appears that many people including div1 hard's writer have issues with precision in their solutions.

Div1 500: The one with blocks

The original version of this problem was meant for the last TCO rounds, I think last year. But was too complicated. In the original version , there were multiple goals and exactly two start cells. I don't even remember how was it solved.

In the current version, there are multiple start cells but only one goal. The game is symmetric so we can just begin at the goal and if it reaches a start cell, there is a solution if you begin at that start cell.

So we should just find the minimum number of holes to cut the path between the goal cell and any start cell.

Now analyze the allowed moves, you will see that the set of reachable cells is like this:

*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*

Where SZ are horizontal 2x1 pairs of cells that can be touched by a 2x1 face, the * are the cells that can be touched by 1x1 faces (including the goal) and AV the vertical 2x1 pairs of cells.

We can just ignore any start cell that is not *, no matter what we do, we can't reach them.

So now we just have a minimum cut problem. The trick is to take the 2x1 pairs of cells as edges then the cost to remove an edge equals the number of normal cells in the edge. You can also remove * cells, this means using an intermediary edge connecting entry and exit point for each * cell. Okay. The minimum cut requires you to get clever and implementation will be a headache but the concept is easy!

Thursday, June 12, 2014

Twitter

Around 2010 I started using twitter. That old account has turned into something. I participate in twitter a lot and I am very opinionated so it has taken a direction that may not be liked by everyone who just wants to talk topcoder stuff.

I really started to feel the need for a new , more professional / respectable twitter account, so I started a new one that will be focused on programming and mostly contests (you know, my job). My old account will probably just retweet those things. But if you want to follow me on twitter and only get programming (contests) opinions and updates and links to updates to this blog, you should follow the @fakevexorian account.

*Taunts you*

In today's SRM 624 there was a problem that asked you to return something modulo 1000000009. Usually the modulo is 1000000007 or 1000000009 , but people have the impression it is 1000000007 most of the time.

Usually topcoder example tests in this kind of problem include a test case that makes use of the modulo. This is intentional so that people can catch a mistake in writing this constant. Unfortunately, today's problem didn't have such test case.

I didn't even notice. If you read my recap of SRM 624 you will see I even mistakenly used 1000000007 when typing the explanation. And yet I passed , well I have a secret weapon (not so secret).

This is my Greed c++ template:

#include <bits/stdc++.h>
// ${Contest.Name} - ${ClassName} (${Problem.Score} points) by vexorian
using namespace std;

struct ${ClassName}
{
${<if Problem.Description.Modulo}
    const int MOD = ${Problem.Description.Modulo};

${<end}
    ${Method.ReturnType} ${Method.Name}(${Method.Params})
    {
        return ${if Method.ReturnType.Array}{}${else}${Method.ReturnType;zeroval}${end};
    }
};

${CutBegin}
${<TestCode}
${CutEnd}

Take a look to the if Problem.Description.Modulo part. Yes, this part of my template actually generates the MOD constant for me :). One day I spent an afternoon or so making the Greed topcoder plugin able to parse the modulo constant and make it accessible to templates. This is why :).

SRM 624: Slooooooooow

I just did the most amazing c++ code, got to share.

Div1 250: The one with buildings

You are given a list of integers (building heights). The maximum number of integers possible is 4000 and the maximum integer possible is 4000. For each `0 <= i <= n`, find the minimum number of steps so that the final sequence has `i` equal integers. Every step is to increment a single integer by 1 (building a new floor in a building).

I had some difficulties getting to the right approach. At first trying to do things like sweeps and dynamic programmings :/ . Turns out the trick is to solve: What is the minimum cost to have `y` buildings of height `x` ? For each `x`, calculate those costs for all `y`, updating the values in a total cost function that considers all possible heights.

The trick is that there are only `O(N)` relevant values of `x` (those equal to the heights), and that it is easy to find, for each `y` the minimum cost to have `y` buildings of height `x`. Just pick the `y` largest buildings of height greater than or equal to `x` and increment them. This gives us an `O(N^3)` algorithm, however, for each `y`, you can reuse the result of `y-1` (you know the cost of fixing `y-1` buildings, just add a single building and we know the result for `y`.

This is an `O(N^2)` algorithm:

int minimum(vector<int> heights)
{
    int n = heights.size();
    // sort non-increasingly by height:
    sort(heights.begin(), heights.end());
    const int INF = 4000 * 4000;
    // cost[i] will have the cost to have i equal buildings
    vector<int> cost(n + 1, INF);
    cost[1] = 0;
    // for each height:
    for (int i = 0; i < n; i++) {
        int c = 0;
        // for each number of buildings i - j + 1:
        for (int j = i - 1; j >= 0; j--) {
            // the minimum cost to have i - j + 1 buildings of this height 
            c += heights[i] - heights[j];
            // remember the best
            cost[i - j + 1] = std::min( cost[i - j + 1], c);
        }
    }
    
    
    // the problem actually asks for the XOR of all values, but I suspect
    // this is because of limitations in the return value size.
    int res = 0;
    for (int i = 2; i <= n; i++) {
        res ^= cost[i];
    }
     
    
    return res;
}

Div1 450: The one with implementation

Classical problem. You are given a graph (no loops, bidirectional, connected, weighted, some weights are zero). Return the total number of shortest paths between vertices `0` and `N-1` modulo 1000000007. Note that zero-weight edges might make the number of shortest paths infinite, return -1 in that case. At most 2000 vertices and at most 2000 edges.

Okay, the complicated path is when paths are infinite. This can only happen if there is at least one shortest path that visits an edge with weight 0. So we need a way to know, for each edge, if it can belong to a shortest path.

The way to check if an edge `(u,v)` can belong to an optimal path requires us to know: The weight of the edge, the minimum distance `D` between 0 and `N-1`, the minimum distance `d` between 0 and `u` and the minimum distance `r` between `N-1` and `v`. If `d + w + r = D`, then the edge can be used in a shortest path. If any 0-weight edge is one of those, return -1.

The rest is to count the number of paths. How about `f(x)` which returns the number of shortest paths towards `N-1` starting with `x`. `f(N-1) = 1`, there is only one shortest path there. For the other vertices `x`, you find every `y` such that `(x,y)` can belong to a shortest path. Then `f(x) = sum(f(y))` , easy, right? . Just evaluate the vertices in non-increasing order of distance to `N-1`, since none of the edges used will have weight 0, we can do this sorting.

I broke a record with this problem. I used 4 lambdas in total. And I even nested lambdas this time, which is so unusual in c++ :).

const int MOD = 1000000009;



int count(int N, vector<int> A, vector<int> B, vector<int> C)
{
    // make the graph easier to use:
    vector<vector<int>> g(N);
    vector<vector<int>> w(N);
    vector<int> degree(N);
    for (int i = 0; i < A.size(); i++) {
        degree[--A[i]]++;
        degree[--B[i]]++;
    }
    for (int i = 0; i < N; i++) {
        g[i].resize(degree[i]);
        w[i].resize(degree[i]);
        degree[i] = 0;
    }
    auto addEdge = [&](int u, int v, int weight) {
        g[u][degree[u]] = v;
        w[u][degree[u]++] = weight;
    };
    for (int i = 0; i < A.size(); i++) {
        addEdge( A[i], B[i], C[i] );
        addEdge( B[i], A[i], C[i] );
    }
    
    //Dijkstra time!
    auto dijkstra = [&](int s, vector<int> &dist) {
        set<pair<int,int>> Q;
        const int INF = 2000000000; 
        dist.resize(N, INF);
        for (int i = 0; i < N; i++) {
            Q.insert( {INF, i} );
        }
        auto push = [&](int x, int d) {
            if (d < dist[x]) {
                Q.erase(Q.find( {dist[x],x}));
                dist[x] = d;
                Q.insert( {dist[x], x} );
            }
        };
        push(s, 0);
        while (! Q.empty()) {
            auto it = Q.begin();
            int x = it->second;
            Q.erase(it);
            
            for (int i = 0; i < degree[x]; i++) {
                int v = g[x][i];
                int we = w[x][i];
                push(v, dist[x] + we);
            }
        }
    };
    vector<int> dist, rdist;
    dijkstra(0, dist);
    dijkstra(N-1, rdist);
    
   
    // if any of the shortest paths visits an edge of weight 0, the result
    // is -1. It is possible to have 0s in the input and result diff to -1
    for (int i = 0; i < A.size(); i++) {
        if (C[i] == 0) {
            int u = A[i], v = B[i];
            for (int r = 0; r < 2; r++) {
                if (dist[u] + rdist[v] == dist[N-1]) {
                    //critical
                    return -1;
                }
                swap(u,v);
            }
        }
    }
    
    // And now the part in which we actually count the roads.
    vector<long> dp(N);
    vector<int> byRDist(N);
    for (int i = 0; i < N; i++) {
        byRDist[i] = i;
    }
    sort(byRDist.begin(), byRDist.end(), [&](int a, int b) {
        return rdist[a] < rdist[b];
    });
    for (int u: byRDist) {
        dp[u] = 0;
        if (u == N-1) {
            dp[u] = 1;
        }
        for (int i = 0; i < degree[u]; i++) {
            int v = g[u][i];
            int we = w[u][i];
            if ( (rdist[v] < rdist[u]) && (dist[u] + rdist[v] + we == dist[N-1]) ) {
                dp[u] += dp[v];
            }
        }
        dp[u] %= MOD;
    }
    return dp[0];
}

The rest

I was very slow, everybody else has a much better score. I hope many of them fail. I am confident in the correctness of my solutions and I don't suspect of any likely implementation bug. Typing this as the challenge phase begins.

During challenge phase: 8 minutes of challenge phase, and I doubt I'll score a successful challenge, there are some tough challengers in my room and they are very quick to find errors. There seem to be many ways to get 450 wrong.

Wednesday, June 04, 2014

SRM 623: Saved by the challenge phase

Div1 300: The evil one

This problem was evil. Let me explain, you have a board of maximum size 20x20, some cells contain apples, some pears, some are empty. A valid move is to move a fruit to an empty cell. You can do at most `K` moves. The objective is to maximize the size of a rectangle containing only apples. What is the maximum?

Say you already picked an arbitrary rectangle, there are `O(n^4)` of those. There are two things that are important about this rectangle: `a` : Number of apples in side, `p` Number of pears inside, `e` number of empty cells inside. You can also know the number of apples/pears/empty cells outside the square. Finding this values is a complicated implementation part. Also, I think too slow for python, so I tried to do something in c++, using lambdas of course. It took eons. But when I finally did it , I had a new problem...

Knowing that information about the rectangle we need to find the minimum number of moves needed to have only apples in the rectangle.

At first I was trying something very wrong: Move each pear to a (different) empty cell outside, then move all the apples inside. This took me ages to debug, and when I finally passed system tests with it, I was about to click submit when I started to wonder "Is this really it?" , and I noticed that noooo, the minimum number of moves is much trickier to find. Hence the examples were weak.

So what is the real optimal strategy? Well, the first issue is that we only need one empty cell in the board. We can keep reusing it to do all the moves. Then we have this: If there are any empty cells in the square, it is easy to see that it is optimal to just move any outside apples and cover them. The remaining case is when there are some pears inside the rectangle, at least one empty cell outside and enough apples outside. A single pear is treated like this: Move pear to empty cell, move apple to the position the pear was in, this leaves an empty cell in the apple's position, and we can repeat.

So in fact, the minimum number of moves is equal to:

  • 0 if all the cells already contain apples.
  • # of empty cells inside the rectangle, if the rectangle contains no pears.
  • # of empty cells + 2 * number of pears inside the rectangle, if and only if there is at least one empty cell in the board
int getBoard(vector<string> board, int K)
{
    // stuff to count the number of apples / pears / empty inside a rectangle:
    int W = board.size(), H = board[0].size(); 
    map<char, vector<vector<int>>> total;
    string chars = ".AP";
    for ( char ch: chars ) {
        vector<vector<int>> t(W+1, vector<int>(H+1,0));
        for (int i = 1; i <= W; i++) {
            for (int j = 1; j <= H; j++) {
                int x = ( board[i-1][j-1] == ch );
                t[i][j] = x + t[i-1][j] + t[i][j-1] - t[i-1][j-1];
            }
        }
        total[ch] = t;
    }
    auto rectangle = [&](char ch, int x0, int x1, int y0, int y1) {
        auto &t = total[ch]; 
        return t[x1][y1] - t[x0][y1] - t[x1][y0] + t[x0][y0];
    };
    
    int A = rectangle('A', 0, W, 0, H);
    int E = rectangle('.', 0, W, 0, H);
    
    int res = 0;
    for (int x0 = 0; x0 < W; x0++) {
        for (int y0 = 0; y0 < H; y0++) {
            for (int x1 = x0 + 1; x1 <= W; x1++) {
                for (int y1 = y0 + 1; y1 <= H; y1++) {
                    // the actual logic:
                    // inside:
                    int e = rectangle('.',x0,x1,y0,y1);
                    int a = rectangle('A',x0,x1,y0,y1);
                    int p = rectangle('P',x0,x1,y0,y1);
                    
                    // out side:
                    int na = A - a;
                    int ne = E - e;
                    
                    if (e <= na) {
                        int x = e;
                        int mov = x;
                        na -= x;
                        a += x;
                        e = 0;
                        ne += x;
                        if (mov <= K) {
                            if (p == 0) {
                                res = std::max(res, a);
                            } else if ( (p <= na) && (ne >= 1) && (mov + 2*p <= K) ) {
                                res = std::max(res, a + p);
                            }
                        }
                    }
                }
            }
        }
    }
    return res;
}

Div1 450: The difficult one

All I can think of is that we can imagine a different problem, the points stay static and the player moves either in a vertical line upwards (increasing `y` coordinate) or in a diagonal (incrementing `y` and incrementing or decrementing `x`).

Challenge phase

I took too long to solve div1 300, but I did learn that the example cases were weak. So I thought of a challenge case to make my initial solution fail (using only one empty square). Luckily my room wasn't very aggressive and I could find 2 solutions with that wrong idea and earned 100 points.

Wednesday, May 28, 2014

SRM 622: Anticlimatic

It was a long day, trying to finish the very late SRM 621 editorial. I eventually improvised something. I tried to learn as much as possible about suffix automata until I can at least explain the solution with some more detail than the sketches.

So then the match started, I wasn't too thrilled because it is a 9:00 PM match and they tend to be... boring and have too few coders. But oh well...

Div1 250: The one with roads

You are given a weighted complete directed graph (A map of roads connecting each pair of cities) and a integer `T`. On a special holiday, for each ordered pair of distinct cities `(i,j)` a bus must travel from city `i` to city `j`, the path used by the bus must be the shortest possible. If there are multiple shortest pats, the bus driver may pick any of them - We don't know which. A road between two cities is unsafe if it is possible that more than `T` buses will go through it. Find the total sum of lengths of unsafe roads.

The thing here is how to know if a road might be used by a bus. There are `O(n^2)` buses and each might use each road. For pair of cities `(a,b)` , then road `(x -> y)` is used if and only if: mindist[a][x] + road[x][y] + dist[y][b] = dist[a][b]. Where mindist[p][q] is the minimum distance between cities p and q and road[p][q] is the length of the direct road connecting them. We can find mindist[p][q] in `O(n^3)` with Floyd-Warshall, sure, we could do something even faster, but it is not needed because the logic (picking `a,b,x,y` is `O(n^4)`.

Code some cute python code, submit:

class BuildingRoutes:
 def build(self, dist, T):
    n = len(dist)
    allN = range(n)
    # turn string list dist into a matrix:
    dist = [ [ord(x)-ord('0') for x in row] for row in dist ]
    # number of times a road might be used
    roadcount = [ [0] * n for i in allN ]
    INF = 10**9
    # Floyd-Warshall
    mindist = [ [dist[i][j] for j in allN] for i in allN ]
    for k in xrange(n):
        for i in xrange(n):
            for j in xrange(n):
                mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j])
    # Find potential uses:
    for a in allN:
        for b in allN:
            if a != b:
                for x in allN:
                    for y in allN:
                        if x != y:
                            if mindist[a][x] + dist[x][y] + mindist[y][b] == mindist[a][b]:
                                roadcount[x][y] += 1
    res = 0
    for i in allN:
        for j in allN:
            if roadcount[i][j] >= T:
                res += dist[i][j]
    return res 

Unfortunately, I forgot to test the largest case before submitting the solution. I test the largest case and it turns out it times out. Very disappointing that it times out in python. And it is only by a few seconds. I had to remake the code in c++, lost plenty of points because of spending double time and recoding. I should know better, need to keep in mind that `50^4` is too bad for python.

int build(vector<string> dist, int T)
{
    // O(n^4) too slow in python, had to migrate code to c++ . arg
    int n = dist.size();
    vector<vector<int>> roadcount(n, vector<int>(n,0));
    const int INF = 1000000000;
    vector<vector<int>> mindist(n, vector<int>(n,INF));
    // Floyd-warshall
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            mindist[i][j] = dist[i][j] - '0';
        }
    }
    for (int k = 0; k < n; k++) {
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < n; j++) {
                mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j]);
            }
        }
    }
    // for each pair of cities (a,b), find if a road(x,y) could be used:
    for (int a = 0; a < n; a++) {
        for (int b = 0; b < n; b++) {
             for (int x = 0; x < n; x++) {
                 for (int y = 0; y < n; y++) {
                    if ( (x != y) && (a != b) ) {
                        if (mindist[a][x] + (dist[x][y]-'0') + mindist[y][b] == mindist[a][b]) {
                            roadcount[x][y]++;
                        }
                    }
                 }
             }
        }
    }
    int res = 0;
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            if (roadcount[i][j] >= T) {
                res += dist[i][j] - '0';
            }
        }
    }
    return res;
}

Div1 500: The one with tree dp (Tree dp again...

You are given a weighted tree. We want to divide the tree into a minimum number `K` of connected subtrees such that their diameter does not exceed `K`. (Diameter is the maximum distance between two vertices in a subtree).

Okay, I didn't have time to solve . I got distracted and was already quite late because of 250. But here is what seems like a solution:

Imagine the root. Either THE root, or a node we are treating as root because we cut the edge above it. Then we have to decide if the children of the root will be used in the subtree or not.

In general, it is best to use a child instead of not using it. Because if we do not use it, we forcefully are creating a new tree, but if we do not, we might be able to avoid it. The issue is how to balance out the lengths of the subtrees that we will keep conencted to the root.

This is a neat trick: Let `M` be the maximum distance between the root and the nodes in one of the arms of the subtree. Then the maximum distance between the root and the nodes in the other arms cannot be greater than `K - M`. We can pick one of the children, decide it will be the one with distance `M`, and then the remaining children must have arms of distance of at most `K - M`. If the edge connecting the root to a child is larger than `K - M`, we cannot use this child and must create a new tree.

We need to solve a subproblem: Do the decision for a subtree rooted at a node `X`, such that the distance between `X` and any grand children cannot exceed `M` and the diameter cannot exceed `K`. This is solved in a similar way, picking a children of `X` to have the maximum distance.

All in all, we have to solve the problem for `f(X, M)` , `M` will not exceed 500, (because `K` is at most 500), so we will be fine.

Challenge phase / etc

Challenge phase already started, I will go check what hapens. Scheduling this post so that it is posted when the challenge phase ends.

Tuesday, May 20, 2014

SRM 621: Invisible slowness

So another SRM day. I've been with headache since yesterday so running at half the steam...

Div1 275: The one with the geometry

So you have a circle (signal) centered at `(0,0)` with radius `S` , `S` is a real number picked uniformly at random from 0.00 to `Z`. There are some other circles (cities) each with center `(x,y)` and radius `r`. Things are good if, for each city, the main circle (signal) either doesn't include the city or it includes ALL of the points inside the city. Return the probability that this happens. The answer can have an absolute or relative error or 1e-6 (whichever is smallest)

So that's a nice thing about the allowed error. After topcoder added the ability to include a custom checker in problem sets, I thought it was only a matter of time until someone makes a problem that allows less precision than the usual 1e-9. The first thing I did in this problem was take my special python tester template that allows you to test automatically even when there is a custom checker. I used CNP super powers to paste the usual python checker, but replacing the 1e-9 with 1e-6... I really wanted to have accurate tester results.

The first thing is to notice what makes a radius `S` valid for a specific circle with `(x,y,r)`. A nice trick in problems involving plenty of circles is to consider only the distance between their centers. Eventually you will notice that if the distance between `(0,0)` is `d`, then there are two ranges in which is `S` is valid: `S <= d - r` or `S >= d + r`. Something that caused me to take a bit of time in this sub-problem, because of a special corner case: What if `(0,0)` is inside circle `(x,y,r)` ? Then `d-r` is negative. It turns out that it doesn't matter. `S` cannot be negative, and thus: `S <= d - r` or `S >= d + r` is still a good condition.

Once we know that condition. How to deal with the probability? You should see those `d-r` and `d+r` (for all cities) as the only points where the validity of `S` could change. So we can divide the `[0,Z]` interval using those points as delimiters. Now we can test each segment `[a,b]`, we know that `a` or `b` are points in which the result could change, so all points exclusively inside `(a,b)` will have the same result. `S` is valid for `(a+b)/2` if and only if it is valid for all points in interval `[a,b]`. For each valid interval `[a,b]`, if any of those points is picked for `S`, it is valid just add `(b - a) / Z` to the probability.

class RadioRange:
 def RadiusProbability(self, X, Y, R, Z):
    cities = zip(X,Y,R)
    
    def goodCity(x,y,r,s):     #city completely inside circle of radius s or not
        d = (x**2 + y**2) ** 0.5
        return ( (s <= d - r) or (s >= d + r) )
    
    def goodRadius(s):         # each circle must be completely included or not
        return all( goodCity(x,y,r,s) for (x,y,r) in cities )
    
    def getEvents():
        yield 0.0
        yield Z
        for (x,y,r) in cities:
            d = (x**2 + y**2) ** 0.5
            yield d + r
            yield d - r
    
    # remove invalid events (less than 0, more than Z), duplicates (using set)
    # and turn into a list (so we can use indexes) and sort
    # (possibly unnecessary to call sorted, the set likely already sorted)
    evs = sorted( list(set( x for x in getEvents() if (0 <= x <= Z) ) ) )
    good = 0.0
    for i in range(0, len(evs) - 1):
        p = (evs[i] + evs[i+1]) / 2
        if goodRadius(p):
            good += evs[i+1] - evs[i]
            
    return good / Z

I knew how to solve the problem since the beginning, except dealing with the corner case that turned out not to exist anyway. I am not sure why it took me so long to solve it, but it did.

Div1 500: tree stuff

Seems hard, I was tired and with headache so I went to lie down while I supposedly thought of the problem, but couldn't think of much.

Challenge phase

During the challenge phase I discovered why people were faster than my in 275, it turns out that `S` is invalid if and only if it intersects with any circle (city) in the input. While most of the code remains the same, this makes reaching the code a bit easier and also removes the need for checking for validity of `(a+b)/2`at the end, instead you just do a line sweep. (Each event `d-r` increments the number of intersecting circles, each `d+r` decrements it, sum up intervals in which the count is 0.)

Sunday, May 18, 2014

Auto-test Topcoder problems with multiple correct answers

Topcoder have been improving the decade old contest system a bit. Things like allowing larger constraints, possibility of different time/memory limit per problem, etc. But the greatest change of them all, the one that actually brings problem variety rather than Fake Difficulty is that some problems allow multiple correct answers.

The first SRM to have these problems was SRM 617. The division 2 easy and division 1 medium allow multiple correct answers. They were interesting problems that would not have worked quite as well without this ability.

In SRM 617 div2 medium, you have to return two composite numbers that add up to `n`. `n` is at least 20 and at most 1000. It turns out there is a very cool and easy solution: Return `(4, n-4)` if `n` is even, else `(9, n-9)` if `n` is odd. But if this problem appeared in old topcoder, how would we have led coders to this solution? We would have had to specify tons of things. Like, the first integer must be the smallest composite possible that makes the result as intended. But then there would be odd cases in which `(4, n-4)` is an answer, and then the solution would be more complicated (not appropriate for div2 easy). Even if we did all that, the problem wouldn't be very nice, because the results of the example cases's results would give the solution away too easily...

However, the bad thing about these new problems is that, it is Topcoder, and many of us are used to Topcoder Arena plugins that generate automatic testers based on the example test cases. If we used our old arena plugins on these problems and the correct results of our solution didn't exactly match the expected solutions, the automated tester would tell us that the results are wrong. Even when they are actually right. We need some changes.

So after SRM 617 I made plans to work as hard as possible in making my Greed tester setup able to work around this a bit. This is the story.

Identifying those problems

The first thing to do is to identify the problems that allow multiple correct answers for a test case. This requires help from the Topcoder Arena developers. The first thing I did was, during SRM 617 I told admins to remember that arena plugin API needs a way to detect this. Then I waited, and waited. Then I figured, there is a thread I know that seems to be monitored closely by Arena developers. Making post in this thread seems to be the only way I have to contact them, and so I did. I really the plugin API was documented at all and that each change was announced and explained and all new methods were known. But this is not the case. Even after receiving confirmation that this feature was in the API, I had to dig through the Jar looking for something that might be it.

problem.getComponent().isCustomChecker();
// where problem is com.topcoder.client.contestant.ProblemComponentModel

So now I just needed to add the feature to Greed so that any template was able to identify these problems and then we can do the magic.

I made a pull request, that PR adds a Problem.HasCustomChecker field. You can use it in templates: ${if ProblemHasCustomChecker}...${end} . The Pull request is still not merged and thus you will have to do some magic to compile greed using it. If you want...

Modifying the tester templates

We also need to modify the tester so that it can make use of this information. But how?

My idea of how to allow automated testers in these problems consists of two parts:

  • My tester already supports test results in which it is not known whether or not the returned answer is correct. And it can already report results as such (Using a ? symbol). Testing is still useful because you can get the results for all examples with a single run and you can also detect any example that times out or has a runtime error. So the first idea is to make the generated test code to make all test cases do this when the problem has this flexibility.
  • However, that doesn't fix that maybe the coder wants/needs also to detect wrong answers. The only workaround to this is to allow user to code a checker method for the returned output.

In many problems, it is easier to check if an output is correct than to solve the problem. For example, the problem I described above. It is easy to check that the returned value is a pair of integers, that they are both positive, they add up to n and they are composite. So we can make a tester.

In other problems, it may not be so easy to know if a result is correct without solving the problem. But if you already know one correct result for that input, then it is still easy to write a grader. For example. A problem that returns the shortest sequence following some conditions. If you know that the provided example case result gives a 3 elements sequence, then we can tell that any sequence with more than 3 elements is wrong and can report as such making use of this result.

My idea is to have a checker method in the problem class. It should take input, resulting output and the example' expected output (if it exists) If checker method returns 0, then it is not known if the provided answer is correct. If return -1, we know it is wrong and if it is 1, we know it is correct. Initially, this method returns 0 in all cases, but it can be modified by the user.

I implemented this in python, it looks like this:

class SilverbachConjecture:
 def solve(self, n):
    if n % 2 == 0:
        return (4,n - 4)
    else:
        return (9,n - 9)

# CUT begin
#-------------------------------------------------------------------------------
 def greedCheckResult(self, testInput, returnedAnswer, expectedAnswer):
    (n,) = testInput
    #You can implement a custom checker.
    # Return value: -1 is WA, 0 is unknown and 1 is AC.
    # expectedAnswer is not None, then expectedAnswer is a known correct answer.
    if len(returnedAnswer) != 2:
        return -1
    (a,b) = returnedAnswer
    if (a + b != n) or (a <= 0) or (b <= 0):
        return -1
    
    def isComposite(x):
        p = 2
        while p*p <= x:
            if x % p == 0:
                return True
            p += 1
        return False
        
    return 1 if isComposite(a) and isComposite(b) else -1

The trick is that the generic tester is a separate file, but I just updated it a bit to detect if a greedCheckResult method exists, if it does, then it does the required logic.

I am not sure yet how to do this in c++, yet, but I should try something tomorrow.

Besides of modifying the generic tester, I also modified my python template. This is the part that makes use of the HasCustomChecker field:

# ${Contest.Name} - ${ClassName} (${Problem.Score} points) by vexorian :

${<if Problem.Description.Modulo}
MOD = ${Problem.Description.Modulo}

${<end}
class ${ClassName}:
 def ${Method.Name}(self, ${Method.Params}):
    return ${Method.ReturnType;zeroval}
${<if Problem.HasCustomChecker}


${CutBegin}
#-------------------------------------------------------------------------------
 def greedCheckResult(self, testInput, returnedAnswer, expectedAnswer):
    (${Method.Params},) = testInput
    #You can implement a custom checker.
    # Return value: -1 is WA, 0 is unknown and 1 is AC.
    # expectedAnswer is not None, then expectedAnswer is a known correct answer.
    return 0

${CutEnd}
${<end}

${CutBegin}
${<TestCode}
${CutEnd}

The relevant changes to the generic tester affect only the run_tests function:

def run_testcase( problemClass, methodName, testInExpected, caseNo, totalCase, testTimeOut, finalLines, compactMode ):
    (testIn, expected) = testInExpected # so that it compiles in python3
    if compactMode != ONLY_REPORT: 
        sys.stdout.write(TEST_COLOR + "Test " + str(caseNo) + COLOR_RESET + ": " + prettyStr(testIn)+"\n")
    startTime = time.time()
    instance = problemClass()
    exception = None
    try:
        result = getattr(instance, methodName)(*testIn);
    except Exception as e:
        ln = None
        for x in traceback.extract_tb(sys.exc_traceback):
            if x[0] == problemClass.__name__+'.py':
                ln = x[1] 
        if ln is None:
            exceptionShort = '%s, (unknown line)'%( type(e).__name__ )
        else:
            exceptionShort = '%s, line: %d'%( type(e).__name__, ln )
        exception = traceback.format_exc()
        
    elapsed = time.time() - startTime   # in sec
    if compactMode != ONLY_REPORT:
        sys.stdout.write("Time: %.02f seconds.\n" % elapsed)

    knownAnswer = (expected is not None)
    if exception is not None:
        if compactMode != ONLY_REPORT:
            sys.stdout.write("RUNTIME ERROR: \n")
            sys.stdout.write(exception)
        correct = False
    else:
        if hasattr(instance, 'greedCheckResult'):
            check = instance.greedCheckResult(testIn, result, expected)
            correct = (check >= 0)
            knownAnswer = (check != 0)
        else:
            correct = expected is None or tc_equal(expected, result)
        if compactMode != ONLY_REPORT:
            if not correct or not knownAnswer:
                sys.stdout.write("Desired answer:\n")
                sys.stdout.write("\t%s\n" % prettyStr(expected) )
            sys.stdout.write("Your answer:\n")
            sys.stdout.write("\t%s\n" % prettyStr(result) )
    
    res = '?'
    if exception is not None:
        res = 'E'
    elif not correct:
        res = 'X'
    elif elapsed > testTimeOut:
        res = 'T'
    elif knownAnswer:
        res = '+'
    
    # final lines:
    s = ( TEST_COLOR + "t" + prettyCase(caseNo, totalCase) + COLOR_RESET + ": " + colorTestResult(res) )
    s += (" (%.2fs)" % elapsed)
    if res in ('?', 'X'):
        s += (" (" + prettyStr( result)+ ")" )
    elif res == 'E':
        s += (" " + exceptionShort)
    finalLines.append(s)

    if compactMode != ONLY_REPORT:
        sys.stdout.write(" %s\n" % colorTestResult(res) )
        sys.stdout.write( BAR_COLOR + ('='*BAR_LENGTH) + COLOR_RESET + "\n")
    return res

I'll eventually make a more formal release of this update once the field is added to the official Greed and I figure out a clean way to do this in c++.

Sure , it is possible that the checker you implement has a bug and thus it reports your results as correct when they are not, you submit and fail system tests. But it is better than nothing.

Victory (for now):

Sunday, May 11, 2014

The greatest topcoder t-shirt never made

The other day I received my SRM 600 t-shirt. If you kept attention you'd remember I did terribly in SRM 600 / SRM 600.5 , so you are wondering what's up with that? Simply, turns out all the problem setters / testers who worked in SRM 600 got their t-shirts too. And I was the editorial writer, so this is actually the first t-shirt I got because of writing editorials rather than because of competing.

If you are wondering how the SRM 600 t-shirt looks like. In the front it says [SRM 600] in large letters the back , it has the new Topcoder logo. It is black.

But this blog post is not about that t-shirt. But about this old TC forums bit:

forums link
Members, admins, country wo/men:

We're trying something a little different for the TCCC T-shirt this year.

Thanks to the thread about best member quotes, http://forums.topcoder.com/?module=Thread&threadID=512175&start=0 we thought, "Wouldn't it be cool to see a member's quote on the back of the TCCC T-shirt?"

So, here's what we'd like you to do. Search your little hearts out for your favorite member quotes. Post each one individually, so the member community can use the feedback symbols (+/-) to show their like or dislike of each quote.

Then, on September 11, TopCoder will select the top five quotes. A quote will be chosen if it meets all of the following requirements: member popularity, originality and TopCoder/programming subject matter. On September 18 everyone (admins included) will vote from the chosen five for their absolute favorite.

There won't be any prize money given, but won't the winner feel warm and fuzzy knowing a little part of him or her is being worn by 2000 people? I know I sure would.

To get the ball rolling, I've posted a few of my favorites below.

C'mon, let's see some more now!

(This was eons ago, back when TCCC was a thing)

And the reply:

sql_lall wrote:

I have this quote on my wall at home :) I'm wondering though, how long it'll be until someone changes their quote just for this thread. i.e. ""I memoized inside a ternary search and all I got was this lousy t-shirt" ;)

[Edit: removed my member quote suggestion, as I'm guessing most people are voting for the quote mentioned above. But then again, I could be wrong :p]

So, this is awesome, what happened with this t-shirt idea? Back then, topcoder decided t-shirt contents using the Schulze method. (Topcoder actually deciding anything using a poll now sounds too Alien of a concept). And another idea won: doh

I mean really, An Edison quote? EDISON?? Nobody likes that guy anymore. Disappoint.

Saturday, May 10, 2014

SRM 620, python fails again

This was the last of the 10 SRMs I decided to use python exclusively on. Also definitely the one in which the poor topcoder python support did the most harm.

Div1 250, the one with gcd

Game: You start with a pair of positive integer `(x,y)`, in each move you can move to `(x,x+y)` or `(x+y,y)`. You are given two paris `(a,b)` and `(c,d)` , where each integer is up to 1000000. You want to know if there is a pair `(x,y)` such that both `(a,b)` and `(c,d)` are reachable using it as a starting position. Return the largest possible sum `x+y` in that case else -1

The idea is to go backwards. From `(a,b)`, there are two candidate backward moves: `(a-b,b)` and `(a,b-a)`, however, since the integers must be positive, only one of them is valid: If `(a>b)`, then pick `(a-b,b)` , if `(a<b)` pick `(a,b-a)` , if they are equal, this is the time to stop, because else one will go negative.

So it is easy to make the two sets of pairs `(x,y)` that can reach both `(a,b)` and `(c,d)`, we could find if the two sets have an intersection, if they do, return the maximum sum. Easy:

Notice that when one of `a` or `b` is 1, and the other is 1000000, there are 1000000 elements in one list. So there can be way tons of elements and looking for the intersection is hard unless we go clever and use sets that work in logarithmic time to find if they contain stuff or not. (Or maybe a hash table, but hash tables really blow) In c++, it means std::set or std::map, in python:

class PairGame:
 def maxSum(self, a, b, c, d):
    # find all the pairs that can be reached going backwards from (a,b):
    def backwards(a,b):
        while a > 0 and b > 0:
            yield (a,b)
            if a >= b:
                a -= b
            else:
                b -= a
    # find the intersection between the two sets:
    inter = set( backwards(c,d) ) & set( backwards(a,b) )
    return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)

I submitted it and scored 225~ points. I was so proud of this code. I knew that people coding in c++ or one of the joke languages would be making totally awful things, like repeating the code used to go backwards from `(a,b)`, copy and pasting it to do it with `(c,d)` but doing the slightly different search.

Unfortunately, it times out with `(a,b) = (1000000,1)` and `(c,d) = (1000000,1)`, or similar. I could notice that it can barely run around 1 second when none of the integers is 2. So the only way to fix this is to take care of the cases when any of them is 1 individually (without making huge loops). This complicates the code a ton, and I needed tons of resubmits before getting the following code which might still be wrong:

class PairGame:
 def maxSum(self, a, b, c, d):
    # this wouldn't be necessary if python wasn't too slow for O(n) when n = 1000000
    if a == 1:
        # The path to (a,b) is (1,1), (1,2), ... (1,x), ... (1,b)
        #can we reach (1,x) going backwards from (c,d) ?
        #if so, what's maximum x that we can reach?
        while (c > 1) and (d > 0):
            if d > c:
                d = d % c
            else:
                c = c % d
                if c == 0 and d == 1:
                    c = 1
        if c == 1:
            return min(b,d) + 1
        else:
            return -1
    if b == 1:
        return self.maxSum(b,a,d,c)
    if c == 1 or d == 1:
        return self.maxSum(c,d,a,b)
        
    # Let's pretend the followign is the only real part of the code, okay?
    def backwards(a,b):
        while a > 0 and b > 0:
            yield (a,b)
            if a >= b:
                a -= b
            else:
                b -= a
    inter = set( backwards(c,d) ) & set( backwards(a,b) )
    return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)

(In fact, at first it seemed like the main problem was memory, but even after fixing the memory, it is still too slow

All those using c++ or Java or perhaps even C# wouldn't have this issue. It caused me to lose plenty of points in resubmits and I might fail still. The problem is much harder when `n = 1000000` is too slow. And the worst thing is, that difficulty wouldn't have changed much for c++/Java if the constraint was 500000, but it would have saved python. Well, too bad.

Div1 500

I had little time. All I can say is that the problem is the same as if you just picked a sequence of skills and sorted by this sequence of skills lexicographically. The rest is fuzzy

Monday, May 05, 2014

SRM 619: Nesting min-cost matching inside dynamic programming

Div1 250: The one with stacks

You start with up to 50 piles of stones. Two players play a game alternating turns. In a given turn, a player must pick three piles: `(a,b,c)` such that pile `a` has at least 2 stones. Pile `a` is split into two non-empty piles and their amounts are added to pile `b` and pile `c`. If a player cannot find any valid turn, they lose. Find if the first player is guaranteed to win or to lose assuming the two players play optimally.

During the match, I did a dynamic programming based on the fact that the game state can be described by two variables `(p,q)` where `p` is the number of piles with more than 1 stone and `q` the number of piles with 1 stone. This leads to an `O(n^2)` dp.

However, as it turns out, the problem can be solved in `O(1)` time. Note that every valid step will always reduce the number of piles by 1. It turns out that all cases with an even number of piles are losing states, so if the number of piles is odd AND there is at least one valid move, then the state is winning.

Need to prove that all even states are losing ones. This is obviously true for `n = 2`. So for `n >= 4`, we figure that the only losing states with `n-1` are those that consist entirely of 1s. However, it is impossible to reach such a state using any valid number of moves, because after any valid move, at least two of the piles will be greater than 1.

Div1 600: ouch

This problem's statement is very long to copy and to re-write. So I will skip to the solution.

Imagine you wanted to calculate `f(x,i)`, the minimum cost to assign skills to the sub-tree rooted at `x` if you already know that one of the skills chosen for `x` is skill `i`.

We can easily find `f(y, j)` for any child of `x` and skill `j`. Out of all the skills assigned to `x` and its children, it should be possible to make a permutation of skills. Let's focus on this permutation. There are two possibilities.

  • Use `i` in the permutation, meaning that all children must use skills different to `i`.
  • Use a different skill in the permutation for `x`. All children and `x` need to pick a unique skill.

Using the values of `f(y,j)` and some ad hoc stuff, you can easily build a table of costs for each case, the cost of assigning each of the employees to each skill in the permutation. Then you have a minimum-cost bipartite matching. Too difficult to implement in the little time I had after figuring this out.

Thursday, April 24, 2014

SRM 618: When python failed

This is the eighth of ten matches in which I promised to use python exclusively. This is also the first match in which I feel like I can claim I lost plenty of points because of this restriction.

Div1 250: The one with heteronormativity

So I am supposed to be solving this problem, and all I can think of after reading for the first time is that the problem's definition of family allows incest but is all about not allowing same-gender pairings :/ okay. ..

So a "family" is defined as a graph, in which each edge either has zero incident edges or exactly two incident edges. The two incident edges MUST be from two nodes with different genders. Also the graph must be topsorted. Given a graph of up of 100 nodes return if it could be a family or not. Basically, you have to assign genders to each node. It turns out that the input format guarantees that the graph is topsorted and that each node has 0 or 2 incident nodes. So all that we need to do is ask if we can assign the genders correctly.

If two nodes appear as the parents of a node, then their genders must be different. If we imagine a different graph, in which each of these pairs of nodes that must have a different gender are connected by a single edge. Then the graph must be bipartite. If the graph is bipartite, then the answer is that Yep, the graph could be a "family". Else not. Checking if a graph is bipartite is an easy application of the Depth-first search.

class Family:
 def isFamily(self, parent1, parent2):
    n = max(max(parent1),max(parent2)) + 1
    # Make a set of edges in the hopefully bipartite graph, they are undirected 
    E = set(zip(parent1, parent2)) | set( zip(parent2, parent1) )
    E.remove( (-1,-1) )
    
    # is the graph bipartite?
    color = [-1] * n
    result = [ "Possible" ]
    def dfs(x, c):
        if color[x] == -1:
            color[x] = c
            for (u,v) in E:
                if u == x:
                    dfs(v, 0 if c == 1 else 1)
        elif color[x] != c:
            result[0]  = "Impossible"
        
    for i in range(n):
        if color[i] == -1:
            dfs(i, 0)
        
    return result[0]

I took a bit of time mostly because of the above-mentioned distraction :P, but python helped me keep the code clean. I think it should pass unless 100x100 is suddenly too slow (I doubt it).

Div1 500: The one with strings.

You should count the number of maximal-length strings that use an alphabet of `n` letters and follow these properties:

  • No consecutive equal letters.
  • For each pair of (possibly equal) characters X and Y, string "XYXY" is NOT a subsequence of the string.

The first issue is how to know what is the maximum length possible. After plenty of analysis, I found that the maximum length possible is 2n-1. However, there are two methods to achieve it:

  • One way, is to surround a string of 2(n-1)-1 length using (n-1) alphabet characters, using two instances of one character different to those (n-1) ones. For example: A????????????A, where ????? contains no A.
  • The other way has 3 instances of the extreme character. A??????A!!!!!!!A, but also ?????? and !!!!!! cannot share characters.

So the rest is to count the number of valid patterns. It is not necessary to worry about which character goes to which position of the pattern, because we can just multiply the result by `n!`.

The rest is an `O(n^2)` dynamic programming, unfortunately, `n` is up to 5000 (I wish it was 1000), so no matter what I do, the thing will take at least 6 seconds. That's too much.

More info about the recurrence in the comments of the code:

MOD = 1000000007

def modFactorial(x):
    r = 1
    for i in xrange(1, x + 1):
        r = (r * i) % MOD
    return r
        

class LongWordsDiv1:
 def count(self, n):
    # The maximum number of letters is 2*n-1, although there are multiple ways 
    # to reach that maximum
    # Example, for n = 5
    # 012343210
    # 010234320
    # 010232420
    # 012103430
    # 012321040
    # 012131040
    # there should be 9, but can't think of the other 3
    
    dp = [0] * 5001
    dp[0] = 1
    dp[1] = 1
    for i in xrange(2,n+1):
        #1) 2n - 1 = 2 + (2(n-1) - 1)
        res = dp[i-1]
        #2) find a,b: 2n - 1 = 3 + 2a + 2b - 2
        #                 b  = (2n - 1 - 3 - 2a + 2) / 2
        #                 b  = (2n - 2a - 2) / 2
        #                 b  = n - a - 1
        a = 1
        while a < i:
            b = i - a - 1
            if b > 0:
                res += (dp[b] * dp[a]) % MOD
            else:
                break
            a += 1
        dp[i] = res

    return ( dp[n] * modFactorial(n) ) % MOD

It is possible I missed something in this analysis, though.

Monday, April 21, 2014

Topcoder SRM 617 recap: regular

This SRM took me by surprise, I think the programming contest calendar I use wasn't updated correctly or I didn't read it correctly the first time I read it, so I thought SRM 617 was another date. On Sunday, I got the Topcoder email and surprise!. It's a good thing that we no longer have that bug in which invitation emails don't get sent...

This is the 7-th of 10 SRMs in which I am using python exclusively. So far I don't think it has caused any negative impact to my rating. I think that unless you are in such a high rating position that you really need frequent div1 1000s, python's drawbacks are not a big deal and the pros tend to compensate for the cons :/

Div1 250: The one with cutting cake

You have the nonsensical notion of a one-dimensional cake of length `n`. Your friends are going to come visit, all you know is that the number of friends will be a proper divisor of `n`, but you are not sure which one. You want to make cuts to the cake in such a way that you can distribute the contiguous pieces evenly to your friends, no matter what the number of friends is. The pieces a friend receives may be of different length and the number of pieces each friend gets is not necessarily always the same, the only condition is that they are contiguous after cutting the cake and that the total length of cake each friend receives is equal. Return the minimum number of cuts needed so that this always works, regardless of the number of friends.

Well, the key is that the amount of cake should be the same for each friend, so for each `y` that is a proper divisor of `n` and `x = n/y`, there MUST exist cuts at `x,2x, 3x, ... `, The problem asks for the minimum, so we shouldn't do any more cuts than those `x,2x, 3x, ... `, for each `x`. The trick is that some values `kx_0` and `rx_1` may be the same for different divisors `y_0, y_1` . So we should just count all the numbers that are equal to some `k * (n//y)` where `k` is a integer, `y` is a divisor of `n` and `k * (n//y) < n`.

So simple simulation will be fine. During the match, I had the feeling that with `n <= 100000` and the 2 seconds limit, this simulation would be a bit slow for python. I did things like getting the divisiors in `O(sqrt(n))` time. And even after that, I made sure to test the worst input. The worst input is the number `n <= 100000` that has the most divisors. This is equal to the largest possible highly composite number or 83160 , in short. It turns out it wasn't necessary, also, with some knowledge of iterators and sets, the code can be much simpler than the code I submitted during the match:

class MyLongCake:
 def cut(self, n):
    cuts = set()
    for d in properDivisors(n):              # for each proper divisor d:
        cuts |= set( xrange(n/d, n, n/d) )   # add n/d, 2n/d, 3n/d, etc to set
    return len(cuts) + 1                     # return number of cuts + 1

def properDivisors(n):
    return (i for i in xrange(1, n) if n % i == 0)

During the challenge phase, you'd figure from reading other people's codes that the integers that get cut are exactly those who are not coprime with `n`. So result is actually `n - phi(n)`

Div1 800: strings

I opened this problem accidentally when I intended to open div1 500 :/ . It seemed like I had no idea how to solve it. Also, when a problem score is much smaller than usual, it is possibly much harder than usual (Sorry admins, everyone knows you are backward when choosing problem scores). so I skipped to div1 500, a bit too late, perhaps.

Div1 500: The one with pie and dolphins

You have 50 programmers, for each `i < n`, where `n <= 1000`, you should choose between:

  • Giving a dolphin to programmer `C_0[i]` and a pie to `C_1[i]`.
  • Or
  • Giving a dolphin to `C_1[i]` and a pie to `C_0[i]`.

After you give all these pies and dolphins, each programmer will calculate the absolute value of the difference between the number of pie and dolphin they got. You want to minimize the total sum of all these absolute values.

ooh boy. So I stared at the computer screen for more than half an hour. Everything I thought was non-sense for a good while. Eventually, I decided that the problem was most likely greedy. So the new issue was how to think of a greedy solution. If you process the decisions in the order of i, it is difficult to predict what will happen later and how it should affect your decision. But there is no need to do that. You can really do a decision whatever time you want. A decision will change the result in -2,-1,0,1 or 2, points. What we can try, is repeat this 1000 times: Pick the decision (of all not picked yet) that will change the score for the best. This is easily implemented through just simulation. Complexity is `O(T^2)` where `T` is the number of choices.

I was typing this, and had some bugs. It took longer than usual to debug because it is not easy to know if an answer is correct or not. (This was the first Topcoder problem ever that allowed multiple correct answers to the same test case). Eventually, I found out that in two lines I had i where I meant j. Too bad, coding phase already ended. I tried the solution in the practice room after fixing it , and it passes although the execution time gets awfully close to 2 seconds .

class PieOrDolphin:
 def Distribute(self, choice1, choice2):
    choices = zip(choice1, choice2)
    n = 50  # values of choice1, choice2 go from 0 to 49, we can assume 50 people
    pie = [0] * n
    dolphin = [0] * n
    res = [ 3 ] * len(choices)
    for i in range(0, len(choices)):
        best = (3, (0,0), -1, 3)
        for (j, (a,b) ) in enumerate(choices):
            if res[j] == 3:
                # try (a,b) or (b,a) for choice:
                for (x,y,z) in ( (a,b,1), (b,a,2) ):
                    # calculate how the score would improve
                    p = abs(pie[x] - dolphin[x] + 1) -  abs(pie[x] - dolphin[x])
                    q = abs(pie[y] - dolphin[y] - 1) -  abs(pie[y] - dolphin[y])
                    best = min(best, (p+q, (x,y), j, z) )

        (x , (a,b), bi, t) = best
      # print (a,b),' would improve by ', x, ' index ',bi
        res[bi] = t
        pie[a] += 1
        dolphin[b] += 1
    return tuple(res)

So why is this correct? Well, no idea, but I thought it should be correct because , in theore, each decision is used in the best way ever. And each decision should reduce the effectiveness of at most one decision, in theory. I have nothing formal to prove.

I know that the real answer involves turning the thing into a graph and doing some proofs to show that it is easy. I think that this graph solution eventually reduces to this greedy. But no idea yet.

So?

I think it was a fine problem set, if a bit unusual (odd problem scores, flexible answers in div1 500).