Showing posts with label srm. Show all posts
Showing posts with label srm. Show all posts

Tuesday, December 09, 2014

SRM 640 : Do I still have write access here?

It's SRM 640, that means there have been 140 SRMs since SRM 500, remember SRM 500? It was fun.

Div1 250: The one with Trees

You have a bunch of stars with different colors. Use Ribbons to connect the stars, so that the ribbons and stars make a tree (connected, no cycles). Only some specific pairs of stars may be connected. Return the minimum number of ribbons connecting two stars with equal colors we should use.

So... isn't this the very definition of the minimum spanning tree problem? You need to find an undirected tree, with a minimum sum of used edge costs. Just make an edge cost 1 if the two stars it connects have the same color.

So just implement Prim's or Kruskal's Minimum Spaning Tree algo. I went for Kruskal because I am odd.

class ChristmasTreeDecoration:
 def solve(self, col, x, y):
    n = len(col)
    # I like python for things like this, create a list of edges (c,u,v) c is cost
    edges = [ ( int(col[i-1] == col[j-1]), i-1, j-1) for (i,j) in zip(x,y) ]
    
    # Precarious union find thingie 
    parent = range(n)
    def getParent(x):
        while parent[x] != x:
            (x, parent[x]) = (parent[x], parent[parent[x]]) 
        return x
    
    # Kruskal:
    total = 0;
    for (cost, u,v) in sorted(edges):    #sorting edges by cost
        if getParent(u) != getParent(v): #distinct components
            parent[getParent(u)] = v     #connect them
            total += cost                # add cost
    return total

After coding this and realizing it is super short and elegant I starred at the screen for a while. Skeptical that this was so easy. 250s lately are harder than this. I also doubted for a while if my getParent(x) function is bugged or not. Normally I use a stack to move all x to have parent[x] = the super grand parent. But this seems to work fine.

Div1 550: The one with bipartite graphs

You are given `n_1,n_2,a,d`, `n_1,n_2 <= 1000000000` return the maximum number of edges in a graph with `n_1+n_2` vertices, where each vertex has degree of at least `d` and the maximum bipartite matching is `a`.

So yeah... this problem exists. I am sure there has to be some sort of solution.

Div1 1000:

Another problem that probably has a solution. My hunch is that this one is rather easy once you do plenty of math.

Challenge phase

Nothing to report.

Let's see if that 250 survives.

Saturday, August 30, 2014

SRM 631 : A new arena

Remember when I wrote blog posts periodically? Remember when I actually had energy to have an explanation ready just by the end of the match? Remember when editorials weren't super late? It was a nice time for sure.

Participated in Topcoder SRM 631 . It was a special match because earlier this week the Web version of the Topcoder arena was released (in beta). There is actually a thing in which if you use the arena in SRM 631 or 632 and fill a survey you can maybe, if you are super lucky, maybe, win a ticket to the TCO. So I decided to use it in these matches. The new Arena currently lacks many features. And I have the impression that there were some few wasted opportunities in regards to how it was designed. I will focus this post on what I experienced about the new arena. If

Registration

I was very busy in the morning finishing the first part of SRM 630 editorial (log in required, which is terrible). I've had a very bad streak with my editorial work. First I was actually solving the problem in SRM 629 quite fast, even figured out div1 medium by myself and all, but the div2 hard got me stuck for too long, so long that SRM 630 started. I made the foolish decision to finish 630 before taking back to work in 629., thinking that at least then only one editorial will be late. That was a very bad decision because it turns out that div2 hard and div1 medium were both the same problem. One with a very easy solution that I couldn't understand for over a week, not helped by a paper that made me feel dizzy when reading it...

The short story is , I was busy and distracted so when it came time to register for the match , I forgot to use the web applet (I was already using the Java one, because I needed to test the SRM 630 problems and the web applet doesn't have practice or plugins). Lucky because it turns out that people who tried to register using the web Arena had issues.

One of the Use Cases I had in mind regarding the web arena was the ability to register to the match from a mobile device. You cannot run the Java Arena in Android or iOS. Unfortunately, my attempts to load the arena in Android Firefox were futile and I didn't have time to unfreeze the Google Chrome Android app so I could test it.

I really hope that whatever issue is making it so hard for the web app to load in mobile browers (or maybe just Mobile Firefox) is eventually not a problem. Will make sure to report once I have more info.

The match

I opened the web arena and looked for a way to enter my room. In the competition room, you meet this:

It actually took me a while to find the problem links. They are in the right. The web arena actually shows you the problem names before you open the problem. This has interesting consequences. Remember that the submission score starts to drop the instant you open a problem, so in topcoder matches we tend to try to solve one problem at a time and usually it is best to start with the easiest (lowest score) problem first. Some problem names are very descriptive , however, so maybe if a problem hints you up that the solution will be too easy or too hard for you the problem names are useful information. Or maybe disinformation if the problem name is not that good. Also, in some cases the problem name can give you a hint of who is the writer of the problem...

I don't really like that you need to click "Unopened" to open the problem. It is just not intuitive. I would have preffered an [open] link/button and some test below problem name to indicate if you already opened. Optimal would be to show the problem score, decreasing real time.

I open the div1 easy problem, remember that I am in low energy mode. It took me a while to read the statement. In this problem, you have to turn a board containing either Black or White cells into a board such that, for each column, the maximum number of consecutive cells (in the column) of the same color is `N/2`. A step consists of picking a row and either painting all white cells black or all black cells white. Note that this is the same as painting the whole row white or black. Return the minimum number of moves. I was trying to solve this problem when I noticed something was missing : The constraints. Turns out that the problem statement didn't display them... This is the kind of bug that we could have caught before the match if there was a practice feature - Just saying.

So I needed to open the problem in the Java arena to read the constraints (If the maximum `N` is 8 the problem is straightforward, if the maximum `N` is 1000 it would be very difficult to make it run in time, Constraints are everything). Logging in the Java arena forces the web arena to close. Logging in the Web arena causes the Java arena to freeze. Both arenas need a constant connection, and I was hoping the web one didn't (maybe make it so having a constant connection enables chat and real time score update but is optional).

On the bright side, having to open the problem in the Java arena meant that the Greed plugin would generate test code for me.

How to solve div1 250

The constraints were: Matrix is size `N times N`, `N ` is even, `N <= 50`.

I think the key here is to understand how the columns work. There can only be at most one group of more than `N/2` consecutive cells of the same color in a single column. If no such group exists, we can ignore the column. So we end up with a problem in which each column has a group of at least `N/2 + 1` that we need to cut. The cool property here is that at least 2 rows will be shared by ALL of these groups. Even in an extreme case, if there were 2 groups in two columns, one group was topmost and the other bottommost and the number of consecutive equal colors was `N/2+1` in both columns; even in this case, the two groups would share two rows. It is nice that there will always be two shared rows, because we can always just paint one of them white and the other black and it will be enough to "cut" all the columns. In short: We can always solve the problem in 2 steps. So we only need to worry about two special cases : a) If the board is already correct then the result is 0 steps. b) Maybe the board can be solved in a single step, pick one of `O(n)` rows and paint either completely black or completely white and check if the board is correct. This leads to an `O(n^3)` algorithm.

class TaroJiroGrid:
 def getNumber(self, grid):
    n = len(grid)
    # result is at most 2
    res = 2
    
    def goodGrid(g):
        def goodColumn(i):
            last = '!'
            x = 1
            for j in xrange(1, n):
                x = x + 1 if g[j][i] == g[j-1][i] else 1
                if x > n / 2:
                    return False
            return True
        
        return all(goodColumn(i) for i in xrange(n))
    
    if goodGrid(grid):
        return 0
    
    for i in range(n + 1):
        for color in ('W', 'B'):
            ngrid = [ grid[j] if i != j else color * n for j in xrange(n) ]
            if goodGrid(ngrid):
                return 1
            
    return 2

So I still had to open the web arena, paste the code and submit... But first I wanted to test in the arena. I noticed something cool, unfortunately the screenshot wasn't saved for some reason. Each example had a checkbox next to it. Then there was a [Test All] button. So you can test multiple examples at once. Thus killing one of the main reasons most people use plugins... Unfortunately, when I tried to do this I had an error that EXACTLY ONE example must test. Apparently this is a bug.

The rest

I opened div1 500. Seemed difficult. Then I went to eat. Came back and Tried to chat in the arena, but scrolling text in the chat box turned out to be very difficult. I hope this gets fixed.

I wish the new arena didn't forcefully require a constant connection. It is very annoying to log back in when your internet has a small hiccup... Also hope that whatever makes me unable to use it in mobile gets fixed (either the browser or the arena, will research more later).

Random screenshot with some glitches:

Some of the glitches are my fault : I force a minimum font in the web browser. Some seem to have been caused by slow internet or something.

Tuesday, July 22, 2014

SRM 628: sleeping during the srm

Okay, so here is the thing. This is a 7:00 AM match. That is not usually a problem except I went to sleep at 4:00+something AM, at least I could register early to the match. So I put my alarm clock to 7:15 AM, I thought losing those 15 minutes wouldn't really change my score too much unless div1 medium is easy. I *really* woke up at 7:30. With 45 minutes the objective was to do at least div1 easy...

Div1 Easy: The one with sub-routines

Given `2 <= n <= 10^18` Find the maximum `x` such that : `(x ^ d(x) = n)` where `d(x)` is the number of divisors of `x`. If no such `x` exists, return -1.

The key is to notice that `d(x)` cannot really get too large. Any valid `d(x)` is `O(log(n))`, so we iterate through all the values of `d` such that `n` is a power of `x`. Until we reach a `d` that is very large.

For a given `d`, let's find `x : x ^ d = n` , this is the `d`-th root of `n`. We could use the pow function here, but it uses floating point numbers and you would need to be careful with precision and stuff... So we can just use a binary search for the largest `x` such that `x ^ d <= n`. Incidentally, if this `x` is 1, then we know that `d` has become too large and we can stop the search.

Once we have `d` and its valid value of `x` , all we need to do is verify that the number of divisors of `x` is equal to `d`. `d` will be at least 2, which means that `x` is `O(sqrt(n))`. We can count the divisors of `x` in `O(sqrt(x))` time. This yields an `O(n ^ (1/4))` algorithm. Nice?

The main method of my solution looks like this:

long findArgument(long n)
{
    long d = 2;
    long res = -1;
    while (true) {
        // find x : x ^ d = n
        
        long x = binarySearch(1, n+1, [&](long x) {
                return ipow(x,d) <= n;
        });
        if (ipow(x,d) == n) {
            // possible
            if (countDivisors(x) == d ) {
                res = std::max(res, x);
            }
        } else if (x == 1) {
            break;
        }
        
        d++;
    }
    
    return res;
}

Everything else are standard sub-routines ...

long binarySearch(long lo, long hi, std::function<bool(long)> P)
{
    // P(lo) true
    // P(hi) false
    while (lo + 1 < hi) {
        long ha = lo + (hi - lo) / 2;
        if (P(ha)) {
            lo = ha;
        } else {
            hi = ha;
        }
    }
    return lo;
}

const long INF = 1000000000000000002LL;
long ipow(long x, long d)
{
    // returns x ^ d if x ^ d < INF, else INF 
    long r = 1;
    while (d > 0) {
        if (d % 2 == 1) {
            if (r > INF / x) {
                r = INF;
            } else {
                r = r * x;
            }
        }
        d /= 2;
        if (x > INF/x) {
            x = INF;
        } else {
            x = x * x;
        }
    }
    return r;
}

long countDivisors(long x)
{
    long p = 1;
    long c = 0;
    while ( p <= x / p) {
        if (x % p == 0) {
            c++;
            if (x / p != p) {
                c++;
            }
        }
        p += 1;
    }
    return c;
}

Div1 medium: The one with expressions

You are given a tree of binary expressions. The expressions can either be the ADD function or the MAX function. You also have a list of (positive) values you can assign to each of the leaves in the tree. There are at most 2000 leaves. Return the maximum total value

Good luck with THAT. Solution is likely some sort of greedy but I am too asleep to do this.

.

Afterwards

During the challenge phase I read codes for div1 500. The idea appears to be that you can find out the maximum quantity of values that will be added up together in the final expression. If you have this number `x`, then you just have to pick the largest `x` integers from the values and add them together. Of course they had to pick THIS match to have an easy div1 medium...

Thursday, July 10, 2014

SRM 627: Did I really code THAT?

Is this just fantasy? I am not sure if this actually happened, backtracking with binary indexed trees? What?

Div1 250: The one with letters

Game starts with a multiset of alphabetic letters. In every step, two distinct letters are taken and removed. Eventually, the multiset will be empty or contain only repeated letters, those letters are called winning letters. Return all the letters that can possibly be winning letters.

My idea, which I am not totally sure is correct. Is to check for each letter `c`, if it is possible that it will win. Then all the moves have to have the intention that the specified letter remains at the end. This means that we should first try to clear as many letters different to `c` as possible. Eventually, a list of equal letters that are distinct to `c` will remain, and we need to get rid of them by using the `c` letters. If these repeated letters are less than `c`, then `c` can win.

The trick then is to always pick pairs of letters distinct to `c` , and pick the two letters that are currently most frequent.

class HappyLetterDiv1:
 def getHappyLetters(self, letters):
    res = ''
    # for each letter ch, can we do it?
    for ch in sorted(set(letters)):
        # get a dict of distinct letters and counts
        other = { x : letters.count(x) for x in letters if x != ch }
        while len(other) > 1:  # at least two different ones
            # sort by frequency
            least = sorted([ (other[x],x) for x in other.keys() ])
            # remove a letter of each of the two most frequent:
            a = least[-1][1]
            b = least[-2][1]
            def dec(y):
                other[y] -= 1
                if other[y] == 0:
                    other.pop(y)
            dec(a)
            dec(b)
        if (len(other) == 0) or (other[other.keys()[0]] < letters.count(ch)) :
            res += ch
    return res

Div1 500: The one with a graph and inversions

You have a graph of at most `1000` vertices. The graph is special in which it is a tree with an extra edge added onto it. (The graph is connected, undirected, lacks loops and has exactly `|V|` edges)

The value of a vertex `x` is `V[x]`.

For a simple path of vertices in the graph, we make a sequence `S` of the values of the vertices in visited order. Now count the number of "inversions" of the graph. This is the number of pairs `(i,j)` such that `(i < j)` and `(S[i] > S[j])`. Return the minimum number of inversions in a sequence of at least `K` elements.

.... So the trick is that the graph is a tree with exactly one extra edge. So there is only one simple cycle in it. So there are at most 2 simple paths connecting each pair of vertices. So there are at most 2000000 paths. 2000000 sequences. With backtracking, we can visit them all. The problem is that you need to calculate the sequence's number of inversions as you build the sequence...

Let's say you have already chosen some of the previous elements in a sequence, and want to add `V[x]` to the sequence, we can count the number of added inversions by calculating the number of elements in the sequence that are higher than `V[x]`. So we "just" need to use a data structure that adds elements, removes elements and counts the number of elements larger than a integer `x` in `O(log(N))`. And we need to update that sequence (add , then remove) as part of the backtracking. Easy... So I implemented a binary indexed tree , called it fenwik for some reason and it seems to work.

// for each pair of edges (x,y), there are at most 2 paths.
// thus there will be at most 2000000 paths.


int fenwick[2048]; //probably wrong name

void fenwick_add(int x, int a = 0, int b = 1024, int node = 0)
{
    if ( a <= x && x < b ) {
        fenwick[node]++;
        if (a < b - 1) {
            int c = (a + b) / 2;
            fenwick_add(x, a,c, node*2 + 1);
            fenwick_add(x, c,b, node*2 + 2);
        }
    }
}
void fenwick_remove(int x, int a = 0, int b = 1024, int node = 0)
{
    if ( a <= x && x < b ) {
        fenwick[node]--;
        if (a < b - 1) {
            int c = (a + b) / 2;
            fenwick_remove(x, a,c, node*2 + 1);
            fenwick_remove(x, c,b, node*2 + 2);
        }
    }
}
int fenwick_counthigher(int x, int a = 0, int b = 1024, int node = 0)
{
    if (a > x) {
        return fenwick[node];
    }
    if ( (a==x) || (b <= x) ) {
        return 0;
    }
    int c = (a + b) / 2;
    int r = fenwick_counthigher(x, a,c, node*2 + 1)
          + fenwick_counthigher(x, c,b, node*2 + 2);
    return r;
}


int getMinimumInversions(vector<int> A, vector<int> B, vector<int> V, int K)
{
    int N = V.size();
    vector<list<int>> g(N);
    for (int i = 0; i < A.size(); i++) {
        g[A[i]].push_back(B[i]);
        g[B[i]].push_back(A[i]);
    }

    const int INF = 20000000;
    int res = INF;
    memset(fenwick, 0, sizeof(fenwick));
    for (int s = 0; s < N; s++) {
        vector<bool> visited(N, false);
        int t = 0;
        int inv = 0;
        
        std::function< void(int) > backtrack = [&](int x) {
            visited[x] = true;
            t++;
            int oinv = inv;
            inv += fenwick_counthigher( V[x] );                
            fenwick_add( V[x] );
            if (t >= K) {
                res = std::min(res, inv);
            }
            for (int y : g[x]) {
                if (! visited[y] ) {
                    backtrack(y);
                }
            }
            fenwick_remove( V[x] );
            inv = oinv;
            t--;
            visited[x] = false;
        };
        backtrack(s);
    }
    
    
    
    return (res >= INF) ? -1: res;
}

Challenge phase

Got lucky, the first place in the room clearly did 500 without the data structure part (Was using a std::multiset, but multiset doesn't help in the counting higher elements part). So I just challenged with a large case.

Saturday, June 28, 2014

SRM 626: Not a math person

Another SRM.

Div1 250: The one with dice

Alice throws `a` dice with `b` faces each (Numbered 1 to `b`, all equally likely). Bob throws `c` dice with `d` faces each (same). The score is the sum of all the dice rools each player gets. Return the expected value of Alice score IN CASES SHE WINS. If it is impossible for Alice to win, return -1. All `a,b,c,d` are at least 1 and at most 50

Okay... For each die, can you calculate the probability to get `X` as the total result? Well, yes. This is an `O(a^2 b^2)` dynamic programming thing. Possibly other good methods too. The thing is that we can precalculate two lists that give us the probability Alice gets `X` and the probability Bob gets `Y`.

Then for each pair `(X,Y)` it is easy to get the probability that Alice gets X AND Bob gets Y. So consider only the cases `(X > Y)` and find the probability Alice wins. If the probability is 0, return -1.

Once again, for each pair `(X,Y)`, find its probability, if Alice wins in this case, then the probability that Alice gets that score, provided Alice wins the whole game is `P_A(X) * P_B(Y) / P_"Alice wins"`. The sum of these probabilities multiplied by `X` is the expected value.

My first instinct was to use Bayes theorem , maybe there is a way to simplify things with it?

Total complexity is `O(a^2 b^2 + c^2 d^2 + abcd)`, so around 3* 2500 * 2500 in the worst case :)


vector<double> findProbabilities(int a, int b)
{
    double dp[51][2501];
    
    memset(dp,0,sizeof(dp));
    dp[0][0] = 1.0;
    
    for (int n = 1; n <= a; n++) {
        
        for (int y = 1; y <= n*b; y++) {
            dp[n][y] = 0.0;
            for (int x = 1; x <= b && x <= y; x++) {
                dp[n][y] += dp[n-1][y - x] / b;
            }
        }
    }

    
    vector<double> res(a*b + 1, 0.0);
    for (int i = 0; i <= a*b; i++) {
        res[i] = dp[a][i];
    }
    return res;
}

double getExpectation(int a, int b, int c, int d)
{
    vector<double> Alice,Bob;
    
    Alice = findProbabilities(a,b);
    Bob   = findProbabilities(c,d);
    
    double pw = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                pw += Alice[A] * Bob[B];
            }
        }
    }
    
    if (pw <= 0.0) {
        return -1.0;
    }
    
    double ew = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                double p = Alice[A] * Bob[B]; //probability this happens
                ew += A * (p / pw);
            }
        }
    }
    return ew;
}
};

Div1 medium the one with negative edges

A directed graph, possibly with multiple edges and loops and cycles. Find the minimum cost to move from a vertex to another.... EXCEPT that you have at most `X` opportunities to make the cost of an edge negative when you use it. `X` can be is obscenely large.

Okay, no idea.

Div1 hard: The one with mirrors

There is a rectangle with mirror sides of some dimensions. Find each angle from which you can cast a ray from the bottom-left corner in such a way that the ray of light bounces exactly `b` times before touching any corner and finally reaches the top-right corner. `b` is at most 1000000000. For each of those angles, add the total distance traveled by the ray before hitting the corner. Calculate all that modulo 1000000007

I thought I had a good chance to solve this problem. Because of the old codejam problem Hall of mirrors I knew a lot of properties about these rectangles of mirror sides. For example, we could imagine that a bounce is the same as moving around a grid made of rectangles of those mirrors. I would have more luck explaining this if I had time to inkscape....

The thing is that if you imagine a grid of mirrors and make the coordinates in each of these mirror corners `(x,y)` . Then moving from the origin to this coordinate is valid if and only if: `(x,y)` are coprime and also the coordinates should share a diagonal with `(1,1 + b)`. In addition, `b` must be even.

Unfortunately, because `b` can be very large, just knowing that isn't enough. It needs some mathematical trick to avoid doing an `O(b)` loop.

The following passes examples 0 o 3.

const int MOD = 1000000007;

long gcd(long a, long b)
{
    while (b != 0) {
        tie(a,b) = make_tuple(b, a%b);
    }
    return a;
}

int findSum(int sideA, int sideB, int bounces)
{
    if (bounces % 2 == 1) {
        return 0;
    }
    long y = 1 + bounces;
    long x = 1;
    
    // find solutions to this:
    // 1 <= y - p <= y
    // gcd(y - p, 1 + p) = 1
    
    long sx = 0, sy = 0;
    while (y >= 1) {
        if (gcd(x,y) == 1) {
            sx += (x*x) % MOD;
            sy += (y*y) % MOD;
        }
        y--;
        x++;
    }
    sx = ( ((sx * sideA) % MOD) * sideA) % MOD;
    sy = ( ((sy * sideB) % MOD) * sideB) % MOD;
    return (sx + sy) % MOD;
}

I submitted it just for fun, it will time out with slightly large `b`. No one dares to challenge it so far.

Thursday, June 19, 2014

SRM 625: I semi-wrote it

Two days ago I was busy trying to come up with a way to finish SRM 624's editorial. I get how to solve div1 hard in the top level, I have no idea how to explain the initial step: Split in centers subtrees. I am fairly sure it is probably a known algorithm, can't find info. Was stuck there when I was contacted by the admins about SRM 625. They needed problems urgently and I apparently had some problems that were created long ago and could work.

I wrote all problems except div1 hard (the important one). They really weren't intended to be part of the same match. And I think that they go more towards the standard side of things. Please note that I've gone through a huge writer's block and I can't seem to come up with interesting problems anymore. But oh well. I just hope there are no big issues of the sort that cause match cancellations.

Div2 250: The one with challenge phase

Unlike the other problems, I thought of this one specifically for this match. Ever since multiple-answer problems were added to topcoder, I wanted to make something that takes advantage of it.

The problem is easy, given `y` find `(a,b,c)` such that `ab + c = y`, `(a,b,c)` must be integers between -1000 and 1000 different to 1 and 0. `y` is between 0 and 500. Otherwise any `(a,b,c)` is correct.

There is a bruteforce search here. But we can also get clever. Note that no matter which two values `(a,b)` you pick, there will always be a `c = y - ab`. So we can do `(a,b) = (1,1)` and then `c = y - 1`. This would be wrong because values cannot be 0 or 1. But we can try to pick values `(a,b)` that don't have this problem. For example `(a,b) = (2,2)`, now `c = y - 4`. This is almost fine, but `y = 4` or `y = 5` would yield a wrong value of `c`. We could handle those cases separately OR we could try something different. We would like `y - ab` to be distinct to 0 or 1. How about : `y - ab > 1` then `ab < y -1`. We can guarantee this to be true with `(a,b) = (2,-2)`. Then we have `c = y + 4`. So the result can always be `(2,-2, y + 4)`.

Div2 500: The one with sequences

This problem is very old. I think I thought of it for some TCO qualification round of a past TCO. I didn't even remember much about the problem. It was already developed but it was never used. So we mostly just enabled the problem for use.

If I remember correctly, this problem can be very tricky.

Div2 1000: Minimum Liars Again

A circle of friends. For each i, we have answers[i]:

  • If answers[i] is ?, we didn't ask friend i anything.
  • Else we asked the i-th friend if friend (i + 1) mod n is a liar. If friend i is a honest person, they would always tell the truth , else they would always lie. answers[i] is H if friend i said friend (i + 1) mod n is a honest person, else L.

Return the minimum number of liars in a consistent case. If no case can be consistent, return . 1.

This is another very old problem. It was also already developed, but for `n <= 20`. We decided `n <= 50` is better for this problem slot.

The idea is simple: If there are no ? in the input, the whole thing is cyclic. You can try making the first person a liar or not. If first person is a liar, and you know their answer, then you know if second person is a liar, then third person and so and so. Eventually, you will know if person `n - 1` is a liar or not, and this is important the conclusion we make about person 0 must be consistent with the hypothesis we made. Return the minimum number of liars in the consistent cases.

If there is at least one ? , the case is not really cyclic. You can just split the friends in groups of consecutive non-? friends. These cases are linear and you can use the same logic (try if first person is a liar or not). But this time, no consistency is needed.

Like this:

import itertools
class ConundrumReloaded:
 def minimumLiars(self, answers):
        
    def simulateLine(f, s):
        p = f
        liars = 0
        for i in range(1, len(s) + 1):
            h = (answers[i - 1] == 'H') ^ (not p)
            liars += not h
            p = h
        #print (f,s), (liars,p)
        return (liars, p)
        
    n = len(answers)
    q = -1
    for i in range(n):
        if answers[i] == '?':
            q = i
    INF = 10**9
    res = INF
    if q == -1:
        for f in (0,1):
            (liars, p) = simulateLine(f, answers)
            if p == f:
                res = min(res, liars)
    else:
        res = 0
        curr = ''
        for i in range(n + 1):
            j = (i + q) % n
            if answers[j] == '?':
                if curr != '':
                    res += min( (not f) + simulateLine(f, curr)[0] for f in (0,1) )
                    curr = ''
            else:
                curr = curr + answers[j]
        
    return -1 if res >= INF else res

Div1 250: When you are desperate combine palindromes and probabilities

Given a string. We pick one of its anagrams with uniform probability. Return the probability the picked anagram is palindrome.

This problem is also very old. I think I tried to use it in many matches, but (thankfully) we always found something more interesting and less standard. Until now.

Simply return: (Number of palindrome anagrams) / (Number of anagrams).

The number of anagrams is calculated by dividing `n!` by `r!` for each letter in word that is repeated `r` times.

So we just need to calculate the number of palindrome anagrams. First of all, if `n` is even, then all characters must appear an even number of times. If `n` is odd then only one character should appear an odd number of times. If `n` is odd, we already know what character is in the center and we can just remove it and treat it as if everything was even.

We have many letters that appear an even number of times. We know that we should put each of these letters in both sides of the palindrome. The order depends only on the first side. So actually, we should count the number of ways to order the `n/2` letters that remain after dividing each letter quantity by 2. This is the same as dividing `(n/2)!` by `(r/2)!` for each letter that is repeated `r` times.

As I type this, it appears that many people including div1 hard's writer have issues with precision in their solutions.

Div1 500: The one with blocks

The original version of this problem was meant for the last TCO rounds, I think last year. But was too complicated. In the original version , there were multiple goals and exactly two start cells. I don't even remember how was it solved.

In the current version, there are multiple start cells but only one goal. The game is symmetric so we can just begin at the goal and if it reaches a start cell, there is a solution if you begin at that start cell.

So we should just find the minimum number of holes to cut the path between the goal cell and any start cell.

Now analyze the allowed moves, you will see that the set of reachable cells is like this:

*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*

Where SZ are horizontal 2x1 pairs of cells that can be touched by a 2x1 face, the * are the cells that can be touched by 1x1 faces (including the goal) and AV the vertical 2x1 pairs of cells.

We can just ignore any start cell that is not *, no matter what we do, we can't reach them.

So now we just have a minimum cut problem. The trick is to take the 2x1 pairs of cells as edges then the cost to remove an edge equals the number of normal cells in the edge. You can also remove * cells, this means using an intermediary edge connecting entry and exit point for each * cell. Okay. The minimum cut requires you to get clever and implementation will be a headache but the concept is easy!

Thursday, June 12, 2014

*Taunts you*

In today's SRM 624 there was a problem that asked you to return something modulo 1000000009. Usually the modulo is 1000000007 or 1000000009 , but people have the impression it is 1000000007 most of the time.

Usually topcoder example tests in this kind of problem include a test case that makes use of the modulo. This is intentional so that people can catch a mistake in writing this constant. Unfortunately, today's problem didn't have such test case.

I didn't even notice. If you read my recap of SRM 624 you will see I even mistakenly used 1000000007 when typing the explanation. And yet I passed , well I have a secret weapon (not so secret).

This is my Greed c++ template:

#include <bits/stdc++.h>
// ${Contest.Name} - ${ClassName} (${Problem.Score} points) by vexorian
using namespace std;

struct ${ClassName}
{
${<if Problem.Description.Modulo}
    const int MOD = ${Problem.Description.Modulo};

${<end}
    ${Method.ReturnType} ${Method.Name}(${Method.Params})
    {
        return ${if Method.ReturnType.Array}{}${else}${Method.ReturnType;zeroval}${end};
    }
};

${CutBegin}
${<TestCode}
${CutEnd}

Take a look to the if Problem.Description.Modulo part. Yes, this part of my template actually generates the MOD constant for me :). One day I spent an afternoon or so making the Greed topcoder plugin able to parse the modulo constant and make it accessible to templates. This is why :).

SRM 624: Slooooooooow

I just did the most amazing c++ code, got to share.

Div1 250: The one with buildings

You are given a list of integers (building heights). The maximum number of integers possible is 4000 and the maximum integer possible is 4000. For each `0 <= i <= n`, find the minimum number of steps so that the final sequence has `i` equal integers. Every step is to increment a single integer by 1 (building a new floor in a building).

I had some difficulties getting to the right approach. At first trying to do things like sweeps and dynamic programmings :/ . Turns out the trick is to solve: What is the minimum cost to have `y` buildings of height `x` ? For each `x`, calculate those costs for all `y`, updating the values in a total cost function that considers all possible heights.

The trick is that there are only `O(N)` relevant values of `x` (those equal to the heights), and that it is easy to find, for each `y` the minimum cost to have `y` buildings of height `x`. Just pick the `y` largest buildings of height greater than or equal to `x` and increment them. This gives us an `O(N^3)` algorithm, however, for each `y`, you can reuse the result of `y-1` (you know the cost of fixing `y-1` buildings, just add a single building and we know the result for `y`.

This is an `O(N^2)` algorithm:

int minimum(vector<int> heights)
{
    int n = heights.size();
    // sort non-increasingly by height:
    sort(heights.begin(), heights.end());
    const int INF = 4000 * 4000;
    // cost[i] will have the cost to have i equal buildings
    vector<int> cost(n + 1, INF);
    cost[1] = 0;
    // for each height:
    for (int i = 0; i < n; i++) {
        int c = 0;
        // for each number of buildings i - j + 1:
        for (int j = i - 1; j >= 0; j--) {
            // the minimum cost to have i - j + 1 buildings of this height 
            c += heights[i] - heights[j];
            // remember the best
            cost[i - j + 1] = std::min( cost[i - j + 1], c);
        }
    }
    
    
    // the problem actually asks for the XOR of all values, but I suspect
    // this is because of limitations in the return value size.
    int res = 0;
    for (int i = 2; i <= n; i++) {
        res ^= cost[i];
    }
     
    
    return res;
}

Div1 450: The one with implementation

Classical problem. You are given a graph (no loops, bidirectional, connected, weighted, some weights are zero). Return the total number of shortest paths between vertices `0` and `N-1` modulo 1000000007. Note that zero-weight edges might make the number of shortest paths infinite, return -1 in that case. At most 2000 vertices and at most 2000 edges.

Okay, the complicated path is when paths are infinite. This can only happen if there is at least one shortest path that visits an edge with weight 0. So we need a way to know, for each edge, if it can belong to a shortest path.

The way to check if an edge `(u,v)` can belong to an optimal path requires us to know: The weight of the edge, the minimum distance `D` between 0 and `N-1`, the minimum distance `d` between 0 and `u` and the minimum distance `r` between `N-1` and `v`. If `d + w + r = D`, then the edge can be used in a shortest path. If any 0-weight edge is one of those, return -1.

The rest is to count the number of paths. How about `f(x)` which returns the number of shortest paths towards `N-1` starting with `x`. `f(N-1) = 1`, there is only one shortest path there. For the other vertices `x`, you find every `y` such that `(x,y)` can belong to a shortest path. Then `f(x) = sum(f(y))` , easy, right? . Just evaluate the vertices in non-increasing order of distance to `N-1`, since none of the edges used will have weight 0, we can do this sorting.

I broke a record with this problem. I used 4 lambdas in total. And I even nested lambdas this time, which is so unusual in c++ :).

const int MOD = 1000000009;



int count(int N, vector<int> A, vector<int> B, vector<int> C)
{
    // make the graph easier to use:
    vector<vector<int>> g(N);
    vector<vector<int>> w(N);
    vector<int> degree(N);
    for (int i = 0; i < A.size(); i++) {
        degree[--A[i]]++;
        degree[--B[i]]++;
    }
    for (int i = 0; i < N; i++) {
        g[i].resize(degree[i]);
        w[i].resize(degree[i]);
        degree[i] = 0;
    }
    auto addEdge = [&](int u, int v, int weight) {
        g[u][degree[u]] = v;
        w[u][degree[u]++] = weight;
    };
    for (int i = 0; i < A.size(); i++) {
        addEdge( A[i], B[i], C[i] );
        addEdge( B[i], A[i], C[i] );
    }
    
    //Dijkstra time!
    auto dijkstra = [&](int s, vector<int> &dist) {
        set<pair<int,int>> Q;
        const int INF = 2000000000; 
        dist.resize(N, INF);
        for (int i = 0; i < N; i++) {
            Q.insert( {INF, i} );
        }
        auto push = [&](int x, int d) {
            if (d < dist[x]) {
                Q.erase(Q.find( {dist[x],x}));
                dist[x] = d;
                Q.insert( {dist[x], x} );
            }
        };
        push(s, 0);
        while (! Q.empty()) {
            auto it = Q.begin();
            int x = it->second;
            Q.erase(it);
            
            for (int i = 0; i < degree[x]; i++) {
                int v = g[x][i];
                int we = w[x][i];
                push(v, dist[x] + we);
            }
        }
    };
    vector<int> dist, rdist;
    dijkstra(0, dist);
    dijkstra(N-1, rdist);
    
   
    // if any of the shortest paths visits an edge of weight 0, the result
    // is -1. It is possible to have 0s in the input and result diff to -1
    for (int i = 0; i < A.size(); i++) {
        if (C[i] == 0) {
            int u = A[i], v = B[i];
            for (int r = 0; r < 2; r++) {
                if (dist[u] + rdist[v] == dist[N-1]) {
                    //critical
                    return -1;
                }
                swap(u,v);
            }
        }
    }
    
    // And now the part in which we actually count the roads.
    vector<long> dp(N);
    vector<int> byRDist(N);
    for (int i = 0; i < N; i++) {
        byRDist[i] = i;
    }
    sort(byRDist.begin(), byRDist.end(), [&](int a, int b) {
        return rdist[a] < rdist[b];
    });
    for (int u: byRDist) {
        dp[u] = 0;
        if (u == N-1) {
            dp[u] = 1;
        }
        for (int i = 0; i < degree[u]; i++) {
            int v = g[u][i];
            int we = w[u][i];
            if ( (rdist[v] < rdist[u]) && (dist[u] + rdist[v] + we == dist[N-1]) ) {
                dp[u] += dp[v];
            }
        }
        dp[u] %= MOD;
    }
    return dp[0];
}

The rest

I was very slow, everybody else has a much better score. I hope many of them fail. I am confident in the correctness of my solutions and I don't suspect of any likely implementation bug. Typing this as the challenge phase begins.

During challenge phase: 8 minutes of challenge phase, and I doubt I'll score a successful challenge, there are some tough challengers in my room and they are very quick to find errors. There seem to be many ways to get 450 wrong.

Wednesday, June 04, 2014

SRM 623: Saved by the challenge phase

Div1 300: The evil one

This problem was evil. Let me explain, you have a board of maximum size 20x20, some cells contain apples, some pears, some are empty. A valid move is to move a fruit to an empty cell. You can do at most `K` moves. The objective is to maximize the size of a rectangle containing only apples. What is the maximum?

Say you already picked an arbitrary rectangle, there are `O(n^4)` of those. There are two things that are important about this rectangle: `a` : Number of apples in side, `p` Number of pears inside, `e` number of empty cells inside. You can also know the number of apples/pears/empty cells outside the square. Finding this values is a complicated implementation part. Also, I think too slow for python, so I tried to do something in c++, using lambdas of course. It took eons. But when I finally did it , I had a new problem...

Knowing that information about the rectangle we need to find the minimum number of moves needed to have only apples in the rectangle.

At first I was trying something very wrong: Move each pear to a (different) empty cell outside, then move all the apples inside. This took me ages to debug, and when I finally passed system tests with it, I was about to click submit when I started to wonder "Is this really it?" , and I noticed that noooo, the minimum number of moves is much trickier to find. Hence the examples were weak.

So what is the real optimal strategy? Well, the first issue is that we only need one empty cell in the board. We can keep reusing it to do all the moves. Then we have this: If there are any empty cells in the square, it is easy to see that it is optimal to just move any outside apples and cover them. The remaining case is when there are some pears inside the rectangle, at least one empty cell outside and enough apples outside. A single pear is treated like this: Move pear to empty cell, move apple to the position the pear was in, this leaves an empty cell in the apple's position, and we can repeat.

So in fact, the minimum number of moves is equal to:

  • 0 if all the cells already contain apples.
  • # of empty cells inside the rectangle, if the rectangle contains no pears.
  • # of empty cells + 2 * number of pears inside the rectangle, if and only if there is at least one empty cell in the board
int getBoard(vector<string> board, int K)
{
    // stuff to count the number of apples / pears / empty inside a rectangle:
    int W = board.size(), H = board[0].size(); 
    map<char, vector<vector<int>>> total;
    string chars = ".AP";
    for ( char ch: chars ) {
        vector<vector<int>> t(W+1, vector<int>(H+1,0));
        for (int i = 1; i <= W; i++) {
            for (int j = 1; j <= H; j++) {
                int x = ( board[i-1][j-1] == ch );
                t[i][j] = x + t[i-1][j] + t[i][j-1] - t[i-1][j-1];
            }
        }
        total[ch] = t;
    }
    auto rectangle = [&](char ch, int x0, int x1, int y0, int y1) {
        auto &t = total[ch]; 
        return t[x1][y1] - t[x0][y1] - t[x1][y0] + t[x0][y0];
    };
    
    int A = rectangle('A', 0, W, 0, H);
    int E = rectangle('.', 0, W, 0, H);
    
    int res = 0;
    for (int x0 = 0; x0 < W; x0++) {
        for (int y0 = 0; y0 < H; y0++) {
            for (int x1 = x0 + 1; x1 <= W; x1++) {
                for (int y1 = y0 + 1; y1 <= H; y1++) {
                    // the actual logic:
                    // inside:
                    int e = rectangle('.',x0,x1,y0,y1);
                    int a = rectangle('A',x0,x1,y0,y1);
                    int p = rectangle('P',x0,x1,y0,y1);
                    
                    // out side:
                    int na = A - a;
                    int ne = E - e;
                    
                    if (e <= na) {
                        int x = e;
                        int mov = x;
                        na -= x;
                        a += x;
                        e = 0;
                        ne += x;
                        if (mov <= K) {
                            if (p == 0) {
                                res = std::max(res, a);
                            } else if ( (p <= na) && (ne >= 1) && (mov + 2*p <= K) ) {
                                res = std::max(res, a + p);
                            }
                        }
                    }
                }
            }
        }
    }
    return res;
}

Div1 450: The difficult one

All I can think of is that we can imagine a different problem, the points stay static and the player moves either in a vertical line upwards (increasing `y` coordinate) or in a diagonal (incrementing `y` and incrementing or decrementing `x`).

Challenge phase

I took too long to solve div1 300, but I did learn that the example cases were weak. So I thought of a challenge case to make my initial solution fail (using only one empty square). Luckily my room wasn't very aggressive and I could find 2 solutions with that wrong idea and earned 100 points.

Wednesday, May 28, 2014

SRM 622: Anticlimatic

It was a long day, trying to finish the very late SRM 621 editorial. I eventually improvised something. I tried to learn as much as possible about suffix automata until I can at least explain the solution with some more detail than the sketches.

So then the match started, I wasn't too thrilled because it is a 9:00 PM match and they tend to be... boring and have too few coders. But oh well...

Div1 250: The one with roads

You are given a weighted complete directed graph (A map of roads connecting each pair of cities) and a integer `T`. On a special holiday, for each ordered pair of distinct cities `(i,j)` a bus must travel from city `i` to city `j`, the path used by the bus must be the shortest possible. If there are multiple shortest pats, the bus driver may pick any of them - We don't know which. A road between two cities is unsafe if it is possible that more than `T` buses will go through it. Find the total sum of lengths of unsafe roads.

The thing here is how to know if a road might be used by a bus. There are `O(n^2)` buses and each might use each road. For pair of cities `(a,b)` , then road `(x -> y)` is used if and only if: mindist[a][x] + road[x][y] + dist[y][b] = dist[a][b]. Where mindist[p][q] is the minimum distance between cities p and q and road[p][q] is the length of the direct road connecting them. We can find mindist[p][q] in `O(n^3)` with Floyd-Warshall, sure, we could do something even faster, but it is not needed because the logic (picking `a,b,x,y` is `O(n^4)`.

Code some cute python code, submit:

class BuildingRoutes:
 def build(self, dist, T):
    n = len(dist)
    allN = range(n)
    # turn string list dist into a matrix:
    dist = [ [ord(x)-ord('0') for x in row] for row in dist ]
    # number of times a road might be used
    roadcount = [ [0] * n for i in allN ]
    INF = 10**9
    # Floyd-Warshall
    mindist = [ [dist[i][j] for j in allN] for i in allN ]
    for k in xrange(n):
        for i in xrange(n):
            for j in xrange(n):
                mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j])
    # Find potential uses:
    for a in allN:
        for b in allN:
            if a != b:
                for x in allN:
                    for y in allN:
                        if x != y:
                            if mindist[a][x] + dist[x][y] + mindist[y][b] == mindist[a][b]:
                                roadcount[x][y] += 1
    res = 0
    for i in allN:
        for j in allN:
            if roadcount[i][j] >= T:
                res += dist[i][j]
    return res 

Unfortunately, I forgot to test the largest case before submitting the solution. I test the largest case and it turns out it times out. Very disappointing that it times out in python. And it is only by a few seconds. I had to remake the code in c++, lost plenty of points because of spending double time and recoding. I should know better, need to keep in mind that `50^4` is too bad for python.

int build(vector<string> dist, int T)
{
    // O(n^4) too slow in python, had to migrate code to c++ . arg
    int n = dist.size();
    vector<vector<int>> roadcount(n, vector<int>(n,0));
    const int INF = 1000000000;
    vector<vector<int>> mindist(n, vector<int>(n,INF));
    // Floyd-warshall
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            mindist[i][j] = dist[i][j] - '0';
        }
    }
    for (int k = 0; k < n; k++) {
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < n; j++) {
                mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j]);
            }
        }
    }
    // for each pair of cities (a,b), find if a road(x,y) could be used:
    for (int a = 0; a < n; a++) {
        for (int b = 0; b < n; b++) {
             for (int x = 0; x < n; x++) {
                 for (int y = 0; y < n; y++) {
                    if ( (x != y) && (a != b) ) {
                        if (mindist[a][x] + (dist[x][y]-'0') + mindist[y][b] == mindist[a][b]) {
                            roadcount[x][y]++;
                        }
                    }
                 }
             }
        }
    }
    int res = 0;
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            if (roadcount[i][j] >= T) {
                res += dist[i][j] - '0';
            }
        }
    }
    return res;
}

Div1 500: The one with tree dp (Tree dp again...

You are given a weighted tree. We want to divide the tree into a minimum number `K` of connected subtrees such that their diameter does not exceed `K`. (Diameter is the maximum distance between two vertices in a subtree).

Okay, I didn't have time to solve . I got distracted and was already quite late because of 250. But here is what seems like a solution:

Imagine the root. Either THE root, or a node we are treating as root because we cut the edge above it. Then we have to decide if the children of the root will be used in the subtree or not.

In general, it is best to use a child instead of not using it. Because if we do not use it, we forcefully are creating a new tree, but if we do not, we might be able to avoid it. The issue is how to balance out the lengths of the subtrees that we will keep conencted to the root.

This is a neat trick: Let `M` be the maximum distance between the root and the nodes in one of the arms of the subtree. Then the maximum distance between the root and the nodes in the other arms cannot be greater than `K - M`. We can pick one of the children, decide it will be the one with distance `M`, and then the remaining children must have arms of distance of at most `K - M`. If the edge connecting the root to a child is larger than `K - M`, we cannot use this child and must create a new tree.

We need to solve a subproblem: Do the decision for a subtree rooted at a node `X`, such that the distance between `X` and any grand children cannot exceed `M` and the diameter cannot exceed `K`. This is solved in a similar way, picking a children of `X` to have the maximum distance.

All in all, we have to solve the problem for `f(X, M)` , `M` will not exceed 500, (because `K` is at most 500), so we will be fine.

Challenge phase / etc

Challenge phase already started, I will go check what hapens. Scheduling this post so that it is posted when the challenge phase ends.

Saturday, May 10, 2014

SRM 620, python fails again

This was the last of the 10 SRMs I decided to use python exclusively on. Also definitely the one in which the poor topcoder python support did the most harm.

Div1 250, the one with gcd

Game: You start with a pair of positive integer `(x,y)`, in each move you can move to `(x,x+y)` or `(x+y,y)`. You are given two paris `(a,b)` and `(c,d)` , where each integer is up to 1000000. You want to know if there is a pair `(x,y)` such that both `(a,b)` and `(c,d)` are reachable using it as a starting position. Return the largest possible sum `x+y` in that case else -1

The idea is to go backwards. From `(a,b)`, there are two candidate backward moves: `(a-b,b)` and `(a,b-a)`, however, since the integers must be positive, only one of them is valid: If `(a>b)`, then pick `(a-b,b)` , if `(a<b)` pick `(a,b-a)` , if they are equal, this is the time to stop, because else one will go negative.

So it is easy to make the two sets of pairs `(x,y)` that can reach both `(a,b)` and `(c,d)`, we could find if the two sets have an intersection, if they do, return the maximum sum. Easy:

Notice that when one of `a` or `b` is 1, and the other is 1000000, there are 1000000 elements in one list. So there can be way tons of elements and looking for the intersection is hard unless we go clever and use sets that work in logarithmic time to find if they contain stuff or not. (Or maybe a hash table, but hash tables really blow) In c++, it means std::set or std::map, in python:

class PairGame:
 def maxSum(self, a, b, c, d):
    # find all the pairs that can be reached going backwards from (a,b):
    def backwards(a,b):
        while a > 0 and b > 0:
            yield (a,b)
            if a >= b:
                a -= b
            else:
                b -= a
    # find the intersection between the two sets:
    inter = set( backwards(c,d) ) & set( backwards(a,b) )
    return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)

I submitted it and scored 225~ points. I was so proud of this code. I knew that people coding in c++ or one of the joke languages would be making totally awful things, like repeating the code used to go backwards from `(a,b)`, copy and pasting it to do it with `(c,d)` but doing the slightly different search.

Unfortunately, it times out with `(a,b) = (1000000,1)` and `(c,d) = (1000000,1)`, or similar. I could notice that it can barely run around 1 second when none of the integers is 2. So the only way to fix this is to take care of the cases when any of them is 1 individually (without making huge loops). This complicates the code a ton, and I needed tons of resubmits before getting the following code which might still be wrong:

class PairGame:
 def maxSum(self, a, b, c, d):
    # this wouldn't be necessary if python wasn't too slow for O(n) when n = 1000000
    if a == 1:
        # The path to (a,b) is (1,1), (1,2), ... (1,x), ... (1,b)
        #can we reach (1,x) going backwards from (c,d) ?
        #if so, what's maximum x that we can reach?
        while (c > 1) and (d > 0):
            if d > c:
                d = d % c
            else:
                c = c % d
                if c == 0 and d == 1:
                    c = 1
        if c == 1:
            return min(b,d) + 1
        else:
            return -1
    if b == 1:
        return self.maxSum(b,a,d,c)
    if c == 1 or d == 1:
        return self.maxSum(c,d,a,b)
        
    # Let's pretend the followign is the only real part of the code, okay?
    def backwards(a,b):
        while a > 0 and b > 0:
            yield (a,b)
            if a >= b:
                a -= b
            else:
                b -= a
    inter = set( backwards(c,d) ) & set( backwards(a,b) )
    return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)

(In fact, at first it seemed like the main problem was memory, but even after fixing the memory, it is still too slow

All those using c++ or Java or perhaps even C# wouldn't have this issue. It caused me to lose plenty of points in resubmits and I might fail still. The problem is much harder when `n = 1000000` is too slow. And the worst thing is, that difficulty wouldn't have changed much for c++/Java if the constraint was 500000, but it would have saved python. Well, too bad.

Div1 500

I had little time. All I can say is that the problem is the same as if you just picked a sequence of skills and sorted by this sequence of skills lexicographically. The rest is fuzzy

Monday, May 05, 2014

SRM 619: Nesting min-cost matching inside dynamic programming

Div1 250: The one with stacks

You start with up to 50 piles of stones. Two players play a game alternating turns. In a given turn, a player must pick three piles: `(a,b,c)` such that pile `a` has at least 2 stones. Pile `a` is split into two non-empty piles and their amounts are added to pile `b` and pile `c`. If a player cannot find any valid turn, they lose. Find if the first player is guaranteed to win or to lose assuming the two players play optimally.

During the match, I did a dynamic programming based on the fact that the game state can be described by two variables `(p,q)` where `p` is the number of piles with more than 1 stone and `q` the number of piles with 1 stone. This leads to an `O(n^2)` dp.

However, as it turns out, the problem can be solved in `O(1)` time. Note that every valid step will always reduce the number of piles by 1. It turns out that all cases with an even number of piles are losing states, so if the number of piles is odd AND there is at least one valid move, then the state is winning.

Need to prove that all even states are losing ones. This is obviously true for `n = 2`. So for `n >= 4`, we figure that the only losing states with `n-1` are those that consist entirely of 1s. However, it is impossible to reach such a state using any valid number of moves, because after any valid move, at least two of the piles will be greater than 1.

Div1 600: ouch

This problem's statement is very long to copy and to re-write. So I will skip to the solution.

Imagine you wanted to calculate `f(x,i)`, the minimum cost to assign skills to the sub-tree rooted at `x` if you already know that one of the skills chosen for `x` is skill `i`.

We can easily find `f(y, j)` for any child of `x` and skill `j`. Out of all the skills assigned to `x` and its children, it should be possible to make a permutation of skills. Let's focus on this permutation. There are two possibilities.

  • Use `i` in the permutation, meaning that all children must use skills different to `i`.
  • Use a different skill in the permutation for `x`. All children and `x` need to pick a unique skill.

Using the values of `f(y,j)` and some ad hoc stuff, you can easily build a table of costs for each case, the cost of assigning each of the employees to each skill in the permutation. Then you have a minimum-cost bipartite matching. Too difficult to implement in the little time I had after figuring this out.

Thursday, April 24, 2014

SRM 618: When python failed

This is the eighth of ten matches in which I promised to use python exclusively. This is also the first match in which I feel like I can claim I lost plenty of points because of this restriction.

Div1 250: The one with heteronormativity

So I am supposed to be solving this problem, and all I can think of after reading for the first time is that the problem's definition of family allows incest but is all about not allowing same-gender pairings :/ okay. ..

So a "family" is defined as a graph, in which each edge either has zero incident edges or exactly two incident edges. The two incident edges MUST be from two nodes with different genders. Also the graph must be topsorted. Given a graph of up of 100 nodes return if it could be a family or not. Basically, you have to assign genders to each node. It turns out that the input format guarantees that the graph is topsorted and that each node has 0 or 2 incident nodes. So all that we need to do is ask if we can assign the genders correctly.

If two nodes appear as the parents of a node, then their genders must be different. If we imagine a different graph, in which each of these pairs of nodes that must have a different gender are connected by a single edge. Then the graph must be bipartite. If the graph is bipartite, then the answer is that Yep, the graph could be a "family". Else not. Checking if a graph is bipartite is an easy application of the Depth-first search.

class Family:
 def isFamily(self, parent1, parent2):
    n = max(max(parent1),max(parent2)) + 1
    # Make a set of edges in the hopefully bipartite graph, they are undirected 
    E = set(zip(parent1, parent2)) | set( zip(parent2, parent1) )
    E.remove( (-1,-1) )
    
    # is the graph bipartite?
    color = [-1] * n
    result = [ "Possible" ]
    def dfs(x, c):
        if color[x] == -1:
            color[x] = c
            for (u,v) in E:
                if u == x:
                    dfs(v, 0 if c == 1 else 1)
        elif color[x] != c:
            result[0]  = "Impossible"
        
    for i in range(n):
        if color[i] == -1:
            dfs(i, 0)
        
    return result[0]

I took a bit of time mostly because of the above-mentioned distraction :P, but python helped me keep the code clean. I think it should pass unless 100x100 is suddenly too slow (I doubt it).

Div1 500: The one with strings.

You should count the number of maximal-length strings that use an alphabet of `n` letters and follow these properties:

  • No consecutive equal letters.
  • For each pair of (possibly equal) characters X and Y, string "XYXY" is NOT a subsequence of the string.

The first issue is how to know what is the maximum length possible. After plenty of analysis, I found that the maximum length possible is 2n-1. However, there are two methods to achieve it:

  • One way, is to surround a string of 2(n-1)-1 length using (n-1) alphabet characters, using two instances of one character different to those (n-1) ones. For example: A????????????A, where ????? contains no A.
  • The other way has 3 instances of the extreme character. A??????A!!!!!!!A, but also ?????? and !!!!!! cannot share characters.

So the rest is to count the number of valid patterns. It is not necessary to worry about which character goes to which position of the pattern, because we can just multiply the result by `n!`.

The rest is an `O(n^2)` dynamic programming, unfortunately, `n` is up to 5000 (I wish it was 1000), so no matter what I do, the thing will take at least 6 seconds. That's too much.

More info about the recurrence in the comments of the code:

MOD = 1000000007

def modFactorial(x):
    r = 1
    for i in xrange(1, x + 1):
        r = (r * i) % MOD
    return r
        

class LongWordsDiv1:
 def count(self, n):
    # The maximum number of letters is 2*n-1, although there are multiple ways 
    # to reach that maximum
    # Example, for n = 5
    # 012343210
    # 010234320
    # 010232420
    # 012103430
    # 012321040
    # 012131040
    # there should be 9, but can't think of the other 3
    
    dp = [0] * 5001
    dp[0] = 1
    dp[1] = 1
    for i in xrange(2,n+1):
        #1) 2n - 1 = 2 + (2(n-1) - 1)
        res = dp[i-1]
        #2) find a,b: 2n - 1 = 3 + 2a + 2b - 2
        #                 b  = (2n - 1 - 3 - 2a + 2) / 2
        #                 b  = (2n - 2a - 2) / 2
        #                 b  = n - a - 1
        a = 1
        while a < i:
            b = i - a - 1
            if b > 0:
                res += (dp[b] * dp[a]) % MOD
            else:
                break
            a += 1
        dp[i] = res

    return ( dp[n] * modFactorial(n) ) % MOD

It is possible I missed something in this analysis, though.

Monday, April 21, 2014

Topcoder SRM 617 recap: regular

This SRM took me by surprise, I think the programming contest calendar I use wasn't updated correctly or I didn't read it correctly the first time I read it, so I thought SRM 617 was another date. On Sunday, I got the Topcoder email and surprise!. It's a good thing that we no longer have that bug in which invitation emails don't get sent...

This is the 7-th of 10 SRMs in which I am using python exclusively. So far I don't think it has caused any negative impact to my rating. I think that unless you are in such a high rating position that you really need frequent div1 1000s, python's drawbacks are not a big deal and the pros tend to compensate for the cons :/

Div1 250: The one with cutting cake

You have the nonsensical notion of a one-dimensional cake of length `n`. Your friends are going to come visit, all you know is that the number of friends will be a proper divisor of `n`, but you are not sure which one. You want to make cuts to the cake in such a way that you can distribute the contiguous pieces evenly to your friends, no matter what the number of friends is. The pieces a friend receives may be of different length and the number of pieces each friend gets is not necessarily always the same, the only condition is that they are contiguous after cutting the cake and that the total length of cake each friend receives is equal. Return the minimum number of cuts needed so that this always works, regardless of the number of friends.

Well, the key is that the amount of cake should be the same for each friend, so for each `y` that is a proper divisor of `n` and `x = n/y`, there MUST exist cuts at `x,2x, 3x, ... `, The problem asks for the minimum, so we shouldn't do any more cuts than those `x,2x, 3x, ... `, for each `x`. The trick is that some values `kx_0` and `rx_1` may be the same for different divisors `y_0, y_1` . So we should just count all the numbers that are equal to some `k * (n//y)` where `k` is a integer, `y` is a divisor of `n` and `k * (n//y) < n`.

So simple simulation will be fine. During the match, I had the feeling that with `n <= 100000` and the 2 seconds limit, this simulation would be a bit slow for python. I did things like getting the divisiors in `O(sqrt(n))` time. And even after that, I made sure to test the worst input. The worst input is the number `n <= 100000` that has the most divisors. This is equal to the largest possible highly composite number or 83160 , in short. It turns out it wasn't necessary, also, with some knowledge of iterators and sets, the code can be much simpler than the code I submitted during the match:

class MyLongCake:
 def cut(self, n):
    cuts = set()
    for d in properDivisors(n):              # for each proper divisor d:
        cuts |= set( xrange(n/d, n, n/d) )   # add n/d, 2n/d, 3n/d, etc to set
    return len(cuts) + 1                     # return number of cuts + 1

def properDivisors(n):
    return (i for i in xrange(1, n) if n % i == 0)

During the challenge phase, you'd figure from reading other people's codes that the integers that get cut are exactly those who are not coprime with `n`. So result is actually `n - phi(n)`

Div1 800: strings

I opened this problem accidentally when I intended to open div1 500 :/ . It seemed like I had no idea how to solve it. Also, when a problem score is much smaller than usual, it is possibly much harder than usual (Sorry admins, everyone knows you are backward when choosing problem scores). so I skipped to div1 500, a bit too late, perhaps.

Div1 500: The one with pie and dolphins

You have 50 programmers, for each `i < n`, where `n <= 1000`, you should choose between:

  • Giving a dolphin to programmer `C_0[i]` and a pie to `C_1[i]`.
  • Or
  • Giving a dolphin to `C_1[i]` and a pie to `C_0[i]`.

After you give all these pies and dolphins, each programmer will calculate the absolute value of the difference between the number of pie and dolphin they got. You want to minimize the total sum of all these absolute values.

ooh boy. So I stared at the computer screen for more than half an hour. Everything I thought was non-sense for a good while. Eventually, I decided that the problem was most likely greedy. So the new issue was how to think of a greedy solution. If you process the decisions in the order of i, it is difficult to predict what will happen later and how it should affect your decision. But there is no need to do that. You can really do a decision whatever time you want. A decision will change the result in -2,-1,0,1 or 2, points. What we can try, is repeat this 1000 times: Pick the decision (of all not picked yet) that will change the score for the best. This is easily implemented through just simulation. Complexity is `O(T^2)` where `T` is the number of choices.

I was typing this, and had some bugs. It took longer than usual to debug because it is not easy to know if an answer is correct or not. (This was the first Topcoder problem ever that allowed multiple correct answers to the same test case). Eventually, I found out that in two lines I had i where I meant j. Too bad, coding phase already ended. I tried the solution in the practice room after fixing it , and it passes although the execution time gets awfully close to 2 seconds .

class PieOrDolphin:
 def Distribute(self, choice1, choice2):
    choices = zip(choice1, choice2)
    n = 50  # values of choice1, choice2 go from 0 to 49, we can assume 50 people
    pie = [0] * n
    dolphin = [0] * n
    res = [ 3 ] * len(choices)
    for i in range(0, len(choices)):
        best = (3, (0,0), -1, 3)
        for (j, (a,b) ) in enumerate(choices):
            if res[j] == 3:
                # try (a,b) or (b,a) for choice:
                for (x,y,z) in ( (a,b,1), (b,a,2) ):
                    # calculate how the score would improve
                    p = abs(pie[x] - dolphin[x] + 1) -  abs(pie[x] - dolphin[x])
                    q = abs(pie[y] - dolphin[y] - 1) -  abs(pie[y] - dolphin[y])
                    best = min(best, (p+q, (x,y), j, z) )

        (x , (a,b), bi, t) = best
      # print (a,b),' would improve by ', x, ' index ',bi
        res[bi] = t
        pie[a] += 1
        dolphin[b] += 1
    return tuple(res)

So why is this correct? Well, no idea, but I thought it should be correct because , in theore, each decision is used in the best way ever. And each decision should reduce the effectiveness of at most one decision, in theory. I have nothing formal to prove.

I know that the real answer involves turning the thing into a graph and doing some proofs to show that it is easy. I think that this graph solution eventually reduces to this greedy. But no idea yet.

So?

I think it was a fine problem set, if a bit unusual (odd problem scores, flexible answers in div1 500).

Monday, April 14, 2014

SRM 616 hard problem: ThreeLLogo

This editorial is http://apps.topcoder.com/wiki/display/tc/SRM+616. Since Topcoder insist on restricting the wiki for non-registered users, I will post the explanation here.

Link to problem statement

The first part of the explanation explains what the problem asks us to count. Just notice that the constraints say there will be at most 30 columns and at most 30 rows:


A change in format

This explanation will interpret the question slightly different from the statement. The purpose is that I find it easier to make explanation graphics under the following interpretation: We have a grid, some cells (white) which allow us to put shapes on them and some cells (black) that don't the "L" shapes occupy multiple cells, at least three.

Fill the grid

There are different approaches to this problem. The one intended by the problem setter is to see it as a dynamic programming problem. Instead of focusing on there being a need for just 3 L shapes and trying ad hoc counting, perhaps it is more straightforward to see this problem as a grid/bitmask dynamic programming one. Much like FourBlocks or FloorBoards

The idea is to fill the grid in row-major order. In each cell, we can leave it empty, or choose to start a L shape or perhaps continue an already started L shape. So given a row `x` and a column `y`, assume that all the previous cells (in row-major order) have already been filled.

In this example, the current cell we are deciding is green and the remaining undecided cells are red. Because there is an unfinished vertical arm of an L shape above this cell, we have two options:

  • Use cell `(x,y)` to continue the vertical arm of the L shape. We move to the next cell:

  • Use cell `(x,y)` to finish it. Then we also move to the next cell:

Assume we did the latter, now we need to decide what to do with the new cell `(x,y)`, this time the cell has a horizontally-adjacent L shape which must be finished. We have two options:

  • Continue this horizontal shape, we will need to add more of this shape in later steps:




    If we make this decision, the next step will have a blocked cell as `(x,y)`, we cannot put any shape on this cell, but due to the decision we are forced to place something, so the number of ways to fill the remaining cells correctly is 0.

  • Finish this horizontal shape:



    Let's pick this decision

After doing this, the new `(x,y)` position is a blocked cell, our only choice is to do nothing and leave this cell empty. Move to process the next cell, you will find a situation similar to the first example, we can choose to continue the above L shape or finish it, this time we'll pick to continue it.

This is another interesting situation, `(x,y)` is an empty cell, there is no vertical or horizontal L-shape that we must continue. We can leave this cell empty:

Or, because there are currently only two L shapes, we can start a whole new one:

Dynamic programming

Once we understand how is it possible to fill the grid in row-major order, we can see that, in order to count the number of valid ways, there are few things to consider:

  • `(x,y)` : The current position in the row-major order. There are `O(nm)` such pairs. We assume that all earlier cells have already been chosen.
  • `k` : The number of L shapes we still haven't created.
  • `V` : A set of the columns currently containing vertical arms of (currently) incomplete L-shapes. This is a subset of the `m` columns, but it can contain at most 3 elements, thus there are `O(m^3)` such sets. More specifically, you shouldn create L shapes at the right-most column, so `V` is only a subset of the first `m-1` columns and with some calculation you can find that there are 4090 such sets in total for `m = 30`.
  • The state of the cell to the left of `(x,y)`, does it contain an unfinished horizontal L arm? We can use the variable `"horz"` for this, if `"horz"` is 1, we must continue the horizontal shape else it is 0.

In other words, the number of ways to fill the cells painted red in the following two examples ( `(x,y)` and the later cells in row-major order ) should be exactly the same:

This allows an `O(nm^4)` approach through memoization of a recursive implementation of the idea to count the ways to fill the grid. But that is not the whole story. We need to be careful: In practice, there are 4 values for `k`, 2 values for `"horz"`, 4090 values for the `V` set and 31*31 values for the `(x,y)` (we need the base case `x = n` and the special case `y = m` to keep the code simple). Also, the result needs to be a 64 bits integer. Thus the memory usage is 4*2*4090*31*31 * 8 bytes ~= 239 MB, quite close to the 256 MB limit and thus we should avoid overhead. Also notice that slight modifications to the logic we used could have made us require more states. There is a 3 seconds limit but we should also avoid having too much of a constant factor. A key aspect of the implementation is how to deal with the `V` set. As is usual, we can use bitmasks, but of course we cannot downright use 30 bit bitmasks to store results in memory, a workaround is to index the bit masks.

Code

    
vector<string> grid;
int n, m;

long mem[31][31][4][MAX_MASKS][2];
map<int,int> getMaskId;
int validMasks[MAX_MASKS];

long f(int x, int y, int need, int maskid, int horz) {
    long & res = mem[x][y][need][maskid][horz];
    if (res != -1) {
        return res;
    }
    int mask = validMasks[maskid];
    if (x == n) {
        // base case
        res = ( ( (mask == 0) && (need == 0) ) ? 1: 0 );
    } else if (y == m) {
        res = ( (horz == 1) ? 0 : f(x + 1,0, need, maskid, 0) );
    } else if ( (1<<y) & mask ) {
        if ( (horz == 1) || (grid[x][y] == '#') ) {
            // if (horz == 1) must continue the left "L" shape,
            //  but also must continue the above one
            res = 0;
        } else {
            //1) Finish the L shape above.
            int nmask = mask ^ (1 << y);
            res  = f(x,y+1, need, getMaskId[nmask], 1);
            //2) Continue the L shape vertically
            res += f(x,y+1, need, maskid, 0);
        }
    } else if (horz == 1) {
        if (grid[x][y] == '#') {
            res = 0;
        } else {
            res = f(x,y+1, need, maskid, 1) + f(x,y+1, need, maskid, 0);
        }
    } else {
        //1) Do nothing
        res = f(x,y+1, need, maskid, 0);
        //2) Create a new L shape
        if ((grid[x][y] == '.') && (need > 0) && (y + 1 < m) ) {
            //2.1) Create a new L shape
            int nmask = mask | (1 << y);
            res += f(x,y+1, need - 1, getMaskId[nmask], 0);
        }
    }
    return res;
}
 

long countWays(vector<string> grid)
{
    this->grid = grid;
    n = grid.size(), m = grid[0].size();
    
    // generate all bitmasks with <= 3 turned on bits:
    int t = 0;
    auto addMask = [&](int mask) {  // don't mind the lambda syntax, it
        getMaskId[mask] = t;       // simplifies having to repeat this code
        validMasks[t++] = mask;    // for each number of 1 bits.
    };
    addMask(0);
    for (int i = 0; i + 1 < m; i++) {
        addMask( 1 << i );
        for (int j = i + 1; j + 1 < m; j++) {
            addMask( (1 << i) | (1 << j) );
            for (int k = j + 1; k + 1 < m; k++) {
                addMask( (1 << i) | (1 << j) | (1 << k) );
            }
        }
    }
    cout << t << " masks found for " << m << endl; 
    
    // run the recursive solution:
    memset(mem, -1, sizeof(mem));
    return f(0,0, 3, 0, 0);
}

Friday, April 04, 2014

SRM 615: Recursive, recursive

Already reached the half of this python experiment. This time , python hasn't really helped me and it wasn't an obstacle either. Quite mundane, really.

Div1 250: The one with ameba

First of all, this is a 7:00 AM match so I had a bit of issues getting brain to work at first. I didn't really understand the statement very well initially. Then the problem itself is quite special, ad hoc and cool. So it took a while. Even so, I was surprised when it turned out that tons of people had a much better score than I.

The problem (after tons of translation) is as follows:

An amoeba has size `s` , a integer. There is a list `X` of amounts of gel it met in chronologically order, so first it met `X[0]` gel, then `X[1]` gel, etc. `X` has at most 200 elements. Whenever the amoeba meets a gel with size exactly equal to its current size `s`. The ameoba is forced to eat the gel and thus double its size. `s' = 2s`. We don't know the initial size `s_0`, but we wonder how many integers `u` exist such that the final size of the ameoba cannot be equal to `u`.

All integers that do not belong to `X` are possible final sizes. For example, take a integer `a` that is not in `X` and make it the starting size. Since none of the gels the amoeba will meet is exactly `a`, the amoeba will never double its size and the final size is `a`.

So we should only care about the integers that are already in `X`. For each of those integers, answer: "Can the final size be equal to `X[i]`?".

Given a integer `a`, how do you know if it is a possible final size? Focus on the last integer in the sequence: `X[n-1]`. There are two possibilities in which the final size is `a`:

  • The size was already equal to `a` before the amoeba met gel `X[n-1]` and the amoeba didn't eat it. This requires two conditions to be true:
    • `a != X[n-1]`, because the amoeba must not eat it.
    • It is possible for the amoeba to reach size `a` after meeting the first `n-1` gels.
  • The size of the amoeba became `a` after meeting `X[n-1]`:
    • This means that `2X[n-1] = a`.
    • And also `X[i]` should be a possible size after meeting the first `n-1` gels.

So we have a recursive idea. The issue is that `n = 200`, so let's get clever. First find the set of impossible integers for the first 1 elements (easy), then use that set to build the set of impossible integers for the first 2 elements, then the first 3 elements and so and so. E.g: If you know the set of impossible elements for the first `n-1` elements, it is easy to know if `a` or `X[i]` are impossible or not to have after meeting the first `n-1` elements. This yields an `O(n^2log(n))` solution if you use a logarithmic set or `O(n^2)` if you use a hash table.

class AmebaDiv1:
 def count(self, X):
    impossible = { X[0] }
    for i in range(1, len(X)):
        impossible2 = set()
        # which are impossible for X[:i+1] ?
        for j in xrange(0,i+1):
            a = X[j]
            # is a possible?
            # a) ignore X[i],
            poss = False
            if X[i] != a and a not in impossible:
                poss = True
            # b) use X[i] to double
            if 2*X[i] == a and X[i] not in impossible:
                poss = True
            if not poss:
                impossible2.add(a)
        impossible = impossible2
    
    return len(impossible)

So I said python didn't help that much. It is a very procedural algorithm and well, I think that if I did it in c++ it would take the same amount of lines. There are possibly much better python implementations.

The rest

Tried the other problems, not very thrilled about having to write editorial because div1 550 seems hard and div1 950 is incredibly complex in what it asks for. Currently waiting for the challenge phase to end. Blog post will be posted automatically once it does so.

While watching the challenge phase, I learned a new solution and the reason everyone had such fast scores. Remember that either `a` is possible in the first `n-1` or `X[i]` is possible. It follows that the only starting integers you should try and simulate are those already in `X[i]`, if you test each of those integers as the starting one, you can easily find which `X[i]` are possible. The solution is then to simply do that.

Friday, March 21, 2014

SRM 613: The doubt

Third python SRM, this time I am not sure if the solution for div1 500 is doable in python, but I was very far from figuring it out so maybe the solution is actually quite fast.

Div1 250: cats

There are many cats in the horizontal axis. You know their integral positions. Each cat has to move exactly `M` units left or `M` units right. Find the minimum gap between left-most and right-most cat possible after a single move. Positions are between -100000000 and 100000000

I will find it hard to explain it editorial-style, because it is rather obvious that there should be a pattern like RRRR...LLLL, such that some cats on the left side of this pattern move right and the rest move left. There are only `O(n)` of these patterns so we can try them all.

My way of acquiring those patterns was to pick a position and tell cats to move towards that position. It passed examples, and submitted quite fast (I think it is because of python's shortness of code), then I noticed a potential bug. If I tell a cat to move towards itself, then it will always pick the left side (I used <= operator) instead of maybe trying the right side. So for a while I wondered if there is a case in which the cat in question needs to move to the other direction. I eventually found that 's not the case. You don't need to try both LLLLLLLL...LL and RRRRRRRRRR...R, because the end result will be the same. So I thankfully avoided a needless resubmission penalty.

class TaroFriends:
 def getNumber(self, X, T):
    def towards(z,x):
        return z + T if z <= x else z - T 
    
    def allTowards(x):
        l = [ towards(z, x) for z in X ]
        return max(l) - min(l) 
        
    return min( allTowards(x) for x in X)

Python seems to be making me trend towards functional-ish programming. Would you take a look how beautiful this code is? I checked out other people's codes and they are awful :)

Div1 500: GCD madness

So we should pick `N` numbers such that each number is between two bounds and the GCD of all the picked numbers is `K`. All the numbers in the input are between 1 and 10^9. Have fun

I didn't really do much.

The maximum number of distinct prime factors of `K` is 9, because the product of the first 10 prime numbers is larger than 10^9. So maybe if we focus on the prime factors... Each picked number must be a multiple of `p^r` if `p` is a prime factor of `K` and `r` its exponent in the factorization. Also, `r` should be the minimum exponent of this prime number for all the `N` numbers. Besides of this I didn't have much luck.

I initially made a huge mistake, thought `N` was up to 50, so I was coding some dynamic programming (which still had complicated parts like having to count numbers that are multiples of something but not multiples of a list of other numbers). When I figured this was completely wrong, I had tons of code and the match was ending, so I submitted it for challenge phase fun. Too bad the code was challenged very fast.

So

When python works, it works, It has been a while since I had such a relatively high score in div1 easy. It was all because of python's potential for concise code.

Tuesday, March 04, 2014

SRM 611: My first python SRM

The other day, I made the foolish decision to use Python exclusive in SRMs 611 to 620. I suspected it will be bad for my rating at least initially. I am not as experienced with python as I am with c++ and there is always a good chance the admins will include a problem in the set that is unsolvable in python. On the other hand, maybe the 250 has a quick to think solution and then python's conciseness would help me score some points.

Div1 250: The one with LCM

You are given two sets `A`, `B`, the LCM set of a set `S` is the set of all numbers `L` such that `L` is equal to the LCM of the numbers in a subset of `S`. Given `A` and `B` find if their LCM sets are equal. Numbers in each set are between `2` and `10^9`, inclusive.

Tough luck, it didn't appear this was going to be a quick problem to code, and at first I had no idea what to do.

Eventually, I thought of something that appears to be a solution. Given a set `X`, we can get rid of any number that can be created by taking the LCM of other numbers in the set. After doing this, set `X` becomes the minimal set such that its LCM set will be equal to `X`. If we find the minimal sets of `A` and `B`, they will be equal if and only if the LCM sets are equal.

Implementing was another issue. How can you tell if a number can be made from the other numbers in the set? Well, you would need to process the numbers in decreasing order. I focused on the prime decomposition of each number, if the exponent of at least one prime number is larger than the maximum ever exponent in the numbers smaller than it, then this number shouldn't be removed. I needed to code Eratosthenes sieve in python for the first time, it probably isn't so well coded.

I made a mistake, the first implementation would ignore the smallest number in each of the sets, so it thought `{7}` had the same LCM set as `{8}`. The slow submission plus the resubmit made me score incredibly low. I am not 100% sure about this approach, but if I fail system tests it will not make a difference, my ranking in the match will be very low either way.

import math

class LCMSet:
 def equal(self, A, B):

    # greatest common divisor
    def gcd(x,y):
        while y != 0:
            (x,y) = (y, x%y)
        return x
    
    # least common multiple:
    def lcm(x,y):
        return x * y / gcd(x,y)
    
    # Eratosthenes siege returns an iterator of prime numbers, we don't need
    # primes larger than sqrt(max( max(A), max(B) ))
    def sieve():
        t = int( math.sqrt(max(max(A), max(B))) + 1 ) 
        compos = [ False ] * (t + 1)
        for i in xrange(2, t + 1):
            if not compos[i]:
                yield i
                j = i + i
                while j <= t:
                    compos[j] = True
                    j += i
    
    # make a list of prime numbers
    primes = list( sieve() )
    
    # prime decomposition:
    def primeDecomp(x):
        d = dict()
        for p in primes:
            c = 0
            while x % p == 0:
                x /= p
                c += 1
            if c > 0:
                d[p] = c
        if x > 1:
            d[x] = 1
        return d

    # "compress" a set, find the minimal set that has a LCM set equal to X
    def compress(X):
        decomp = [ primeDecomp(x) for x in reversed(X) ]
        n = len(X)
        c = []
        for i in xrange(n - 1):
            good = False
            # for each p^r in the prime decomposition:
            for (p , r) in decomp[i].items():
                if r > max( 0 if p not in decomp[j] else decomp[j][p] for j in xrange(i+1,n) ):
                    good = True
            if good:
                c.append( decomp[i] )
        #smallest number is always there. Missing this line caused the resubmit
        c.append( decomp[n-1] )
        return c
        
    return ['Equal','Not equal'][compress(A) != compress(B)]

Div1 500: The one with spanning tree

You are given a complete graph of up to 20 vertices with weighted edges (It is given in the form of points in a map that you have to connect using a tree topology). Return the minimum standard deviation of the edges of a spanning tree.

Easy, right? Well, the `20` constraint hints me that some heavy dp or brute force is needed. I remembered a match in which we had some sort of brute force/dp through all spanning trees, but I didn't remember enough. Also, the constraint wasn't a good sign for whether the problem is solvable in python or not :/

The end

Plenty of very different solutions for 250. I have no idea if I will pass, I expect many system test fails.

What about python? Well, this wasn't a great match for comparisons. I think I would have taken very long to code a solution in c++ too. I am currently sleep deprived, I stayed up till very late last night trying to finish something that barely passes as an editorial for SRM 610. I guess there is a good reason I am doing this for 10 SRMs and not just a couple.

Update: Failed system tests

What a bad day. Div1 250 approach fails the following test case: [{168, 7476, 84, 890, 10, 4, 40, 3560, 37380, 8, 89, 20}, {84, 4, 20, 7476, 3560, 89, 712, 420, 74760, 1780, 14952, 8, 356, 10}], it is a wrong answer so this is likely a mistake with the algorithm and not python's fault.