Saturday, June 28, 2014

SRM 626: Not a math person

Another SRM.

Div1 250: The one with dice

Alice throws `a` dice with `b` faces each (Numbered 1 to `b`, all equally likely). Bob throws `c` dice with `d` faces each (same). The score is the sum of all the dice rools each player gets. Return the expected value of Alice score IN CASES SHE WINS. If it is impossible for Alice to win, return -1. All `a,b,c,d` are at least 1 and at most 50

Okay... For each die, can you calculate the probability to get `X` as the total result? Well, yes. This is an `O(a^2 b^2)` dynamic programming thing. Possibly other good methods too. The thing is that we can precalculate two lists that give us the probability Alice gets `X` and the probability Bob gets `Y`.

Then for each pair `(X,Y)` it is easy to get the probability that Alice gets X AND Bob gets Y. So consider only the cases `(X > Y)` and find the probability Alice wins. If the probability is 0, return -1.

Once again, for each pair `(X,Y)`, find its probability, if Alice wins in this case, then the probability that Alice gets that score, provided Alice wins the whole game is `P_A(X) * P_B(Y) / P_"Alice wins"`. The sum of these probabilities multiplied by `X` is the expected value.

My first instinct was to use Bayes theorem , maybe there is a way to simplify things with it?

Total complexity is `O(a^2 b^2 + c^2 d^2 + abcd)`, so around 3* 2500 * 2500 in the worst case :)


vector<double> findProbabilities(int a, int b)
{
    double dp[51][2501];
    
    memset(dp,0,sizeof(dp));
    dp[0][0] = 1.0;
    
    for (int n = 1; n <= a; n++) {
        
        for (int y = 1; y <= n*b; y++) {
            dp[n][y] = 0.0;
            for (int x = 1; x <= b && x <= y; x++) {
                dp[n][y] += dp[n-1][y - x] / b;
            }
        }
    }

    
    vector<double> res(a*b + 1, 0.0);
    for (int i = 0; i <= a*b; i++) {
        res[i] = dp[a][i];
    }
    return res;
}

double getExpectation(int a, int b, int c, int d)
{
    vector<double> Alice,Bob;
    
    Alice = findProbabilities(a,b);
    Bob   = findProbabilities(c,d);
    
    double pw = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                pw += Alice[A] * Bob[B];
            }
        }
    }
    
    if (pw <= 0.0) {
        return -1.0;
    }
    
    double ew = 0;
    for (int A = a; A <= a*b; A++) {
        for (int B = c; B <= c*d; B++) {
            if (A > B) {
                double p = Alice[A] * Bob[B]; //probability this happens
                ew += A * (p / pw);
            }
        }
    }
    return ew;
}
};

Div1 medium the one with negative edges

A directed graph, possibly with multiple edges and loops and cycles. Find the minimum cost to move from a vertex to another.... EXCEPT that you have at most `X` opportunities to make the cost of an edge negative when you use it. `X` can be is obscenely large.

Okay, no idea.

Div1 hard: The one with mirrors

There is a rectangle with mirror sides of some dimensions. Find each angle from which you can cast a ray from the bottom-left corner in such a way that the ray of light bounces exactly `b` times before touching any corner and finally reaches the top-right corner. `b` is at most 1000000000. For each of those angles, add the total distance traveled by the ray before hitting the corner. Calculate all that modulo 1000000007

I thought I had a good chance to solve this problem. Because of the old codejam problem Hall of mirrors I knew a lot of properties about these rectangles of mirror sides. For example, we could imagine that a bounce is the same as moving around a grid made of rectangles of those mirrors. I would have more luck explaining this if I had time to inkscape....

The thing is that if you imagine a grid of mirrors and make the coordinates in each of these mirror corners `(x,y)` . Then moving from the origin to this coordinate is valid if and only if: `(x,y)` are coprime and also the coordinates should share a diagonal with `(1,1 + b)`. In addition, `b` must be even.

Unfortunately, because `b` can be very large, just knowing that isn't enough. It needs some mathematical trick to avoid doing an `O(b)` loop.

The following passes examples 0 o 3.

const int MOD = 1000000007;

long gcd(long a, long b)
{
    while (b != 0) {
        tie(a,b) = make_tuple(b, a%b);
    }
    return a;
}

int findSum(int sideA, int sideB, int bounces)
{
    if (bounces % 2 == 1) {
        return 0;
    }
    long y = 1 + bounces;
    long x = 1;
    
    // find solutions to this:
    // 1 <= y - p <= y
    // gcd(y - p, 1 + p) = 1
    
    long sx = 0, sy = 0;
    while (y >= 1) {
        if (gcd(x,y) == 1) {
            sx += (x*x) % MOD;
            sy += (y*y) % MOD;
        }
        y--;
        x++;
    }
    sx = ( ((sx * sideA) % MOD) * sideA) % MOD;
    sy = ( ((sy * sideB) % MOD) * sideB) % MOD;
    return (sx + sy) % MOD;
}

I submitted it just for fun, it will time out with slightly large `b`. No one dares to challenge it so far.

Thursday, June 19, 2014

SRM 625: I semi-wrote it

Two days ago I was busy trying to come up with a way to finish SRM 624's editorial. I get how to solve div1 hard in the top level, I have no idea how to explain the initial step: Split in centers subtrees. I am fairly sure it is probably a known algorithm, can't find info. Was stuck there when I was contacted by the admins about SRM 625. They needed problems urgently and I apparently had some problems that were created long ago and could work.

I wrote all problems except div1 hard (the important one). They really weren't intended to be part of the same match. And I think that they go more towards the standard side of things. Please note that I've gone through a huge writer's block and I can't seem to come up with interesting problems anymore. But oh well. I just hope there are no big issues of the sort that cause match cancellations.

Div2 250: The one with challenge phase

Unlike the other problems, I thought of this one specifically for this match. Ever since multiple-answer problems were added to topcoder, I wanted to make something that takes advantage of it.

The problem is easy, given `y` find `(a,b,c)` such that `ab + c = y`, `(a,b,c)` must be integers between -1000 and 1000 different to 1 and 0. `y` is between 0 and 500. Otherwise any `(a,b,c)` is correct.

There is a bruteforce search here. But we can also get clever. Note that no matter which two values `(a,b)` you pick, there will always be a `c = y - ab`. So we can do `(a,b) = (1,1)` and then `c = y - 1`. This would be wrong because values cannot be 0 or 1. But we can try to pick values `(a,b)` that don't have this problem. For example `(a,b) = (2,2)`, now `c = y - 4`. This is almost fine, but `y = 4` or `y = 5` would yield a wrong value of `c`. We could handle those cases separately OR we could try something different. We would like `y - ab` to be distinct to 0 or 1. How about : `y - ab > 1` then `ab < y -1`. We can guarantee this to be true with `(a,b) = (2,-2)`. Then we have `c = y + 4`. So the result can always be `(2,-2, y + 4)`.

Div2 500: The one with sequences

This problem is very old. I think I thought of it for some TCO qualification round of a past TCO. I didn't even remember much about the problem. It was already developed but it was never used. So we mostly just enabled the problem for use.

If I remember correctly, this problem can be very tricky.

Div2 1000: Minimum Liars Again

A circle of friends. For each i, we have answers[i]:

  • If answers[i] is ?, we didn't ask friend i anything.
  • Else we asked the i-th friend if friend (i + 1) mod n is a liar. If friend i is a honest person, they would always tell the truth , else they would always lie. answers[i] is H if friend i said friend (i + 1) mod n is a honest person, else L.

Return the minimum number of liars in a consistent case. If no case can be consistent, return . 1.

This is another very old problem. It was also already developed, but for `n <= 20`. We decided `n <= 50` is better for this problem slot.

The idea is simple: If there are no ? in the input, the whole thing is cyclic. You can try making the first person a liar or not. If first person is a liar, and you know their answer, then you know if second person is a liar, then third person and so and so. Eventually, you will know if person `n - 1` is a liar or not, and this is important the conclusion we make about person 0 must be consistent with the hypothesis we made. Return the minimum number of liars in the consistent cases.

If there is at least one ? , the case is not really cyclic. You can just split the friends in groups of consecutive non-? friends. These cases are linear and you can use the same logic (try if first person is a liar or not). But this time, no consistency is needed.

Like this:

import itertools
class ConundrumReloaded:
 def minimumLiars(self, answers):
        
    def simulateLine(f, s):
        p = f
        liars = 0
        for i in range(1, len(s) + 1):
            h = (answers[i - 1] == 'H') ^ (not p)
            liars += not h
            p = h
        #print (f,s), (liars,p)
        return (liars, p)
        
    n = len(answers)
    q = -1
    for i in range(n):
        if answers[i] == '?':
            q = i
    INF = 10**9
    res = INF
    if q == -1:
        for f in (0,1):
            (liars, p) = simulateLine(f, answers)
            if p == f:
                res = min(res, liars)
    else:
        res = 0
        curr = ''
        for i in range(n + 1):
            j = (i + q) % n
            if answers[j] == '?':
                if curr != '':
                    res += min( (not f) + simulateLine(f, curr)[0] for f in (0,1) )
                    curr = ''
            else:
                curr = curr + answers[j]
        
    return -1 if res >= INF else res

Div1 250: When you are desperate combine palindromes and probabilities

Given a string. We pick one of its anagrams with uniform probability. Return the probability the picked anagram is palindrome.

This problem is also very old. I think I tried to use it in many matches, but (thankfully) we always found something more interesting and less standard. Until now.

Simply return: (Number of palindrome anagrams) / (Number of anagrams).

The number of anagrams is calculated by dividing `n!` by `r!` for each letter in word that is repeated `r` times.

So we just need to calculate the number of palindrome anagrams. First of all, if `n` is even, then all characters must appear an even number of times. If `n` is odd then only one character should appear an odd number of times. If `n` is odd, we already know what character is in the center and we can just remove it and treat it as if everything was even.

We have many letters that appear an even number of times. We know that we should put each of these letters in both sides of the palindrome. The order depends only on the first side. So actually, we should count the number of ways to order the `n/2` letters that remain after dividing each letter quantity by 2. This is the same as dividing `(n/2)!` by `(r/2)!` for each letter that is repeated `r` times.

As I type this, it appears that many people including div1 hard's writer have issues with precision in their solutions.

Div1 500: The one with blocks

The original version of this problem was meant for the last TCO rounds, I think last year. But was too complicated. In the original version , there were multiple goals and exactly two start cells. I don't even remember how was it solved.

In the current version, there are multiple start cells but only one goal. The game is symmetric so we can just begin at the goal and if it reaches a start cell, there is a solution if you begin at that start cell.

So we should just find the minimum number of holes to cut the path between the goal cell and any start cell.

Now analyze the allowed moves, you will see that the set of reachable cells is like this:

*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*
A  A  A
V  V  V
*SZ*SZ*

Where SZ are horizontal 2x1 pairs of cells that can be touched by a 2x1 face, the * are the cells that can be touched by 1x1 faces (including the goal) and AV the vertical 2x1 pairs of cells.

We can just ignore any start cell that is not *, no matter what we do, we can't reach them.

So now we just have a minimum cut problem. The trick is to take the 2x1 pairs of cells as edges then the cost to remove an edge equals the number of normal cells in the edge. You can also remove * cells, this means using an intermediary edge connecting entry and exit point for each * cell. Okay. The minimum cut requires you to get clever and implementation will be a headache but the concept is easy!

Thursday, June 12, 2014

Twitter

Around 2010 I started using twitter. That old account has turned into something. I participate in twitter a lot and I am very opinionated so it has taken a direction that may not be liked by everyone who just wants to talk topcoder stuff.

I really started to feel the need for a new , more professional / respectable twitter account, so I started a new one that will be focused on programming and mostly contests (you know, my job). My old account will probably just retweet those things. But if you want to follow me on twitter and only get programming (contests) opinions and updates and links to updates to this blog, you should follow the @fakevexorian account.

*Taunts you*

In today's SRM 624 there was a problem that asked you to return something modulo 1000000009. Usually the modulo is 1000000007 or 1000000009 , but people have the impression it is 1000000007 most of the time.

Usually topcoder example tests in this kind of problem include a test case that makes use of the modulo. This is intentional so that people can catch a mistake in writing this constant. Unfortunately, today's problem didn't have such test case.

I didn't even notice. If you read my recap of SRM 624 you will see I even mistakenly used 1000000007 when typing the explanation. And yet I passed , well I have a secret weapon (not so secret).

This is my Greed c++ template:

#include <bits/stdc++.h>
// ${Contest.Name} - ${ClassName} (${Problem.Score} points) by vexorian
using namespace std;

struct ${ClassName}
{
${<if Problem.Description.Modulo}
    const int MOD = ${Problem.Description.Modulo};

${<end}
    ${Method.ReturnType} ${Method.Name}(${Method.Params})
    {
        return ${if Method.ReturnType.Array}{}${else}${Method.ReturnType;zeroval}${end};
    }
};

${CutBegin}
${<TestCode}
${CutEnd}

Take a look to the if Problem.Description.Modulo part. Yes, this part of my template actually generates the MOD constant for me :). One day I spent an afternoon or so making the Greed topcoder plugin able to parse the modulo constant and make it accessible to templates. This is why :).

SRM 624: Slooooooooow

I just did the most amazing c++ code, got to share.

Div1 250: The one with buildings

You are given a list of integers (building heights). The maximum number of integers possible is 4000 and the maximum integer possible is 4000. For each `0 <= i <= n`, find the minimum number of steps so that the final sequence has `i` equal integers. Every step is to increment a single integer by 1 (building a new floor in a building).

I had some difficulties getting to the right approach. At first trying to do things like sweeps and dynamic programmings :/ . Turns out the trick is to solve: What is the minimum cost to have `y` buildings of height `x` ? For each `x`, calculate those costs for all `y`, updating the values in a total cost function that considers all possible heights.

The trick is that there are only `O(N)` relevant values of `x` (those equal to the heights), and that it is easy to find, for each `y` the minimum cost to have `y` buildings of height `x`. Just pick the `y` largest buildings of height greater than or equal to `x` and increment them. This gives us an `O(N^3)` algorithm, however, for each `y`, you can reuse the result of `y-1` (you know the cost of fixing `y-1` buildings, just add a single building and we know the result for `y`.

This is an `O(N^2)` algorithm:

int minimum(vector<int> heights)
{
    int n = heights.size();
    // sort non-increasingly by height:
    sort(heights.begin(), heights.end());
    const int INF = 4000 * 4000;
    // cost[i] will have the cost to have i equal buildings
    vector<int> cost(n + 1, INF);
    cost[1] = 0;
    // for each height:
    for (int i = 0; i < n; i++) {
        int c = 0;
        // for each number of buildings i - j + 1:
        for (int j = i - 1; j >= 0; j--) {
            // the minimum cost to have i - j + 1 buildings of this height 
            c += heights[i] - heights[j];
            // remember the best
            cost[i - j + 1] = std::min( cost[i - j + 1], c);
        }
    }
    
    
    // the problem actually asks for the XOR of all values, but I suspect
    // this is because of limitations in the return value size.
    int res = 0;
    for (int i = 2; i <= n; i++) {
        res ^= cost[i];
    }
     
    
    return res;
}

Div1 450: The one with implementation

Classical problem. You are given a graph (no loops, bidirectional, connected, weighted, some weights are zero). Return the total number of shortest paths between vertices `0` and `N-1` modulo 1000000007. Note that zero-weight edges might make the number of shortest paths infinite, return -1 in that case. At most 2000 vertices and at most 2000 edges.

Okay, the complicated path is when paths are infinite. This can only happen if there is at least one shortest path that visits an edge with weight 0. So we need a way to know, for each edge, if it can belong to a shortest path.

The way to check if an edge `(u,v)` can belong to an optimal path requires us to know: The weight of the edge, the minimum distance `D` between 0 and `N-1`, the minimum distance `d` between 0 and `u` and the minimum distance `r` between `N-1` and `v`. If `d + w + r = D`, then the edge can be used in a shortest path. If any 0-weight edge is one of those, return -1.

The rest is to count the number of paths. How about `f(x)` which returns the number of shortest paths towards `N-1` starting with `x`. `f(N-1) = 1`, there is only one shortest path there. For the other vertices `x`, you find every `y` such that `(x,y)` can belong to a shortest path. Then `f(x) = sum(f(y))` , easy, right? . Just evaluate the vertices in non-increasing order of distance to `N-1`, since none of the edges used will have weight 0, we can do this sorting.

I broke a record with this problem. I used 4 lambdas in total. And I even nested lambdas this time, which is so unusual in c++ :).

const int MOD = 1000000009;



int count(int N, vector<int> A, vector<int> B, vector<int> C)
{
    // make the graph easier to use:
    vector<vector<int>> g(N);
    vector<vector<int>> w(N);
    vector<int> degree(N);
    for (int i = 0; i < A.size(); i++) {
        degree[--A[i]]++;
        degree[--B[i]]++;
    }
    for (int i = 0; i < N; i++) {
        g[i].resize(degree[i]);
        w[i].resize(degree[i]);
        degree[i] = 0;
    }
    auto addEdge = [&](int u, int v, int weight) {
        g[u][degree[u]] = v;
        w[u][degree[u]++] = weight;
    };
    for (int i = 0; i < A.size(); i++) {
        addEdge( A[i], B[i], C[i] );
        addEdge( B[i], A[i], C[i] );
    }
    
    //Dijkstra time!
    auto dijkstra = [&](int s, vector<int> &dist) {
        set<pair<int,int>> Q;
        const int INF = 2000000000; 
        dist.resize(N, INF);
        for (int i = 0; i < N; i++) {
            Q.insert( {INF, i} );
        }
        auto push = [&](int x, int d) {
            if (d < dist[x]) {
                Q.erase(Q.find( {dist[x],x}));
                dist[x] = d;
                Q.insert( {dist[x], x} );
            }
        };
        push(s, 0);
        while (! Q.empty()) {
            auto it = Q.begin();
            int x = it->second;
            Q.erase(it);
            
            for (int i = 0; i < degree[x]; i++) {
                int v = g[x][i];
                int we = w[x][i];
                push(v, dist[x] + we);
            }
        }
    };
    vector<int> dist, rdist;
    dijkstra(0, dist);
    dijkstra(N-1, rdist);
    
   
    // if any of the shortest paths visits an edge of weight 0, the result
    // is -1. It is possible to have 0s in the input and result diff to -1
    for (int i = 0; i < A.size(); i++) {
        if (C[i] == 0) {
            int u = A[i], v = B[i];
            for (int r = 0; r < 2; r++) {
                if (dist[u] + rdist[v] == dist[N-1]) {
                    //critical
                    return -1;
                }
                swap(u,v);
            }
        }
    }
    
    // And now the part in which we actually count the roads.
    vector<long> dp(N);
    vector<int> byRDist(N);
    for (int i = 0; i < N; i++) {
        byRDist[i] = i;
    }
    sort(byRDist.begin(), byRDist.end(), [&](int a, int b) {
        return rdist[a] < rdist[b];
    });
    for (int u: byRDist) {
        dp[u] = 0;
        if (u == N-1) {
            dp[u] = 1;
        }
        for (int i = 0; i < degree[u]; i++) {
            int v = g[u][i];
            int we = w[u][i];
            if ( (rdist[v] < rdist[u]) && (dist[u] + rdist[v] + we == dist[N-1]) ) {
                dp[u] += dp[v];
            }
        }
        dp[u] %= MOD;
    }
    return dp[0];
}

The rest

I was very slow, everybody else has a much better score. I hope many of them fail. I am confident in the correctness of my solutions and I don't suspect of any likely implementation bug. Typing this as the challenge phase begins.

During challenge phase: 8 minutes of challenge phase, and I doubt I'll score a successful challenge, there are some tough challengers in my room and they are very quick to find errors. There seem to be many ways to get 450 wrong.

Wednesday, June 04, 2014

SRM 623: Saved by the challenge phase

Div1 300: The evil one

This problem was evil. Let me explain, you have a board of maximum size 20x20, some cells contain apples, some pears, some are empty. A valid move is to move a fruit to an empty cell. You can do at most `K` moves. The objective is to maximize the size of a rectangle containing only apples. What is the maximum?

Say you already picked an arbitrary rectangle, there are `O(n^4)` of those. There are two things that are important about this rectangle: `a` : Number of apples in side, `p` Number of pears inside, `e` number of empty cells inside. You can also know the number of apples/pears/empty cells outside the square. Finding this values is a complicated implementation part. Also, I think too slow for python, so I tried to do something in c++, using lambdas of course. It took eons. But when I finally did it , I had a new problem...

Knowing that information about the rectangle we need to find the minimum number of moves needed to have only apples in the rectangle.

At first I was trying something very wrong: Move each pear to a (different) empty cell outside, then move all the apples inside. This took me ages to debug, and when I finally passed system tests with it, I was about to click submit when I started to wonder "Is this really it?" , and I noticed that noooo, the minimum number of moves is much trickier to find. Hence the examples were weak.

So what is the real optimal strategy? Well, the first issue is that we only need one empty cell in the board. We can keep reusing it to do all the moves. Then we have this: If there are any empty cells in the square, it is easy to see that it is optimal to just move any outside apples and cover them. The remaining case is when there are some pears inside the rectangle, at least one empty cell outside and enough apples outside. A single pear is treated like this: Move pear to empty cell, move apple to the position the pear was in, this leaves an empty cell in the apple's position, and we can repeat.

So in fact, the minimum number of moves is equal to:

  • 0 if all the cells already contain apples.
  • # of empty cells inside the rectangle, if the rectangle contains no pears.
  • # of empty cells + 2 * number of pears inside the rectangle, if and only if there is at least one empty cell in the board
int getBoard(vector<string> board, int K)
{
    // stuff to count the number of apples / pears / empty inside a rectangle:
    int W = board.size(), H = board[0].size(); 
    map<char, vector<vector<int>>> total;
    string chars = ".AP";
    for ( char ch: chars ) {
        vector<vector<int>> t(W+1, vector<int>(H+1,0));
        for (int i = 1; i <= W; i++) {
            for (int j = 1; j <= H; j++) {
                int x = ( board[i-1][j-1] == ch );
                t[i][j] = x + t[i-1][j] + t[i][j-1] - t[i-1][j-1];
            }
        }
        total[ch] = t;
    }
    auto rectangle = [&](char ch, int x0, int x1, int y0, int y1) {
        auto &t = total[ch]; 
        return t[x1][y1] - t[x0][y1] - t[x1][y0] + t[x0][y0];
    };
    
    int A = rectangle('A', 0, W, 0, H);
    int E = rectangle('.', 0, W, 0, H);
    
    int res = 0;
    for (int x0 = 0; x0 < W; x0++) {
        for (int y0 = 0; y0 < H; y0++) {
            for (int x1 = x0 + 1; x1 <= W; x1++) {
                for (int y1 = y0 + 1; y1 <= H; y1++) {
                    // the actual logic:
                    // inside:
                    int e = rectangle('.',x0,x1,y0,y1);
                    int a = rectangle('A',x0,x1,y0,y1);
                    int p = rectangle('P',x0,x1,y0,y1);
                    
                    // out side:
                    int na = A - a;
                    int ne = E - e;
                    
                    if (e <= na) {
                        int x = e;
                        int mov = x;
                        na -= x;
                        a += x;
                        e = 0;
                        ne += x;
                        if (mov <= K) {
                            if (p == 0) {
                                res = std::max(res, a);
                            } else if ( (p <= na) && (ne >= 1) && (mov + 2*p <= K) ) {
                                res = std::max(res, a + p);
                            }
                        }
                    }
                }
            }
        }
    }
    return res;
}

Div1 450: The difficult one

All I can think of is that we can imagine a different problem, the points stay static and the player moves either in a vertical line upwards (increasing `y` coordinate) or in a diagonal (incrementing `y` and incrementing or decrementing `x`).

Challenge phase

I took too long to solve div1 300, but I did learn that the example cases were weak. So I thought of a challenge case to make my initial solution fail (using only one empty square). Luckily my room wasn't very aggressive and I could find 2 solutions with that wrong idea and earned 100 points.