Showing posts with label algorithm. Show all posts
Showing posts with label algorithm. Show all posts

Saturday, April 09, 2016

Google Code Jam 2016

I completely forgot about the qualification round. This morning I woke up and was planning to have a normal day when I accidentally noticed in my calendar applet that "Qualification Round" was on. What qualification round!? Oh no, the google one...



So I decide to register to the contest and it's 15 hours late. What are even the rules? Is it the same as usual? What's the cut-off? It turns out everything is mostly the same. I need 30 points and just the small inputs are enough to give you 37 points. So it should be easy to qualify.

Perhaps it's too similar to previous editions? I think 1000 t-shirts is pretty low, considering how much the attendance has grown since this number was established. Actually when I look at the past I think I would have never really put as much work on programming contest if, ages ago, when I was getting started, there weren't as many small rewards for simple things as they were. In the past, in codejam or topcoder open, qualifying to the elimination rounds was worth a t-shirt and there were extra little rewards for advancing in other rounds. In fact, the first codejam done using Google's own site had regional semifinals so if you were in the top 500 you'd already win a paid trip to a google office.

Back to the present. Qualifying, even 15 hours late should be pretty easy for a decrepitveteran contest participant such as I. But the real contest is to get a perfect 100/100. So what are my chances there? Looking at the score board it seemed like this qualification round had easier problems than previous editions, as a really considerable amount of competitors already had the perfect score. So I gave it a shot. Note that I am starting to write this before the end of the round, so I don't know if I scored properly.


Problem A (Large)


Problem Statement

This is a straight-forward problem, you just need to simulate the steps until all digits are found. A pressing question is what's up with impossible cases? Well, I actually just coded right away and tested all 1000001 possible inputs from 0 to 1000000, with a maximum number of 1000 steps to find all digits. Even then, the simulation is very fast and the only case that doesn't reach all digits is `N = 0`, which will always be 0 no matter how far you go. So with that knowledge I just submitted an easy code:



    const int MAX_ITERATIONS = 1000;
   
    long solve(long x)
    {
        long y = x;
        set<char> s;
        for (int i = 0; i < MAX_ITERATIONS; i++) {
            ostringstream st;
            st << y;
            for (char ch: st.str()) {
                s.insert(ch);
            }
            if (s.size() == 10) {
                return y;
            }
           
           
            y += x;
        }
        return -1;
        // not pictured: I/O stuff including translating -1 to "impossible"
    }



Problem B (Small and Large)


Problem Statement

Well, the small version can be solved using a Breadth-First Search. Just consider a string of +/- as a vertex in some graph and each possible step is an edge between its initial setup and its resulting setup. Then you want to find the shortest path between the input and +++++...+ .  With `N <= 10`, there are at most `2^10 = 1024` vertices in the graph, so this is quite doable.


But I wanted to solve the hard version. Clueless as I often are, I just did the small version hoping that it would help me think of the hard solution.  I concluded some greedy approach was needed because all permutations of cookies are reachable from any initial setup so cutting the search space wouldn't work (for me).

After coding the small version and being able to see some of the solutions it generates, I noticed a very obvious pattern:



+++-++---+--+
----++---+--+
++++++---+--+
---------+--+
++++++++++--+ 
------------+ 
+++++++++++++  



Basically keep flipping the largest sequence of cookies at the top that have the same sign (+ or -) as the top-most cookie. You will eventually have a bunch of happy cookies. And this is the optimal solution. Because this is the optimal strategy in a different problem where a step consists only about changing the sign of a prefix of a string, and the other options in the real problem are not better.




    int solve(string s)
    {
        int n = s.size();
        int step_count = 0;
        while (s != string(n, '+') ) {
            int t = 1;
            while ( (t < n) && (s[t] == s[t-1]) ) {
                t++;
            }
            for (int i = 0; i < t; i++) {
                if (s[i] == '+') {
                    s[i] = '-';
                } else {
                    s[i] = '+';
                }
            }
            step_count++;
        }
        return step_count;
       
       
    }
   
   
    /* _Used this to think of the real strategy which was pretty obvious in
        hindsight ... */
    int solve_slow(string s)
    {
        int n = s.size();
        map<string, int> dist;
        queue<string> Q;
        dist[s] = 0;
        Q.push(s);
        map<string, string> parent;
        while (! Q.empty()) {
            string s = Q.front();
            Q.pop();
            for (int t = 1; t <= n; t++) {
                string ss = s;
                reverse(ss.begin(), ss.begin() + t);
                for (int i = 1; i <= t; i++) {
                    char & ch = ss[i - 1];
                    ch = ( (ch == '+') ? '-' : '+' );
                }
                if ( dist.count(ss) == 0) {
                    dist[ss] = dist[s] + 1;
                    Q.push(ss);
                    parent[ss] = s;
                }
            }
        }
       
        string w = string(n, '+');
        assert( dist.count(w) == 1 );
       
        {
            stack<string> S;
            string x = w;
            while (true) {
                S.push(x);
                if (parent.count(x) == 1) {
                    x = parent[x];
                } else {
                    break;
                }
            }
            while (! S.empty()) {
                cout << S.top() << endl;
                S.pop();
            }
        }
       
        return dist[w];
    }



Problem C (Small and Large)


Problem Statement

My initial hunch was that a huge majority of the possible binary strings are valid coins. Because really, primes tend to be hard to find and we want strings that aren't primes. Even though they can't be primes in many bases. But I still thought it would take a while to find 500 coins. So I decided to do something with threads. I find threads easier in python so I used python. When I finished coding it and tested, it turned out that the code was very fast, so the threads weren't needed :/, but it's still a cool looking solution:



N = 32
J = 500

NUM_THREADS = 4

EASY_PRIMES = [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71]

import random
import threading
import sys

mutex = threading.Lock()

seen_coins = dict()

def add_if_new(coin, divisors):
    with mutex:
        if coin not in seen_coins and len(seen_coins) < J:
            seen_coins[coin] = divisors
            sys.stderr.write( str(len(seen_coins)) + ' already!\n' )
            sys.stderr.write( str(attempt_count[0]) + ' attempts\n' )
            sys.stderr.flush()
           
           
def work_done():
    with mutex:
        if len(seen_coins) >= J:
            return True
        else:
            return False

def random_coin():
    x = [ random.randint(0,1) for i in range(N) ]
    x[0] = 1
    x[-1] = 1
    return tuple(x)


def is_divisor(coin, b, d):
    m = 0
    p = 1
    for x in reversed(coin):
        if x == 1:
            m = (m + p) % d
        p = (p * b) % d
    if m == 0:
        return True
    else:
        return False

def get_divisors(coin):
    divisors = []
    for b in xrange(2,10 + 1):
        for d in EASY_PRIMES:
            if is_divisor(coin, b, d):
                 divisors.append(d)
                 break
        else:
            return None
    return divisors

attempt_count = [0]

def worker():
    while not work_done():
        coin = random_coin()
        divisors = get_divisors(coin)
        if divisors != None:
            add_if_new(coin, divisors)
        with mutex:
            attempt_count[0] = attempt_count[0] + 1
            #sys.stderr.write( str(attempt_count[0]) + ' attempts\n' )
   
threads = [ threading.Thread(name='worker%d'%i, target=worker) for i in range(NUM_THREADS) ]

for t in threads:
    t.start()

for t in threads:
    t.join()

sys.stdout.write( 'Case #1:\n')
for x in seen_coins:
    for y in x:
        sys.stdout.write( str(y) )
    for div in seen_coins[x]:
        sys.stdout.write( ' ' + str(div) )
    sys.stdout.write( '\n')
    sys.stdout.flush()
    



I think one thing that improves the execution is I only test if the coin is a multiple of a small list of primes. Instead of looking for coins that are any composite number , I add the condition that they are a multiple of small primes. Most composite numbers are multiples of small prime numbers, so this shouldn't be an issue in theory, and in practice, it isn't.



Problem D (Small and Large)


Problem Statement

First the small problem is super trivial. Because `S = K`, so you can just test positions 1,2,3,...`S` and it will be enough. Because if there is any G in the initial string, then the final string will definitely have a G in that same position.


The larger solution is more complicated but not that much. Even though it took me a couple of hours to think of it .

If you want to be able to find any case that contains a G, then we should focus on the hardest cases, the ones that have only one G in the input: GLLL, LGLL, LLGL, LLLG. It turns out that if you include a G in position 1, it spreads to many positions in the later iterations. So imagine we had four positions: ABCD in the first string.  We want to find which positions of the second string would be effected by positions A, B, C and D (E.g: If there is a G in position A, then the second position of the second string will contain a G. If there is a G in position B, then the second position of the second string will also contain a G, so position 2 of the second string can be classified as AB. If you repeat this you will find: A, AB, AC, AD, AB, B, BC, BD, AC, BC, C, CD, AD, BD, CD, D. Perhaps it will be easier if you read it as: AA, AB, AC, AD, AB, BB, BC, BD, AC, BC, CC, CD, AD, BD, CD, DD. In fact, with 3 steps we have: AAA, AAB, AAC, AAD, ... DDD. And so and so, it's the same as thinking of numbers in base K. So for C = 3, if we want positions that cover as many initial Gs as possible then we want the positions that have ABC and DAA (or DAB or ADC, anything containing a D). You can get the position index by converting from base k to base 10. This explanation is probably very bad. Sorry, I can't do better without drawings.




    long translate(vector<int> digits, int K)
    {
        long res = 0;
        long p = 1;
        for (int i = 0; i < digits.size(); i++) {
            res += p * digits[i];
            p *= K;
        }
        /*for(auto x: digits) cout << " " << x;
        cout << " -> "<<res<<endl;*/
        assert(res >= 0);
        return res;
    }
    list<long> solve(int K, int C, int S)
    {
        list<long> res;
        int p = 0;
        while (p < K) {
            vector<int> digits;
            for (int i = 0; i < C; i++) {
                if (p < K) {
                    digits.push_back(p);
                    p++;
                } else {
                    digits.push_back(0);
                }
               
            }
            res.push_back( 1 + translate(digits, K) );
        }
       
        if (res.size() > S) {
            //assert(false);
            return {};
        } else {
            return res;
        }
    }




Final results


As I was typing, the round ended, let's check the final results...

All my large solutions were okay. But tons of people had them okay as well. I am barely in position 1246. That's pretty low, but I did start the round 15 hours late... (excuses)





Saturday, April 11, 2015

Google Codejam 2015 Qualification Round : Part 4: C small

Part 1: Problem D in Emily

Part 2: Problem A in unnamed language

Part 3: Problem B in Julia

C small

Problem statement

Unlike the other Small problems I didn't really think of a solution of this right away. I had to take a day to do it.

The trick to the problem is that we can find in `O(n)` the product of the `x` first left-most numbers (for all `x`). And the same for the right-most numbers. So imagine if we had a substring `[a,b]`. If left[a-1] is equal to `i` and right[b+1] is `k` then all that we need to learn is if the substring is equal to `j`. If we can answer that question in `O(n^2)`, we have the whole problem solved. And we totally can answer that question! We just need to start with an `a`, then `b = a` and calculate the product for `b = a`, then it is easy to calculate the product when `b = a + 1`, just multiply one more number. And so and so for `b = a + 2`, etc.

The only extra complication is doing the multiplication with `i`, `j`, `k` and the signs... Well, really there are 8 distinct numbers, and we can easily index them and make a table, multiplying this index with this other index yields this other index...

So I tried free pascal. I used to code in Pascal, and I hated it. Now I don't really remember much. So I guess it counts? I had to relearn Pascal a bit. But I reached a solution... which failed. And then it failed again and again... I really thought the problem was in that big hardcoded matrix, and spent hours trying to think of a mistake there. Turns out that free Pascal's Strings are limited to 255 characters (What the hell, free pascal?). You have to use AnsiString for limitless strings.

program C;

function solution(s : AnsiString): String;
const
   CONST_I = 2;
   CONST_J = 4;
   CONST_K = 6;
var
   i, j, n, left, curr : Integer;
   right : Array[0..100000] of Integer;
   t : Array[0..7,0..7] of Integer = (
       (0,1,2,3,4,5,6,7),
       (1,0,3,2,5,4,7,6),
       (2,3,1,0,6,7,5,4),
       (3,2,0,1,7,6,4,5),
       (4,5,7,6,1,0,2,3),
       (5,4,6,7,0,1,3,2),
       (6,7,4,5,3,2,1,0),
       (7,6,5,4,2,3,0,1)
   );
   inputInts: Array[0..99999] of Integer;
   
    function ch2index(ch: Char): Integer;
    begin
        ch2index := CONST_K;
        if ch = 'i' then ch2index := CONST_I;
        if ch = 'j' then ch2index := CONST_J;        
    end;

begin
    
    n := Length(s);
    solution := 'NO';
    for i:=1 to n do begin
        inputInts[i - 1] := ch2index(s[i]);
    end;
    right[n] := 0;
    for i := n - 1 downto 0 do begin
        right[i] := t[inputInts[i]][right[i+1]];
    end;
    left := 0;
    for i := 0 to n-1 do begin
        curr := 0;
        for j := i to n-1 do begin
            curr := t[curr][inputInts[j]];
            if (left = CONST_I) and (curr = CONST_J) and (right[j+1] = CONST_K) then begin
                solution := 'YES';
            end;
        end;
        left := t[left][inputInts[i]];
    end;
end;


var
   X, L : Integer;
   w : AnsiString;
   ww : AnsiString;
   T : Integer;
   i,j : Integer;
begin
    readln(T);
    for i := 1 to T do begin
        Readln(L,X);
        Readln(w);
        ww := '';
        for j := 1 to X do begin
            ww := ww + w;
        end;
        //writeln(ww);
        writeln('Case #',i,': ', solution(ww) );
    end
end.

The end

I qualified! This was a lot more fun and far more stressful than usual.

Google Codejam 2015 Qualification Round : Part 3: B small in Julia

(Part 1: Problem D) (Part 2: Problem A)

After the disaster solving A-large, it became clear that I needed to do more than 2 problems to get an advancing score. But also both B and C seemed likely to be impossible in the languages I wanted. But I still didn't want to use my normal languages. I mean at least B-small seemed quite obvious. A typical problem we can solve with memoization. If we set the state to be equal to an array of 9 elements containing the number of eaters that have 1, 2, ... 9 pancakes. We need to notice that it never makes sense to create a plate that has more pancakes than the original plate from which we took the pancakes. Otherwise each minute just shifts the array...

Anyway, I decided to look for a new language to learn. But something a bit more usable than the other two. I tried Julia. It actually felt like cheating, because it turns out this is a pretty good language. There's a couple of things I don't like, like the 1-indexing. But it was overall a nice experience. Similar to python but with some differences. And apparently benchmarks say it is a bit faster. You can also make it less dynamically typed if needed. Anyway:

function solve(P)
    mem = Dict()

    function f(seq)
        if haskey(mem, seq)
            return mem[seq]
        end
        #println(seq)
        res = 1000000000
        if count(x -> (x != 0), seq) == 0
            # all zeros, base case
            res = 0
        else
            # let one minute go:
            a = [ seq[i] for i = [2:9] ]
            push!(a, 0)
            res = 1 + f(a)
            
            # move something:
            for i = [2:9]
                if seq[i] > 0
                    for j = [1:i-1]
                        #if j != div(i , 2)
                        #    continue
                        #end
                        for k = [1 : i - j - 1]
                            if seq[k] > 0
                                b = [x for x = seq]
                                b[k] -= 1
                                b[k + j] += 1
                                b[i] -= 1
                                b[i - j] += 1
                                res = min(res, 1 + f(b) )
                            end
                        end
                        # move to empty plate
                        if i - j < i
                            b = [x for x = seq]
                            b[j] += 1
                            b[i - j] += 1
                            b[i] -= 1
                            res = min(res, 1 + f(b) )
                        end
                    end
                end
            end
        end
        mem[seq] = res
        #print (seq)
        #print (" = ")
        #println(res)
        return res
    end
    
    seq = [ count(x -> x==i, P) for i = [1:9] ]
    
    return f(seq)
end

s = int(readline(STDIN))
for i = [1: s]
    n = int(readline(STDIN))
    q = split(readline(STDIN), " " )
    P = map( x -> int(x) , q)
    @printf "Case #%d: %d\n" i solve(P)
end

Coding this was easy except for googling how to do things. The logic I learned when learning python really helped.

Google Codejam 2015 Qualification Round : Part 2: A small/large

[Part 1]

My compiler

The other language I initially intended to use in codejam was an old compiler thing I coded in 2013. It was a school assignment that took me a couple of days to implement. It's just a C clone that doesn't even have arrays. But hey, at least it has integers. And can actually read. Although it can only read one integer, and write one integer. It doesn't have division either. But it compiles to 64 bits asm code which can be compiled to binary, so speed isn't a big issue.

Problem A - The one with shyness - Standing Ovation

I had to pick a good problem for this. But all seemed unsuitable. My first idea for problem A - Standing OvationWas to just bruteforce for the number of additional members. There's no reason to bring audience members who are shy, so their shyness should be 0. The maximum result is equal to Smax. For each candidate number of additional members, we can just simulate the process and check if everyone claps. So we can do this in an `O(S_(max) N)` loop quite easily.

For the large version, and also to solve using a language that has no arrays. We can use an on-line algorithm. We initially have `x` audience members with 0 shyness. We know those will be clapping. So we should worry about the rest. We encounter that there are `s_1` audience members with shyness 1. If `1 > x`, we need to invite one more audience member: `1 - x`. In fact, we can repeat this, now we know there are `max(1,x) + s_1` audience members that are clapping. We find that there are `s_2` members with shyness 2. And so and so.

The problem was that I needed to have a way to turn the neat input file into something my language can read. Then turn the language's output into something that looks like "Case #1: xxx". DISASTER, I actually learned that my asm code only does syscalls for reading bytes and then converts to integer and is not quite the same as a scanf or cin>>x... In fact, the whole thing really fails at reading files. Even ignored the format issues. I had to learn how to automatize having a terminal send keystrokes to my program for it to work. Apparently this is usually a thing that people need as there is a family of programs called expect that do this. Fortunately, after learning how to use expect I found that python has a pexpect module. This allowed me to create this nifty glue python program:

import pexpect
import sys

# call the solver program, return the result
def call_solver(smax, shy):
    try:
        child = pexpect.spawn ('./A')
        child.sendline('%d' % smax)
        child.expect( '.*' )
        for ch in shy:
            child.sendline ('%s' % ch)
            child.expect( '.*' )
        child.expect(pexpect.EOF)
        return int(child.before.split('\r\n')[-2])
    except:
        print "ERROR"
        return -1


T = int(raw_input() )
for i in range(1, T+1):
    s = raw_input()
    (smax, shy) = s.split(' ')
    smax = int(smax)
    print "Case #%d: %d" % (i, call_solver(smax, shy) )  
    sys.stdout.flush()

So now I am free to make my program using my language...

program Codejam2015A
{
    int n, a, s, i, cost;

    read n;
    read s;
    
    i = 1;
    cost = 0;
    //initial clappers (shyness = 0)
    while (i <= n) {
        read a;
        if (a >= 1) {
            if (i > s) {
                // must invite i - s
                cost = cost + (i - s);
                s = i;
            }
        }
        s += a;
        i = i + 1;
    }
    print cost;
    
    
}

Anyway, you can find the complete source code in the codejam score boards...

I submitted A-small, had a failure because it was slower than I thought, I tried to check if there was something wrong. It seems expect is slow in sending things.

I should have known that the reason it was slow would only get worse in A-Large, in fact, when I tried A.large, it turned out that it could barely solve one case per minute, even tho the program was, the glue stuff using expect was very slow, specially when there are 100000 line enters to send to the fake terminal... Failed A-large because of a time out.

Google Codejam 2015 Qualification Round : Part 1: D small in Emily

It's another google codejam qualification round and honestly, these rounds are getting old. We are supposed to spend a whole day solving problems, between the too easy Small inputs and the impossible Large inputs. And what for? Once you get enough points for the cut-off it's not worth it. Unless you planned to be connected at 19:00 and try to solve the problems as fast as possible, then your rank might actually mean something...

One time I had position 29 at a qualification round. Impressive! Except it's all fake. I didn't even qualify to the third round that year.

I actually might have some rants to make about the formats used in our competitions. Did you know that google found winning programming competitions is correlated with bad work performance? It's SCIENCE. (Supposedly). Although us fans of these contests might like to think there is some sort of mistake in this result of Google's data mining. We cannot just really say it's not true, without doing any actual analysis of the data used and how the conclusion was made.

Anyway, so I want to qualify to round 1, but I was bored of qualifications rounds. So I decided to make them interesting by doing something different to my routine of using c++ and trying to get a perfect score. Instead, I wanted to make special language choices. Use only languages that I designed myself or that I didn't actually learn before the day of the round.

Emily

Once I decided to use special languages. I knew that one of the languages to use today was Emily. It's a relatively new languages, so that's a bonus. Also because o> f luck I've been seeing many previews of its development and finding them interesting. Honestly tho I kind of made this decision without thinking a lot...

The day of the round I noticed I should start learning this language. I figured out a couple of issues. Emily doesn't seem to have a way to read input. Fortunately, it does have the ability to write things. Unfortunately, only strings that you cannot process or floats. For some reason, Emily in its 0.1 version only has floating point numbers. That was going to be a bit of a problem.

My initial plan was to only use Emily and the language I will mention next; and hope to advance using only them. *After* getting an advancing score, I could actually try to use other languages just for fun. When I opened the problems, this seemed difficult. All problems seemed to need arrays, or actual string manipulation or 2D arrays, which are worse to do in Emily. Eventually I noticed Problem D.

Problem D: Ominus Omino - (The one with n-ominos)

statement

When I read this problem, it seemed that the small input can be solved with a bunch of conditionals. And the results are just one of two strings. Conditionals are good news because at least they can be implemented in Emily. And we can print fixed strings. I would only need to use my text editor's regexes to turn input files into code that calls my Emily solution.

To solve this problem (The small version) you can just try all variations of R and C, and just try by yourself, is it possible to pick an n-omino that stops the other player from winning?

This is how my notebook looked after this:

The rest was just implementing everything on Emily. It was HARD. So many conditions. The interesting thing about Emily is everything is an unary function. And I mean EVERYTHING. This includes objects and numbers. Unfortunately, I couldn't think of a good way to think advantage of this in this problem.

\version 0.1

# Hi, I am using a language called Emily, it is in version 0.1 :/
# https://bitbucket.org/runhello/emily/wiki/Home


ALWAYSAWAY = "GABRIEL"
CANMAKELOSE  = "RICHARD"

min = ^a ^b (
    (a < b) ? a : b
)
max = ^a ^b (
    (a > b) ? a : b
)

case4 = ^R ^C (
    ( (max R C) == 4 && ( (min R C) >= 3 ) ) ? ALWAYSAWAY : CANMAKELOSE
)

case3 = ^R ^C (
    a = max R C
    b = min R C
    
    maxIs4 = ( (b == 3) ? ALWAYSAWAY : CANMAKELOSE )
    maxIs3 = ( ((b == 3) || (b == 2) ) ? ALWAYSAWAY : CANMAKELOSE )
    
    (a == 4) ? maxIs4 : (  (a == 3) ? maxIs3 : CANMAKELOSE )
    
)

even = ^X (
    ( (X == 0) || (X == 2) || (X == 4) )
)

case2 = ^R ^C (
    ( (even R) || (even C) ) ? ALWAYSAWAY : CANMAKELOSE
)

# woops, here was my mistake
case1 = ^R ^C (
    ALWAYSAWAY
)


mode = ^X(
   ( X==1? case1: X==2? case2: X==3? case3: X==4? case4: null ) 
]

solution = ^X ^R ^C (
    ( mode(X) ) R C
)


# Examples
println: solution 2 2 2
println: solution 2 1 3
println: solution 4 4 1
println: solution 3 2 3

I couldn't help but get carried away with the functional programming here. Case4 is a function that solves the case for tetraominoes, Case3 for tetraominoes, etc. mode is a function that picks which function to use according to X. I wanted this to just be an array, but apparently that's currently a bit difficult to do.

TCO 2015 Round 1A

I guess I advanced? Unless there is an implementation bug in my algorithms, which is quite possible. I know they are theoretically correct.

While solving the first two problems I kept having the feeling I was doing things the overcomplicated way.

250 points: The one with digits

You are given `1 <= L < R <= 100000`, find a pair `L <= a < b <= R` such that `a` and `b` have as many digits in common (ignore repeated digits).

So for a 250 in an "easy" round, this is not like the div2 easies. Can't just try all pairs `(a,b)`. In fact, the solution I used is quite complex (not hard, complex) and sounds like a division 2 hard thing?

The numbers will have at most 5 digits. So each number `L <= i <= R` will have at most `2^5 = 32` subsets of digits. The idea is to try them all and check if any number in the past had that subset of digits. Hey, it works.

// extract digits from x, call f(d) for each digit d we find.
void forEachDigit(int x, std::function<void(int)> f)
{
    while (x > 0) {
        f(x % 10);
        x /= 10;
    }
}

int maxsim(int L, int R)
{
    vector<bool> exists(1024, false);
    int best = 0;
    for (int i = L; i <= R; i++) {
        // create a bit mask of the digits in i:
        int mask = 0;
        forEachDigit(i, [&](int d) {
                mask |= (1 << d);
        });
        // for each subset of i's digits:
        for (int smask = mask; smask != 0; smask = (smask - 1) & mask) {
            // did we see this subset before?
            if (exists[smask]) {
                // if so, consider for maximum
                best = std::max(best, __builtin_popcount(smask) );
            }
            exists[smask] = true;  //whoops
        }
    }
    return best;
}

I did a wrong thing in my first submission, instead of marking exists[smask] = true for each subset I was marking exists[mask] = true for just the set...

500: The one with simulation

There is a graph in which all vertices have exactly one out-edge. You initially place tokens, at most one per vertex. In each step of the game, the tokens move to the vertex pointed by their out-edge. If at any time there are two tokens in the same vertex, you lose. Count the number of ways to place tokens without losing. The steps are repeated `K`<= 100000000` times.

So once two tokens end up in the same vertex, they will never separate again. So for example we can try simulating with a graph that has a token each. At the end of the simulation, some vertices will have multiple tokens. The trick is to notice that each of these tokens represent an initial vertex. If there are 4 tokens in a single vertex, it means there are 4 vertices that must all contain at most one token in total initially. So for each of the vertices at the end of the game, multiply (Number of tokens in vertex + 1).

The problem is that `K` can be pretty large. I used something similar to matrix exponentiation to simulate. But I think maybe it is possible to prove that you don't need to do too many simulations?

const int MOD = 1000000007;


// merge two simulation vectors
vector<long> merge(const vector<long> &A, const vector<long> &B)
{
    int N = A.size();
    vector<long> R(N);
    
    for (int i = 0; i < N; i++) {
        R[i] = 0;
        for (int j = 0; j < N; j++) {
            if ( (1LL << j) & B[i] ) {
                R[i] |= A[j];
            }
        }
    }
    
    return R;
}

int wayscnt(vector<int> a, int K)
{
    int N = a.size();
    // The "identity" vector each token in its initial vertex
    vector<long> I(N);
    for (int i = 0; i < N; i++) {
        I[i] = (1LL << i);
    }
    // B represents what happens after a single step:
    vector<long> B(N);
    for (int i = 0; i < N; i++) {
        B[i] = 0;
        for (int j = 0; j < N; j++) {
            if (a[j] - 1 == i) {
                B[i] |= (1LL << j);
            }
        }
    }
    // P[0] is what happens after one step
    // P[1] is what happens after two steps
    // P[2] is what happens after four steps
    //...
    // P[i] is what happens after 2^i steps 
    const int MAX = 30;
    vector<vector<long>> P(MAX, vector<long>(N) );
    P[0] = B;
    for (int i = 1; i < MAX; i++) {
        P[i] = merge(P[i-1], P[i-1]);
    }
    // with that in mind, we can find what happens after any number of steps
    vector<long> x = I;
    for (int j = 0; j < MAX; j++) {
        if ( (1<<j) & K ) {
            x = merge(x, P[j]);
        }
    }
    
    long res = 1;
    for (long y: x) {
        res = (res * (__builtin_popcountll(y) + 1 ) ) % MOD;
    }
    
    return (int)(res % MOD);
}

1000: The one with bipartite matching

Find the minimum total weight of edges to remove from a bipartite graph so that there is no perfect bipartite matching. o_O.

According to this: ItayGonshorovitzThesis.pdf, if we reduce the bipartite matching problem with flow and try to find the minimum cost to reduce the flow, that is NP-hard. Although with such a low constraint for `n`, maybe non-polynomial IS intended. I think that paper has interesting reductions for the problem so maybe it helps?

Saturday, March 29, 2014

SRM 614: Cursed weak test cases

Remember that this is the 4-th SRM in the python experiment. This time I felt really fast when solving the div1 250 pointer. When python helps, it really helps. Unfortunately...

Div1 250: Square, points, etc.

You are given up to 100 pair-wise distinct, lattice points with coordinates with absolute value up to 10^9. Find the minimum area square with lattice vertices and side paralel to the x and y axises that strictly contains at least `K` of the input points.

So I code my initial idea: One of the optimal squares will have a bottom-left corner equal to (x-1,y-1) for some point (x,y) in the input. Try all `O(n)` of such candidate corners. Once you fix a corner, you just need the side length. We can just binary search for the smallest `L` such that the square will strictly contain all those points.

FAST submission! I was first in room for a while and was happy.

Div1 525

This is a complex dp/implementation problem mix. Eventually decided to skip it. I had no idea how to work around the high constraint for `K`. And I am tired after staying late trying to do SRM 613's editorial. Plus I won't have to solve this problem anyway because I am not writing SRM 614.

Div1 1000

Oh boy, I am so happy I am not writing this editorial.

Back to Div1 250

So I decided it was best to start writing this blog post. As I was explaining the div1 250, I noticed: My logic is wrong! The test cases were awfully weak in this case. In reality, if the optimal corner is `(x-1,y-1)` , then `(x,y)` is not necessarily a point in the input. x and y could come from different points. The counter example is simple, points: `(0,3), (1,3), (2,3), (3,3), (3,2), (3,1), (3,0)` and `K = 6`. The best square has `x = -1` and `y = -1`.

By the time I noticed this, it was incredibly late. The resubmit was very costly, I hope to be able to compensate by finding similar mistakes in the challenge phase.

The python code is lovely though:

def binarySearch(f, lo, hi):
    # f(hi) and not f(lo)
    while lo + 1 < hi:
        ha = (lo + hi) / 2
        if f(ha):
            hi = ha
        else:
            lo = ha
    return hi

class MinimumSquare:
 def minArea(self, X, Y, K):
    points = zip(X,Y)
    
    res = 10**20
    # oops, initial submission had: for (x,y) in points instead of 
    for x in X:
        for y in Y:
            # find smallest square with bottom-left corner (x-1, y-1)
            def canIt(L):
                return sum( (x-1 < xi < x-1+L) and (y-1 < yi < y-1+L) for (xi,yi) in points) >= K
            
            if canIt(res) :
                res = min(res, binarySearch( canIt, 0, res) ) 

    return res ** 2

Take a look to the beautiful binary search, thanks to first class functions I can do binary search on a higher level, rather than mixing it with other code.

Post will be created automatically once the challenge phase ends.

Wednesday, January 15, 2014

The "empty chairs" bipartite matching algorithm

Today while browsing the TC forums I had a blast from the past. There was a time in which I didn't know what max flow or bipartite matching is, so I had to learn it. The process was difficult and long. While trying to understand max flow though, I did get some knowledge about solving bipartite matching without max flow. Or rather by a very simplified recursive logic that does max flow.

The good thing about the recursive logic is that it has a real life metaphor that makes it easy to remember and it is also very easy to implement. I am reposting this from an old topcoder forums post I wrote a long time ago. I thought it is good to have it in a more visible place:

Well, I have a logic for bipartite matching that makes an easy to remember and implement algorithm, not to mention it is fast:

This is as best as I could explain it:

int personInChair[m]; //personInChair[m] Points to the person that's currently on  chair i
bool foundChair[n]; //foundChair[i] is 1 if and only if the guy has already sit on a chair.
 
 
bool canSit[n][m]; //canSit[i][j] is true if and only if person i can sit on chair j.
 
bool alreadyTried[n]; //alreadyTried[i] is true if person i has already tried to find a new chair
 
bool findChair(int x)
// Person x will try and find a new chair...
{
    if (alreadyTried[x]) //alreadyTried just serves the purpose of
         return false;   //preventing infinite recursion and things like that
    alreadyTried[x]= true;
 
   for (int j=0; j<m; j++)
       if(canSit[x][j])
       {
           int otherGuy = personInChair[j];
           if ( otherGuy == -1)
           {
               //the sit is empty, just sit on it!
               personInChair[j] = x;
               foundChair[x] = true;
               return true;
           }
           else
           {
               //kindly ask the other guy if he can move to another chair:
               if ( findChair(otherGuy) )
               {
                   //he was able to move.
                   personInChair[j] = x;
                   foundChair[x]=true;
                   return true;
               }
               
           }
       }
    return false; //No chairs available right now.
}
 
int maximumChairs() //this will return the maximum number of people
                    // that sit on a chair.
{
    int sum=0;
    fill(personInChair,personInChair+m, -1); //nobody has sit yet.
    fill(foundChair,foundChair+n, false);    //
    while(true)
    {
        bool somethingChanged = false;
        fill(alreadyTried,alreadyTried+n, false);
        for (int i=0; i<n; i++) //Everybody should try to find a chair.
           if(! foundChair[i] )
           {
               if(findChair[i])
               {
                   somethingChanged=true;
                   sum++;
               }
           }
        
        if(!somethingChanged)
            break;
        
    }
 
    return sum;
}

I think that when I came up with that algorithm it was based on code I've seen from red coders. The chair metaphor makes it easy to remember. I wonder if this algorithm has a name / original author or if it is something that merely materialized in programming competitions out of need.

Anyway, it is not that fast. It is `O(n^3)` :)

Saturday, March 30, 2013

SRM 574 editorial supposedly ready

The editorial for SRM 574 is in a state that can be published. https://apps.topcoder.com/wiki/display/tc/SRM+574.

I would love to say that the huge delay was because I took too long to solve div1 hard quickly (as it usually happens). But this time I had bigger issues, while I actually sort of understood all the problems in the match rather quickly (thanks to help from admins in div1 hard). My on-going recovery from health issues made me take much longer than I planned/expected. Sorry for that, but I am feeling better now.

I think the div1 1050 explanation is terrible. Maybe I will improve it.

It was a fun problem set.

Friday, September 07, 2012

SRM 555: 5555555555555555555555

Yeah, I posted it in the TCO blog: http://community.topcoder.com/tco12/srm-555/. Let me talk of what was going on to me:

Waking up at 6:30

I had it all planned up, and set things up to wake up at 6:30, so I could register (Really did not expect it to get any close to the registrants limit).

Unfortunately, I spent just about the whole period from 05:00 to 06:00 waking up to the same lame dream: that I missed the SRM. Only to find out that it was 5:00 and was way early. Arrg, I think it happened at least three times.

Div1 255

Meh, it seems that my brain was not awake when I first tried to solve this problem. At first I thought it was about splitting in multiples of 5 instead of powers of 5. My first code even had funny tricks to avoid considering a single "0" as a number that begins with leading zeros... The whole plan was to simply get the results modulo 5 when translating bits to integer. It was so clever...

But it failed examples. So, it needed powers of five... I had to fix stuff, but had some lame bugs and it delayed me. At the end , 215 points. Is way too slow. My only hope was the div1 555...

Div1 555

I actually saw the correct solution (although overcomplicated by dp for the sub-problem instead of even more binomials.) instantly. Of course, after coding everything here were bugs and bugs. I reached a state in which I was passing all examples but the second-last and third-last. Which was very puzzling, because the example that was seemingly the most complicated (last one) was correct. Surely, with 32 minutes left, I would eventually find the mistake, right? ... right?

I tried many things. I noticed the main difference between the examples I was passing and those I was not was that they were the only cases in which Rcount != Ccount, so I was trying to find any bugs related to it. Then I tried other things, and many others. But I effectively had 31 minutes with the same wrong code and no idea what to do. I even doubted the approach, but it really seemed right.

During the challenge phase, I checked out the other solutions and they were using the same logic. After some time off I opened it in the practice room, and could finally see the mistake right away. It is such a stupid mistake that it makes the drop to 18XX rating even more sad. You can see the code I had during the contest. The bug is with subProblem(er/2, Rcount) (and its column analogous), it should be: subProblem(er/2, H), of course, because the first expression makes no sense. I think I quickly replaced a different bug with Rcount and Ccount without putting a lot of thought and then completely missed it. I did not notice that it was also a main trait of the test cases I was failing, that W!=Rcount and H!=Ccount.

Anyway, this whole thing of SRMs made with mixed problem setters that turn out to have very easy div1 250 and div1 500 that everyone solves except for me that take too long or have bugs is really getting old. It is a perfect rating killing storm.

Codes!

div 2 easy, is really easy:

int theMax(vector <string> board) 
{
int h = board.size(), w = board[0].size();
int res = 0;
// pick a row
for (int i=0; i<h; i++) {
// pick a column
for (int j=0; j<w; j++) {
int ones = 0;
// count the ones
for (int a=0; a<h; a++) {
for (int b=0; b<w; b++) {
// the xor operation ^, is great to simlate toggling...
int o = ( (board[a][b]=='1') ^ (a==i) ^ (b==j) );
ones += o;
}
}
res = std::max(res, ones);
}
}
return res;
}

div2 medium / div1 easy: The lack of a c++ library function to convert a integer to binary is really lame.

int getmin(string s) 
{
int n = s.size();
int dp[n+1];
dp[0] = 0;
int INF = 100;

// make the powers of 5:
long long p = 1;
string pow5[100];
int q = 1;
pow5[0] = bit(p); // the bit(x) function returns x in binary.
while (p <= (1LL<<62) / 5) {
p *= 5;
pow5[q] = bit(p);
q++;
}

// dp part:
for (int t=1; t<=n; t++) {
dp[t] = INF;
for (int i=0; i<q; i++) {
// does it end with the i-th power of 5?
int len = pow5[i].size();
if ( (t - len >= 0) && ( s.substr(t-len, len) == pow5[i] ) ) {
//Yes. Yes, it does.
dp[t] = std::min(dp[t], dp[t - len] + 1);
}
}
}
return ( (dp[n] >= INF) ? (-1) : dp[n] );
}

div2 hard: This guy was completely not standard and not easy (at least not for me.) I think I needed a lot of thought to reach the good method to do a dp program. Like I said in the TCO blog, it is all about what starting point you use for the analysis. I was trying bottom-up dp instead of top-down and it was a true nightmare. (For me, this is the hardest div2 hard in a while).

const int MOD = 555555555; 
typedef long long int64;
#define long int64
long dp[555][554][2][2];
struct MuddyRoad2
{
int theCount(int N, int muddyCount)
{
memset(dp, 0, sizeof(dp) );
dp[0][0][1][1] = 1;
dp[0][0][0][0] = 1;

//dp[i][m][p1][p2] returns the total number of ways
// to set muddy segments if:
// * We have already assigned segments higher than i
// * p1 is the parity of the number of paths for i+1
// * p2 is the parity of the number of paths for i+2
// * There are m muddy roads left to mark.
//
for (int i=1; i<=N-1; i++) {
for (int p1 = 0; p1 < 2; p1++) {
for (int p2 = 0; p2 < 2; p2++) {
for (int m=0; m<=muddyCount; m++) {
long & res = dp[i][m][p1][p2];
// do nothing
res = dp[i-1][m][ (p1 ^ p2) ][p1];
// place a 0
if ( (m > 0) && (i < N-1) ) {
res += dp[i-1][m-1][0][p1];
res %= MOD;
}
}
}
}

}
// this way p1^p2 = 1 for i=N-1 :)
return dp[N-1][muddyCount][0][1];
}
};
#undef long

Div1 medium: This dreaded problem was easy but that's not a good thing when you do it wrong for a lame reason and everyone else solves it.

 
long C[3200][3200];
int count(int H, int W, int Rcount, int Ccount, int S)
{
long res = 0;
memset(C, 0, sizeof(C));
// Pascal's triangle is a combinatorics problem coder's best friend
for (int i=0; i<=3199; i++) {
C[i][0] = 1;
for (int j=1; j<=i; j++) {
C[i][j] = (C[i-1][j-1] + C[i-1][j]) % MOD;
}
}
for (int r = (Rcount % 2); r <= Rcount && r <= H; r += 2) {
for (int c = (Ccount % 2); c <= Ccount && c <= W; c += 2) {
int tem = (H - r) * c + (W - c) * r;
if (tem == S) {
cout << r << ", "<<c<<endl;
int er = (Rcount - r)/2; //extra
int ec = (Ccount - c)/2;
long prod = 1;
prod = (prod * C[H][r]) % MOD;
prod = (prod * C[W][c]) % MOD;
prod = (prod * C[H-1 + er][H-1]) % MOD;
prod = (prod * C[W-1 + ec][W-1]) % MOD;
res += prod;
}
}
}
return (int)(res % MOD);
}