Tuesday, January 21, 2014

SRM 605

A slow day. It was an 8:00 AM match and I may or may not have stayed up till 3:00 AM the day after writing a very bad editorial. So I wasn't at 100% speed.

250: The one with hamburgers

You are given taste[] and type[], each hamburger is of type type[i] and has a taste value taste[i]. The joy to eat is equal to the total sum of the hamburgers chosen multiplied by the number of distinct types eaten. What is the maximum joy? (Note multiple hamburgers may be of the same type and that taste[i] can be negative). We can choose to eat zero hamburgers, so result should never be negative.

Let's say you fix the number of types `y`, then we should find the `y` types with the maximum sums of tastes and sum those tastes together. Picking the best taste sum of each type is a separate problem of its own. Normally, you should pick all non-negative tastes . The only reason to pick a negative taste is when there are only negative tastes of a given type. In that case, you should only pick one negative number, the largest one (closest to 0). Sometimes, even though a type has negative sum, it is a good idea to try it because it increases the number of types.

The rest is to solve the implementation of finding the best possible sum of tastes for each type and then the best sum of `y` types. I used the STL which makes everything cute:

int getNumber(vector<int> type, vector<int> taste)
{
    // first of all, save all tastes for each type.
    map<int, vector<int> > typeTastes;
    for (int i = 0; i < type.size(); i++) {
        typeTastes[ type[i] ].push_back(taste[i]);
    }
    vector<int> tasteResult;
    // for each type:
    for (auto &q: typeTastes) {
        auto &p = q.second;
        // sort tastes in non-increasing order:
        sort(p.rbegin(), p.rend());
        // Find the best sum, try to get all the non-negative numbers
        int s = 0;
        int i = 0;
        while ( (i < p.size()) && (p[i] >= 0) ) {
            s += p[i];
            i++;
        }
        // sometimes all numbers are negative... 
        if (i == 0) { 
            s += p[0]; //pick the largest
        }
        //add this sum to the list of types:
        tasteResult.push_back(s);
    }
    // now try the best number of types:
    sort(tasteResult.rbegin(), tasteResult.rend());
    int res = 0, sum = 0;
    for (int i = 0; i < tasteResult.size(); i++) {
        sum += tasteResult[i];
        res = std::max(res,  (i + 1) * sum );
    }
    return res;
}

450: The one with sets

You have a set of all natural numbers from 1 to `2N`, inclusive. Partition the set in two sets of the same size such that, after sorting the sets, the absolute value of the difference of their i-th elements is at least `K`. Given `N <= 50` and `K <= 10`, count the total number of valid ways to pick these partitions.

As usual, I wasted most of my time coding a dynamic programming that was actually wrong. I didn't at first notice that the sets should be sorted ...

Anyway, the trick is that `K <= 10`. Assume you decide the contents of the sets going from highest integer to the lowest. You begin deciding the position of `n = 2N`, then `n = 2N-1` and so and so. You can represent the integers added to each set using two things: a) The number of integers whose difference with the current number is at least one in each of the two sets and b) The bitmasks representing the set of the other integers that went to each of the two sets. After some inspection, you only need one bitmask and one count, the other set's can be found using `N` and `n`.

Anyway, let's say you want to add `n` to set A. From the bit masks and counts there are four things we should know:

  • `bA` : The number of bad numbers in A, numbers smaller than `n + K`.
  • `bB` : The number of bad numbers in B, numbers smaller than `n + K`.
  • `gA` : The number of good numbers in A, numbers that are at least `n + K`.
  • `gB` : take a guess

Since `A` and `B` are sorted, you can assume that the last `bA + gA` spots in `A` are taken and the last `bB + gB` spots in `B` are taken.

If we want to add `n` to `A` one of the following must be true:

  • The largest position in `A` will be matched to a good number in B
  • The largest position in `A` will be matched to nothing in `B`.

I am pretty sure this logic is correct, but when I learned it I had only 8 minutes to implement it and it is quite complicated to.

Wednesday, January 15, 2014

The "empty chairs" bipartite matching algorithm

Today while browsing the TC forums I had a blast from the past. There was a time in which I didn't know what max flow or bipartite matching is, so I had to learn it. The process was difficult and long. While trying to understand max flow though, I did get some knowledge about solving bipartite matching without max flow. Or rather by a very simplified recursive logic that does max flow.

The good thing about the recursive logic is that it has a real life metaphor that makes it easy to remember and it is also very easy to implement. I am reposting this from an old topcoder forums post I wrote a long time ago. I thought it is good to have it in a more visible place:

Well, I have a logic for bipartite matching that makes an easy to remember and implement algorithm, not to mention it is fast:

This is as best as I could explain it:

int personInChair[m]; //personInChair[m] Points to the person that's currently on  chair i
bool foundChair[n]; //foundChair[i] is 1 if and only if the guy has already sit on a chair.
 
 
bool canSit[n][m]; //canSit[i][j] is true if and only if person i can sit on chair j.
 
bool alreadyTried[n]; //alreadyTried[i] is true if person i has already tried to find a new chair
 
bool findChair(int x)
// Person x will try and find a new chair...
{
    if (alreadyTried[x]) //alreadyTried just serves the purpose of
         return false;   //preventing infinite recursion and things like that
    alreadyTried[x]= true;
 
   for (int j=0; j<m; j++)
       if(canSit[x][j])
       {
           int otherGuy = personInChair[j];
           if ( otherGuy == -1)
           {
               //the sit is empty, just sit on it!
               personInChair[j] = x;
               foundChair[x] = true;
               return true;
           }
           else
           {
               //kindly ask the other guy if he can move to another chair:
               if ( findChair(otherGuy) )
               {
                   //he was able to move.
                   personInChair[j] = x;
                   foundChair[x]=true;
                   return true;
               }
               
           }
       }
    return false; //No chairs available right now.
}
 
int maximumChairs() //this will return the maximum number of people
                    // that sit on a chair.
{
    int sum=0;
    fill(personInChair,personInChair+m, -1); //nobody has sit yet.
    fill(foundChair,foundChair+n, false);    //
    while(true)
    {
        bool somethingChanged = false;
        fill(alreadyTried,alreadyTried+n, false);
        for (int i=0; i<n; i++) //Everybody should try to find a chair.
           if(! foundChair[i] )
           {
               if(findChair[i])
               {
                   somethingChanged=true;
                   sum++;
               }
           }
        
        if(!somethingChanged)
            break;
        
    }
 
    return sum;
}

I think that when I came up with that algorithm it was based on code I've seen from red coders. The chair metaphor makes it easy to remember. I wonder if this algorithm has a name / original author or if it is something that merely materialized in programming competitions out of need.

Anyway, it is not that fast. It is `O(n^3)` :)

Saturday, January 11, 2014

SRM 604: More slow

Hmnn, another SRM in which speed in 250 can break or make it for a yellow coder. And I am too slow lately... I couldn't solve 550

Div1 250: The one with powers of 3

You start at `(0,0)`, in each step `(k >= 0)`, you can add/subtract `3^k` from exactly 1 of the coordinates. Is it possible to reach `(x,y)` in any number of steps? (possibly 0?)

First of all, `(0,0)` can be reached in 0 steps, disregard this special case. Otherwise, we assume that we make at least 1 step. How about we undo the first step? The one in which we only moved by 1 unit? Try the 4 possible directions for this first step and undo them. This modifies the problem to: Can we reach `(x,y)` using steps of length `3^k`, but this time `(k > 0)` ? Now since all steps changed a coordinate by a power of 3 different to 1, then both `x` and `y` must be multiples of 3. If that isn't the case, this is invalid. Else since both `x` and `y` are multiples of 3, we can actually just divide them to by 3. This converts all moves `3^k` to `3^(k-1)` which means that the problem is now exactly as it was initially, using moves `3^0, 3^1, ....`, so we can solve it recursively.

We stop recursion whenever we reach `(0,0)` or when multiples of 3 are needed and some of `x` or `y` is not a multiple of 3. This recursion *seems* to work in time.

I tested with what I think gives the maximum number of branching: {-348678440, 116226147} and it works just fine. I think it is unlikely it will time out. But I am unsure. High coders used a completely different approach.

bool able1(int x, int y)
{
    if (x == 0 && y == 0) {
        return true;
    }
    // can we if the first step is 3, second is 9, and so and so?...
    // (all must be a power of 3) greater than 1.
    if ( (abs(x) % 3 != 0) || (abs(y) % 3 != 0) ) {
        return false;
    }
    return able0(x / 3, y / 3);
}
bool able0(int x, int y)
{
    bool good = ( (x==0) && (y==0) );
    for (int i = 0; i < 4; i++) {
        good |= able1(x + dx[i], y+dy[i]);
    }
    return good;
}

string ableToGet(int x, int y)
{
    return able0(x,y) ? "Possible" : "Impossible";
}

Div1 medium, the one with a tree

I initially thought of a dynamic programming solution, and began to code it but took too long. 5 minutes before the end of the match, I noticed some issues with this idea. It was too slow and I wasn't sure how to handle a special case.

I now think of a slightly different dp, but my thoughts are not complete.

So?

A bad day, I just hope the 250 passes system tests.

Tuesday, January 07, 2014

Greed 2.0 release candidate

When this was released I was super busy with the SRM 602 editorial to share it. But now I am not so busy (although I am making SRM 603's editorial as we speak). Let me take a break and talk about the final feature list for 2.0. No more features will be added to 2.0, the rc is only to detect bugs.

Flexible Templates

Greed 1.6 was cool and I liked it, but 2.0 is far more powerful because of what it did to the templates. In 1.6 you could only modify two templates : source (source code file), test (the tester put inside the source code file) and maybe unit tests for java and csharp. This made Greed just slightly more powerful that the other plugins. The new Greed though, can be programmed to create any custom file/sections of files for you using any template you add. Each template is a powerful script that makes it easy to build anything using information from the problem and contest. For example, Greed by default will make some templates:

  • source: (same as before)
  • filetest: New version of the tester, saves the test cases in an external file, so you can edit this external file to add new test cases.
  • testcase: The text file that contains all the test cases .
  • problem-desc: The problem statement in cool HTML format.

So by default Greed will already do pretty much anything you'd expect from a topcoder plugin. Will generate a source code file for you, with automated testing of the example cases and an ability to add new test cases. It will also generate and store in disk a version of the problem statement.

However, what's new is that you can modify / change all of this. You can add new files to test. You can replace the files that are created or the templates used for them. For example, by replacing the filetest template with the test template, it will use another kind of tester that doesn't use a text file to store the test cases and instead has them stored as code.... Or, you could use the [dualcolor templates]. Very advanced testers for c++ and python that I developed and do tons of amazing things...

Create makefiles, project files for IDEs, separate source code and tester, store the problem's memory limit in some file so you can tell your testing environment to enforce it. it is all up to your needs and creativity.

Template options

With the advent of custom templates, the need to configure them became very important. There is a difference between the plugin and the templates. So hardcoding configuration options for the templates in the plugin wouldn't work too well. Instead, we added a way to allow templates to declare configuration options and the configuration file can write them.

Basically , the template definition in config file has an .options {} section, each option is a String field that is read by the template, this enables the following features.

Multi-process tester in c++

All the 3 c++ testers available have the option to work in multiprocess. Which means that they will call a different instance of the program's process to test each example case. This makes testing more accurate if your program uses global variables. It also allows the testing to continue even after a runtime error in an early example. The runtime error is also reported.

Problem statement themes

The problem statement template is very useful and can be customized through template options. The most important customization is the color theme:

It is also possible to customize the CSS either by adding an extra file or by replacing the style completely. Of course, since the problem statement is a template, you can also change everything about it.

I personally use the Stylish firefox plugin

Hooks

You can now set a program to run automatically whenever a template is created. I currently have a trigger template that is created automatically whenever I open a problem. The trigger causes firefox to open the problem statement and jEdit to open the source code ^^

Aids to make folder organization better

I added something that allows the code to be placed in better-organized folders. I currently programmed my Greed to put all TCO problems in a folder, and split the SRMs in groups of 25. It is very useful and much better to me than having all problems in the same folder, or a single folder for each SRM.

Aware of time and memory limits

Greed is the only plugin I know of that actually knows of the per-problem memory and time limits. It uses them for the problem statement, but your custom templates can also make use of them. My tester uses the time limit, for example.

Modulo detection

This is so useful: http://vexorian.blogspot.com/2013/11/bragging-about-incoming-greed-features.html

How to get it

Greed 2.0 release candidate: in release page. It has a couple of minor HTML bugs ashawat just found, if you don't like those bugs, you can fetch the source code from git and compile. Speaking of which, I have a version of Greed in my github fork that sets up my prefered settings by default (The folder organization, the dark gray HTML template and my testers): https://github.com/vexorian/topcoder-greed

Monday, January 06, 2014

SRM 603: uncertainty

What a match to be tainted by uncertainty. I have no idea if my 250 is correct and for a while I had no idea if my 500 was correct. Starting to write before the challenge phase and now I know that my 500 is at least correct in theory. I could have made an implementation mistake.

Div1 250

Two players have a tree. They play alternating turns. In a turn, a player must pick an edge and destroy the edge. This splits the tree into two components (also trees). The player then chooses a component to keep, the rest is discarded. The game ends when only one node remains. The nodes have some costs written on them. The first player wants to maximize this value, the other player wants to minimize it. Return the cost of the node if both players play optimally.

I had absolutely no idea what to do. I tried everything and couldn't think of a non-clever way to solve it. Eventually, I thought that since it is a div1 250, it shouldn't be so hard. It probably has a very easy solution that I am missing. Maybe a super easy solution that is, however, difficult to prove... I checked the example cases and noticed that they were all very simple... maek you think... Then I noticed that in all those cases, the returned value was the maximum cost node with degree equal to 1. After drawing in paper it sort of makes sense. I was late so I made a submission. Right now it seems that just about everybody is doing this. So maybe it is correct?

int findend(vector<int> edges, vector<int> costs)
{
    int n = edges.size() + 1;
    vector<int> degree(n, 0);
    for (int i = 0; i < edges.size(); i++) {
        // edge between i+1 and edges[i]
        degree[ edges[i] ]++;
        degree[i+1]++;
    }
    int best = 0;
    for (int i = 0; i < n; i++) {
        if (degree[i] == 1) {
            best = std::max(best, costs[i] );
        }
    }
    return best;
}

Div1 500

Given `n` count the number of pairs of two strings such that the second string is a rotation of the first string and the strings use only the first `k` letters of the English alphabet.

The number of rotations depends on the number of repetitions in the first string. If the string contains 2 repetitions (eg: abab), then the number of rotations is n/2. For each number of repetitions `r`, calculate the number of ways to have exactly that number of repetitions and multiply by `n/r`.

So in order to have `r` repetitions, `n` must be a multiple of `r`. So we are interested in all divisors `d` of `n`. With `n <= 1000000000`, we can find all the divisors in `O(sqrt(n))`.

The difficult part is to count the number of strings with an exact number of repetitions. Note that "aaaa" contains "a" repeated 4 times, "aa" repeated two times and "aaaa" "repeated" 1 time. We should only consider it as 4 repetitions. So if we calculate the number of strings with 2 repetitions as `K ^ (n / r)`, we have to subtract the strings with smaller repetitions that divide `r`... I couldn't think of a way to do this fast. I could only do it in `O(d^2)` where `d` is the number of divisors of `n`. At first I had no idea what an upper bound for the number of divisors is. We know we need an `O(sqrt(n))` algorithm to find them. But what about the actual number of divisors?

In my first excursions, I tried many numbers composed of the first few primes raised to some exponents. My best was 729.

Eventually though, I coded a bruteforce. It appears 1344 is the maximum number of divisors for `n <= 1000000000`. Let's see...

const int MOD = 1000000007;
vector<int> getDivisors(int n)
{
    vector<int> div;
    for (int i = 1; i*i <= n; i++) {
        if (n % i == 0) {
            div.push_back(i);
            if (n / i != i) {
                div.push_back(n / i);
            }
        }
    }
    sort(div.begin(), div.end());
    return div;
}
long modPow(long x, int y)
{
    long r = 1;
    while (y > 0) {
        if (y & 1) {
            r = (r * x) % MOD;
        }
        x = (x * x) % MOD;
        y /= 2;
    }
    return r;
}
long best = 1;
vector<int> primes = {2,3,5,7,11,13,17,19,23};

void backtrack(int p, long x, long divs)
{
    if (divs > best) {
        best = divs;
        cout << x <<" has "<<divs<<endl;
    }
    long y = x;
    int i = 0;
    if (p == primes.size()) {
        return;
    }
    while (y * primes[p] <= 1000000000LL) {
        y *= primes[p];
        i++;
    }
    //cout << "}" << endl;
    for (int j = i; j >= 0; j--) {
        backtrack(p+1, y, divs * (j+1) );
        y /= primes[p];
    }

}

int getNumber(int n, int k)
{
    // The bruteforce to find the maximum number of divisors
    //backtrack(0, 1, 1);
    
    vector<int> d = getDivisors(n);
    long total[d.size()];
    long res = 0;
    cout << d.size() << " divisors found"<<endl;
    for (int i = 0; i < d.size(); i++) {
        long x = modPow(k , d[i]);
        for (int j = 0; j < i; j++) {
            if (d[i] % d[j] == 0) {
                x = (x - total[j] + MOD) % MOD;
            }
        }
        total[i] = x;
        //cout << "for "<<d[i]<<" = "<<total[i]<<endl;
        res += (total[i] * d[i]) % MOD;
    }
    return (int)( res % MOD );
}

So?

The challenge phase is about to end and my solutions are standing. Let's see what happens

I don't really like 250s that have solutions that are easy but difficult to prove. How many people are getting points like me, coding it on a hunch? Well... maybe the solution is wrong,but Petr agrees, so... I gues it is right.

Thursday, January 02, 2014

Per problem memory and Time limits in Greed

Greed (at least the git version) is now the first plugin to support the per problem memory and time limits that were recently added to topcoder.

I had to cry for help from the arena maintainers to get this to work. I discovered many things I didn't know about the arena API, for example, the Memory limit thing was in it since years ago (huh?). Anyway, I finally got it to work. That was quite a pull request...

Why did I care so much to get this to work? Well, if some TC problems are going to have a custom memory and time limit, you need to know. Currently I am reading problem statements generated by greed instead of those by the arena (it is much better!), so I need them to contain this information. There is also something else though...

My test system is basically just a runcpp.sh script that sets things up correctly and puts the default Memory Limit (256) and runs things. Then it calls the Greed-generated source code file, which has a time limit feature. So my current TopCoder environment makes quite some use of these limits. If a problem gets new limits, they would break without using this information.

Making the environment take advantage of this feature was another issue. For time limit, it was relatively easy, my tester templates already had a time limit constant. I just modified them to use a time factor instead of set limit. My templates are now an official part of greed and you can find more info at: https://github.com/shivawu/topcoder-greed/tree/master/src/main/resources/templates/dualcolor

The memory limit was an adventure of its own. Memory limit was enforced by the runcpp.sh (and similar) scripts I used. But now there was a custom ML per problem. I first needed to store that information somewhere. I am not sure if this was the simplest way possible, but what I did was add yet-another template to greed. The memory-limit template:

    memory-limit {
        overwrite = force
        outputFileName = "limit/${Problem.Name}.txt"
        templateFile = "limittemplate.txt"
    }

(You also need to add memory-limit to the templates line). This would make greed generate a file ProblemName.txt at a limit sub-folder of the contest folder... The template file is just this:

${Problem.MemoryLimitMB}

So this txt file just contains the memory limit in MB.

Next step was to make the runcpp.sh script able to read this txt file (if it exists). The same with the python and Java ones. Posting the Java one for reference as it is the easiest to understand (The other ones need to do far more complicated things like calling ulimits and multiplying the MB limit by 1024...):

#!/bin/bash
#default
LIMIT_MB=256

#extract folder name from $1:
DIR=$(dirname $1)
#extract file name (problem name) from $1:
PROBLEM=$(basename $1)
PROBLEM=${PROBLEM%%.*}
#build the limit file name
LIMIT=$DIR/limit/$PROBLEM.txt
if [ -f $LIMIT ]; then
#echo "READ MEMORY LIMIT"
LIMIT_MB=`cat $LIMIT`
#else
#echo "NO MEMORY LIMIT, USE DEFAULT"
fi
#echo "MB = " $LIMIT_MB

echo "compiling..."
javac $1.java
if [ $? = 0 ];
then
java -Xmx${LIMIT_MB}m $1
fi

The trick is to modify the java command line, to use -Xmx${LIMIT_MB}m, where LIMIT_MB contains the read memory limit value.

So?...

I just hope that problems with custom memory limits are not too rare to justify all this work.

I am liking greed more and more every day. Nowadays I consider Greed not to be a TopCoder arena plugin, but an arena plugin workshop. The workshop just happens to come with an example simple arena plugin by default :)

The arena plugin survey

Calling all Greed users (and any other plugin users too). It is best to get surveyed in the plugin survey at : http://apps.topcoder.com/forums/?module=Thread&threadID=807988&start=0 I already made a post for Greed, so if you use it, plus that post.