Saturday, November 30, 2013

Topcoder SRM 598 Editorial part 1

http://apps.topcoder.com/wiki/display/tc/SRM+598

What an odd match. People keep saying the div1 medium was too easy. I just can't see it. From what I am looking at it is going to be very hard to explain. Whole match was incompatible with my editorial method. Problems are too hard to explain when they tend to be "obvious".

Now that I understand div1 hard better, I think that the "dirty tricks" solution I used during the match can be shown to have a very low probability to fail. From the analysis it sorts of follows that the number of beacons tends to be high. Except in cases were it should be easy to find the solution through the brute force anyway.

Thursday, November 28, 2013

SRM 598: Dirty tricks

Another week, another Topcoder SRM.

Div1 250: The one with bins.

We have up to 50 items of different weights `100 <= "item"[i] <= 300`, we have to put these items in bins that have a maximum capacity of 300. Which means that the total weight of the items in a bin cannot exceed 300. Return the minimum number of bins we can use to store all items.

It is unusual for an input to have a lower bound. The items' weight is at least 100. This means that a bin can hold at most 3 items. More so, if a bin ever holds 3 items, all 3 items must have weight 100. So we can just treat weight 100 items separately.

There is a case: `{100, 100, 100, 100, 100, 100, 200, 200, 200}` that shows that the optimal strategy is not always to put the 100-weight items in as many bins with 3 items as possible. However, the number of bins holding 3 items is finite and at most 50/3, we can just try all possible numbers of 3-item bins that we will use. 0, 1, 2, ... `h/100` (where `h` is the number of items of weight 100). Once you do this, all of the remaining bins will contain at most 2 items. We should maximize the number of pairs of items with total capacity <= 300.

int optimal2(vector<int> item)
{
    // solve if a bin can only hold at most 2 items.
    int cost = item.size();
    sort(item.begin(), item.end());

    // Each bin can now have at most 2 items
    int i = 0, j  = item.size() - 1;

    // Maximize the number of pairs of items that can enter a single bin: 
    while (i < j) {
        // simply match the smallest item with the largest item possible:
        while ( (j > i) && (item[j] + item[i] > 300) ) {
            j--;
        }
        if ( (j > i) && (item[i] + item[j] <= 300) ) {
            cost--;
            i++;
            j--;
        } else {
            break;
        }
    }
    return cost;

}

int minBins(vector<int> item)
{
    vector<int> not100;
    int count100 = 0;
    for (int x: item) {
        if (x != 100) {
            not100.push_back(x);
        } else {
            count100++;
        }
    }
    // It is possible for a bin to hold 3 items, but they must all be 100.
    int res = item.size();
    for (int a = 0; a <= count100 / 3; a++) { // number of bins that hold 3 items
        // the rest of the bins can hold at most 2 items:
        vector<int> tem = not100;
        for (int i = 0; i < count100 - a * 3; i++) {
            tem.push_back(100);
        }
        int cost = a + optimal2(tem);
        res = std::min(res, cost);
    }
    
    return res;
}

I made plenty of mistakes while coming up to this solution. Coded a wrong solution that failed test cases (the test case I mentioned above). Coded another wrong solution that passed all test cases, but I was very suspicious about it and could notice its mistake. I had the feeling this problem was going to have many wrong solutions and I was right. Unfortunately, during the challenge phase I remembered I am unable to read other people's codes. Many of the codes seemed overly complicated and possibly wrong, but I had no idea how to find a good case for them.

Div1 550

It seemed complicated, I eventually decided to test my luck and skip the div1 550 to open the hard problem.

Div1 950: The one with cities

You have a country/set of n (at most 50) cities. The cities have connections and these connections makes the graph of cities a tree (how suspiciously convenient). The plan is to put beacons in some of the cities so that, when evaluating the list of distances from a city to each of the beacons, the distances make a sequence that is a unique identifier for the city.

A basic `O(2^n)` solution would be to just try each subset of cities and place beacons on it and pick the smallest subset that allows the unique identification to work. It occurred to me that with backtracking and some cleverness, maybe we can optimize this solution to run fast enough and give a correct answer at least most of the time. I expected it to be quite hard to make system tests for this problem.

So I made some backtracking algorithm. All cities begin with beacons on them and we decide from city 0 to n-1 whether or not we remove the beacon from it. It is only possible to remove a beacon if all the remaining beacons allow the identification. Note that when all cities have beacons, the identification works just fine, because each city is the only city with a distance of 0 to its beacon. When doing the backtracking, make sure not to remove a beacon if it no longer allows identification of cities.

The backtracking wouldn't run in time. But my dirty trick idea was to make the backtracking search stop after a high number of steps. So that we use 2 seconds of time to do the search and return the best result we found in those limited 2 seconds.

It wasn't good enough, what if the optimal set of beacons to remove happens to be a set of the last indexed items? My new idea was to make this an unlikely event, my algorithm would randomly modify the positions of the input cities, lowering the chances of a crafted test case of occurring. Now, this approach would be wrong if there was a case that required a perfectly unique set of beacons to be removed, so the probability would be low. I had another trick idea: Spend half of the time in a greedy idea, giving priority to nodes with little edges so we can remove them.

Implementing the dirty tricks was not easy and took me quite a while. So much that in fact the only code I could submit was incomplete. It only did the randomized idea and it still only allocated half of the execution time for it. The randomized idea seemed to be returning a wrong result for a large case I had, so I sort of suspected it would either fail in the challenge phase or in the system tests.

Unfortunately, that didn't happen, and my very dirty brute force algorithm passed system tests. Good for my rating.

Comments?

Sunday, November 24, 2013

SRM 597 Editorial

It is up: http://apps.topcoder.com/wiki/display/tc/SRM+597. Don't forget to vote and comment. Feedback is important.

I really failed this time, first time I fail the deadline since the 96 hours deadline for div1 hard's editorial was added . The problem was that div1 900's is not difficult, but I completely underestimated the time it took me to code it correctly (had a very dumb bug last night and I couldn't debug it and kept falling asleep). Then explaining it was impressively hard.

Match comments

This was a good match. I was angry the last day because of missing it. I still think the 600 points problem wasn't suitable for medium. I ow think that 900 was better for that slot. Hmnn. The rest of the match had cool problems.

Wednesday, November 20, 2013

SRM 597: argh

The other day I accepted to write editorials to the TCO. Unfortunately they never warned me that some of the problems I would read could end up getting used outside of the TCO. I only learned that after the problems were sent to me. Had I know that I could miss SRMs just for helping with these editorials, it would have been a deal breaker. I hate missing SRMs. I really hate missing SRMs.

I had hopes that maybe if one of the problems I read gets used like this in a SRM, I could at least have a chance to write the other problems in the SRM. Nope. Then I tried to maybe get added as a tester to read the other problems so I could at least have progress in the editorial. Nope. So missing this SRM was useless. Week ruined.

Div1 600: ConvexPolygonGame

This is the problem that ruined my week. Two players take turns. It all starts with a convex polygon. In each turn, the player draws a polygon such that:

  • The previous polygon contains all the vertices of the new polygon.
  • The vertices of the new polygon are lattice points
  • No three vertices are colinear.

The player loses when it is impossible to draw such a polygon.

The trick is to notice that as long as a player can make a move, the player can always pick a minimal triangle that will win the game. We know that the game will end with a single triangle. Imagine this triangle came after a sequence of moves. This triangle will then be inside each of the polygons used in the game, including the first one. Then first player can just directly use this triangle and win.

So we just need to check: Is there at least one triangle with non-zero that can be drawn in the first turn? If so, first player draws a terminal one and wins, else the first player loses.

So far it is a cool little problem. The original version planned for TCO had `(-500 <= x,y <= 500)` constraint for the vertices of the first polygon. Which had a couple of cute solutions, for example, you could iterate through all the points inside the polygon and if you ever find a point that is not colinear to the previous two points, the job is done. Unfortunately, when migrating it to SRM format, they changed the problem by making the constraints much larger. So the clever conclusion that you only need to check if one drawable polygon exists becomes just problem obfuscation, because the real issue is to work up the implementation problem of checking for that condition. Apparently the idea is to use only the points that are close to the vertices, within 200000 distance or something like that.

I think this problem was a decent TCO finals easy with the small constraints. With the large constraints though, it becomes a SRM div1 medium with excessive implementation that doesn't make it more interesting.

Saturday, November 16, 2013

Bragging about incoming Greed features part 2

Last time I mentioned that Greed Topcoder arena plugin is getting a 2.0 version that has plenty of nice features. Between that last post and today, many things happened. The first is that Greed 2.0 is now officially in beta state. So you can now get the jar in the releases page. If you used Greed 1.5, the configuration has really changed, it might actually be more productive to start configuration all over from the beginning and try to reuse your templates if they were too complicated. (There is only one breaking change in the template format, the ;zeroVal was changed with ;zerovalue, but the configuration file has really changed).

Also important: I committed a couple of cool things that will also be part of the new version. These changes are not in the beta jar, so you might have to compile the git branch yourself if you want to try them now. If you can wait, that's better.

Modulo detection

Counting problems in algo contests tend to ask you for the result modulo 1000000007, or modulo 1000000009 or some other large prime. This is because everyone hates bignums. Anyway, it does add a bit of an annoyance when solving these problems, you have to paste the number in your code, preferably as a constant. In TopCoder, they have the bad habit of using the absurd format that uses commas as separators , so the constant appears in the statement as 1,000,000,007. So you actually have to remove the commas. This is if you actually think of using copy paste, maybe you don't and just type the number manually, possibly typing the wrong number (like using 1000000009 instead of 1000000007 or missing a zero).

We deserve better. Now that greed has access to the problem statement, it occurred to me that it could totally try to detect this modulo. It looks for the phrase ":modulo (some number)" and then saves the number in a convenient spot. If there are multiple of them, it saves the last one that is mentioned, because that tends to be one of the last phrases in the statement. It probably won't work all the time, but most of the time it shall be of some help.

My template looks like this:

${<if Problem.Description.Modulo}
    static const int MOD = ${Problem.Description.Modulo}; 
${<end}

The result is:

    static const int MOD = 1000000007;

Or whatever number is parsed. This constant is only added if the Modulo was detected. You know, automatically.

The colorful templates with generic tester

I added tweaked versions of my current templates so that it is easy to use them in greed out of the box after just tweaking two configuration lines.

More info

Tuesday, November 12, 2013

Topcoder Open 2013 editorial: TheGameDAG

The other month, I utterly failed and couldn't advance to round 2 of the TopCoder open. I didn't even get a t-shirt. Something after that I was contacted by an ominous figure, goes by the name of Rustyoldman. Apparently this year the TCO bloggers (read: Rustyoldman and .... misof) were going to write the editorials for the TCO onsite problems. The catch would be that this time editorial writer would have access to the problems long before the matches, so that editorials could be published fast.

I was undecided. On one side, that means money. On the other side, if my explanations get published at the same time as Rustyoldman's or, worse, misof's, then it would become very clear that my editorials are awful in comparison. I have a very fragile ego and I do not need any of this. Also, TCO onsite problems are ...hard. This was going to be a time sink.

But I eventually agreed. I only had time to write a small amount of editorials (possibly just the following one, I can't disclose more info). So here we go:

TheGameDAG

Problem statement available at: http://apps.topcoder.com/forums/?module=Thread&threadID=803574&start=0

Editorial available at: http://apps.topcoder.com/wiki/display/tc/TheGameDAG

This one problem was very hard. I needed a couple of weeks (of course, also busy with homework and the other editorials) to think of it. Ultimately I had to ask for help. Gotta thank bmerry for the help in getting me to understand this problem.

Friday, November 08, 2013

Some great incoming topcoder-greed features

So Greed is a topcoder Arena plugin that is very customizable (And I mean, very customizable).

When I started using these plugins I took advantage of its github to provide patches and suggestions. This helped shivawu implement some wonderful features that will make Greed 2.0 a plugin with unprecedented power (I think).

Custom templates

In the current stable version, there are 3 templates: source, tester and unit tests (optional for Java and C#). The 2-3 templates are used by Greed to generate the source and unit test files. The templates use a language of their own that is very powerful. So powerful that implementing python support almost didn't need much modification to the plugin itself, just the creation of templates for python and an update to make the plugin use them.

Why stop there, though? The new idea is that you can, define custom templates. Any template for any kind of file in addition to the source and test ones. The plugin will use its configuration to read a template and put the generated contents in some location that depends on contest name and problem name. Why is this important? Well, it adds plenty of options that weren't there before. After some tweaking I was able to customize greed to generate ACM-style input and output files and a source code file that takes input and generates output ACM style.

This is the input file generated for FoxAndGo3:

6

3
o.o
.o.
o.o

3
...
.o.
...

5
xxxxx
xxoxx
xo.ox
xxoxx
xxxxx

5
.xox.
.o.ox
x.o.o
ox.ox
.ox..

5
o.o.o
.ox..
oxxxo
..x..
o.o.o

3
...
...
...

But I really didn't want these input/output files, I like my meta testers just fine. However, this feature is also helpful for other things. How about automatically creating a project XML file for your favorite IDE's ? Or a build file for whatever build system you have? Separating tester and source code. Whatever you need you can now do it.

Problem statement HTML

The first good use for the custom templates is that, just by adding some info from the problem statement to the template engine. The plugin can now generate problem statement HTML files from a template. Because it is a template, there is plenty of flexibility in regards to the format of the problem statement. The default looks like this:

There are plenty of uses for saving the problem statements in files. If the internet goes down during the contest, you can pretty much still work in the problem - You can read the problem and the source code is in an external file.

After generation hooks

My favorite new feature is, however, the "After Gen Hook" as it makes everything work great now. You can tweak configuration so, whenever Greed finishes generating a file from a specific template, it runs a custom command. Whatever command you need.

In my case, I tweaked Greed to make it automatically open source code files in jEdit, and opens the problem statement in Firefox. Manually opening the generated source code files in jEdit was an annoying bottleneck.

How to get the features

They are currently in a test branch and are still in development. But if you want to give them a try, just get the contents of the tmpl-seq branch from their github. The command to build a jar file from the new sourcecode is ./gradlew fatjar.

The new configuration options are not documented, so you will need to take a look to at the default.conf file in the source/main/resources folder and do some reverse engineering.

Friday, November 01, 2013

SRM 596: Lack of speed will kill you

Well, that it is , a not great day because I took too long to solve 250 points and the other problems seem too hard. Starting this at 11:55.

Div1 250: The one with sequences

You start with a sequence of n zeros. You want to convert it into a desired sequence. There are two allowed moves: a) Increment a single element by 1. b) Multiple all of the elements by 2. The elements of desired are non-negatives less than 1001.

I noticed right away that you can first decide the number of double operations because you will need at most 9 (Multiply 1 by 2 ten times and you get 1024). So the rest is an easy subproblem: For each element in the desired sequence, find the minimum number of increments you need if you are going to multiply by 2 a fixed number of times. I used dynamic programming, maybe there is an easier method.

static const int MAX     = 1000;
static const int MAX_N   = 50;
static const int MAX_LOG = 11;
static const int INF     = MAX_N * MAX;

vector<int> desiredArray;

int mem[MAX + 1][MAX + 1][MAX_LOG + 1];
int rec(int x, int y, int d)
{
    int & res = mem[x][y][d];
    if (res == -1) {
        res = INF;
        if (x == y) {
            res = 0;
        } else {
            //increment
            if (x  + 1 <= y) {
                res = 1 + rec(x + 1, y, d);
            }
            //double
            if ( (2*x <= y) && (d > 0) ) {
                res = std::min(res,  rec(2 * x, y, d - 1) );
            }
        }
    }
    return res;
}

int getMin(vector<int> desiredArray)
{
    int res = INF;
    memset(mem, -1, sizeof(mem));
    for (int d = 0; d <= MAX_LOG; d++) {
        // If we do the double operation d times, what is the minimum number
        // of increment moves each element needs?
        int m = d;
        for (int x: desiredArray) {
            m += rec(0, x, d); 
        }
        res = std::min(res, m);
    }
    return res;
}

I took way too long to code. Everyone in the division summary had far better scores than I.

Update: During the challenge phase, I noticed that the answer is equal to the maximum number of bits (if not 0) plus the sum of 1 bits. Makes sense. I wish I coded my dp faster.

Div1 500: The one with bits

This problem strikes me as having too many complicating factors. I am starting to think that it is a brute force problem. Right now trying to think of a way to do that, but with 13 minutes left I doubt I will be able to do anything. My 250 score is too low, my only hope is that there is a greedy solution to 250 that most people are usuing and it is wrong, but I doubt that to be the case. I think the "Fix number of doubling" solution is straightforward to see and can't think of anything else.

Programming this post to appear exactly as the match ends.

Update

Now they are gonna have an update. This is the kind of matches that makes me hate the 250-500-1000 strategy. Putting two greedy problems in the first two slots ensures that even if I get to solve them, it will be low score and a wasted match. At least focusing on 1000 would be more interesting, and a low score in a match where everyone solve 2 problems is as disastrous for rating as a zero score anyway.

Wish I could opt out of writing the editorial ( I cannot ), also without practice rooms it will delay development of the editorial too. Just awful.