Tuesday, December 28, 2010

SRM 492, comments and explanation for div1 250

Before this match, I noticed how it was actually plausible I would reach 2100+ rating this time, and get to see a red sector in my rating graph. But it did not happen.

Div1 250 - TimeTravellingGardener

Once I noticed this problem, I knew it was going to take me a long time.

The idea is actually simple, imagine that at least two of the trees will stay in their position, then we can find two of such trees, and generate a straight line from them. Then we can iterate through all the other trees and see if their tops are already in the line, else verify that their tops can be reduced so that they touch the line. And pick the pair of points that would require the least amount of cuts (time traveling).

But for that, we need to ask "Is a case that leaves less than one tree intact possible?" As a matter of fact, it is, and I only found out about it after coding the two trees solution, example 1) actually has a case in which only one tree is left intact.

There are good news though, if we assume that exactly one tree will stay intact, then we can change all of the other trees' heights. The simplest thing to do in this case is to set these trees' heights to be equal to the tree that will stay intact. In fact, we can just cut all the trees so that their heights become equal to the minimum, and then we will have a straight line that will go through their tops. Since all sequences of heights will have a minimum, we can assume that the result will be n-1 in case it is not possible to pick two trees that form a valid straight line (previous step).

Back to that previous step, the checks needed for this iteration are tricky. Imagine trees i and j were picked to be part of the straight line, and we want to check if tree k belongs to that line. Assume that i < j (without loss of generality). Also assume we have arrays x[] and y[] that hold the top points' (x,y) coordinates for each tree (x would be the position of the tree and y its height). Then we can say that (dx = x[j]-x[i]) and (dy = y[j]-y[j]). We can say that the slope is dy/dx . Then what about (tx = x[k]-x[i]) ? Well, then for the point to belong to the straight line, (ty = x[k]-x[i]) must be equal to tx*(dy/dx)...

Then we have a equation;:
ty = tx*(dy/dx)

It is always recommendable not to use floating point numbers if it is not necessary, many people actually failed system tests because of precision errors, we can change last equation to:

ty * dx = tx * dy

Then we do not need integers for that. If (ty * dx = tx * dy) is true then it is not necessary to cut the tree. Otherwise it is necessary to cut the tree. There are still two things to consider, and this is the part in which I had the most trouble: a) It is not possible to "cut" a tree so that its height becomes larger than its original value. And b) It is not possible to "cut" a tree so that its height becomes negative.


For the first condition, note that it translates into: y[k] >= y[i] + tx*(dy/dx) . Because tx is the difference x[k]-x[i], so tx*(dy/dx) is going to be the wanted difference between the tree's height and y[i].

For the second condition, note that it similarly translates into: 0 <= y[i] + tx*(dy/dx).

We can get rid of floating point calculations by multiplying dx to both inequalities, but note that we have ensured dx is positive, so the directions of the inequalities will not change.

y[k]*dx >= y[i]*dx + tx*dy
(y[k]-y[i])*dx >= tx*dy
ty*dx >= tx*dy

0 <= y[i]*dx + tx*dy

If those two previous conditions are true, then it is possible to cut the tree to match the straight line, else it is not possible, so there is no solution in which both trees i and j stay intact.

Adding things up, we can code a solution...


int determineUsage(vector <int> distance, vector <int> y)
{
int n=y.size();
if(n<=2) {
return 0;
}
///prepare x[], note that we just renamed the height array to y[]
int x[n];
x[0] = 0;
for (int i=1; i<n; i++) {
x[i] = x[i-1] + distance[i-1];
}


int res = n-1;
// It is always possible to leave at least one tree intact, just
// pick the minimum height tree, and make the other trees' heights
// match it.

for (int i=n; i--;) {
for(int j=i+1; j<n; j++) {
//pick two trees i and j, j>i:

int r=n-2; //r is the number of trees we sent back in time...
bool can = true;
for (int k=0; k<n; k++) {
if(k!=i && k!=j) {
int dx = x[j]-x[i];
int dy = y[j]-y[i];
int ty = y[k]-y[i];
int tx = x[k]-x[i];
//The conditions we found...
if(ty*dx == dy*tx ) {
r--;
} else if ( ( ty*dx < tx * dy ) || ( y[i]*dx+tx*dy < 0 ) ) {
can = false;
}
}
}
if(can) {
res = std::min(res,r);
}
}
}
return res;
}


During the match, I noticed most of the aspects needed for the solution early, except the second condition (that trees must not become negative after the time travel) which caused me to lose a long time debugging it.

Once I submitted 250, I barely had time to see the 550 problem, it seemed very interesting, but it is too bad 250 was such a time sink, I just didn't have enough time to think it through.

I ended up losing plenty of rating, but at least it wasn't higher than the amount of rating I won in the previous match.

Sunday, December 19, 2010

Member SRM 491 - Div1 600, PrefixTree

At first it was easy to get misguided by the examples and think that maybe sorting each of words and then building a trie would work, but that is not the case. Even one of the examples actually fails with this solution.

Anyway, once I noticed that was not going to work, I just took a look to the constraints. There can only be at most 16 words in the input, which suggested me that the intended solution is likely exponential.

I think the main trick is to notice the relation between intersection and the optimal solution. For example, with two words "aaaxy" and "yaaz" , there is a intersection "aay" (order does not matter). It should be easy to see that the optimal trie would be formed as long as we make sure that the intersection is a prefix of both words: For example, "aayxa" , "aayz". Note that the order inside the intersection does not matter and neither do the suffixes, for example "ayaax" and "ayaz" will also give optimal prefix trees.

From previous paragraph, we can assume that if we had plenty of words in a trie, and the trie was optimal, then the total intersection of all the words must be a common prefix between them. (Note that such condition is necessary but not sufficient). So, let us for example say that we are merging two different optimal tries of words, each generate from a different list of words. For all words in list A the common prefix will be equal to their intersection and the same is true for list B. We want to merge both tries such that the final trie will be optimal. For this list to be optimal, we would need all of the words in both A and B to have a common prefix equal to the intersection of all words in A and B. This is actually possible, because if A1,A2,...An were the elements of A, and B1,B2,...,Bm the elements of B, then we can assume that the common prefix among A will consist of the characters in (A1 ∩ A2 ∩ ... ∩ An) and the common prefix among elements of B will consists of the characters in (B1 ∩ B2 ∩ ... ∩ Bm). The total intersection is: (A1 ∩ A2 ∩ ... ∩ An ∩ B2 ∩ ... ∩ Bm). The key in here is that (A1 ∩ A2 ∩ ... ∩ An ∩ B2 ∩ ... ∩ Bm) is a subset of (A1 ∩ A2 ∩ ... ∩ An) and also a subset of (B2 ∩ ... ∩ Bm), this is a known fact of set theory: ( (A ∩ B) c B ) .

Because for optimal tries we want the prefixes to consist of the characters in the intersection, but the order does not matter, we can assume that the prefixes of A1,A2,...An all start with the characters that are in (A1 ∩ A2 ∩ ... ∩ An ∩ B2 ∩ ... ∩ Bm). We can do something similar for the elements in B. Then we can say that all the elements in both A and B will have a prefix in common that consists of the characters in set (A1 ∩ A2 ∩ ... ∩ An ∩ B2 ∩ ... ∩ Bm). This common prefix will become a common path in both tries. So, we can combine the tries to form a single one. The size of the merged trie would be: (Size of trie A) + (Size of trie B) - (Size of nodes in common). The size of the nodes in common is equal to the size of (A1 ∩ A2 ∩ ... ∩ An ∩ B2 ∩ ... ∩ Bm) plus 1. (Because all the prefix trees have a root node that represents an empty string).


Did you see that? This means that in order to know the minimum trie size of a trie that combines two tries from lists A and B, we only need the optimal trie size for the words in A, the optimal trie size for the words in B, and the size of the intersection between all words in A and B. So, if we are given a list of words, we just need to somehow split it into two parts A and B, then find the optimal trie sizes for A and B and use that procedure. Correctly guessing which parts A and B to pick is difficult though. But we do not need to find a way to guess it, we can just try all possible pairs A and B that partition the set of words in two parts, and then pick the pair that would yield the minimum size after merging the tries.

The last paragraph implied that we have a recursion. We need a function trieMin() that takes a set of words, and returns its minimum trie size. What it will do is try all subsets A, find the complement B, call trieMin(A) and trieMin(B) to find out the optimal trie sizes of A and B and then pick the minimum total size. You should note that no matter which two complementary sets A and B we pick, the total intersection is always the total intersection among all the words in the set originally given to trieMin.

trieMin will always take a subset of the original words array in the input. That means that if its size was N, then there are at most 2N subsets this function can take. With N=16, we can just use memoization for this function, so it never evaluates the same call twice. Note that A and B will always be smaller than the argument set, so the recursion would be acyclic (which is necessary for the memoization to work).

Inside the trieMin function, we need to iterate through all possible subsets of the given set. If we iterate through all possible subsets of each subset of a set of size N, the total complexity is O(3N). As a quick proof, try counting the number of steps. Begin by counting the number of steps caused by subsets of size 0. That is: C(N,0) * 20 . (Because there are C(N,0) subsets of size 0, and for each of them you'll need 20 steps. For the subsets of size 1, you'll need: C(N,1) * 21, and we can repeat:


C(N,0) * 20 + C(N,1) * 21 + ... C(N,N-1) * 2N-1 + C(N,N) * 2N


Let us just use a little imagination, and add a couple of powers of 1:

C(N,0) * 20 * 1N + C(N,1) * 21 * 1N + ... C(N,N-1) * 2N-1 * 11 + C(N,N) * 2N * 10

If we know anything about The binomial coefficient then we can tell that the previous sum is equal to:

(2 + 1)N = 3 N.

Therefore, the complexity of our algorithm is O(N3) which is good to run in time for N=16. But we still need to actually implement it. First of all, we need a way to find the size of the intersection of the characters in a list of words, note that words may contain the same character more than once, so you need a special data structure to represent a set and the intersection. Then we need to represent a subset of the list of words and also to iterate through all the subsets of a subset of the list. In both cases, bit operations are very useful. We can represent a set by a bitmask integer such that if the i-th bit is 1, the subset includes element i. In order to iterate through all the subsets A of a set, just use the ( i = (i - 1) & superset ) trick explained in bmerry's tutorial . And to get the complementary subset B , just negate A, and intersect it with the original bitmask.

    int n;
//our set structure is simply a size [26] array that holds the count
// of each character in the word.
int swords[16][26];

// The memoization array...
int mem[1<<16];
// Returns the optimal trie size for the subset of
// words represented by the bits in mask.
int trieMin(int mask) {
int & res = mem[mask];
if(res==-1) {
res = 851; //largest trie size is smaller than 17*50

//get intersection size...
int inters[26];
fill(inters,inters+26,50);
int t=0;
for (int i=0; i<n; i++) {
if( mask&(1<<i)) {
for (int j=0; j<26; j++) {
inters[j] = std::min(inters[j], swords[i][j]);
}
t++;
}
}
// (to intersect two sets, just get the minimum count for
// each character). The sum of these counts is the size
// of the intersection.
int isize = accumulate(inters,inters+26,0)+1;
// +1 because of the root node which also belongs to the
// intersection.

if(t==1) {
// only one word, its size is the optimal trie size.
res = isize;
} else {
// Try all possible subsets and pair them with their
// complements.
for (int sub=mask-1; sub>0; sub=(sub-1)&mask) {
int nsub = mask &(~sub);
int A = trieMin(sub); //optimal trie size for the subset
int B = trieMin(nsub); //... for its complement.

// total size when merging these tries is A+B - isize.
// keep the minimum one.
res = std::min(res, A+B - isize);
}
}
}
return res;
}
int getNumNodes(vector <string> words)
{
n = words.size();
//convert words to the set format:
for(int i=0; i<words.size(); i++) {
fill(swords[i],swords[i]+26,0);
for (int j=0; j<words[i].size(); j++) {
swords[i][words[i][j]-'a']++;
}
}
// initialize the mem array.
memset(mem,-1,sizeof(mem));

// recurse...
return trieMin( (1<<n)-1 );
}

Saturday, December 18, 2010

Member SRM 491 commentary and explanation for the div1 easy problem.

FoxMakingDice
The 250, 600, 900 distribution was threatening this match to repeat the pattern of SRM 490. So I knew I had to be fast in this problem, and tried hard to do it... It didn't work.

Ok... so we always have 6 faces in the dice. We want to count the number of ways to assign the faces such that the sum of all opposite faces is equal and greater than or equal to K and the faces are different numbers from 1 to N.

Err, wait, how are two dices different exactly? The statement said that they are equal if you can rotate one to become the other. Well, that was a little confusing to me. What really helped was to notice that the first example had N=6, K=7 (normal 6 faces dice) and the result was 2.

Why two? Well, there are 3 pairs of numbers that give 7 as result, so there are C(3,3)=1 ways to pick the pairs, then according to word of God, there are two ways to assign these pairs to the dice. I decided to trust the problem setter on this.

So with fixed sum, we can count the number t of available pairs that give such sum, then 2*C(t, 3) is the number of ways to have dice that have opposite faces that sum sum.

Now we can just iterate from sum=K to 2*N (minimum and maximum sum value possible), and add up the totals.

Finding the number of pairs is easy, we can just iterate starting from x=1 , the opposite face must be (sum-x), so (sum-x > x) (because if that was not the case, we already counted this pair) and also (1 <= sum-x <= N). We just need to count the total values of x that follow those runs. This takes O(N) steps. The iteration for the sum value takes another O(N) steps. So the total complexity is O(N*N). The solution is then "simple" , except for calculating C(t, 3) . For some silly reason I used Pascal's triangle. Which introduced a silly bug, I had a [2001][3] array when I needed a [2001][4] array. Lame lame. Instead, it is better to just use t*(t-1)*(t-2) / 6 (that's what you get from manually calculating C(t,3).

long long theCount(int N, int K)
{
long long ways = 0;
//iterate through all possible face sums:
for (int s=K; s<=N+N; s++) {
long long t=0;
//count the number of face pairs that yield s as sum:
for (int x=1; (s-x>x) && (x<=N); x++) {
if(s-x<=N) {
t++;
}
}
if(t>=3) {
// add C(t,3)*2 to the result, it neatly translates to:
ways += (t*(t-1)*(t-2)) / 3;
}
}
return ways;
}




PrefixTree
I was wrong, unlike SRM 490, this medium turned out to be approachable without mass implementation issues. It was also in my opinion, beautiful , will elaborate later.
Update: Explanation for div1 600


Hard problem (whatever the name)
I had enough time to open it, but was clueless. During the challenge phase I found out it could have been used with min-cost max flow and binary search, that's crazy.

Challenge phase
This should underline neal_wu 's awesomeness. Before the challenge phase, I noticed there was a 600-pointer submission by a blue coder. So I rushed to open it quickly at the start of the challenge phase, and just as I finished reading the first three lines, it was already challenged by neal_wu. He later said he quickly noticed the guy wasn't picking all the subsets so he just gave it a random large challenge.

Rating and issues
Well, I had a chance to see for a second what my rating would become if this match is rated: 2035 ! That would mean I recovered from the losses of last match. Unfortunately, it appears that it won't be rated due to issues during the challenge phase :(.

Thursday, December 16, 2010

The latest topcoder editorials I've written

Hello all (I wonder if anybody actually reads this blog, I have made a good effort not to post links to it or talk about it in public). I have decided to rebrand this blog. Its name is now Doomed to debugging and its focus will be about programming and computers but will try to use it for just stuff related to me learning programming and try as hard as possible to keep opinions and rants outside. (I cannot give guarantees though).

I could not help but notice that this blog has had no entries since October. As much as I would like to say I was busy, I really wasn't. Though lately and more than usual I have been attempting to write editorials for TopCoder matches. Let us try all that has happened since October.

TCO Semifinals and wild card rounds.
This was fun. In total, I had to write explanations for 7 problems (The explanations for 2 problems were already done). All of which were of semifinal level (The TCO is a world wide tournament). I have started to think that the editorial writer is the one that has to do the dirty work. You know, the problem setter has to think of clever problems. The testers have to find flaws in them and the director has to decide what problem is appropriate and what not. Who's left? The editorial writer! The guy that has to actually make an explanation for the problems that people have to be able to understand...

I'll have to admit, writing the editorial for the semifinals and wild card rounds felt like less work than usual. Usually, when writing the editorial for a SRM, I spend most of the time actually trying to solve the problems, which is not easy at all. It is in fact nigh impossible sometimes. This time, I had quick explanations from the problem writers and also the most helpful bits of text that Petr Mitrichev had in his own blog. Actually, after reading the stuff Petr wrote, I couldn't help but feel like a third wheel. It was not as clear to me as to why was it needed for me to make much longer versions of what Petr said and turn it into three editorials...

Another thing that made it easier than a usual SRM's editorial was that I did not have to write the match summaries - those paragraphs in the top of each editorial that supposedly explain what happened during the match - They are incredibly hard for me to write.

TCO 2010 Semifinal round 1 (Explanation for the easy problem was written by Ivan Metelsky)
TCO 2010 Semifinal round 2 (Explanation for the medium problem was written by Ivan Metelsky)
TCO 2010 wildcard round

Please notes that the editorials I write are subsequently posted to a wiki which every TC member can edit, so anything that reassembles correct English probably came from a helpful editor and not me.

The most difficulty I had while writing those editorials was actually the hard problem in the wildcard round. I was actually unable to make it run in time in Java, and I ran out of time to work on a correct version. But it was supposed to work well in theory and it implements the correct ideas. The second issue I had was with the Semifinal 2 hard problem, until that day I have had little to no experience with range trees.


SRM editorials
SRM 487
SRM 488
SRM 490

I have broken a record and written three SRM editorials in a row! I have to make a clarification, the reason I get to write editorials so frequently is not exactly because of the score they get in the feedback post that usually accompanies editorials when they are posted. The reason is actually that I am usually available to write editorials when other approved editorial writers are not. Of course, the positive feedback does help and I guess it would be possible for me to lose my approved editorial writer position if I get consistently bad feedback. I must confess I am usually shocked by the good feedback my editorials receive :).

Writing editorials for SRMs is very hard, because you must actually understand the solutions for the problems before being able to explain them, and SRM problems have gotten very hard lately. So I spend most of the time actually attempting to solve the problems, else I have to ask the contest director for help or see if there are useful hints at the forums. Some behind the scenes:

SRM 487: I actually reverse engineered the division 1000 hard problem's solution from what the source code of the top placed coders. During the match I tried to solve this problem and was elaborating on many approaches that were wrong. This match was very enjoyable to me both as a coder and as the editorial writer. Specially the graph coloring problem was just great.

SRM 488: This SRM... I must say that I seriously ran out of time when writing this editorial, because the division 2 and 1 hard problems had intended solutions I was not able to understand. At the end things turned out right, I think.

SRM 490: I hated this SRM while I was participating in it. By the time I opened the 250 it was pretty clear to me that it was going to be yet another mathy problem that was going to take ages for me to solve, and I already knew the medium level problem was worth 550 points which probably meant it was out of my league. At the end I ended up getting a very low score in the 250 problem and I had almost no time to solve the 550 problem. I was right on the preliminary solving idea for the 550 but was hours of debugging away from solving it. I lost many rating points thanks to this SRM and I once again failed to maintain a 2000+ rating for more than one match.

Writing the editorial for SRM 489 actually greatly improved my opinion on it. The 250 actually makes sense once you managed to picture how it works and explain it in text. The 550 was a very hard to implement problem but it was the kind of problem that rewarded you for thinking before implementing. But the real savior was the 1000 pointer. I actually did not even open it during the match, that was a mistake. The 1000 pointer was a maze one (I love maze problems) and it was very interesting AND it had something similar to a linear recurrence... It has many elements I love. Though TopCoder's contest system tends to sometimes be excessively punitive of slow submissions.

Oh, and I like this editorial because I managed to finish it even though I had to study and go to a "Group work and mnemonics 5" ... errr I mean "Software engineering" final exam during the first 8 hours of the deadline. At the end things turned out right.

SRM next Saturday
Member SRM 491 is set for next Saturday. Note that a large group of people are entering holiday and vacation periods, plus Saturday matches tend to be very active. I have the feeling this SRM will have many and many contestants, so I hope the problem set is fun and I also hope I recover my 2000+ rating.

That's how this entry ends. I'll return to debugging err... programming some new features for Xye.

Friday, October 15, 2010

Ubuntu Jaunty to Maverick 'upgrade' : The aftermath

Yesterday I upgraded my Ubuntu 9.04 desktop to 10.10 . Mostly because repo support for Jaunty has ended. What I did was a very irresponsible and risky thing simply because UBUNTU UPGRADES DON'T WORK. Ok, they sorta do, but just the smallest "dependency error" will at best doom you for hours of tweaking and at worst ruin your install forever. That wouldn't be a huge issue if dependency errors didn't happen ALL the time when upgrading from a ubuntu version to another...

And that's just when upgrading between two versions that are 6 months apart. Upgrading from Jaunty to Maverick multiplies the headaches by 9 (no, not by 3) and is very risky.

Why did I do it?

Why did I wait until 10.10 instead of just upgrading every month to the next version? Well, I was very happy with 9.04 and I am also lazy. I didn't need to upgrade until Jaunty Jackalope support ended.

Why didn't I just do a clean install? Well, that's what I am asking myself. The thing is that I can resist some hours of tweaking config and fixing errors, but I have been using this ubuntu install for at least 5 years now (I am quite sure it starting in Breezy Badger times) and thus I have tons, and tons of config and installed packages. If I went with a clean install, I would not have as many trouble getting the computer to work, but I would have to reinstall (read: download) tons of packages, and will also have to reconfigure things to suit me.

If you do not have patience or skills to be locket into ubuntu recovery mode (no UI) for 6 hours trying to fix stuff using command line you should definitely NEVER upgrade and ALWAYS do a clean install to the newer version, you'll live longer.

Anyway, if you are in my same situation, were using Jaunty and now want to upgrade you have two solutions: a) Upgrade to each consecutive version step by step using the update manager. Since those upgrades are supported, they will give you less issues. I didn't do it because I have low bandwidth and thus that process would have taken me three weeks...

Or b) This:
* Edit /etc/apt/sources.list (for example gksu gedit /etc/apt/sources.list)
* Optional/recommended : Get rid of any repository that is not from official ubuntu, just in case. You can add those repositories back later.
* In that file, replace every instance of "jaunty" with "maverick"
* now open a command line and do "sudo apt-get update"
* Then do "sudo apt-get --download-only dist-upgrade"

That will download all the upgrades for your packages.

* To install (and possibly doom yourself) do:
* Then do "sudo apt-get dist-upgrade"

What will happen is that ubuntu will try to update itself, and it will try very hard, but at one moment, it will fail, because a new package will conflict with an old package that was meant to be removed but for some reason wasn't, thus the thing will halt and will tell you there were issues installing one of the packages. You have no choice than to decrypt the terminal text and find the name of the package that is causing the conflict. Then do "sudo apt-get remove packagename" . Chances are that about 20 packages depended on that package... So it will tell you a big deal of dependencies that cannot be met. Your only chance is to do "sudo apt-get remove packagename1 packagename2 .. packagenameN" for ALL the packages, including the one you want to remove and those that required it. Then you will have to do dist-upgrade as well and repeat, and repeat.

Eventually, dist-upgrade will finish. But you have probably removed a big deal of packages... So you better try at least getting the basic stuff:

"sudo apt-get install ubuntu-desktop"

Then cross your fingers.

What happened to me It is not the first time I upgrade ubuntu, it is not the first time I upgrade between two versions separated by more than 6 months either. I was expecting all that dependency mambo. So I eventually reached the end of dist-upgrade. But when 10.10 booted... I have no mouse or keyboard in the graphical interface! ARRRGGG Things like that can happen because when trying to fix all those broken packages your system got horribly disconfigured.

After hours of trying to overcome it doing things like reconfiguring X server, removing nvidia drivers and others. It finally stroke me... Perhaps I just need to use the newer kernel. I was using the old one because I didn't update my grub's menu.lst (as since my setup is very old, updating menu.lst automatically will screw the formatting up and remove the windows XP entry). So I modified it to use the newest kernel.

Then X crashed (darn). But it turns out it was a simple issue, the kernel no longer loads the "nv" driver but the "nouveau" one so I just changed the driver used in xorg.conf.

After all of that my keyboard and mouse worked. I am using 10.10 already, however, there is a horrible, horrible issue, my Desktop's emblems and icon size data is lost! :( I will have to resize them and add emblems again :(


I seriously think that all ubuntu upgrade mechanism should be full of giantic warning signs before letting a user do it. I think there may be users out there finding howtos about how to upgrade to avoid a clean install and following them... No, people, do not upgrade unless you want to suffer. Do NOT upgrade.

Wednesday, June 02, 2010

How to turn a vector into a set

It is amazing how large the STL is and how it manages to surprise you every time you find a new trick out.

So, in python it is often useful to do this:
v = [4,5,2,7,5]
s = set(v)
if 4 in v :
print "YESS!!"

By doing set(v) you create a set from the given list, which you can then use to do 4 in v.

I was practicing topcoder's SRM 426, and once again I got myself into a common theme: You have a vector<int> cardsReceived, does it contain int x?.

The form I would have used until today: (find(cardsReceived.begin(), cardsReceived.end(),x) != cardsReceived.end() ) is excessively ugly . But somehow it occurred to me to just try std::set's constructors:


set<int> rec( cardsReceived.begin(), cardsReceived.end() );
// ...
if( recset.count(x) ) {
cout<<"YESSS"!"<<endl;
}


That's right, do not underestimate STL's constructors. Since this constructor uses generic iterators, you can use it for arrays as well:


int A[6] = {2,3,2,2,4,5};
set rec( A, A+6);


Do not forget also that constructors may be used as functions:


int A[6] = {2,3,2,2,4,5};
if( set( A, A+6).count(4) ) {
cout<<"YEss!!"<<endl;
}



Off-topic: I once again got into the classical issue with using HTML for simple communication, thanks to HTML it is very difficult to use < and you have to always remember to use &lt;, that is very dumb, I will begin to look forward better wash to write entries for blogger.

Monday, May 10, 2010

Google code jam qualification rounds.

I've participated in google code jam since the 2008 version. Considering how the 2009 one went and seeing the problems in 2010's qualification round, I can predict with 95% confidence that in the 2012 google code jam:
- There will be 5 advancers to on site finals.
- The world champion will win 100 dollars.
- Rounds will last for four hours.
- Problems will be 350% more about implementation than they are now.
- Problems will be 10% as interesting as they are now.
- Filtering scoreboard by country will still be impossible.


Anyway, the positive thing about this round was that I finally learned my lesson. Nope, python is not the answer when there are bignum problems. Just because the contest organizer's don't feel like you should use C++ to solve a problem does not mean you shouldn't. In fact, after some time in chat I finally saw the light. GMP !.

Instead of spending 30 minutes of the following GCJ rounds trying to remember how to use python's bizare standard i/o (which for some reason is not as simple as cin >> n or n = sys.stdin.readInt()) (I am guessing all GCJ rounds from now on will use bignums). I can just use the GMP library. Because:

- It is free software!.
- It works.
- The c++ binding makes clever use of operator overloading - It is easier to use than Java's bignums...

Anyway, so here is what I learned.

Installing GMP
In ubuntu: sudo apt-get install libgmp3-dev libgmpxx4ldbl

Using GMP in your code jam c++ template

Well, it is easy, after #include "gmpxx.h" , simply use the mpz_class just as you would use a number type... Do not forget to change your compile command to link to this new library (I added -lgmpxx -lgmp to my g++ command inside the script that runs gcj code).

After installing and preparing GMP I gave it a try and coded problem B in c++ using GMP, it was very easy. Granted, I am no fan of large non-sense names as mpz_class so I used a typedef to call it big.

What follows is the elegant resulting c++ code I have fallen in love with:

#include <iostream>
#include "gmpxx.h"
typedef mpz_class big;

using namespace std;

//=========================================================
// program:
//
int N;
big t[1000];

big gcd(big A , big B) {
while (B != 0) {
big C = B;
B = A%B;
A = C;
}
return A;
}

big solve() {
big T = t[0] - t[1];
for (int i=0; i<N; i++) {
for (int j=i+1; j<N; j++) {
T = gcd(T, t[i] - t[j] );
}
}
T = abs(T);
return (T - t[0] % T) % T;
}


inline void init(){}
//=========================================================
// I/O:
//
int main()
{
init();
int C; cin>>C;
for (int i=1;i<=C;i++)
{
cerr<<"["<<i<<" / "<<C<<"]"<<endl;
cin >> N;
for (int j=0; j<N; j++){
cin >> t[j];
}
cout<<"Case #"<<i<<": " << solve() << endl;
}
return 0;
}

Saturday, January 23, 2010

Postmortem: UVA World finals warmpup II

Contest link

BLeh , why did they have to make today's warmup the 14:00 GMT one instead of last one? I had an exam during the warmup and it seems I didn't have time to solve or even open the interesting parts of the problem set.

The exam
So, at 10:00 AM GMT-4 (exactly the same time as the warm up's start time) I was supposed to solve an exam. This was so lame.

10:40 : Exam begins, professor didn't arrived at 10:30 even though the exam's schedule was 10:00 AM - 12:00 PM...

Lame, only two questions , the first one was very easy using predicate transformer semantics (zero loops, conditional statements or anything.

Second question: Even lamer. So we have to use Hoare-Floyd logic to verify a program which has ... no post-condition. Well I got used to this during the course since it seems that verifying a program also includes making the post-condition up. Anyway... the algorithm was simple, just product using successive sums. The difficulty was that it had a do...while loop, and during the course we were only introduced to while loops. But the professor was kind enough to include the do...while loop's rule ... but THE RULE WAS WRONG. It is just the first time I get into Hoare-Floyd logic, but even I know there is no way in hell that B could be part of the post-condition of (do C while B) since we need B to be false to end the loop... So, I just figured out the correct rule by myself, included an explanation why the given rule was wrong, and a sort of proof that mine works based on the fact that do..while can be converted to a while loop by copying the loop's contents to the section before the loop...

So either I got a good 100% or 50% if somehow we were supposed to use the wrong rule to prove it or 0% knowing my luck.

11:00 : I am finished. I should probably go back home to actually try the warmup II contest... But I don't know if I should go or wait for the official end time... Always confusing.

11:40 : Ok, some people are just giving him their tests and leaving. I better do as well.

12:20 : I am at home and ready to open the problems...

Problem D
So, I look at statistics and problem D is by far the one with the most solutions. Turns out it was excessively easy. I actually double checked the statement just in case there wasn't a cheap difficulty device like having to use bignums or something. Turns out there wasn't. I just solved it. I wonder how would people manage not to have at least 1 problem in this warmup...

Problem J
Apparently the second easiest problem, I first thought that it was another easy one since I thought that if light a can trigger light b the converse is true as well. But then I noticed that it probably isn't the case. I asked for clarification to be sure. But the response to the question never came to my inbox...

So, after inspecting the examples, I finally figured out that the graph is directed, this makes things a little harder. I quickly noticed that all nodes belonging to a cycle could be treated as the same node. So we can do a SCC algorithm and then treat the transformed DAG graph for a solution. Once we can assume the graph is a DAG, things are easy. Just notice that you will have to manually turn a light on if and only if there are no lights that can trigger it. This can be read as "count all the nodes in the DAG that have in-degree equal to 0".

Coding the solution was hard since I noticed I never actually implemented a SCC algorithm before. I had just blurry memories of CLRS' lesson about it, so I went to wikipedia. It recommended tarjan's algorithm and also included pseudo-code for it. I took some time convincing myself that the pseudo-code is actually correct.

1:00 PM : The time I finished coding the problem coincided with lunch time. So I submitted it and checked the result... WA!

Lunch
I spent the whole lunch hour wondering if maybe the algorithm I conceived was wrong or not. I ended up quite convinced that it should work.

Back to problem J - Today's blunder
14:00 PM (ish)- I had no choice but to inspect my solution. I created some test cases and noticed that it was failing many of the new ones. After a lot of manual debugging I finally noticed my mistake... I had something like:

for (int i=0; i<T; i++) {
cin >> N;
for (int j=0; j<N; j++) {
adj[i].clear();
...
}


It was supposed to be adj[j].clear() ... So, some ghost edges could remain and ruin everything... I just fixed this issue and got AC.

Problem B
14:40-ish . The next easiest problem was B. I quickly noticed it was just a dp/memoization one. So, I made a recurrence, but it allowed some cycles, so you had to solve the equation inside the recurrence to avoid the cycles (so that dp/memo worked). There is also the other problem of having to get a) the prime divisors of a number and b) The number of primes between 1 and a number. These both are easy using a sieve, I modified Eratosthenes' one so that it would store a number's lowest prime divisor in its array. If no prime divisor was found for a number use this number as the minimum divisor for the multiples that don't have such number yet.

I was having problems at first with the example cases. it turns out my recurrence was wrong. At the end I finally figured out:

f(x) = ( 1 + sum( f(y) for each y such that y is prime and x divides y) ) / ( (number of good y primes ) / (number of primes between 1 and x) )

I actually finished this problem very close to the end of the contest.

Final result: 77-th place . Hmnn, I never do well in world finals warmups...

Conclusion
I think that maybe if it wasn't for the time wasted in problem J due to not knowing enough about SCC and the lame mistake I could have solved another problem or at least have more fun from the contest. I need to practice AND study more theory.

Final thoughts
Why do blogs use HTML? instead of something like BBCode or whatever wikipedia uses? Since they have to be paranoid about XSS attacks, they don't let you use many tags and it can get confused by things like C++ code in which > and < are used... With bbcode, we wouldn't have so many problems...

Sunday, January 03, 2010

Making spells in wc3: still torture


I spent a lot of holiday time making spells for the failed hero contest at wc3c.net My spells are somewhere around that page.

My conclusion is that. I have ZINC which makes typing very fast. Let me self indulge by saying it really does help. Of course, it also allows me to code atrocious things like:


t = CreateTrigger();
TriggerRegisterAnyUnitEventBJ(t, EVENT_PLAYER_UNIT_SPELL_CAST );
TriggerAddCondition(t, function()->boolean {
return BUILD_SPELL_ID == GetSpellAbilityId();
});
TriggerAddAction(t, function () {
onRebuildStart( GetTriggerUnit() );
});


Yay! anonymous functions... I wonder if people can actually read that code...

Another accomplishment of the year was the discovery of jEdit. It has also been amazing at increasing my efficiency. Thanks to auto complete I don't actually need to browse common.j every 5 minutes...

I have also made a lot of things to my linux command-line based warcraft map build system.

So, in theory I should now be able to make spells and maps quite quickly. Unfortunately that's not true. Just like I thought, the bottleneck was never coding or compiling, it was testing. In fact, I think I could have finished the spells using the vJass from 3 years ago much faster than I did now if map loading time took 1 second instead of 40 seconds. The other major issue is that wc3's engine is full of idiosyncrasies that force you to keep tweaking and testing object editor until you find something that actually works. In fact, making these spells was a fight against the engine. I had to deal with pocket factory and storm, earth and fire, both things required a lot of reverse engineering.

Reality is that regardless of all the work in the last years to make coding easier, and although we succeeded at that. Making spells is still torture. But I also figured that we still lack specialized libraries or at least I do. The spell making process invokes using certain tools and processes over and over and over again. The spells I made all are really just combinations of the same old 'processes' I've been using since 2004. The language has changed. The rules about how to code have changed. But it is still the same. Doing all these things over and over again is very repetitive.

I think some sort of library combination that has all these common processes abstracted in a way that you can just put them together as LEGO bricks would be amazing and seriously improve the speed of these things. I also think that maybe we need a better language. We've been messing with syntax and OOP concepts for ages but maybe we need something that would free us from having to use all that attaching and defensive programming related to it manually.