## Saturday, March 29, 2014

### SRM 614: Cursed weak test cases

Remember that this is the 4-th SRM in the python experiment. This time I felt really fast when solving the div1 250 pointer. When python helps, it really helps. Unfortunately...

## Div1 250: Square, points, etc.

You are given up to 100 pair-wise distinct, lattice points with coordinates with absolute value up to 10^9. Find the minimum area square with lattice vertices and side paralel to the x and y axises that strictly contains at least K of the input points.

So I code my initial idea: One of the optimal squares will have a bottom-left corner equal to (x-1,y-1) for some point (x,y) in the input. Try all O(n) of such candidate corners. Once you fix a corner, you just need the side length. We can just binary search for the smallest L such that the square will strictly contain all those points.

FAST submission! I was first in room for a while and was happy.

## Div1 525

This is a complex dp/implementation problem mix. Eventually decided to skip it. I had no idea how to work around the high constraint for K. And I am tired after staying late trying to do SRM 613's editorial. Plus I won't have to solve this problem anyway because I am not writing SRM 614.

## Div1 1000

Oh boy, I am so happy I am not writing this editorial.

## Back to Div1 250

So I decided it was best to start writing this blog post. As I was explaining the div1 250, I noticed: My logic is wrong! The test cases were awfully weak in this case. In reality, if the optimal corner is (x-1,y-1) , then (x,y) is not necessarily a point in the input. x and y could come from different points. The counter example is simple, points: (0,3), (1,3), (2,3), (3,3), (3,2), (3,1), (3,0) and K = 6. The best square has x = -1 and y = -1.

By the time I noticed this, it was incredibly late. The resubmit was very costly, I hope to be able to compensate by finding similar mistakes in the challenge phase.

The python code is lovely though:

def binarySearch(f, lo, hi):
# f(hi) and not f(lo)
while lo + 1 < hi:
ha = (lo + hi) / 2
if f(ha):
hi = ha
else:
lo = ha
return hi

class MinimumSquare:
def minArea(self, X, Y, K):
points = zip(X,Y)

res = 10**20
for x in X:
for y in Y:
# find smallest square with bottom-left corner (x-1, y-1)
def canIt(L):
return sum( (x-1 < xi < x-1+L) and (y-1 < yi < y-1+L) for (xi,yi) in points) >= K

if canIt(res) :
res = min(res, binarySearch( canIt, 0, res) )

return res ** 2


Take a look to the beautiful binary search, thanks to first class functions I can do binary search on a higher level, rather than mixing it with other code.

Post will be created automatically once the challenge phase ends.

## Friday, March 21, 2014

### SRM 613: The doubt

Third python SRM, this time I am not sure if the solution for div1 500 is doable in python, but I was very far from figuring it out so maybe the solution is actually quite fast.

## Div1 250: cats

There are many cats in the horizontal axis. You know their integral positions. Each cat has to move exactly M units left or M units right. Find the minimum gap between left-most and right-most cat possible after a single move. Positions are between -100000000 and 100000000

I will find it hard to explain it editorial-style, because it is rather obvious that there should be a pattern like RRRR...LLLL, such that some cats on the left side of this pattern move right and the rest move left. There are only O(n) of these patterns so we can try them all.

My way of acquiring those patterns was to pick a position and tell cats to move towards that position. It passed examples, and submitted quite fast (I think it is because of python's shortness of code), then I noticed a potential bug. If I tell a cat to move towards itself, then it will always pick the left side (I used <= operator) instead of maybe trying the right side. So for a while I wondered if there is a case in which the cat in question needs to move to the other direction. I eventually found that 's not the case. You don't need to try both LLLLLLLL...LL and RRRRRRRRRR...R, because the end result will be the same. So I thankfully avoided a needless resubmission penalty.

class TaroFriends:
def getNumber(self, X, T):
def towards(z,x):
return z + T if z <= x else z - T

def allTowards(x):
l = [ towards(z, x) for z in X ]
return max(l) - min(l)

return min( allTowards(x) for x in X)


Python seems to be making me trend towards functional-ish programming. Would you take a look how beautiful this code is? I checked out other people's codes and they are awful :)

So we should pick N numbers such that each number is between two bounds and the GCD of all the picked numbers is K. All the numbers in the input are between 1 and 10^9. Have fun

I didn't really do much.

The maximum number of distinct prime factors of K is 9, because the product of the first 10 prime numbers is larger than 10^9. So maybe if we focus on the prime factors... Each picked number must be a multiple of p^r if p is a prime factor of K and r its exponent in the factorization. Also, r should be the minimum exponent of this prime number for all the N numbers. Besides of this I didn't have much luck.

I initially made a huge mistake, thought N was up to 50, so I was coding some dynamic programming (which still had complicated parts like having to count numbers that are multiples of something but not multiples of a list of other numbers). When I figured this was completely wrong, I had tons of code and the match was ending, so I submitted it for challenge phase fun. Too bad the code was challenged very fast.

## So

When python works, it works, It has been a while since I had such a relatively high score in div1 easy. It was all because of python's potential for concise code.

## Wednesday, March 12, 2014

### No Google Code Jam for you!

The other day I made a #CoCPledge, I thought "Hey, Tech certainly has a diversity problem and if this can help, let's do it". Well, I don't really frequent conferences that much, so it was more of a "just in case" thing... Few minutes later, I noticed, there is no reason why this shouldn't apply to tournament on-sites. "Darn", I thought to myself, this certainly means I am not longer able to participate, attend or otherwise support a tournament that doesn't have a decent Code of conduct. If you have no idea why this is important, read this.

Well, "Not a big deal" - I thought - maybe this time the Code jam and TCO organizers will add a Code of Conduct. That's the time I noticed that I have no idea how to tell the tournament organizers to do that. The only way I could think of was to make two posts. One in topcoder's forums and one in the codejam mail list. Just asking, in all my pseudo-Socratic glory "will there be a CoC this time?".

As you can see if you head to those links, reception was...cold. In TC, I didn't get any official reply, although I did get a tweet showing slight interest:

In the case of codejam, it was worse, the codejam admin just replied with some jokes:

## Context

It is clear to me that competitive programming has the same diversity issues as programming as a whole. In fact, it may even be worse because the apparent proportion of guy coders vs. non-guy coders in competitive programming is humongous. We can speculate about the reasons for weeks, but there's no arguing this is a problem. If the likes of codejam and topcoder want more competitors, reaching to the other 50% of the population is a good step. Please note that the reason this 50% don't join has nothing to do with lack of affinity to coding or skill.

It reminds me of that other day in Topcoder arena chat before a SRM, when the guys started joking about how most topcoders don't have a girlfriend. (Note the assumption that we are all men and straight). And then the discussion devolved into some people "joking" about how girls are "stupid". I am going to argue that maybe, ... just maybe, that's the sort of thing limiting our progress towards a more inclusive competitive programming environment. We need to improve the environment and a simple first step is to show confirmation that it is safe and mature enough to have a CoC.

In the topcoder forum thread, I got replies regarding whether I noticed wrong conduct in the onsites. ... That's just not the point! I think the TCO onsite I attended was great and had great people (although my perception may have been partial and incomplete), but that just tells me that it wouldn't be difficult at all to have a CoC, if we are all decent people then that's no problem at all, right?

## So...

Everything indicates that I cannot support the Code jam this year. This means I won't participate in it, or make blog posts about it, and that includes my explanations and advances. You will not see something like this or this for codejam 2014.

It appears the same will happen about The Topcoder Open. Well, same deal, I will ignore the tournament. Also notice that this means I won't write any TCO editorials or contribute problems.

Please don't take this as a (god forbid) "boycott", it would only be a boycott if I expected tournaments to change because of missing my (quite irrelevant) "contributions". I am fairly sure the tournaments will be quite okay or maybe even better without me. And I am not asking anyone else to join me. This is more about keeping my promises. Just explaining why you won't see me involved in these tournaments as usual :)

## Tuesday, March 11, 2014

### SRM 612: Python SRM II

This was the second SRM in which I used python exclusively, I think I did just okay.

## Div1 250: The one with copy paste

You have one character, you want to reach smilies characters (smilies is between 2 and 1000). You have three operations: Delete one character. Copy all current characters to the clipboard and paste the characters from the clipboard (without replacing the current ones). What is the minimum number of steps?

Define state (s,c), where s is the current number of characters and c the number of characters in clipboard. Then a delete operation is to move to (s-1,c), a copy operation is to move to (s,s) and a paste operation is to move to (s+c,c). The rest is just to code a Breadth-first-search (Or you know, A Dijkstra where all edge weights are 1), because we want to find the minimum-length path from (1,0) to any ("smilies",c) (We start at (1,0) because there is one character and the clipboard is empty).

So I just needed to code BFS in python. I had a bit of difficulty because I forgot how to do queues in python... turns out you just do .pop(0) to remove and return the front element.

class EmoticonsDiv1:
def printSmiles(self, smiles):
v = [ (1,0) ]
dist = dict()
dist[(1,0)] = 0

def neighbors(s,c):
if s > 0:
yield (s-1, c) #del
yield (s,s)        #copy
yield (s+c,c)      #paste

# BFS
while len(v) > 0:
(s,c) = v.pop(0)
d = dist[(s,c)]
if s == smiles:
return d
for x in neighbors(s,c):
if x not in dist:
dist[x] = d + 1
v.append(x)



Why to do a BFS and not just a dynamic programming? The recurrence relation is cyclic. So, really, don't try to do dynamic programming here.

Note the cute usage of yield.

## Later

Tried div1 medium but seems difficult. Decided to skip the rest of this SRM and finish the editorial for SRM 611.

I think python really helped me in this problem. I was able to quickly modify my tester to quickly test for all possible values of smilies, so I am quite sure I passed. I had a delay remembering about pop(0), but I also coded very quickly thanks to python, I think I would have taken a bit more time in c++-

## Saturday, March 08, 2014

### SRM 611 Python review

SRM 611 was the first of 10 topcoder SRMs in which I am going to exclusively use python during the contest. During the match, I had a bit of a bad day. Now I finished the first part of the editorial so I had to solve all the problems (except the hardest one, for now). I have some things to say.

## Div2 250: The one with digits

The problem is just an implementation thing: Return if a number (provided as a string) is interesting. A number is interesting if for each digit d from 0 to 9, either it doesn't appear in the number or it appears in exactly two positions and there are exactly d other digits between those positions.

My first reaction was:

class InterestingNumber:
def isInteresting(self, x):
for d in range(10):
ch = chr(d + ord('0') )
if ch in x:
indexes = [ i for i in range( len(x) ) if x[i] == ch ]
if len(indexes) != 2 or indexes[1] - indexes[0] - 1 != d:
return "Not interesting"
return "Interesting"


It is okay, although not particularly interly or showcasing of what's different in python. What is interesting is extracting the indexes that contain ch in one line.

It can be simplified a bit, although it is unclear whether this makes the code clearer or not:

class InterestingNumber:
def isInteresting(self, x):
for d in range(10):
ind = [ i for i in range( len(x) ) if x[i] == chr(d + ord('0') ) ]
if (len(ind) > 0) and ( len(ind) != 2 or ind[1] - ind[0] - 1 != d ):
return "Not interesting"
return "Interesting"


Too imperative, maybe?

class InterestingNumber:
def isInteresting(self, x):
def validIndexes(d, ind):
return (len(ind) == 0) or (len(ind) == 2 and ind[1] - ind[0] - 1 == d)

def extractEqual(d):
return [i for i in range(len(x)) if int(x[i]) == d]

def validDigit(d):
return validIndexes(d, extractEqual(d) )

return ["Not i","I"][ all(validDigit(d) for d in range(10) ) ] + "nteresting"


The all() function is a useful builtin, returns True if all the stuff you provided it are True too.

I like this last one because it is almost pseudocode (to me). But I don't know, really. At least in general the code in this problem is quick to write in python.

## Div2 500: The one with LCM

problem statement

You are given a set S of up to 50 elements and a number x. All elements and x are between 1 and 10^9. Return "Possible" if there exists a subset S' such that the LCM (least common multiple) of all the numbers in S' is equal to x.

The solution is basically to just take the LCM of all elements in S that are divisors of x. Answer is Possible if and only if that LCM is equal to x.

The first part of all of this is to find the LCM of two numbers. We can use the formula lcm(a,b) = (ab)/gcd(a,b). Then we need to find the GCD... We use Euclid's algorithm. This takes me to how to implement Euclid's algorithm in python...

def gcd(a,b):
while b != 0:
(a,b) = (b, a%b)
return a

def lcm(a,b):
return a*b / gcd(a,b)


This is a good approximation. There is a slight problem and it is, what if 0 = a < b? The correct result in that case is b. This is not an issue in this problem specifically because no element is 0. But we can solve it by making sure b is always the smallest of the two.

In c++ you'd do :

if (a < b) {
swap(b,a);
}


But in python, there are some other ideas. For example: (b,a) = (min(a,b), max(a,b)) . But my favorite is:

def gcd(a,b):
(b,a) = sorted( (a,b) )
while b != 0:
(a,b) = (b, a%b)
return a


The sorted function will sort the tuple (a,b), so its result will be (min(a,b), max(a,b)), then we do the assignment in such a way that b will hold the minimum and a the maximum.

Now we need to extract the elements of S that are divisors of x (easy), and then take the LCM of them all... Our lcm function takes a pair of numbers, but this is what reduce was made for! reduce is another useful builtin with a functional origin story. Basically "reduce"(lcm, (a,b,c,...)) = "reduce"( lcm, (lcm(a,b),c,...) )... So calling reduce with lcm as argument will apply the lcm of all the numbers. A bit of a corner case is when the set of divisors of x in S is empty... Then reduce would get an empty set. For those cases, reduce would throw an exception unless you give it a default value. A good default value is 1.

def div2solution(S, x):
return x == reduce(lcm, [z for z in S if (x % z == 0) ], 1)


Reduce is one of my favorite things after starting to use python. It makes code so much simpler once you "get" it. The imperative alternative would be to set a variable to 1, do a loop and repeat lcm( variable, x[i] ) and things like that.

## Div2 hard: The one with elephants

This was a long implementation problem with a long problem statement that it is difficult for me to abridge and an even longer explanation for a solution that is quite obvious. So here is the problem statment and here's the editorial (Too bad topcoder is still hiding editorials to guest users, I guess those are *great* at SEO...).

The crux of the matter is, you need a brute force to pick all combinations of starting positions in the top and bottom rows. There are at most 5 columns, so there are at most 2^10 such combinations. So really, we need to iterate through all 2^(2n) subsets of the numbers from 0 to 2n-1. In the contest world, we usually use bitmasks, we are so used to them that we forget how horrific of an incomprehensible low level hack they are. The other option is to use backtracking, but that's a lot of code... In python, though, we have something better: itertools.combinations. It allows you to easily iterate through all the subsets of size r of some list.

The second issue with this problem is that you need to simulate a search that can start in the top,right,left or bottom side of the map, progress towards the opposite direction until it finds a "pond" or a square that is already blocked, and block all the cells in its path. This sort of simulation is best if we engineer it to use a single function. Nesting the function makes things just easier here:

class ElephantDrinkingEasy:
def maxElephants(self, theMap):
n = len(theMap)
mx = 0
for r in xrange(2*n + 1):                                #number of elements of subset
for prows in itertools.combinations(xrange(2*n), r): #subsets of r elements

# copy the contents of the map so we can overwrite them
sim = [ [ theMap[i][j] for j in xrange(n) ] for i in xrange(n) ]
res = 0

#the simulation in a cnvenient nested function
def simulate(x,y, dx,dy):
while (0 <= x < n) and (0 <= y < n):
if sim[x][y] == 'Y':
sim[x][y] = '#'
return 1
elif sim[x][y] == '#':
break
else:
sim[x][y] = '#'
(x,y) = (x + dx, y + dy)
return 0

for i in prows:
x = i % n
(y,dy) = (0, 1) if i < n else (n-1, -1)
res += simulate( x,y, 0,dy)
for j in xrange(2*n):
y = j % n
(x,dx) = (0, 1) if j < n else (n-1, -1)
res += simulate( x,y, dx,0 )
mx = max(mx, res)
return mx


## Div1 easy : The one I failed

Problem statement

The LCM set of a set S is the set of all numbers that can be generated by taking LCM of a non-empty subset of S. Given sets A and B, find if the LCM sets of the two sets are equal.

I failed this problem during the match. It turns out that my logic was mostly right, the mistake, however was in checking if it is possible to build a number out of some of the other elements in the set. During the match I did an overcomplicated mess based on prime factors, that was wrong. Now we know that this is actually the solution to the division 2 problem above. So we need to do the same thing "compress" both sets getting rid of the redundant numbers and then compare the compressed sets.

def compress(X):
return { y for y in X if not div2solution( [z for z in X if z != y] , y) }

def div1solution(A, B):
return compress(A) == compress(B)


Another shiny python implementation. [z for z in X if z != y] returns the elements in X that are not equal to y. Note that I return a set (surrounded by {} instead of a list, this saves me the issue of sorting after compressing (A and B are provided as tutples, so their orders may be different even though they represent the same sets).

## Div1 medium: The one with Spanning Tree

You are given 50 points. Find the Spanning Tree to connect those points through straight lines (ignoring that there might be intersections), but instead of a usual MST problem, we want the Spanning Tree with the minimum standard deviation of edge weights...

Long story short, we need to repeat the following for each dummy average m = (w_i - w_j)/2 + \epsilon where w_i and w_j are any two edges of the original graph. Find the minimum spanning tree using weights: (m - w_i)^2, where w_i is the original edge weight. Take the standard deviation of the w_i weights in this tree. One of the standard deviation of the O(n^4) trees you are going to try is going to be the answer. *Somehow* (This is a python post, not an editorial, read the long story short for what passes as a proof).

Anyway... This problem seems impossible in python through this O(n^6) approach. O(n^6) under two seconds in TC servers for n<=20 is a piece of cake for c++ and Java, but apparently not for python. This is my fastest python solution:

class Egalitarianism2:
def minStdev(self, x, y):
points = [ (float(p), float(q)) for (p,q) in zip(x,y) ]

def euclid(p, q):
return math.sqrt( (p[0] - q[0])**2 + (p[1] - q[1])**2 )

#Prim's MST algorithm:
def deviationPrim(m):
n = len(points)
par = range(n)
w = [ [ (m - euclid(points[i],points[j]))**2 for i in xrange(n) ] for j in xrange(n) ]
c = 0
reached = [ i == 0 for i in xrange(n) ]
parent = [0] * n
dist = [ w[0][i] for i in xrange(n) ]
r = [0.0] * (n - 1)

for i in xrange(n - 1):
v = min( (dist[j],j) for j in xrange(n) if not reached[j] )[1]
u = parent[v]
r[c] = euclid( points[u], points[v] )
c += 1
reached[v] = True
for j in xrange(n):
if dist[j] > w[v][j]:
dist[j] = w[v][j]
parent[j] = v

# Take the standard deviation of weights
m = sum(r) / (n - 1)
return math.sqrt( sum( (x-m)**2 for x in r) / (n - 1) )

res = 10.0**9
edges = [ euclid(a,b) for a in points for b in points ]
for (e1,e2) in itertools.combinations( edges, 2):
m = ( e1 + e2 ) / 2
res = min(res, deviationPrim(m + 1e-9) )

return res


A cool thing is how it takes the unordered pairs of edge weights to find m = (e_1 + e_2) / 2, instead of using indexes i and j, we can just use itertools.combinations yet again :).

Unfortunately, it is too slow. It needs 30 seconds to run for n = 20 in my computers, which means it should probably take 25 seconds in topcoder's servers. Could it be improved? Maybe, my python skills are still young and this is the first time I try Prim's in python. Either way, the c++ version of the code easily passes under 1 second. So it is at least much harder to pass in python.

## To sum up

Some great python opportunities in most of the match, except for div1 medium which is extremely difficult to pass in python if not impossible.

## Tuesday, March 04, 2014

### SRM 611: My first python SRM

The other day, I made the foolish decision to use Python exclusive in SRMs 611 to 620. I suspected it will be bad for my rating at least initially. I am not as experienced with python as I am with c++ and there is always a good chance the admins will include a problem in the set that is unsolvable in python. On the other hand, maybe the 250 has a quick to think solution and then python's conciseness would help me score some points.

## Div1 250: The one with LCM

You are given two sets A, B, the LCM set of a set S is the set of all numbers L such that L is equal to the LCM of the numbers in a subset of S. Given A and B find if their LCM sets are equal. Numbers in each set are between 2 and 10^9, inclusive.

Tough luck, it didn't appear this was going to be a quick problem to code, and at first I had no idea what to do.

Eventually, I thought of something that appears to be a solution. Given a set X, we can get rid of any number that can be created by taking the LCM of other numbers in the set. After doing this, set X becomes the minimal set such that its LCM set will be equal to X. If we find the minimal sets of A and B, they will be equal if and only if the LCM sets are equal.

Implementing was another issue. How can you tell if a number can be made from the other numbers in the set? Well, you would need to process the numbers in decreasing order. I focused on the prime decomposition of each number, if the exponent of at least one prime number is larger than the maximum ever exponent in the numbers smaller than it, then this number shouldn't be removed. I needed to code Eratosthenes sieve in python for the first time, it probably isn't so well coded.

I made a mistake, the first implementation would ignore the smallest number in each of the sets, so it thought {7} had the same LCM set as {8}. The slow submission plus the resubmit made me score incredibly low. I am not 100% sure about this approach, but if I fail system tests it will not make a difference, my ranking in the match will be very low either way.

import math

class LCMSet:
def equal(self, A, B):

# greatest common divisor
def gcd(x,y):
while y != 0:
(x,y) = (y, x%y)
return x

# least common multiple:
def lcm(x,y):
return x * y / gcd(x,y)

# Eratosthenes siege returns an iterator of prime numbers, we don't need
# primes larger than sqrt(max( max(A), max(B) ))
def sieve():
t = int( math.sqrt(max(max(A), max(B))) + 1 )
compos = [ False ] * (t + 1)
for i in xrange(2, t + 1):
if not compos[i]:
yield i
j = i + i
while j <= t:
compos[j] = True
j += i

# make a list of prime numbers
primes = list( sieve() )

# prime decomposition:
def primeDecomp(x):
d = dict()
for p in primes:
c = 0
while x % p == 0:
x /= p
c += 1
if c > 0:
d[p] = c
if x > 1:
d[x] = 1
return d

# "compress" a set, find the minimal set that has a LCM set equal to X
def compress(X):
decomp = [ primeDecomp(x) for x in reversed(X) ]
n = len(X)
c = []
for i in xrange(n - 1):
good = False
# for each p^r in the prime decomposition:
for (p , r) in decomp[i].items():
if r > max( 0 if p not in decomp[j] else decomp[j][p] for j in xrange(i+1,n) ):
good = True
if good:
c.append( decomp[i] )
#smallest number is always there. Missing this line caused the resubmit
c.append( decomp[n-1] )
return c

return ['Equal','Not equal'][compress(A) != compress(B)]


## Div1 500: The one with spanning tree

You are given a complete graph of up to 20 vertices with weighted edges (It is given in the form of points in a map that you have to connect using a tree topology). Return the minimum standard deviation of the edges of a spanning tree.

Easy, right? Well, the 20 constraint hints me that some heavy dp or brute force is needed. I remembered a match in which we had some sort of brute force/dp through all spanning trees, but I didn't remember enough. Also, the constraint wasn't a good sign for whether the problem is solvable in python or not :/

## The end

Plenty of very different solutions for 250. I have no idea if I will pass, I expect many system test fails.

What about python? Well, this wasn't a great match for comparisons. I think I would have taken very long to code a solution in c++ too. I am currently sleep deprived, I stayed up till very late last night trying to finish something that barely passes as an editorial for SRM 610. I guess there is a good reason I am doing this for 10 SRMs and not just a couple.

## Update: Failed system tests

What a bad day. Div1 250 approach fails the following test case: [{168, 7476, 84, 890, 10, 4, 40, 3560, 37380, 8, 89, 20}, {84, 4, 20, 7476, 3560, 89, 712, 420, 74760, 1780, 14952, 8, 356, 10}], it is a wrong answer so this is likely a mistake with the algorithm and not python's fault.

## Sunday, March 02, 2014

### We all hate the ads

Let's try something. If you like this blog and find it useful, you can use gittip to motivate me. Whenever the weekly tip goes to a value greater than or equal to 3.00 USD, I'll hide the ads in the blog once I notice. If it drops below 3.00, we get ads again. If it never goes anywhere close to 3.00 or even above 0.00, it is all right too, don't worry, don't have much expectations.