## Wednesday, May 28, 2014

### SRM 622: Anticlimatic

It was a long day, trying to finish the very late SRM 621 editorial. I eventually improvised something. I tried to learn as much as possible about suffix automata until I can at least explain the solution with some more detail than the sketches.

So then the match started, I wasn't too thrilled because it is a 9:00 PM match and they tend to be... boring and have too few coders. But oh well...

## Div1 250: The one with roads

You are given a weighted complete directed graph (A map of roads connecting each pair of cities) and a integer T. On a special holiday, for each ordered pair of distinct cities (i,j) a bus must travel from city i to city j, the path used by the bus must be the shortest possible. If there are multiple shortest pats, the bus driver may pick any of them - We don't know which. A road between two cities is unsafe if it is possible that more than T buses will go through it. Find the total sum of lengths of unsafe roads.

The thing here is how to know if a road might be used by a bus. There are O(n^2) buses and each might use each road. For pair of cities (a,b) , then road (x -> y) is used if and only if: mindist[a][x] + road[x][y] + dist[y][b] = dist[a][b]. Where mindist[p][q] is the minimum distance between cities p and q and road[p][q] is the length of the direct road connecting them. We can find mindist[p][q] in O(n^3) with Floyd-Warshall, sure, we could do something even faster, but it is not needed because the logic (picking a,b,x,y is O(n^4).

Code some cute python code, submit:

class BuildingRoutes:
def build(self, dist, T):
n = len(dist)
allN = range(n)
# turn string list dist into a matrix:
dist = [ [ord(x)-ord('0') for x in row] for row in dist ]
# number of times a road might be used
roadcount = [  * n for i in allN ]
INF = 10**9
# Floyd-Warshall
mindist = [ [dist[i][j] for j in allN] for i in allN ]
for k in xrange(n):
for i in xrange(n):
for j in xrange(n):
mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j])
# Find potential uses:
for a in allN:
for b in allN:
if a != b:
for x in allN:
for y in allN:
if x != y:
if mindist[a][x] + dist[x][y] + mindist[y][b] == mindist[a][b]:
res = 0
for i in allN:
for j in allN:
if roadcount[i][j] >= T:
res += dist[i][j]
return res


Unfortunately, I forgot to test the largest case before submitting the solution. I test the largest case and it turns out it times out. Very disappointing that it times out in python. And it is only by a few seconds. I had to remake the code in c++, lost plenty of points because of spending double time and recoding. I should know better, need to keep in mind that 50^4 is too bad for python.

int build(vector<string> dist, int T)
{
// O(n^4) too slow in python, had to migrate code to c++ . arg
int n = dist.size();
const int INF = 1000000000;
vector<vector<int>> mindist(n, vector<int>(n,INF));
// Floyd-warshall
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
mindist[i][j] = dist[i][j] - '0';
}
}
for (int k = 0; k < n; k++) {
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
mindist[i][j] = min(mindist[i][j], mindist[i][k] + mindist[k][j]);
}
}
}
// for each pair of cities (a,b), find if a road(x,y) could be used:
for (int a = 0; a < n; a++) {
for (int b = 0; b < n; b++) {
for (int x = 0; x < n; x++) {
for (int y = 0; y < n; y++) {
if ( (x != y) && (a != b) ) {
if (mindist[a][x] + (dist[x][y]-'0') + mindist[y][b] == mindist[a][b]) {
}
}
}
}
}
}
int res = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
if (roadcount[i][j] >= T) {
res += dist[i][j] - '0';
}
}
}
return res;
}



## Div1 500: The one with tree dp (Tree dp again...

You are given a weighted tree. We want to divide the tree into a minimum number K of connected subtrees such that their diameter does not exceed K. (Diameter is the maximum distance between two vertices in a subtree).

Okay, I didn't have time to solve . I got distracted and was already quite late because of 250. But here is what seems like a solution:

Imagine the root. Either THE root, or a node we are treating as root because we cut the edge above it. Then we have to decide if the children of the root will be used in the subtree or not.

In general, it is best to use a child instead of not using it. Because if we do not use it, we forcefully are creating a new tree, but if we do not, we might be able to avoid it. The issue is how to balance out the lengths of the subtrees that we will keep conencted to the root.

This is a neat trick: Let M be the maximum distance between the root and the nodes in one of the arms of the subtree. Then the maximum distance between the root and the nodes in the other arms cannot be greater than K - M. We can pick one of the children, decide it will be the one with distance M, and then the remaining children must have arms of distance of at most K - M. If the edge connecting the root to a child is larger than K - M, we cannot use this child and must create a new tree.

We need to solve a subproblem: Do the decision for a subtree rooted at a node X, such that the distance between X and any grand children cannot exceed M and the diameter cannot exceed K. This is solved in a similar way, picking a children of X to have the maximum distance.

All in all, we have to solve the problem for f(X, M) , M will not exceed 500, (because K is at most 500), so we will be fine.

## Challenge phase / etc

Challenge phase already started, I will go check what hapens. Scheduling this post so that it is posted when the challenge phase ends.

## Tuesday, May 20, 2014

### SRM 621: Invisible slowness

So another SRM day. I've been with headache since yesterday so running at half the steam...

## Div1 275: The one with the geometry

So you have a circle (signal) centered at (0,0) with radius S , S is a real number picked uniformly at random from 0.00 to Z. There are some other circles (cities) each with center (x,y) and radius r. Things are good if, for each city, the main circle (signal) either doesn't include the city or it includes ALL of the points inside the city. Return the probability that this happens. The answer can have an absolute or relative error or 1e-6 (whichever is smallest)

So that's a nice thing about the allowed error. After topcoder added the ability to include a custom checker in problem sets, I thought it was only a matter of time until someone makes a problem that allows less precision than the usual 1e-9. The first thing I did in this problem was take my special python tester template that allows you to test automatically even when there is a custom checker. I used CNP super powers to paste the usual python checker, but replacing the 1e-9 with 1e-6... I really wanted to have accurate tester results.

The first thing is to notice what makes a radius S valid for a specific circle with (x,y,r). A nice trick in problems involving plenty of circles is to consider only the distance between their centers. Eventually you will notice that if the distance between (0,0) is d, then there are two ranges in which is S is valid: S <= d - r or S >= d + r. Something that caused me to take a bit of time in this sub-problem, because of a special corner case: What if (0,0) is inside circle (x,y,r) ? Then d-r is negative. It turns out that it doesn't matter. S cannot be negative, and thus: S <= d - r or S >= d + r is still a good condition.

Once we know that condition. How to deal with the probability? You should see those d-r and d+r (for all cities) as the only points where the validity of S could change. So we can divide the [0,Z] interval using those points as delimiters. Now we can test each segment [a,b], we know that a or b are points in which the result could change, so all points exclusively inside (a,b) will have the same result. S is valid for (a+b)/2 if and only if it is valid for all points in interval [a,b]. For each valid interval [a,b], if any of those points is picked for S, it is valid just add (b - a) / Z to the probability.

class RadioRange:
def RadiusProbability(self, X, Y, R, Z):
cities = zip(X,Y,R)

def goodCity(x,y,r,s):     #city completely inside circle of radius s or not
d = (x**2 + y**2) ** 0.5
return ( (s <= d - r) or (s >= d + r) )

def goodRadius(s):         # each circle must be completely included or not
return all( goodCity(x,y,r,s) for (x,y,r) in cities )

def getEvents():
yield 0.0
yield Z
for (x,y,r) in cities:
d = (x**2 + y**2) ** 0.5
yield d + r
yield d - r

# remove invalid events (less than 0, more than Z), duplicates (using set)
# and turn into a list (so we can use indexes) and sort
# (possibly unnecessary to call sorted, the set likely already sorted)
evs = sorted( list(set( x for x in getEvents() if (0 <= x <= Z) ) ) )
good = 0.0
for i in range(0, len(evs) - 1):
p = (evs[i] + evs[i+1]) / 2
good += evs[i+1] - evs[i]

return good / Z


I knew how to solve the problem since the beginning, except dealing with the corner case that turned out not to exist anyway. I am not sure why it took me so long to solve it, but it did.

## Div1 500: tree stuff

Seems hard, I was tired and with headache so I went to lie down while I supposedly thought of the problem, but couldn't think of much.

## Challenge phase

During the challenge phase I discovered why people were faster than my in 275, it turns out that S is invalid if and only if it intersects with any circle (city) in the input. While most of the code remains the same, this makes reaching the code a bit easier and also removes the need for checking for validity of (a+b)/2at the end, instead you just do a line sweep. (Each event d-r increments the number of intersecting circles, each d+r decrements it, sum up intervals in which the count is 0.)

## Sunday, May 18, 2014

### Auto-test Topcoder problems with multiple correct answers

Topcoder have been improving the decade old contest system a bit. Things like allowing larger constraints, possibility of different time/memory limit per problem, etc. But the greatest change of them all, the one that actually brings problem variety rather than Fake Difficulty is that some problems allow multiple correct answers.

The first SRM to have these problems was SRM 617. The division 2 easy and division 1 medium allow multiple correct answers. They were interesting problems that would not have worked quite as well without this ability.

In SRM 617 div2 medium, you have to return two composite numbers that add up to n. n is at least 20 and at most 1000. It turns out there is a very cool and easy solution: Return (4, n-4) if n is even, else (9, n-9) if n is odd. But if this problem appeared in old topcoder, how would we have led coders to this solution? We would have had to specify tons of things. Like, the first integer must be the smallest composite possible that makes the result as intended. But then there would be odd cases in which (4, n-4) is an answer, and then the solution would be more complicated (not appropriate for div2 easy). Even if we did all that, the problem wouldn't be very nice, because the results of the example cases's results would give the solution away too easily...

However, the bad thing about these new problems is that, it is Topcoder, and many of us are used to Topcoder Arena plugins that generate automatic testers based on the example test cases. If we used our old arena plugins on these problems and the correct results of our solution didn't exactly match the expected solutions, the automated tester would tell us that the results are wrong. Even when they are actually right. We need some changes.

So after SRM 617 I made plans to work as hard as possible in making my Greed tester setup able to work around this a bit. This is the story.

## Identifying those problems

The first thing to do is to identify the problems that allow multiple correct answers for a test case. This requires help from the Topcoder Arena developers. The first thing I did was, during SRM 617 I told admins to remember that arena plugin API needs a way to detect this. Then I waited, and waited. Then I figured, there is a thread I know that seems to be monitored closely by Arena developers. Making post in this thread seems to be the only way I have to contact them, and so I did. I really the plugin API was documented at all and that each change was announced and explained and all new methods were known. But this is not the case. Even after receiving confirmation that this feature was in the API, I had to dig through the Jar looking for something that might be it.

problem.getComponent().isCustomChecker();
// where problem is com.topcoder.client.contestant.ProblemComponentModel


So now I just needed to add the feature to Greed so that any template was able to identify these problems and then we can do the magic.

I made a pull request, that PR adds a Problem.HasCustomChecker field. You can use it in templates: ${if ProblemHasCustomChecker}...${end} . The Pull request is still not merged and thus you will have to do some magic to compile greed using it. If you want...

## Modifying the tester templates

We also need to modify the tester so that it can make use of this information. But how?

My idea of how to allow automated testers in these problems consists of two parts:

• My tester already supports test results in which it is not known whether or not the returned answer is correct. And it can already report results as such (Using a ? symbol). Testing is still useful because you can get the results for all examples with a single run and you can also detect any example that times out or has a runtime error. So the first idea is to make the generated test code to make all test cases do this when the problem has this flexibility.
• However, that doesn't fix that maybe the coder wants/needs also to detect wrong answers. The only workaround to this is to allow user to code a checker method for the returned output.

In many problems, it is easier to check if an output is correct than to solve the problem. For example, the problem I described above. It is easy to check that the returned value is a pair of integers, that they are both positive, they add up to n and they are composite. So we can make a tester.

In other problems, it may not be so easy to know if a result is correct without solving the problem. But if you already know one correct result for that input, then it is still easy to write a grader. For example. A problem that returns the shortest sequence following some conditions. If you know that the provided example case result gives a 3 elements sequence, then we can tell that any sequence with more than 3 elements is wrong and can report as such making use of this result.

My idea is to have a checker method in the problem class. It should take input, resulting output and the example' expected output (if it exists) If checker method returns 0, then it is not known if the provided answer is correct. If return -1, we know it is wrong and if it is 1, we know it is correct. Initially, this method returns 0 in all cases, but it can be modified by the user.

I implemented this in python, it looks like this:

class SilverbachConjecture:
def solve(self, n):
if n % 2 == 0:
return (4,n - 4)
else:
return (9,n - 9)

# CUT begin
#-------------------------------------------------------------------------------
(n,) = testInput
#You can implement a custom checker.
# Return value: -1 is WA, 0 is unknown and 1 is AC.
# expectedAnswer is not None, then expectedAnswer is a known correct answer.
if len(returnedAnswer) != 2:
return -1
if (a + b != n) or (a <= 0) or (b <= 0):
return -1

def isComposite(x):
p = 2
while p*p <= x:
if x % p == 0:
return True
p += 1
return False

return 1 if isComposite(a) and isComposite(b) else -1


The trick is that the generic tester is a separate file, but I just updated it a bit to detect if a greedCheckResult method exists, if it does, then it does the required logic.

I am not sure yet how to do this in c++, yet, but I should try something tomorrow.

Besides of modifying the generic tester, I also modified my python template. This is the part that makes use of the HasCustomChecker field:

# ${Contest.Name} -${ClassName} (${Problem.Score} points) by vexorian :${<if Problem.Description.Modulo}
MOD = ${Problem.Description.Modulo}${<end}
class ${ClassName}: def${Method.Name}(self, ${Method.Params}): return${Method.ReturnType;zeroval}
${<if Problem.HasCustomChecker}${CutBegin}
#-------------------------------------------------------------------------------
(${Method.Params},) = testInput #You can implement a custom checker. # Return value: -1 is WA, 0 is unknown and 1 is AC. # expectedAnswer is not None, then expectedAnswer is a known correct answer. return 0${CutEnd}
${<end}${CutBegin}
${<TestCode}${CutEnd}


The relevant changes to the generic tester affect only the run_tests function:

def run_testcase( problemClass, methodName, testInExpected, caseNo, totalCase, testTimeOut, finalLines, compactMode ):
(testIn, expected) = testInExpected # so that it compiles in python3
if compactMode != ONLY_REPORT:
sys.stdout.write(TEST_COLOR + "Test " + str(caseNo) + COLOR_RESET + ": " + prettyStr(testIn)+"\n")
startTime = time.time()
instance = problemClass()
exception = None
try:
result = getattr(instance, methodName)(*testIn);
except Exception as e:
ln = None
for x in traceback.extract_tb(sys.exc_traceback):
if x == problemClass.__name__+'.py':
ln = x
if ln is None:
exceptionShort = '%s, (unknown line)'%( type(e).__name__ )
else:
exceptionShort = '%s, line: %d'%( type(e).__name__, ln )
exception = traceback.format_exc()

elapsed = time.time() - startTime   # in sec
if compactMode != ONLY_REPORT:
sys.stdout.write("Time: %.02f seconds.\n" % elapsed)

knownAnswer = (expected is not None)
if exception is not None:
if compactMode != ONLY_REPORT:
sys.stdout.write("RUNTIME ERROR: \n")
sys.stdout.write(exception)
correct = False
else:
if hasattr(instance, 'greedCheckResult'):
check = instance.greedCheckResult(testIn, result, expected)
correct = (check >= 0)
knownAnswer = (check != 0)
else:
correct = expected is None or tc_equal(expected, result)
if compactMode != ONLY_REPORT:
if not correct or not knownAnswer:
sys.stdout.write("\t%s\n" % prettyStr(expected) )
sys.stdout.write("\t%s\n" % prettyStr(result) )

res = '?'
if exception is not None:
res = 'E'
elif not correct:
res = 'X'
elif elapsed > testTimeOut:
res = 'T'
res = '+'

# final lines:
s = ( TEST_COLOR + "t" + prettyCase(caseNo, totalCase) + COLOR_RESET + ": " + colorTestResult(res) )
s += (" (%.2fs)" % elapsed)
if res in ('?', 'X'):
s += (" (" + prettyStr( result)+ ")" )
elif res == 'E':
s += (" " + exceptionShort)
finalLines.append(s)

if compactMode != ONLY_REPORT:
sys.stdout.write(" %s\n" % colorTestResult(res) )
sys.stdout.write( BAR_COLOR + ('='*BAR_LENGTH) + COLOR_RESET + "\n")
return res


I'll eventually make a more formal release of this update once the field is added to the official Greed and I figure out a clean way to do this in c++.

Sure , it is possible that the checker you implement has a bug and thus it reports your results as correct when they are not, you submit and fail system tests. But it is better than nothing.

Victory (for now): ## Sunday, May 11, 2014

### The greatest topcoder t-shirt never made

The other day I received my SRM 600 t-shirt. If you kept attention you'd remember I did terribly in SRM 600 / SRM 600.5 , so you are wondering what's up with that? Simply, turns out all the problem setters / testers who worked in SRM 600 got their t-shirts too. And I was the editorial writer, so this is actually the first t-shirt I got because of writing editorials rather than because of competing.

If you are wondering how the SRM 600 t-shirt looks like. In the front it says [SRM 600] in large letters the back , it has the new Topcoder logo. It is black.

But this blog post is not about that t-shirt. But about this old TC forums bit:

Members, admins, country wo/men:

We're trying something a little different for the TCCC T-shirt this year.

Thanks to the thread about best member quotes, http://forums.topcoder.com/?module=Thread&threadID=512175&start=0 we thought, "Wouldn't it be cool to see a member's quote on the back of the TCCC T-shirt?"

So, here's what we'd like you to do. Search your little hearts out for your favorite member quotes. Post each one individually, so the member community can use the feedback symbols (+/-) to show their like or dislike of each quote.

Then, on September 11, TopCoder will select the top five quotes. A quote will be chosen if it meets all of the following requirements: member popularity, originality and TopCoder/programming subject matter. On September 18 everyone (admins included) will vote from the chosen five for their absolute favorite.

There won't be any prize money given, but won't the winner feel warm and fuzzy knowing a little part of him or her is being worn by 2000 people? I know I sure would.

To get the ball rolling, I've posted a few of my favorites below.

C'mon, let's see some more now!

(This was eons ago, back when TCCC was a thing) sql_lall wrote:

I have this quote on my wall at home :) I'm wondering though, how long it'll be until someone changes their quote just for this thread. i.e. ""I memoized inside a ternary search and all I got was this lousy t-shirt" ;)

[Edit: removed my member quote suggestion, as I'm guessing most people are voting for the quote mentioned above. But then again, I could be wrong :p]

So, this is awesome, what happened with this t-shirt idea? Back then, topcoder decided t-shirt contents using the Schulze method. (Topcoder actually deciding anything using a poll now sounds too Alien of a concept). And another idea won: doh

I mean really, An Edison quote? EDISON?? Nobody likes that guy anymore. Disappoint.

## Saturday, May 10, 2014

### SRM 620, python fails again

This was the last of the 10 SRMs I decided to use python exclusively on. Also definitely the one in which the poor topcoder python support did the most harm.

## Div1 250, the one with gcd

Game: You start with a pair of positive integer (x,y), in each move you can move to (x,x+y) or (x+y,y). You are given two paris (a,b) and (c,d) , where each integer is up to 1000000. You want to know if there is a pair (x,y) such that both (a,b) and (c,d) are reachable using it as a starting position. Return the largest possible sum x+y in that case else -1

The idea is to go backwards. From (a,b), there are two candidate backward moves: (a-b,b) and (a,b-a), however, since the integers must be positive, only one of them is valid: If (a>b), then pick (a-b,b) , if (a<b) pick (a,b-a) , if they are equal, this is the time to stop, because else one will go negative.

So it is easy to make the two sets of pairs (x,y) that can reach both (a,b) and (c,d), we could find if the two sets have an intersection, if they do, return the maximum sum. Easy:

Notice that when one of a or b is 1, and the other is 1000000, there are 1000000 elements in one list. So there can be way tons of elements and looking for the intersection is hard unless we go clever and use sets that work in logarithmic time to find if they contain stuff or not. (Or maybe a hash table, but hash tables really blow) In c++, it means std::set or std::map, in python:

class PairGame:
def maxSum(self, a, b, c, d):
# find all the pairs that can be reached going backwards from (a,b):
def backwards(a,b):
while a > 0 and b > 0:
yield (a,b)
if a >= b:
a -= b
else:
b -= a
# find the intersection between the two sets:
inter = set( backwards(c,d) ) & set( backwards(a,b) )
return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)


I submitted it and scored 225~ points. I was so proud of this code. I knew that people coding in c++ or one of the joke languages would be making totally awful things, like repeating the code used to go backwards from (a,b), copy and pasting it to do it with (c,d) but doing the slightly different search.

Unfortunately, it times out with (a,b) = (1000000,1) and (c,d) = (1000000,1), or similar. I could notice that it can barely run around 1 second when none of the integers is 2. So the only way to fix this is to take care of the cases when any of them is 1 individually (without making huge loops). This complicates the code a ton, and I needed tons of resubmits before getting the following code which might still be wrong:

class PairGame:
def maxSum(self, a, b, c, d):
# this wouldn't be necessary if python wasn't too slow for O(n) when n = 1000000
if a == 1:
# The path to (a,b) is (1,1), (1,2), ... (1,x), ... (1,b)
#can we reach (1,x) going backwards from (c,d) ?
#if so, what's maximum x that we can reach?
while (c > 1) and (d > 0):
if d > c:
d = d % c
else:
c = c % d
if c == 0 and d == 1:
c = 1
if c == 1:
return min(b,d) + 1
else:
return -1
if b == 1:
return self.maxSum(b,a,d,c)
if c == 1 or d == 1:
return self.maxSum(c,d,a,b)

# Let's pretend the followign is the only real part of the code, okay?
def backwards(a,b):
while a > 0 and b > 0:
yield (a,b)
if a >= b:
a -= b
else:
b -= a
inter = set( backwards(c,d) ) & set( backwards(a,b) )
return -1 if len(inter) == 0 else max( x+y for (x,y) in inter)


(In fact, at first it seemed like the main problem was memory, but even after fixing the memory, it is still too slow

All those using c++ or Java or perhaps even C# wouldn't have this issue. It caused me to lose plenty of points in resubmits and I might fail still. The problem is much harder when n = 1000000 is too slow. And the worst thing is, that difficulty wouldn't have changed much for c++/Java if the constraint was 500000, but it would have saved python. Well, too bad.

## Div1 500

I had little time. All I can say is that the problem is the same as if you just picked a sequence of skills and sorted by this sequence of skills lexicographically. The rest is fuzzy

## Div1 250: The one with stacks

You start with up to 50 piles of stones. Two players play a game alternating turns. In a given turn, a player must pick three piles: (a,b,c) such that pile a has at least 2 stones. Pile a is split into two non-empty piles and their amounts are added to pile b and pile c. If a player cannot find any valid turn, they lose. Find if the first player is guaranteed to win or to lose assuming the two players play optimally.

During the match, I did a dynamic programming based on the fact that the game state can be described by two variables (p,q) where p is the number of piles with more than 1 stone and q the number of piles with 1 stone. This leads to an O(n^2) dp.

However, as it turns out, the problem can be solved in O(1) time. Note that every valid step will always reduce the number of piles by 1. It turns out that all cases with an even number of piles are losing states, so if the number of piles is odd AND there is at least one valid move, then the state is winning.

Need to prove that all even states are losing ones. This is obviously true for n = 2. So for n >= 4, we figure that the only losing states with n-1 are those that consist entirely of 1s. However, it is impossible to reach such a state using any valid number of moves, because after any valid move, at least two of the piles will be greater than 1.

## Div1 600: ouch

This problem's statement is very long to copy and to re-write. So I will skip to the solution.

Imagine you wanted to calculate f(x,i), the minimum cost to assign skills to the sub-tree rooted at x if you already know that one of the skills chosen for x is skill i.

We can easily find f(y, j) for any child of x and skill j. Out of all the skills assigned to x and its children, it should be possible to make a permutation of skills. Let's focus on this permutation. There are two possibilities.

• Use i in the permutation, meaning that all children must use skills different to i.
• Use a different skill in the permutation for x. All children and x need to pick a unique skill.

Using the values of f(y,j) and some ad hoc stuff, you can easily build a table of costs for each case, the cost of assigning each of the employees to each skill in the permutation. Then you have a minimum-cost bipartite matching. Too difficult to implement in the little time I had after figuring this out.