Starting to write this 4 hours before the contest ends, I might be away of the keyboard until the time it ends, so I think this is it, no more solutions from me. I am a bit downed because my C-large 1 will fail because of an overflow bug...

## Preparation

I did many things to prepare to this round. The usual ones were to set up the gmp library. It allows c++ to have arbitrary precision, and if you use a "typedef mpz_class big", the bignum support of c++ becomes one of the best you can find. This is a measure to shut any attempts to pull artificial difficulty or to brag about how some problems are not as good to be solved by C++ (lies, just get the right library, C++ always wins for contests, </extremist> ).

Last week, on Saturday I bought a precious CPU/Motherboard/RAM upgrade. I jumped from Core 2 Duo to an Ivy Bridge Core i5. In codejam, your processing power matters. Well, processing power is *always* useful for programming contests, because it speeds up your local testing. In the case of codejam, it can be a game changer. I quickly learned that the average algorithmic program takes half the time it used to take in my old Core 2 Duo. Also, I now have four Cores instead of two, if I somehow manage to use them all...

I had plenty of reasons to update my codejam template. The old template supported two processes, but it was a very rudimentary support - It split the input file in two, and ran a process on each half. With time I found it was not very effective, because usually one of the halves had a lot more processing needs than the other. Most of the times, I ended with a process finishing its job quickly and the other still taking quite some time. The other issue with the old template is, of course, that it only supported 2 processes, I need 4 processes so that I use all my CPU cores.

From experience out of TopCoder Marathons, I was able to finally make a template that creates three extra processes. Each process has to read the input entirely (This is a flaw that adds overhead, but is sort of needed not to complicate the problem, there are too many different ways a problem could need input/output). For each read case, the process verifies if it is taken by any of the other four processes, if not, starts solving the case. The main process then joins all the results together once they finish.

As I did this (around 2 hours before the contest), I noticed that C++0x has actual thread support, there are hopes I might be able to optimize this in a way that does not need four processes, but four threads, but that's work for another day.

I also had to set up the codejam command line tool. After the first time I failed a small input because I submitted the wrong output file, it became clear I needed to automatize things.

## Problem A - Tic-Tac-Toe-Tomek

Link to problem statementSo it is a 4x4 tic tac toe variation which begins with a single T (as opposed to X and O, the players' symbols) which acts as a wild card (It can be part of the winning line).

A pure implementation problem, you just need to be able to detect if anyone won. If no one won, then the game finished in a draw as long as there are no empty cells.

I think I took way too much time to solve this. I tried to be clever and made a function to quickly verify a line. Unfortunately, this cleverness brought a silly bug that leeched away precious time in debugging.

`char board[4][4];`

int sign(int x) {

return ( (x == 0) ? 0 : (x / abs(x) ) );

}

// Checks if a line has 4 player cells or 3 player cells and one T.

// Coordinates of the cells in the line are:

// (x0,y0), (x0+sx, y0+sy), (x0+2*sx, y0+2*sy), (x0+3*sx, y0+3*sy)

// Where sx = sign(x1 - x0), sy = sign(y1 - y0).

bool lineCheck(int x0, int x1, int y0, int y1, char p) {

map<char, int> c;

int y = y0;

int x = x0;

for (int i=0; i< 4; i++) {

c[ board[x][y] ]++;

x += sign(x1 - x0);

y += sign(y1 - y0);

}

return (c['T'] + c[p]) == 4;

}

string solve() {

// count empty cells:

int e = 0;

for (int i=0; i<4; i++) {

for (int j=0; j<4; j++) {

e += (board[i][j] == '.');

}

}

// Find out if any player wins:

bool wins[2] = {false, false};

string p = "XO"; // for i=0, check X, for i=1 check O

for (int i=0; i<2; i++) {

// diagonal

wins[i] |= lineCheck(0,4, 0,4, p[i]);

// anti diagonal

wins[i] |= lineCheck(3,-1, 0,4, p[i]); //My mistake was 4,0 instead of 3,-1

for (int j=0; j<4; j++) {

//horizontal lines (four of them):

wins[i] |= lineCheck(0,4, j,j, p[i]);

// vertical lines

wins[i] |= lineCheck(j,j, 0,4, p[i]);

}

}

if (wins[0]) {

return "X won";

} else if (wins[1]) {

return "O won";

} else if (e > 0) {

return "Game has not completed";

} else {

return "Draw";

}

}

## Problem B - Lawnmower

Link to statementHow about you read the statement so that I don't have to explain it?

Ok... There is one thing to understand. For each row, find the maximum height in that row. We can clearly just run the mower on that row with that maximum height as setting. The heights of all grass squares in that row will either become exactly what the pattern wants them to be, or a larger height (which we can cut by using another mower run).

Then we can do the same about columns, run the mower using a height setting equal to the maximum height in that column.

If we did any different kind of move. If we ever used a setting smaller than the maximum of a row or column, we would definitely have a wrong pattern. Some grass squares (the ones that are supposed to be equal to the maximum height of the column/row) would become smaller than what you wanted.

If after doing these moves for each row/column, the pattern is not equal to what we wanted, it should be impossible to do it.

The solution is then to verify. Each cell must be either equal to the maximum height of the column that contains the cell, or the maximum height of the row that contains the cell. Else it would not be possible to do it.

`int N , M;`

int a[100][100];

string solve( ) {

int rowMax[N];

int colMax[M];

// Find the maximum for each row:

for (int i=0; i<N;i++) {

int mx = 0;

for (int j=0; j<M; j++) {

mx = std::max(mx, a[i][j] );

}

rowMax[i] = mx;

}

// Find the maximum for each column:

for (int j=0; j<M; j++) {

int mx = 0;

for (int i=0; i<N; i++) {

mx = std::max(mx, a[i][j] );

}

colMax[j] = mx;

}

//Verify:

for (int i=0; i<N; i++) {

for (int j=0; j<M; j++) {

if ( (a[i][j] < rowMax[i]) && (a[i][j] < colMax[j]) ) {

return "NO";

}

}

}

return "YES";

}

## Problem C - Fair and Square - Small

Link to statementSince **B** is at most 1000, we can just do a straight forward check. For each integer between **A** and **B**, verify it is palindrome, find its square (if it exists) and verify if the square root is palindrome too.

I decided to skip the large inputs, I noticed that you only need the four small inputs to be correct in order to advance to round 1. So I rushed that objective.

## Problem D - Treasure - Small

Link to statementOuch. This is a problem I wanted to write since a long time ago. My inspiration for this sort of riddle about Interchangeable Antimatter Keys, comes from Xye. Tutorial 10 has a puzzle about choosing the right order to use the keys, but it was more complicated.

Ok, let us focus on the small input. The number of chests is at most 20. This is too large for us to just try a brute force of the 20! different permutations, but it is not too large for another approach that is also a bit standard...

Let us say you have opened a set of chests. Once you opened the chests, the order in which you opened the sets does not matter, only which of the chests you opened. From a single bitmask from 0 to 2^{20}-1, you can find out which chests were opened and calculate the number of keys of each type that you currently have (Sum between initial keys and the keys inside the chests you opened minus the keys needed to open all of the chests). When you know which keys you have, you know which of the chests you can open.

Let us just use dynamic programming. Make a function f(**mask**) that returns the lexicographically smallest order of the chests that are inside the **mask**, assuming that we have already opened the other sets.

f(0) is an empty vector, we have already opened all the chests, so the remaining chests are an empty set. This is the base case.

In any other case, calculate the number of remaining keys of each type. Then for each chest that can be opened, find out what would happen if we open it. For chest **j**, the solution of the remaining chests after opening **j** will be f(**mask** - (1<<**j**), if this solution exists, then we can append **j** to the front of the solution, and we now have a possible solution for f(**mask**), pick the lexicographically smallest one.

Overall result is: f(111111...1) (base 2, as many 1s as bits).

`int N, K;`

int initialKeys[40];

int chests[20][40];

int chestKeys[20];

int chestType[20];

vector<int> mem[1<<20];

void solve( ostream& cout ) {

// base case:

mem[0] = vector<int>(0);

const int INF = N*(N + 1);

// Try each of them:

for (int mask=1; mask <(1<<N); mask++) {

// A single element containing infinite is our way to mark a state

// as having no possible solution.

mem[mask] = vector<int>(1, INF);

// Count the number of keys of each type:

map<int, int> keys;

for (int i = 0; i < K; i++) {

keys[ initialKeys[i] ]++;

}

for (int j=0; j < N; j++) {

if ( !( (1<<j) & mask ) ) {

keys[ chestType[j] ]--;

for (int k = 0; k < chestKeys[j]; k++) {

keys[ chests[j][k] ]++;

}

}

}

// don't worry, if any number of keys is negative, this state will

// never be reached from an upper state, so its result would not matter

for (int j=0; j < N; j++) {

if ( (1<<j) & mask ) {

if (keys[ chestType[j] ] > 0) { // We can open chest j.

// Find out what happens if we open chest j:

vector<int> other = mem[ mask - (1<<j) ];

// If there is a solution, append j:

if ( (other.size() == 0) || (other[0] != INF) ) {

vector<int> nw( other.size() + 1);

nw[0] = j+1;

for (int i=1; i<nw.size(); i++) {

nw[i] = other[i-1];

}

// Remember the best solution so far.

mem[mask] = std::min( mem[mask], nw) ;

}

}

}

}

}

// Result is f( (1<<N) - 1):

vector<int> &res = mem[ (1<<N) - 1];

if (res[0] != INF) {

cout << res[0];

for (int i = 1; i < N; i++) {

cout << " "<<res[i];

}

} else {

cout << "IMPOSSIBLE";

}

cout << endl;

}

This is when the preparations I made started to fall down. The command line tool does not work for problems above the small input of problem C, because of the unusual way problem C works (two large inputs).

## C - large 1

So I advanced, but I have a reputation to maintain and I cannot afford having a low score...

This is when I noticed that 10^{14} is not so large. The square root of the numbers between 1 and **B** will be at most 10^{7}, since the numbers must be perfect squares, we only need to iterate through the possible square roots. For each number between 1 and 10^{7}, find if it is palindrome, then square it, if the
square is palindrome and the square is between **A** and **B**, then you got a solution.

So I downloaded the first large input. My processes, even with 4 cores were taking too long to solve it. I didn't notice that although Large 2 has 1000 input cases, large 1 has 10000 input cases... That is too much. I shut down the execution after 2 minutes. I had 6 minutes to make a faster solution and to submit it. It was difficult , but I was able to do it.

You only need a run from 1 to 10000000, to find out all of the "Fair and Square" numbers from 1 to 10^{14}, there are actually just a few of them, 39 (if I remember correctly), so you can just keep them in an array and then, when requested the result for **A** and **B**, look through the array and count the numbers between bounds

## C - large 2

I tried to think of ways to adapt the previous idea to numbers of 100 digits.

For example, since the square root must be a palindrome, then you can also brute force for the first half of the palindrome number. There are about 10^{25} ways to find the first half of the palindrome square root. It is still too large.

Hours later, it finally occurred me a way to fix this issue. Clearly, from what I could see from the 10^{14} solution, there are very few fair and square numbers. Why is it?

I then did something clever, I made the program show not only the fair and square numbers, but also their square roots. I noticed something interesting. Besides of 3, all of the other square roots were composed only of digits 0, 1 and 2.

If this rule is true, then the solution just became faster. There are about 3^{25} different ways to make the first half of the square root.

This solution was still a bit slow to run, even if I managed to split the work load in 4.

The question is, why should the digits be only 0,1 and 2? Well, imagine that you have a palindrome number, and you square it. That the square is palindrome as well would be quite strange... For the initial symmetry not to get lost, there would have to be no carriage in the multiplication. Whenever there is a 3 digit or higher (except when there is only one) there must be a carriage operation (Because as it is a product between a palindrome and itself, all combinations of digits will be used).

Most importantly, if a half of a string causes a carriage operation , then any other half of a string that adds any digits to it will also have a carriage operation. We can crop the search.

After cropping the search, even for B=10^{100}, the search takes little time to run. So we can solve the problem.

I wonder if there is any mistake in this analysis, I might find out later when it turns out that my solution is wrong.

`// I told you c++ can do big nums:`

// http://gmplib.org/

#include "gmpxx.h"

#define big mpz_class

struct solver

{

big A, B;

char x[100];

char y[100];

bool palindrome(string s)

{

int i = 0, j = s.length() - 1;

while (i < j) {

if (s[i] != s[j]) {

return false;

}

i++; j--;

}

return true;

}

int theCount, lim;

void find(int len, ostream& cout) {

// We try each string of 1,2 and 0 digits as the first half of the

// palindrome.

// Make them, and then raise that result to square. If the result

// is a p

bool good = ( len == 0 ) ;

if (len > 0) for (int k = 0; k < 2; k++) {

//There are two palindromes that have this first half.

// For example, for 12 : 1221 and 121.

// So each value of k tries one of them.

int l2;

if (k == 0) {

l2 = len*2 - 1;

} else {

l2 = len*2;

}

for (int i=0; i < len; i++) {

y[i] = x[i];

y[l2 - i - 1] = x[i];

}

y[l2] = 0;

// The square of the palindrome, if it is within limits:

if ( 2 * l2 - 1 <= lim) {

big z(y, 10);

big w = z * z;

// w must be also within limits.

if (w <= B) {

if ( palindrome(w.get_str()) ) {

// We should only continue if at least one of the palindromes

// has no carriage in its square.

good = true;

if( A <= w) {

// within range, count it.

theCount++;

}

}

}

}

}

if(! good) return;

// Only add 0 after the first digit.

if (len != 0) {

x[len] = '0';

find(len + 1, cout);

}

x[len] = '1';

find(len + 1, cout);

x[len] = '2';

find(len + 1, cout);

}

int solve(ostream & cout)

{

lim = B.get_str().length();

// Special case, 9 is the only one with a square root that contains

// 3 as a digit:

theCount = (A <= 9 && 9 <= B);

find(0, cout);

return theCount;

}

After I implemented this, I had a big issue, my results for C-large1 with this new solution were not correct against the previous C-large1 solution. So maybe there was a mistake in the analysis? After a lot of dribbling, I found out the sad truth, my first C large solution had a very silly overflow bug. I cared about using 64 bits integers in most of the problem, but I forgot them in the set<> structure that saves the precalculated results, making everything else worthless.

GMP was very useful indeed.

## D - large

Sorry, I couldn't do it.

I tried a max flow idea, but it was wrong, it has nothing that prevents cycles.

At the end, what we want is to make a forest out of all the keys. Each key can have the keys of at most one chest as its children. The initial keys have to be the roots. Can we make such a forest? No, really, can we? I have no idea how.

I am finishing this write up at 17:10, the post shall appear automatically at 20:00. Let us see if there are any surprises, I know I fail the first C-large because of overflow, but I might fail the other larges too :/-

## 10 comments :

It seems D is greedy. According to the top solutions, it seems.

2 at the ends and center is a great idea.

For C - sum of all digits in number cannot be higher than 9. This means that it can't be higher than 4 in half-number. This leaves us with all permutations of 1,2,3's and 0's with sum not more than 4 to check, which is really not much. (you also should think what you use to glue the halves - you can use "", "0", "1" or other numbers so that total sum is less than 9). Then you just find all needed numbers. Very fast. I think I could do 10^200 with the same code.

For D - I was 100 times reading this phrase about "not more than 400 keys" which in my opinion was pointing to graphs or dynamic programming, or something which treats keys individually, not by type, but I failed to come up with any reasonable solution.

I am still to review actual solutions submitted by people.

The solution I see now (which does Large input in few seconds) is really a brute-force with just one more smart idea to limit search tree. How stupid I was - I added just one line to my solution and now it works with the same speed.

Nice analysis, ... I also couldn't come up with D-large solution, and my D-small approach is exactly like what you described.

On C-large, very interesting that even I first thought of the 3^25 thing, ... that's still too large a number, I think ... but then the OEIS page has something really interesting, ... the palindromic square root has to have its sum of square of its digits to be strictly less than 10, ... only then if it is squared, it will also result in a palindrome. Of course I don't have the proof, that's why I'm waiting for some analysis ... I did solve it during the contest using python, not too bad, ... so all in all, I'm interested to see the analysis of D-large and the proof of C-large-2.

D Large - brute force with one more additional check:

Before each step of recursion do the following:

- sum up all remaining keys less chests and check that all numbers are positive (so before recursion we know that branch we going to check is theoretically solvable)

- pretend that keys do not disappear after they used and under that condition quickly check that the branch is solvable.

Just this two things speed up things enough to solve D-large in seconds. Without saying how stupid I was... I would say that we were cheated - I mean, the problem was really MUCH easier than we thought and that was the biggest problem coming up with solution. Instead of thinking about Dynamic programming, Graphs, Max Flows and all other clever things, we should have just think about optimizing brute force a little bit. Oh.

I got stuck on this problem. Would you mind explaining why sum can not be higher that 9?

They posted an analysis in the contest dashboard.

But it is all about how there should be no carry operations when raising to square.

Would anyone be willing to post the correct output file for problem B (lawnmower) large? I have tried a million times over but cannot get a correct solution accepted, though in every single attempt passed the small set with flying colors. I am trying to see which cases I messed up to nail where my problem is at. Thanks!

Hey try finding a solution to that problem on go-hero.net in your favourite language and run it on your input. Then you should get the correct output file and do a diff :)

Post a Comment