Sunday, October 06, 2013

c++ and python topcoder generic testers

A sequel to post: did I just spent all my Saturday customizing my Tester?. I have been improving the idea further. After adapting the ideas to python, it caused me to work harder in the c++ version.

Test case format

When I tried to adapt the ideas to python, I was able to have this useful format for test cases:

TEST_CASES = [
    ( [ ("---","---","---",) ], 0 ),
    ( [ ("-X--","---X","----","-X--",) ], 1 ),
    ( [ ("XXXX","---X","---X","---X",) ], 2 ),
    ( [ ("XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-",) ], 3 ),
    #Your custom test goes here:
    #( [ ], None ),
]

This would be an example for SRM 593 div1 250: HexagonalBoard. It is much simpler than the odd template stuff that I barely managed to have in c++ and it is shorter.

It led me to try to work harder on the c++ version, maybe there really was a way to use {} format without having to type input {} and output {}... It turns out that once you are using constructors you can totally use them:

template<class I, class O> vector<pair<I,O>> getTestCases() { return {    
    { { {"---","---","---"} }, {0} },
    { { {"-X--","---X","----","-X--"} }, {1} },
    { { {"XXXX","---X","---X","---X"} }, {2} },
    { { {"XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-"} }, {3} },
    // Your custom test goes here:
    { { {"XXXX","XXXX","XXXX","XXXX"} }, {3} },
    { { {"XX-","---","---"} }, {2} },
    { { {"XX","--"} }, {2} },
    { { {"-XX", "X-X", "XX-"} }, {2} },
    { { {"-XX","X-X","XX-"} }, {2} }
    { { {"XXX","XXX","XXX"} }, { } }
    //{ { {}}, {} },    
};}

You can see some custom examples I added today during the match and also while working on the problem for the editorial. An empty { } for the output means the output is unknown...

This works because we are calling constructors of an input and an output structs. The input struct has an implicit constructor to fill all the arguments of a problem. The output has two constructors, one that takes an output result, and one that takes no arguments. The reason {} syntax wasn't working for tuples, was that tuples apparently define only an explicit constructor? Either way once you switch to fixed structs it finally works.

This input format in c++ is quite useful. The compatibility with TopCoder's system tests format makes debugging during practice easier. Also it is easy to copy test cases you created and have stored in the file to the challenge window.

Making the problem-specific code as small as possible

The whole idea of the "meta-tester" is to have the code baggage that we need to automatically generate and include in the problem's file as small as possible. That's why I wanted to have most of the tester in a separate file that is not-specific to a single problem but works with any problem (It is also easier to edit the tester this way). Again, python made it way too easy. This is a sample file generated for python for the HexagonalBoard problem:

import math,string,itertools,fractions,heapq,collections,re,array,bisect
# SRM 593 - HexagonalBoard (250 points) by vexorian :

class HexagonalBoard:
 def minColors(self, board):
    return 0

# CUT begin
#-------------------------------------------------------------------------------
CASE_TIME_OUT = 2.00

TEST_CASES = [
    ( [ ("---","---","---",) ], 0 ),
    ( [ ("-X--","---X","----","-X--",) ], 1 ),
    ( [ ("XXXX","---X","---X","---X",) ], 2 ),
    ( [ ("XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-",) ], 3 ),
    #Your custom test goes here:
    #( [ ], None ),
]

def isTestDisabled(i):
    return False

#-------------------------------------------------------------------------------
if __name__ == '__main__':
    import sys, tester
    tester.run_tests(
        HexagonalBoard, 'minColors', TEST_CASES, isTestDisabled, 
        1381033582, 250, CASE_TIME_OUT, True
    )
# CUT end

look at, basically the only part code that is not useful for the problem is the call to the tester.run_tests(). Everything else - The problem class that will be sent to topcoder, the TEST_CASES array that allows us to add test cases and the isTestDisabled() function that allows test cases to be disabled - Is very useful to have in the problem's code file.

How does run_tests work? First, it needs to receive the TEST_CASES list and the isTestDisabled function. But we can also send it the HexagonalBoard class. Python is quite dynamic, so it can even instantiate objects from a variable that determines what class. minColor is between quotes, it is the name of the method. After instantiating HexagonalBoard() it needs to call the method, we use a trick like:

getattr(instance, methodName)( arguments list)

to use the method name to call the method from an instance.

The input / arguments list we are supposed to send this method. Python stores them in a list. And no matter the list, we can uncompress it when sending this group of arguments to the method by using the * python operator:

getattr(instance, methodName)( *some_list)

Since the python code was getting so small, I tried to work more in the c++ one. The first candidate for cleaning was the output struct. It is a struct that holds the return type, a boolean to say that there is no return value and two constructors:

struct output {
    int value; bool u=false;
    
    output(int value) { this->value = value; }
    output() { u = true; }
};

We can actually make this thing a template. Being a template we need the code to exist in the generic tester instead of being specific to the problem:

template<typename T> struct output {
    T value; bool u=false;
    
    output(T value) { this->value = value; }
    output() { u = true; }
};

The real code bloat came from the input struct though:

struct input {
    vector<string> board;
     
    int run(double & elapsed) {
        StringPath* instance = new StringPath();
        time_t s = clock();
        int r = instance->countBoards( this->board );
        elapsed = (double)(clock() - s) / CLOCKS_PER_SEC;
        delete instance;
        return r;
    }
 
    void print() {
        cout << "[" << Tester::pretty(board) "]";
    }
};

What complicates things in c++ is that the number of arguments the method needs might vary. I think it might be possible to do this all with a variable template, however, maybe the reason tuples use explicit constructors is because of a limitation in the variadic-ness? Also, it sounds very complicated to do it. So I looked for other ways to reduce the footprint.

The print() method exists because of the variable number of arguments. So the part of the generic tester that needs to print the arguments can't do it alone. However, there was too much code in the print() method. Also, it determined the format the arguments are printed instead of letting the generic tester do it. I managed to create a variadic template that prints any sequence of arguments of any type, following the tester's format:

/*==============================================================================
  Converts given argument into a string containing pretty-looking output
*/
string pretty(string s)
{
    return "\"" + s + "\"";
}
string pretty(double x)
{
    ostringstream s;
    s.precision(10);
    s << x;
    return s.str();
}
template <typename T> string pretty(T t) {
    ostringstream s;
    s << t;
    return s.str();
}
template <typename T> string pretty(vector<T> t)
{
    ostringstream s;
    s << "{";
    for (int i=0; i<t.size(); i++) {
        s << (i?",":"") << t[i];
    }
    s << "}";
    return s.str();
}

/*==============================================================================
  Prints a test case's argument list
  
  printArgs(a,b,c) ->  cout << "[" << pretty(a) <<","<< pretty(b) <<"," <<  pretty(c) <<"]"
*/
void printArgsCont() {}

template<typename T, typename... Args> void printArgsCont(T value, Args... args)
{
    cout << "," << pretty(value);
    printArgsCont(args...);
}

template<typename T, typename... Args> void printArgs(T value, Args... args)
{
    cout << "[" << pretty(value);
    printArgsCont(args...);
    cout << "]";
}

Variadic templates are just great life savers. Now the print() method in the output struct looks like this:

void print() { Tester::printArgs( /*list of comma-separated arguments*/ ) ; }

The larger code usage in input struct was the run method. It also implemented tons of things that should really belong to the tester code, like measuring time or constructing the problem class. Well, it turns out that since the constructor for the problem class has no arguments, the tester can instanciate the problem class (we would need it to be a template argument). The new run method can just take a pointer to the class instance. It just needs to call the method. It is also not necessary to measure time inside the run() method, the tester can measure the amount of time the run() method takes, the overhead would be minor.

struct input {
    /*variables that hold each of the arguments*/;

    int run( /*ProblemClass*/ * x) {
        return x->minColors( /* comma - separated arguments */ );
    }
    void print() { Tester::printArgs( /* comma - separated arguments */ ); }
};

Problem name

At one point the tester received the problem name both as a string and as a template argument. Was it possible to make it receive it just as a template argument? There is something called typeid(T).name() in c++. Unfortunately, it seems to return some random numbers before the class name in gcc. It turns out we can fix it...

/*==============================================================================
    Returns the name of a type in human-readable form. It likely only works 100%
    correctly with classes that don't belong to a namespace and is not very
    portable.
*/
template<typename T> string getTypeName()
{
    string s = typeid(T).name();
    int i= 0; 
    while ( (i < s.size()) && ('0' <= s[i] && s[i] <= '9') ) {
        i++;
    }
    return s.substr(i);
}

What if?

The issue with the input struct is that it has to store the arguments. What if an argument is called input? That would cause some compile errors. A fix is to call the struct _input, it is unlike TopCoder would name arguments starting with _. We can also just add _ to all the argument names. But a method that also reduces the amount of code is to name the arguments inside the input struct as just (p0, p1, p2, ...). Unfortunately, greed 1.5 didn't have an option to get the number / index of an argument. I actually had to do a pull request to add this . The next version of greed should have the feature, but you can also just get the latest source code from github and compile it. Or modify the test code template I'll attach soon so that it uses a _input struct

The result

This is how the c++ test code generated for HexagonalBoard looks like:

#include <bits/stdc++.h>
// SRM 593 - HexagonalBoard (250 points) by vexorian
using namespace std;

struct HexagonalBoard
{
    int minColors(vector<string> board)
    {
        return 0;
    }
};

// CUT begin
//------------------------------------------------------------------------------
const double CASE_TIME_OUT = 2.0;

bool disabledTest(int x)
{
    return 0 * x;
}
template<class I, class O> vector<pair<I,O>> getTestCases() { return {
    { { {"---","---","---"} }, {0} },
    { { {"-X--","---X","----","-X--"} }, {1} },
    { { {"XXXX","---X","---X","---X"} }, {2} },
    { { {"XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-"} }, {3} },
    // Your custom test goes here:
    //{ { {}}, {} },
};}

//------------------------------------------------------------------------------
// Tester code:
    #include "../tester.cpp"
    struct input {
        vector<string> p0;

        int run(HexagonalBoard* x) {
            return x->minColors(p0);
        }
        void print() { Tester::printArgs(p0); }
    };
    
    int main() {
        return Tester::runTests<HexagonalBoard>(
            getTestCases<input, Tester::output<int>>(), disabledTest, 
            250, 1381061147, CASE_TIME_OUT, true
        );
    }
// CUT end

The extra code that is not solution and not test cases is very small now. I wonder if this is the smallest it can get. Probably not if it is possible to do the input with variadic templates.

Tweaks to the tester

While modifying the auto-generated codes, I also tweaked the tester. Remember the dilemma between compact and not compact code? I think the appeal of the compact code is as a report. This way things like output results kind of pollute the report. I turned the whole thing into a hybrid. First it runs each test case and the output shows everything including parameter lists and results and the test cases are separated by a big part. At the end though, a report is shown. The report has just the results and a debug mesage won't clip it. It looks like this:

(I also found out a perfect character and format to represent a disabled case (test 5 in the image is disabled).

By modifying the boolean value in the call to the tester, you can switch to the old format...

Get stuff

3 comments :

Goutham said...

Well, I did that and the output is badly disfigured. I have attached a screenshot. Please fix it when you get time. Thanks :)

vexorian said...

That makes sense, templates need you to double escape the escape characters. This is probably your last fix:

Change every:
\033


With \\033

Goutham said...

It finally works! I had to do what you said and change \t and \n to \\t and \\n as well. Thanks a lot :)
Here is the final tester code in case someone wants it: http://pastebin.com/vF1jBL1v