Thursday, December 12, 2013

New version of my Greed tester and template

Shortly after I discovered the Greed topcoder plugin, I spent some time customizing its templates and later did some really cool stuff by making a generic tester. It has been working great so far.

During the weekend a discussion erupted at the Greed github about making Greed support running test cases in different thread/process the main reason people like this feature is because if you make code that depends on global variables' values and requires them to be initialized with 00 bytes, then running all test cases as part of the same process causes errors. I never really cared much about this, because I wouldn't ever use globals like that :/. But during the weekend I figured I had good reason to use this sort of feature:

  • Sometimes I need to test some other people's code locally and they have the sort of bad taste about global usage that I described. E.g, during the weekend I had to reverse engineer the solution rng_58 wrote for the SRM 599 Hard problem.
  • Most importantly, there is a much better reason to run in multiple processes: Runtime errors. C++ poor exception handling makes it so in my old version of tester, a single failed assert, a segfault or an exception would stop the execution of the whole thing. If the exception happened in just test case 1, it would still halt the execution of all test cases.

So that is when I started to experiment. The first thing I did was make Greed 2.0's default test template get this feature. I did some hacky stuff: Basically, the c++ program calls itself over and over again for each argument. But it works.

For my so-called "dualcolor" templates though, the requirements were higher because there is a lot more things to care about when printing results. The report at the end needs to know exactly what happened during the execution of test cases. Another issue is that I wouldn't be able to have access to the command line arguments in the tester file, unless I modified the tester call syntax (which would break old source files generated by old version of the template). So I had to do something else.

The eventual decision was to use fork and wait. fork() works great because it makes a copy of the current process. Wait is needed to wait for the process to end. The one problem about this is that it effectively makes the feature unable to work under windows (unless you use cygwin, I think). I make these templates for myself mostly and I don'tt have time to port to winapi. If you use windows and wanted this feature in my template I guess it sucks it when someone's software doesn't work fully on your OS of choice, eh?. Of course, maybe windows users still want the other cool features like the output colors and the easy-to-modify test cases, so when running in windows the whole thing is disabled. Metaprogramming to the rescue.

#if defined(WIN32) || defined(_WIN32) || defined(__WIN32) && !defined(__CYGWIN__)
    #define DISABLE_THREADS
#endif
#ifndef DISABLE_THREADS
    #define DO_THREADS
#endif
#ifdef DO_THREADS
   #include <unistd.h>
   #include <sys/wait.h>
   #include <sys/types.h>
#endif

The other negative side is that communicating with child process was sort of too complicated for me to implement without using an external file. So the tester now needs an external file to work. Currently it is located in /tmp/

Maybe some *n*x users have trouble getting this to work. Or maybe it is failing for some reason or making testing a specific problem difficult. Hence why if the DISABLE_THREADS thing is defined before including the tester.cpp file, the feature can be disabled.

The result is great. It can catch the usual issues (aborted due to some std:: exception, division by zero, segfault). To be consistent with the python version, letter E is used to signify test cases with runtime errors. Here is a sample:

Keeping one line per test case in the report

Another feature that can be caught in the screenshot above is that the final results report in the bottom is now guaranteed to be one-line long. If the reported results is longer than that, it will be stripped. Of course, the line length can be configured in tester.cpp.

Return of the super compact mode

By popular (one person) demand. The mode that printed just and only the "final" report and nothing else is back. I don't like it because it doesn't work very well when you use printing for debugging. But I guess there are people who don't do that.

To choose output mode, find this line in template:

        return Tester::runTests<${ClassName}>(
            getTestCases<input, Tester::output<${Method.ReturnType}>>(), disabledTest, 
            ${Problem.Score}, ${CreateTime}, CASE_TIME_OUT, Tester::FULL_REPORT
        );

Change Tester::FULL_REPORT to one of these:

  • Tester::FULL_REPORT : Full , verbose output and report at the end.
  • Tester::NO_REPORT : Full , verbose output but the bottom report is a single line with less info.
  • Tester::ONLY_REPORT : Only the report and nothing more.

Get them

  • Test code template for greed, needs Greed 2.0 unless you tweak something small. testtemplate.cpp
  • Generic Tester (Place at .. folder relative to where source codes are generated): tester.cpp

The official Greed version 2.0 is probably going to include these templates and just need a small configuration line to use them. Greed git already includes the older version.

1 comment :

Making Dips said...

Thanks great post.