Sunfish-Like is a CLI tool for debugging and testing chess engine logic of the Molafish chess engine against its parent engine, Sunfish. It also serves as a detailed performance log, change log, and archive of previous Molafish versions.
Several customizable test groupings accessible through the command line with the click
python package.
Test methodology and more detailed example usages for each type can be found below.
test.py move
Move generation comparison accross single or bulk input PGN / UCI chess games.test.py search
Search evaluation comparison for consistant output to a specified depth.teast.py perft
Isolated or comparative perft testing by depth limit or time limit.
The CLI provides several options and commands for testing with the format:
test.py [OPTIONS] COMMAND [ARGS]...
--debug
Normal output - default.--full
Full, detailed output.--silent
Output on error only.
move
Compares move generation data across a single game or multiple games.search
Compares search evaluation data from a starting position.perft
Perft tests an engine for performance and accuracy by time or depth.perft-compare
Perft tests two engines by time or depth and compares results.full
Executes move, search, and perft_compare commands with detailed output.quick
Executes move, search, and perft_compare commands with minimal.
Move generation and execution are tested with input UCI or PGN formatted games.
PGN inputs will be converted to UCI. For every listed move, the tool will:
- Execute the move
- Generate all legal response moves and their corresponding scores and flags
- Verify each engine has matching move outputs and scores and flags for those outputs
By default, move
processes 50 games from a bulk PGN file containing nearly 300 PGN formatted Aegon Tournament games from 1995.
test.py move --games 10
A successful run will print the average move generation and execution time for each engine.
Any mismatch encountered between engines will cause the tool to display the discrepancy and print the game moves list in UCI format.
Search evaluation begin from a provided position (defaults to initial board position), and search for the best move to a the provided search depth (default=5). Similar to move generation testing, each returned move's attributes are compared with sunfish for equivalent results.
test.py search --max_depth 6
Perft testing can be executed in isolation with python test.py perft
, or compared against sunfish with python test.py perft-compare
.
Perft tests have two modes:
- Time-limited, where the test will run to the specified number of seconds (default=1), and print the depth reached, nodes reached, and an estimated nodes per second (nps).
test.py perft --test 'depth'
- Depth-limited, where the test will run to the specified search depth (default=5), and print the time taken to reach it.
test.py perft --test 'time'
The two main tests used in evaluating Molafish combine move
, search
, and perft-compare
into either a quick test just to ensure the engine is breathing,
or a full test with enough games to ensure move generation is functioning correctly, and enough time to adequately compare perft tests between engines.
- quick tests move generation across 2 games, search evaluation to a depth of 2, and perft comparison to a time limit of 1.
test.py peft quick
The equivalent test commands are:move --games 2
search --max_depth 2
perft_compare --test time --limit 1
- full tests move generation across 100 games, search evaluation to a depth of 5, and perft comparison to a time limit of 10.
test.py peft full
The equivalent test commands are:move --games 100
search --max_depth 5
perft_compare --test time --limit 10