The Fourth International Paderborn Computer Othello Tournament October 5.-6., 1996 [ With contributions from several program authors, edited by Michael Buro ] The Tournament: =============== Like last year, PB-IV was an ordinary IOS computer Othello tournament besides the facts that the field of participants was probably the strongest ever seen in a tournament and that 20 rounds were played. As usual, the tournament consisted of two parts: First, a single round-robin tournament was played. Thereafter, three groups were formed in which the return-games were played. Each program had 20 minutes thinking time per game. Participants & final standings: ------------------------------- rank program ios-handle author(s) pts/games architecture speed e-mail diff.avg. ------------------------ Group I ------------------------------------- 1 LOGISTELLO logtest M. Buro 17.5/20 MIPS-R10K/194 18 mic@research.nj.nec.com +7.60 2 HANNIBAL hannibal M. Piotte & L. Geoffroy 17.0/20 Pentium/166 9 piotte@cam.org +9.60 hs98@music.polymtl.ca 3 KITTY kitty I. Durdanovic 13.5/20 SparcUltra/143 13 igord@research.nj.nec.com +5.80 4 REV8.0 avenger J. Buijs 11.0/20 PentiumPro/233 17 buijs@chem.vu.nl +3.20 = KEYANO keyano M. Brockington 11.0/20 8xSparcUltra/143 104 brock@cs.ualberta.ca +2.60 6 MYTURN rapsac C. Wilders 10.5/20 PentiumPro/233 17 rapsac@iaehv.nl +4.30 ------------------------ Group II ------------------------------------ 7 ECLIPSE eclipse M.Giles & C.Springer 12.0/19 Cyrix/686-166+ 9 mgiles@tir.com +3.05 springer@math.umn.edu 8 BRUTUS Brutus L. Geoffroy & M. Piotte 11.0/19 Cyrix/686-150+ 8 see HANNIBAL +5.26 9 CASSIO Cassio S. Nicolet 10.0/19 PowerPC/200 ? nicolet@ecoledoc.ibp.fr -0.84 10 WIGALD Wigald R. Brunner 7.5/19 HP720/80 ? rtbrunne@iiic.ethz.ch -5.16 11 FLIPPER csimmons C. Simmons 7.0/19 Sparc-20/60 6 csimmons@oracle.com +0.63 12 MAST95 +) speed R. Kroonenberg 8.0/15 ? ? buijs@chem.vu.nl +5.73 ------------------------ Group III ----------------------------------- 13 WOLPERS wolpers F. Alvaro 5.5/18 DEC-AXP/300 ? othello@satan.vmsmail.ethz.ch -1.78 14 EXODUS exodus N. Hansson 4.0/18 Pentium/90 5 pt96nha@mail.student.hk-r.se -14.67 15 CORN corn O. Hansson 3.5/18 Sparc-5 ? dat96oha@ludat.lth.se -11.33 16 REVFW frenkel F. Weinreis & A. Frenkel 2.0/18 Pentium/166 9 alex@logotec.com -16.89 +) left the tournament after the first part Directing the tournament was an easy job. While sitting in front of my computer at home - my kids around me - I only had to copy & paste emails coming from IOS, start 'robin' once in a while, and send the results to Paderborn. The only things that made me nervous were the frequent modem-disconnects and LOG's last-iteration opinion-changes. This was the last Paderborn Computer Othello Tournament. I would like to thank all participants for their great interest in the tournaments and their enthusiasm over the years which gave the Paderborn tournaments the spirit of the Computer Othello Worldchampionships. As you might know, Igor Durdanovic has recently moved to Princeton, NJ. He has taken IOS with him. Its new address is (ftp.nj.nec.com or 138.15.10.2) 5000. So - maybe - next year, "Princeton I" will take place. /Michael Programmers' Thoughts and Excuses: ================================== AVENGER ======= About three weeks before the tournament my entire harddisk was wiped probably due to a bug in the Windows NT 4.0 beta release, and no backups at all. It is stupid, I know. I lost quite a lot of data in the process. The latest (greatest) version of 'AVENGER' was gone and also the opening-book I compiled during the last few months. Also some utility- programs didn't survive the kill. So I had to decide entering the arena with a very old (1990) version or not to play at all. Eventually I managed to rewrite some of the utilities and compiled a book of frequently played lines by strong IOS players in the hope this would give AVENGER some good positions to start with. After playing the first few games, it was easy to see that the opposite happened. The opposing programs knew this lines much better then AVENGER did, and specially against the top three programs (LOGISTELLO, HANNIBAL and KITTY) it suffered losses right out of the book. AVENGER surprised me by winning from ECLIPSE, the position it was in looked bad to me but ECLIPSE played a late mid-game blunder and lost the game due to this move. Also I didn't expect AVENGER to win from RAPSAC, but Cas's program was kindly playing a bad opening-line and got totally demolished after this. So I can still say that AVENGER is the strongest Dutch othello-player ;). The game with KEYANO was quite balanced, and I had the impression that both programs were of about the same strength. I don't know if Mark was using his multi-processor version at that time, but if he did, he definitely had a speed advantage. The major problem with AVENGER is the lack of a good opening-book and a untuned evaluation- function during the late mid-game. Also missing is a quiescence-search which makes the program sometimes vulnerable to corner-combinations and other tactical jokes. If I rip my chess-program of it's quiescence-search the rating goes down hundreds of points, so I think it must do something with othello as well. Really bad is AV's endgame, it was running at 550000 nodes/sec but it didn't manage to search any deeper than 21 empty on average. I must be missing something here, something everybody seems to know, but i don't. Basically AVENGER uses a PVS-search with a huge hash-table and response-killers for move- ordering. The history-heuristic used in previous versions was not used in the tournament- version. The evaluation function consists mainly of edge-tables, some double-edge-tables, corner-tables and potential-mobility. Running on a Pentium-Pro 233 it was able to search at a speed of 330k N/S during the mid-game. The depth of search was 14-16 ply on the average without pruning. The program is totally written in 'C' and compiled with VC 4.0 which produces a 50% faster code compared to GCC 2.7.x running under Linux. Next year I hope to have solved some of the major deficiencies, and maybe AVENGER will be a more worthy opponent for the top programs by that time. - Joost Buijs BRUTUS ====== BRUTUS is a PC program running under DOS. It was written in 1992, and has been in constant evolution since. It performs a fixed depth alpha-beta search with a variation of the famous Probability-Cut. It uses iterative deepening with a simple static evaluation function at the terminal nodes. The function is a weighted sum of the number of disks, the mobility, the edge-configurations and the X-squares. It computes more than 400k nodes per second on a Cyrix 686-P150+, searching on average 15 to 19 ply deep in the middle game at 30 min time control, usually solving the endgame with 25 or 26 empty squares. During the PBIV tournament, BRUTUS did badly on the first day, because of a bad last-minute setting of probability-cut levels and of a to well known non-learning opening book. A very good second day did help BRUTUS to get back up in the ranking. Most of the work on BRUTUS is stopped because of thekill new program HANNIBAL, but a new learning book for BRUTUS is still in work now. - Louis Geoffroy CORN ==== I didn't have very great expectations of CORN's performance in PBIV, but hoped to have an enjoyable event ... which it also turned out to be. My preparations consisted of giving CORN a new evaluation on Saturday morning... only a couple of hours before the tournament started. The new evaluation uses the same ideas as the old one, but divided the game into two stages instead of using the same coefficients through- out the game, and also had new values for the coefficients. The new evaluation turned out fairly OK, but had two problems. CORN grew somewhat too fond of x-square moves, and in some games played x-squares that I doubt were good (e.g. the first game against WOLPERS, or the game against BRUTUS). I adjusted the coefficients during the tournament, and think I got rid of this problem... play CORN and judge for yourself - I haven't made any changed in CORN since PBIV. The other problem is that when the evaluation changes from the opening to the mid-game set of coefficients, the score sometimes (way too often) jumps dramatically ... often downwards. :( CORN did about as expected during the tournament. It put up at least a fight against most of the programs (except against BRUTUS), but still lost most of its games. There were two games that I found especially interesting, the one against ECLIPSE and the second one against WOLPERS. In the game against E, E made a fairly early corner sacrifice (perhaps still in book ... I'm not sure) where CORN happily grabbed the corner. After that, E's evaluation stayed around +4 for most of the game, while CORN happily thought itself ahead due to the corner and its stable discs. E was (as expected) correct in the end: a 34-30 win for E. The game against WOLPERS was a turn-around of the above. This time it was CORN who made a quite early corner sacrifice to gain mobility, and who was happy through most of the game after that ... though it ended in a draw. I still have not understood why CORN likes the sacrifice in the game against WOLPERS, but not in the game against ECLIPSE. All in all, it was a very nice and exciting tournament, despite the rather unexciting end-result with Lucky LOG winning (as so often before). [ often lucky or often winning? :) /mic ] - Ola Mikael Hansson ECLIPSE ======= PB IV was a great tournament. E got off to a good start with a wins over KEYANO and LOG in the early rounds, but as often happens, new "bugs" appeared during the tournament. Most of the bad games were in part caused by screwy book behavior: following draw lines against KITTY, following a "draw" which turned into a loss vs HANNIBAL, and also a very weird line as black against FLIPPER (E should have lost but FLIPPER timed out), and yet another incorrect book line (an annual bug it seems!!) against WIGALD which turned ugly quick. ECLIPSE also blew a winning position against BRUTUS, due to the fact that despite a 50% endgame speed-up before the tournament, E is still a slug compared to the best endgame solvers. BRUTUS found the loss in 293 seconds and on the next move E did not come close before timing out. Lots of work left to due here. Another big weakness in E is the late mid-game; Here the evaluation is always less stable and frequently goes from a + to - score (for example the first CASSIO game). Perhaps Colin will have time to tweak the evaluation this summer. Also, it appears that to ever win this tournament E will have to give in and go selective. It was great to see so many good programs competing this year. I hope this tournament continues to grow each year. - Mike Giles EXODUS ====== This was EXODUS first tournament ever and I would like to say first of all that this was great fun so I hope the tradition with a really big tournament will continue in Princeton too. About my preparations there isn't very much to say I coded EXODUS during a few weeks before the tournament so I wasn't really excepting a high place. EXODUS had one of the worst brick-differences in the tournament though. And this is due to that the evaluation I used was tuned for early mid-game. So EXODUS did a lot of endgame blunders. And of course the shallow endgame search does it to (17 ply wld 15 ply perfect). EXODUS placed where I had excepted in the score-list but with some differences to what I predicted. The win against BRUTUS was one of those= differences there BRUTUS jumped on the edged due to a bad opening by EXODUS (book less). B thought himself far ahead around move 23 then he suddenly realized that he wouldn't run E out of moves and dropped to solve for loss. BRUTUS vs. EXODUS A B C D E F G H 1 |55|46|17|16|15|19|54|53| 1 2 |56|38|13|12|14|18|52|30| 2 3 |21|20| 7| 1| 6| 9|22|49| 3 4 |43|35|10|()|##|11|28|25| 4 5 |42|36| 2|##|()| 4|23|26| 5 6 |37|39|40|48| 5| 3|24|27| 6 7 |59|60|51|41| 8|34|47|31| 7 8 |58|45|44|29|32|33|57|50| 8 A B C D E F G H ##:() - 30:34 HISTORY: move ( time ) value 23. g5 ( 86.0) +1.07 24. g6 ( 16.0) -1.00 25. h4 ( 54.0) +0.98 26. h5 ( 41.0) +0.00 27. h6 ( 0.0) +0.76 28. g4 ( 1.0) +0.00 29. d8 ( 210.0) +0.66 30. h2 ( 20.0) +1.00 31. h7 ( 0.0) +0.45 32. e8 ( 18.0) +1.00 It was a good thing that there were so many different programs on many different levels this made the tournament interesting for everybody. So keep it up so we can meet again for Princeton I next year. - Niklas Hansson FLIPPER ======= While FLIPPER played poorly in this tournament, we learned quite a bit about the areas in which it is strong and the areas in which it is weak. FLIPPER plays a good endgame and can solve board positions about as fast as other top programs. However, FLIPPER plays poorly during the opening and mid-game. Additionally, FLIPPER has poor time management algorithms. (FLIPPER ran out of time on three games, two of which could have been wins with better time management software.) Additionally, FLIPPER has a minimal opening book. For the next tournament, we will improve our evaluation routines so that they work better during the opening and mid-game. We will add a more substantial opening book. And we will revamp our time management algorithms. - Charles Simmons HANNIBAL ======== About HANNIBAL: HANNIBAL was at its first participation to the annual Paderborn Othello tournament. HANNIBAL is a program written in C running under Linux. During the tournament, it ran on a Pentium 166MHz. With this setup, it evaluates 40-60K positions/sec in middle game, and 200-300K positions/sec during the end game. HANNIBAL uses an alpha-beta algorithm with selective pruning, providing 16-20 ply search during the middle game. HANNIBAL learned to play by itself, so no one really knows what it knows. I suspect it is heavily biased toward mobility optimization. I started working on HANNIBAL in January 1996. The first working versions was completed in February, but it couldn't play a decent game until June, when the learning algorithm was finally debugged correctly. The most recent feature is the learning opening book algorithm which was completed at the end of August. As a consequence, HANNIBAL's opening library was quite small during the tournament. About PBIV: I must admit that I started PBIV expecting HANNIBAL do to well, and I was not disappointed. HANNIBAL had the chance to place several long book line, which was welcome given the small size of the opening book. As I expected, HANNIBAL performed better playing white than black. With black it lost two games (to RAPSAC and LOGISTELLO) and drew one (to SPEED). With white it was undefeated, drawing only to KITTY. HANNIBAL finished with a very good disc differential, which surprised me because HANNIBAL is optimized to evaluate close position and does not perform well in large win/loss situations. The turning point of the tournament was probably the lost against LOGISTELLO on round 17 (after a win on round 12), giving away the lead to LOGISTELLO by half point, half point which could not be recored as both LOGISTELLO and HANNIBAL were undefeated during the last three rounds. I would like to draw attention to two of HANNIBAL's games of particular interest. The first one is HANNIBAL vs SPEED. During most of the game, HANNIBAL believed it had a decisive advantage against SPEED, but it evaporated near the end and the game solved for a draw! This demonstrate that it is still possible for a good program to find a position that it doesn't understand. Warning: it is in the opening book now, try at your own risk! HANNIBAL vs. SPEED -> ## A B C D E F G H 1 |43|49|50|36|54|53|58|47| 1 2 |44|42|34|35|37|38|40|57| 2 3 |15|29|27|28|26|39|25|60| 3 4 |17|14|21|()|##|10|41|59| 4 5 |22|16| 7|##|()| 3| 6|56| 5 6 |24|23| 9| 4| 1| 2|31|55| 6 7 |45|20| 8|12| 5|19|33|51| 7 8 |52|30|11|32|13|18|48|46| 8 A B C D E F G H ##:() - 32:32 The second interesting is one in which HANNIBAL followed a very long book line. So long, in fact, that it entered the endgame search right out of the book for a win. AVENGER vs. HANNIBAL -> ## A B C D E F G H 1 |49|51|21|27|42|43|58|59| 1 2 |52|48|12|26|31|34|60|57| 2 3 |22|32| 7| 1| 6| 8|33|36| 3 4 |23|10| 9|()|##|20|39|44| 4 5 |13|11| 2|##|()| 4|37|40| 5 6 |17|16|15|14| 5| 3|29|38| 6 7 |24|50|18|19|25|30|47|55| 7 8 |54|53|35|46|41|28|45|56| 8 A B C D E F G H ##:() - 27:37 About the director: I would like to thank the tournament director, Michael Buro, for organizing and directing the tournament. I believe this was a fantastic tournament because on the exceptional quality of all the participants. I look forward to next year's. - Martin Piotte KEYANO ====== I was graced with the use of a cluster of 28 Ultra SPARC processors for the tournament, thanks to Charles Leiserson's group at MIT. I spent most of the week before the tournament figuring out how to get ProbCut and APHID (my parallel game-tree library) working together in harmony. I managed the task on the Thursday evening on benchmark positions. However, I had not tested the combination of the two in a tournament-style game. This was an error. I got about three hours sleep on Friday night after discovering that, on occasion, a slave would drop off and never be heard from again! It wouldn't even leave a sane core file behind -- it was a disturbing problem, and one that I still haven't tracked down yet. Anyway, that is why KEYANO was continually crashing. It normally helped to be running on a small number of slaves. Depending on the game, there were up to 16 processors running ... and on occasion, it would run without a hitch. KEYANO didn't really deserve to finish =4th. The most obvious game that KEYANO won and should have lost was FLIPPER losing on time in a won endgame. In the pivotal 15th round game where KEYANO upset KITTY to make the top 6, KITTY had a win at 27 empty. However, KITTY played the incorrect move, and KEYANO had sufficient time (on 8 processors) to find the win at 26 empty. This was the high point of my tournament. However, I guess I was accumulating bad karma for my game in the second round against LOGISTELLO. For the first time ever, KEYANO had a winning position against LOGISTELLO. KEYANO and LOGISTELLO both knew the position was +4 for KEYANO at 20 empty, but there was a very elementary error in my parallel endgame code handler. KEYANO searched a losing move first and a winning move second. It knew what the winning move was, but since the score lay outside the alpha-beta window, a research wasn't executed on the WLD search, and the PV was never changed from the losing move. It was an elementary typo, worthy of a big red X on an undergrad programming assignment. KEYANO has never beaten LOGISTELLO in a tournament game. Next time ... things will hopefully be different. - Mark Brockington KITTY ===== As always when the time for PB-xy comes, old KITTY crawls down the tree (sharpening the claws on the way down = last minute preparation) :) and stretches her paws on the fastest computer Igor can find for her, this time SparcUltra/147MHz. After losing to LOG in the first round (as expected) and drawing ECLIPSE (unexpected), K went on winning vs. BRUTUS, SPEED, WIGALD, FRENKEL, CORN, EXODUS, CASSIO, WOLPERS, FLIPPER. Then it decided it had enough for lunch and started fooling around ... drawing RAPSAC, AVENGER, HANNIBAL. A Canadian bear was not in the mood for fooling around (no wonder, he barely made it into group 1) and managed a narrow win :( This put KITTY on the third place, not bad :) The return matches started well, win vs. RAPSAC, AVENGER. HANNIBAL and LOG were not nice :(, but this time, KEYANO agreed on a draw. This was enough for first place in number of draws :) but only third in overall score (standard place) :) Again, not bad for a four years old program, for which source, book(s), coefficients are available! - Igor Durdanovic LOGISTELLO ========== Well, what can I say. LOG didn't deserve to win this tournament since it showed a weak performance in a couple of games playing black and - more importantly - seized an already lost point from KEYANO due a fatal endgame error at 19 empty. Thanks KEYANO. Looking at the recent tournaments results it seems that I was too lazy lately while others were working hard to improve their programs. During the last year only LOG's book and book-algorithm has been improved and that's not enough to compete till the end of time. Isn't that a nice excuse? :) Recently, HANNIBAL gave LOG some instructive lessons after which I decided to work on LOG's search and evaluation function again. See you next year, guys. - Michael Buro WOLPERS ======= There are not a lot of things to say: I simply had not enough time to finish my program :-) I spent quite a lot of time to speed-up the endgame search of my program, which now evaluates about 600k nodes per second on an AXP/300 and solves at 21 or 22 empties in a 15 minutes game. And what did I have to see at PBIV? The other programs wld-solved already at 24 or 25 empties. This isn't just fair! Could anybody tell me the trick? Oh well, and there was of course another problem: My program wasn't prepared to play on a little endian machine, and so I had to turn off the most important feature of the evaluation routine (a 2*8 pattern in a fixed Hash table). I'd like to give you an example how badly WOLPERS used to evaluate positions, which lead into this nice position against CORN. WOLPERS vs. CORN, White to move: a b c d e f g h 1 . . . . . . . . 2 . . * . O * . . 3 O * O O * . . . 4 * * * * * * . . 5 O O O O * * . . 6 . O O * O * . . 7 . . * O . * . . 8 . O O O O * . . WOLPERS just played f8, expecting CORN to move g8. CORN played f7. The match was a draw in the end. Now dear fellows, I'd like you to answer me two questions: - How do you manage to finish your programs exactly one day before the tournament begins? - How for heaven's sake do you solve so early? I'm still waiting for a tournament on which I don't have to apologize for my program's poor performance. I was glad to participate PBIV for two reasons. First, it turned out that Ralf cooks surprisingly fine chili con carne. Second, it was amusing to comfort him when WOLPERS defeated WIGALD. - Alvaro Fussen