Construction and Elicitation of a Black Box Model in the Game of Bridge

05/04/2020 ∙ by Véronique Ventos, et al. ∙ 0

We address the problem of building a decision model for a specific bidding situation in the game of Bridge. We propose the following multi-step methodology i) Build a set of examples for the decision problem and use simulations to associate a decision to each example ii) Use supervised relational learning to build an accurate and readable model iii) Perform a joint analysis between domain experts and data scientists to improve the learning language, including the production by experts of a handmade model iv) Build a better, more readable and accurate model.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Our goal is to model expert decision processes in Bridge. To do so, we propose a methodology involving human experts, black box decision programs, and relational supervised machine learning systems. The aim is to obtain a global model for this decision process, that is both expressive and has high predictive performance. Following the success of supervised methods of the deep network family, and a growing pressure from society imposing that automated decision processes be made more transparent, a growing number of AI researchers are (re)exploring techniques to interpret, justify, or explain ”black box” classifiers (referred to as the Black Box Outcome Explanation Problem

Guidotti et al. (2019)

). It is a question of building, a posteriori, explicit models in symbolic languages, most often in the form of rules or decision trees that explain the outcome of the classifier in a format intelligible to an expert. Such explicit models extracted by supervised learning can carry expert knowledge

Murdoch et al. (2019), which has intrinsic value for explainability, pedagogy, and evaluation in terms of ethics or equity (they can help to explain biases linked to the learning system or to the set of training examples). Learning a global model (i.e. capable of explaining the class of any example) that is both interpretable and highly accurate has been identified as a difficult problem. Many recent approaches that build interpretable models adopt a simpler two-step method: a first step aims to build a global black box classifier that is as precise as possible, a second step focuses on the generation of a set of local explanations (linear models/rules) to justify the classification of a specific example (see for instance the popular LIME systems Ribeiro et al. (2016) and ANCHORS Ribeiro et al. (2018) systems).

We present here a complete methodology for acquiring a global model of a black box classifier as a set of relational rules, both as explicit and as accurate as possible. As we consider a game, i.e. a universe with precise and known rules, we are in the favorable case where it is possible to generate, on demand, data that i) is in agreement with the problem specification, and ii) can be labelled with correct decisions through simulations. The methodology we propose consists of the following elements:

  1. Problem modelling with relational representations, data generation and labelling.

  2. Initial investigation of the learning task and learning with relational learners.

  3. Interaction with domain experts who refine a learned model to produce a simpler alternative model, which is more easily understandable for domain users.

  4. Subsequent investigation of the learning task, taking into account the concepts used by experts to produce their alternative model, and the proposition of a new, and more accurate model.

The general idea is to maximally leverage interactions between experts and engineers, with each group building on the analysis of the other.

We approach the learning task using relational supervised learning methods from

Inductive Logic Programming

(ILP) Muggleton and Raedt (1994). The language of these methods, a restriction of first-order logic, allows learning compact rules, understandable by experts in the domain. The logical framework allows the use of a domain vocabulary together with domain knowledge defined in a domain theory, as illustrated in early work on the subject Legras et al. (2018).

The outline of the article is as follows. After a brief introduction to bridge in Section 2, we describe in Section 3 the relational formulation of the target learning problem, and the method for generating and labelling examples. We then briefly describe in Section 4 the ILP systems used, and the first set of experiments run on the target problem, along with their results (Section 4.2). In Section 5, bridge experts review a learned model’s output and build a powerful alternative model of their own, an analysis of which leads to a refinement of the ILP setup and further model improvements. Future research avenues are outlined in the conclusion.

2 Problem Addressed

2.1 Brief Introduction to the Game of Bridge

Bridge is played by four players in two competing partnerships, namely, North and South against East and West. A classic deck of 52 playing cards is shuffled and then dealt evenly amongst the players ( cards each). The objective of each side is to maximize a score which depends on:

  • The vulnerability of each side. A non-vulnerable side loses a low score when it does not make it’s contract, but earns a low score when it does make it. In contrast, a vulnerable side has higher risk and reward.

  • The contract reached at the conclusion of the auction (the first phase of the game). The contract is the commitment of a side to win a minimum of tricks in the playing phase (the second phase of the game). The contract can either be in a Trump suit (, , , ), or No Trumps ( NT), affecting which suit (if any) is to gain extra privileges in the playing phase. An opponent may Double a contract, thus imposing a bigger penalty for failing to make the contract (but also a bigger reward for making it). A contract is denoted by (or if it is Doubled), where is the level of the contract, and the trump suit. The holder of the contract is called the declarer, and the partner of the declarer is called the dummy.

  • The number of tricks won by the declaring side during the playing phase. A trick containing a trump card is won by the hand playing the highest trump, whereas a trick not containing a trump card is won by the hand playing the highest card of the suit led.

For more details about the game of bridge, the reader can consult ACBL (2019). Two concepts are essential for the work presented here:

  • Auction: This allows each player (the first being called the dealer) the opportunity to disclose coded information about their hand or game plan to their partner111The coded information given by a player is decipherable by both their partner and the opponents, so one can only deceive their opponents if they’re also willing to deceive their partner. In practice, extreme deception in the auction is rare, but for both strategical and practical reasons, the information shared in the auction is usually far from complete.. Each player bids in turn, clockwise, using as a language the elements: Pass, Double, or a contract higher than the previous bid (where at each level). The last bid contract, followed by three Passes, is the one that must be played.

  • The evaluation of the strength of a hand: Bridge players assign a value for the highest cards: an Ace is worth 4 HCP (High Card Points), a King 3 HCP, a Queen 2 HCP and a Jack 1 HCP. Information given by the players in the auction often relate to their number of HCP and their distribution (the number of cards in one or more suits).

2.2 Problem Statement

After receiving suggestions from bridge experts, we chose to analyse the following situation:

  • West, the dealer, bids , which (roughly) means that they have a minimum of 7 spades, and a maximum of 10 HCP in their hand.

  • North Doubles, (roughly) meaning that they have a minimum of 13 HCP and, unless they have a very strong hand, a minimum of three cards in each of the other suits (, and ).

  • East passes, which has no particular meaning.

South must then make a decision: pass and let the opponents play

, or bid, and have their side play a contract. This is a high stakes decision that bridge experts are yet to agree on a precise formulation for. Our objective is to develop a methodology for representing this problem, and to find accurate and explainable solutions using relational learners. It should be noted that Derek Patterson was interested in solving this problem using genetic algorithms

Patterson (2008).

In the remainder of the article, we describe the various processes used in data generation, labelling, supervised learning, followed by a discussion of the results and the explicit models produced. These processes use relational representations of the objects and models involved, keeping bridge experts in the loop and allowing them to make adjustments where required.

In the next section we consider the relational formulation of the problem, and the data generation and labelling.

3 Automatic Data Generation Methodology

The methodology to generate and label the data consists of the following steps:

  • Problem modelling

  • Automatic data generation

  • Automatic data labelling

  • ILP framing

These steps are the precursors to running relational rule induction (Aleph) and decision tree induction (Tilde) on the problem.

3.1 Problem Modelling

The problem modelling begins by asking experts to define, in the context described above, two rule based models: one to characterize the hands such that West makes the bid, and another to characterize the hands such that North makes the Double bid. These rule based models are submitted to simulations allowing the experts to interactively validate their models. With the final specifications, we are able to generate examples for the target problem. For this section we introduce the terms:

  • nmpq exact distribution which indicates that the hand has cards in , cards in , cards in and cards in , where .

  • nmpq distribution refers to an exact distribution sorted in decreasing order (thus ignoring the suit information). For instance, a exact distribution is associated to a distribution.

  • is the number of cards held in the suit .

Modelling the 4 bid

Experts have modeled the 4 bid by defining a disjunction of 17 rules that relate to West hand:

(1)

in which

  • is a condition common to the 17 rules and is reported in Listing LABEL:common-4s-rules in the Appendix.

  • is one of the four possible vulnerability configurations (no side vulnerable, both sides vulnerable, exactly one of the two sides vulnerable).

  • is a condition specific to the rule.

For instance, the conditions for rules and are:

  • East-West not vulnerable, North-South vulnerable.

  • all of:

    • ,

    • 2 cards exactly among Ace, King, Queen and Jack of ,

    • a 7321 distribution.

  • East-West not vulnerable, North-South vulnerable.

  • all of:

    • or ,

    • 2 cards exactly among Ace, King, Queen and Jack of ,

    • a distribution with .

To generate boards that satisfy these rules, we randomly generated complete boards (all 4 hands), and kept the boards where the West hand satisfies one of the 17 rules. The experts were able to iteratively adjust the rules as they analysed the boards that passed through the filter. After the experts were happy with the samples, 8,200,000 boards were randomly generated to analyse rule adherence, and 10,105 of them contained a West hand satisfying one of the 17 rules. All rules were satisfied at least once. , for example, was satisfied 16.2% of the time, and was satisfied 15% of the time.

Double Modelling

Likewise, the bridge experts also modeled the North Double by defining a disjunction of 3 rules relating to the North hand:

(2)

This time, the conditions do not depend on the vulnerability. The common condition and the specific conditions are as follows:

  • - for all , and not ( and ).

  • - HCP 13 and 1.

  • - HCP 16 and and 3 and 3 and 3.

  • - HCP 20.

The same expert validation process was carried out as in Section 3.1. A generation of 70,000,000 boards resulted in 10,007 boards being satisfied by at least one 4 bid rule for the West hand and at least one Double rule for the North hand. All rules relating to Double were satisfied at least once. , for example. was satisfied 69.3% of the time, and was satisfied 24.8% of the time.

3.2 Data Generation

The first step of the data generation process is to generate a number of South hands in the context described by the 4 and Double rules mentioned above. Note, again, that East’s Pass is not governed by any rules, which is close to the real situation.

We first generated 1,000 boards whose West hands satisfied at least one bid rule and whose North hands satisfied at least one Double rule. One such board is displayed in Example 1:

Exemple 1

A board (North-South vulnerable / East-West not vulnerable) which contains a West hand satisfying rules and and a North hand satisfying rules and :

10 3
J 8 4
A 7
Q J 8 7 6 2
A K J 8 7 5 2
7
J 8 6 3