Automated Configuration of Negotiation Strategies

03/31/2020 ∙ by Bram M. Renting, et al. ∙ universiteit leiden Delft University of Technology 0

Bidding and acceptance strategies have a substantial impact on the outcome of negotiations in scenarios with linear additive and nonlinear utility functions. Over the years, it has become clear that there is no single best strategy for all negotiation settings, yet many fixed strategies are still being developed. We envision a shift in the strategy design question from: What is a good strategy?, towards: What could be a good strategy? For this purpose, we developed a method leveraging automated algorithm configuration to find the best strategies for a specific set of negotiation settings. By empowering automated negotiating agents using automated algorithm configuration, we obtain a flexible negotiation agent that can be configured automatically for a rich space of opponents and negotiation scenarios. To critically assess our approach, the agent was tested in an ANAC-like bilateral automated negotiation tournament setting against past competitors. We show that our automatically configured agent outperforms all other agents, with a 5.1 that without our agent in the tournament, the top-ranked agent wins by a margin of only 0.01

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

As of the 1980s, researchers have tried to design algorithms (or software agents) that can assist or act on behalf of humans in negotiations. Early adopters in this field are Smith, Sycara, Robinson, Rosenschein and Klein Smith (1980); Sycara (1988); Sycara-Cyranski (1985); Robinson (1990); Rosenschein (1986); Klein and Lu (1989).

In 2010, the genius platform was created to provide a test-bed for evaluating new developments in the field of automated negotiation. Alongside, the anac competition series was organized to stimulate the development of negotiation algorithms in academia. Every year, anac poses a new challenge for contestants to cope with. Today, the combined effort of genius and anac has resulted in a standardized test-bed with more than 100 negotiating agents and negotiation scenarios that are readily accessible for research on automated negotiation Baarslag et al. (2015).

The negotiators are generally hard-coded software agents, based on a strategy with fixed parameters that are tuned at design time to optimize its behavior. The difficulty lies not in developing a negotiator, but in winning the competition, as both the configuration space and the space of negotiation scenarios are large, and the competing agents change every year.

This makes manual configuration on larger sets of negotiation instances tedious, time-consuming and impractical. Furthermore, note that evaluating a single strategy on a large set of negotiation scenarios takes too much time to be practical.

To avoid these difficulties, agents have been configured on smaller sets Matos et al. (1998)

. Attempts were made to automate this process, for example using genetic programming 

Holland (1992), but again only on specific and simplified test sets. For instance, agents were only tested in one or two scenarios, or merely optimized against themselves Eymann (2001); Dworman et al. (1996). The resulting agents are highly specialized with unpredictable performance when negotiating outside of their comfort zone. No attempts have been reported at automating this configuration task on large-scale, broad sets of negotiation scenarios and opponent strategies.

In this work, we present a solution for the automated algorithm configuration problem for automated negotiation on large problem sets. We recreate a negotiation agent from literature Lau et al. (2006) that is configured manually, combine it with contemporary opponent learning techniques and create a configuration space of its strategic behavior. To automatically configure this conceptually rich and highly parametric design, we use smac, a general-purpose automated algorithm configuration procedure that has been used previously to optimize the performance of cutting-edge solvers for sat, mip and other NP-hard problems. We note that here, we apply automated algorithm configuration for the first time to a multi-agent problem.

The aim of this work is to automatically configure a negotiation algorithm with no fixed or pre-defined strategy. This agent can be configured to perform well on a user-defined set of training problem instances, with little restrictions on the size of the instances or instance sets. To demonstrate its performance, we configure the agent in an attempt to win an anac-like bilateral tournament.

We show that we can win such a tournament with a comfortable margin of 5.1% in increased negotiation payoff compared to the number two. These margins are not observed in a tournament without our negotiation agent, where the winning strategy obtains a marginal improvement in negotiation payoff of 0.012%.

2. Related work

In this section, we discuss related work in the field of automated algorithm configuration, as well as some past applications in the research area of automated negotiation.

2.1. Automated algorithm configuration

In literature, automated algorithm configuration is also referred to as parameter tuning or hyperparameter optimization (in machine learning). It can be formally described as follows: given a parameterized algorithm

, a set of problem instances and a cost metric , find parameter settings of that minimize on  Hutter et al. (2011). The configuration problem occurs for example in solvers for mip problems Hutter et al. (2010)

, neural networks, classification pipelines, and every other algorithm that contains performance-relevant parameters.

These configuration problems can be solved by basic approaches such as manual search, random search, and grid search, but over the years researchers developed more intelligent methods to obtain the best possible configuration for an algorithm. Two separate part within these methods can be identified: how new configurations are selected for evaluation and how a set of configurations is compared.

F-Race Birattari et al. (2010) races a set of configurations against each other on an incremental set of target instances and drops low performing configurations in the process. This saves computational budget, as not all configurations have to be tested on the full target instance set. The set of configurations to test can be selected either manually, as a grid search, or at random. Balaprakash et al. Balaprakash et al. (2007) extended upon F-Race by implementing it as a model-based search Zlochin et al. (2004), which iteratively models and samples the configuration space in search of promising candidate configurations.

ParamILS Hutter et al. (2009) does not use a model, but instead performs a local tree search operation to iteratively find better configurations. Like F-Race, ParamILS is capable of eliminating low performing configurations without evaluating them on the full set of instances.

Another popular method of algorithm configuring is GGA Ansótegui et al. (2009), which makes use of genetic programming to find configurations that perform well. This method does not model the configuration space and has no method to eliminate low performing configurations early.

The final method we want to mention is smac, which is an algorithm configuration method that uses a random forest model to predict promising configurations. It also includes an early elimination mechanism for promising configurations by comparing them with a dominant incumbent configuration on individual problem instances.

2.2. Automated configuration in negotiation agents

Earlier attempts for solving the automated configuration problem in automated negotiation mostly used basic approaches, such as random and grid search. The only advanced method used to configure negotiation strategies is the genetic algorithm.

Matos et al. Matos et al. (1998) encoded a mix of baseline tactics as an chromosome and deployed a genetic algorithm to find the best mix. They assumed perfect knowledge of the opponents preferences and their strategy is only tested against itself on a single negotiation scenario. Eymann Eymann (2001) encoded a more complex strategy as a chromosome with 6 parameters, again only testing its performance against itself and using the same scenario. Dworman et al. Dworman et al. (1996) implement the genetic algorithm in a coalition game with 3 players, with a strategy in the form of a hard coded if-then-else rule. The parameters of the rule are implemented as a chromosome. The strategy is tested against itself on a coalition game with varying coalition values. Lau et al. Lau et al. (2006) use a genetic algorithm to explore the outcome space during a negotiation session, but do not use it to change the strategy.

3. Preliminaries

Automated negotiation is performed by software agents called parties, negotiation agents or simply agents. Agents that represent opposing parties in negotiation are also referred to as opponents. We focus solely on negotiations between two parties, which is known as bilateral negotiation. The software platform that we use for agent construction and testing is genius Lin et al. (2014), which contains all the necessary components to setup a negotiation, allowing us to focus solely on agent construction.

In this paper, we use the saop as negotiation protocol, which is the formalization of the aop in genius. Here, agents take turns and at each turn either make an (counter) offer, accept the current offer, or walk away. This continues until one of the parties agrees, or a deadline is reached, which is set to 60 seconds in this paper (normalized to ).

Besides a protocol we need a set of opponent agents to negotiate against and a set of scenarios to negotiate over. We call the combination of a single opponent and a single scenario a negotiation setting or negotiation instance .

3.1. Scenario

The negotiations in this paper are performed over multi-issue scenarios. Past research has already described on how to define and use such scenarios in automated negotiation Raiffa (1982); Marsa-Maestre et al. (2014); Baarslag (2014). We adopt these standards in this paper and describe them briefly.

An issue is a sub-problem in the negotiation for which an agreement must be found. It can be either numerical or categorical. The set of possible solutions in an issue is denoted by and the Cartesian product of all the issues in a scenario forms the total outcome space . An outcome is denoted by .

Every party has his own preferences over the outcome space expressed through a utility function , such that , where a score of 1 is the maximum. We refer to our own utility function with and to the opponents utility function with . The negotiations are performed under incomplete information, so the utility of the opponent is predicted, which is denoted by .

Each scenario has a Nash bargaining solution Nash (1950) that we will use for performance analyses. Equation 1 defines this equilibrium.

(1)

We simplify in this paper, by eliminating the reservation utility and discount factor from the scenarios for the experiments.

3.2. Dynamic agent

We first create a Dynamic Agent with a flexible strategy equivalent to a configuration space. We implement a few popular components and add their design choices to the configuration space, increasing the chances that it contains a successful strategy. We refer to this configuration space (or strategy space) with . We name the constructed agent Dynamic Agent , with strategy .

The dynamic agent is constructed on the basis of the BOA-architecture Baarslag (2014). We use this structure to give a brief overview of the workings of the dynamic agent and its configuration space.

3.2.1. Bidding strategy

The implemented bidding strategy applies a fitness value to the outcome space and selects the outcome with the highest fitness as the offer, which is an approach used by Lau et al. Lau et al. (2006). This fitness function balances between our utility, the opponent’s utility and the remaining time towards the deadline. Such a tactic is also known as a time dependent tactic and generally concedes towards the opponent as time passes.

The fitness function in Equation 2 has three parameters:

  • A trade-off factor that balances between the importance of our own utility and the importance of reaching an agreement.

  • A factor to control an agents eagerness to concede relative to time, where . Boulware if , linear conceder if , conceder if .

  • A categorical parameter that sets the outcome where the fitness function concedes towards over time (Equation 3). Here, is the last offer made by the opponent and is the best offer the opponent made in terms of our utility.

(2)
(3)
Outcome space exploration

The outcome space is potentially large. To reduce computational time and to ensure a fast response time of our agent, we apply a genetic algorithm to explore the outcome space in search of the best outcome. Standard procedures such as, elitism, mutation and uniform crossover are applied and the parameters of the genetic algorithm are added to the configuration space.

Configuration space

The configuration space of the bidding strategy is summarized in Table 1.

Description Symbol Domain
Trade-off factor
Conceding factor
Conceding goal
Population size
Tournament size
Evolutions
Crossover rate
Mutation rate
Elitism rate
Table 1. Configuration space in bidding strategy

3.2.2. Opponent model

The Smith Frequency model Van Galen Last (2012)

is used to estimate the opponents utility function

. According to an analysis by Baarslag et al. Baarslag et al. (2013), the performance of this opponent modelling method is already quite close to that of the perfect model. No parameters are added to the configuration space of the Dynamic Agent.

3.2.3. Acceptance strategy

The acceptance strategy decides when to accept an offer from the opponent. Baarslag et al. Baarslag et al. (2014) performed an isolated and empirical research on popular acceptance conditions. They combined acceptance conditions and showed that a combined approach outperforms its parts. Baarslag et al. defined four parameters and performed a grid-search in search of the best strategy. We adopt the combined approach and add its parameters (Table 2) to the configuration space of the Dynamic Agent. For more details on the combined acceptance condition, see  Baarslag et al. (2014).

Description Symbol Domain
Scale factor
Utility gap
Accepting time
Lower boundary utility
Table 2. Configuration space in acceptance strategy

3.3. Problem definition

The negotiation agents in the genius environment are mostly based on manually configured strategies by competitors in anac. These agents almost always contain parameters that are set by trial and error, despite the abundance of automated algorithm configuration techniques (e.g. Genetic Algorithm Holland (1992)). Manual configuration is a difficult and tedious job due to the dimensionality of both the configuration and the negotiation problem space.

A few attempts were made to automate this process as discussed in Section 2, but only on very specific negotiation settings with few configuration parameters. The main reason for this, is that many automated configuration algorithms require to evaluate a challenging configuration on the full training set. To illustrate, evaluating the performance of a single configuration on the full training set that we use in this paper would take 1̃8.5 hours, regardless of the hardware due to the real-time deadline. These methods of algorithm configuration are therefore impractical.

Automated strategy configuration

We have an agent called Dynamic Agent , with strategy . We want to configure this agent, such that it performs generally well, using automated configuration methods. More specifically, we want the agent to perform generally well in bilateral negotiations with a real time deadline of . To do so, we take a diverse and large set of both agents of size and scenarios of size that we use for training, making the total amount of training instances . Running all negotiation settings in the training set would take minutes or hours, regardless of the hardware as we use real time deadlines.

Now suppose we have a setting for the Dynamic Agent based on the literature and a setting that is hand tuned based on intuition, modern literature and manual tuning that we consider baselines. Can we automatically configure a strategy that outperforms the baselines and wins an anac-like bilateral tournament on a never before seen test set of negotiation instances ?

4. Automated configuration

The goal of our work is to create an agent that can be configured to obtain a negotiation strategy that performs well in a given setting. This requires us to define what it mean for a strategy to perform well. An obvious performance measure is the utility obtained using strategy in negotiation instance . As we are interested in optimizing performance on the full set of training instances rather than for a single instance, we define the performance of a configuration on an instance set as the average utility:

(4)

where:

utility of configuration on instance
average utility of configuration on instance set
parameter configuration
single negotiation instance consisting of opponent agent and scenario , where
set of negotiation instances

As stated in Section 3.3, automated configuration methods that require evaluation on the full training set of instances, thus requiring Equation 4

to be calculated, are impractical for our application. A second component that influences the amount of required evaluations, is the mechanism that selects configurations for evaluation. This is not a straightforward problem, as the configuration space is large, and simple approaches, such as random search and grid search, suffer from the curse of dimensionality.

4.1. Smac

To solve the problem defined in Section 3.3, we bring smac, a prominent, general-purpose algorithm configuration procedure Hutter et al. (2011), into the research area of automated negotiation. We note that smac is well suited for tackling the configuration problem arising in the context of our study:

  1. It can handle different types of parameters, including real- and integer-valued as well as categorical parameters.

  2. It can configure on subsets of the training instance set, reducing the computational expense.

  3. It has a mechanism to terminate poorly performing configurations early, saving computation time. If it detects that a configuration is performing very poorly on a small set of instances (e.g., a very eager conceder), it stops evaluating and drops the configuration.

  4. It models the relationship between parameter settings, negotiation instance features and performance, which tends to significantly reduce the effort of finding good configurations.

  5. It permits straightforward parallelization of the configuration process by means of multiple independent runs, which leads to significant reductions in wall-clock time.

smac keeps a run history (Equation 5), consisting of a configuration with its associated utility on a negotiation instance that is modeled by a feature set . A random forest regression model is fitted to this run history, mapping the configuration space and negotiation instance space to a performance estimate (Equation 6). This model is then used to predict promising configurations, which are subsequently raced against the best configuration found so far, until an overall time budget is exhausted. We refer the reader to Hutter et al. (2011) for further details on smac.

(5)
(6)

In order for smac to be successful in predicting promising configurations, it requires an accurate feature description of the negotiation instances that captures differences in complexity between these instances.

Automated algorithm configuration

Suppose we have a set of opponent agents and a set of negotiation scenarios , such that combining a single agent and a single scenario creates a new negotiation setting or instance . Can we derive a set of features for both the opponent and the scenario that characterize the complexity of the negotiation instance?

We approach this question empirically, by analyzing if a candidate feature set helps the automated algorithm configuration method in finding better configurations within the same computational budget.

5. Instance Features

The negotiation instances consist of an opponent and a scenario. We will extract features for both component separately and then combine them as a feature set of an instance (Equation 7). This feature description is used to by the configuration method to predict promising strategies for our Dynamic Agent .

(7)

5.1. Scenario features

A negotiation scenario consists of a shared domain and individual preference profiles. Ilany et al. Ilany and Gal (2016) specified a list of features to model a scenario that they used for strategy selection in bilateral negotiation. Although the usage differs in their paper, the goal to model the scenario is the same, so we will follow Ilany et al.. The features are fully independent of the opponents behavior. An overview of the scenario features is provided in Table 3.

Feature type Description Equation Notes
Domain Number of issues
Domain
Average number of
values per issue
Domain
Number of possible
outcomes
Preference
Standard deviation of
issue weights
Preference
Average utility of all
possible outcomes
denoted
by
Preference
Standard deviation utility
of all possible outcomes
Table 3. Scenario features

5.2. Opponent features

This section describes the opponent features in detail. For each opponent, we store both the mean and the cov of all features.

5.2.1. Normalized time

The time it takes to reach an agreement with the opponent.

5.2.2. Concession rate

To measure how much an opponent is willing to concede towards our agent, we use the notion of cr introduced by Baarslag et al. Baarslag et al. (2011). The cr is a normalized ratio , where means that the opponent fully conceded and means that the opponent did not concede at all. By using a ratio instead of an absolute value (utility), the feature is disassociated from the scenario.

To calculate the cr, Baarslag et al. Baarslag et al. (2011) used two constants. The minimum utility an opponent has demanded during the negotiation session and the fyu, which is the utility that the opponent receives at our maximum outcome .

We present a formal description of the cr in Equation 8 and a visualization in Figure 1.

(8)
Figure 1. Visualization of cr

5.2.3. Average rate

We introduce the ar that indicates the average utility an opponent has demanded as a ratio depending on the scenario. The two constants needed are the fyu () as described in the previous section and the average utility an opponent demanded (). The ar is a normalized ratio , where means that the opponent only offered his maximum outcome and means that the average utility the opponent demanded is less than or equal to the fyu. We present a definition of the ar in Equation 9 and a visualization in Figure 2.

(9)
Figure 2. Visualization of ar

The ar is another indication of competitiveness of the opponent based on average utility demanded instead of minimum demanded utility as the cr is.

5.2.4. Default configuration performance

According to Hutter et al. Hutter et al. (2011), the performance of any default configuration on a problem works well as a feature for that specific problem. For negotiation, this translates to the obtained utility of a hand-picked default strategy on a negotiation instance. The obtained utility is normalized and can be used as a feature for that negotiation instance.

We implement this concept as an opponent feature by selecting a default strategy and using it to obtain an agreement with the opponent. We then normalize the obtained utility and use it as the dcp feature. We present the formal definition of this feature in Equation 10 and a visualization in Figure 3.

(10)
Figure 3. Visualization of dcp

5.3. Opponent utility function

As can be seen in Figure 1, 2, and 3, the actual opponent utility function is used to calculate the opponent features. smac is only used to configure the Dynamic Agent on the training set. As the opponent features are only used by smac, we can safely use the opponent’s utility function to construct those features (Equation 8, 9 and 10) without giving the Dynamic Agent an unfair advantage during testing. The Dynamic Agent always uses the predicted opponent utility obtained through the model (Section 3.2.2), as is conventional in the anac.

We provide an overview of when the predicted opponent utility function and when the actual opponent utility function is used in Table 4.

Training Testing
smac N/A
Table 4. Opponent utility function usage

6. Empirical evaluation

We must set baseline configurations to compare to the result of the optimization. The basis of our Dynamic Agent is derived from a paper by Lau et al. Lau et al. (2006). Though some functionality is added, it is possible to set our agent’s strategy to resemble that of the original agent. We refer to this configuration from the literature as , its parameters can be found in Table 5.

Another baseline strategy is added, which is configured manually, as the literature configuration is outdated. A combination of intuition, past research, and manual search, is used for this manual configuration, which we consider default method for current anac competitors. We present the manually configured parameters in Table 5 and an explanation below:

  • Accepting: The acceptance condition parameters of set a pure strategy with parameters . Baarslag et al. Baarslag et al. (2014) performed an empirical research on a variety of acceptance conditions and showed that there are better alternatives. We set the accepting parameters of our configuration to the best performing condition as found by Baarslag et al. Baarslag et al. (2014).

  • Fitness function: Preliminary testing showed that the literature configuration concedes much faster than the average anac agent, resulting in a poor performing strategy. We set a more competitive parameter configuration for the fitness function by manual search, to match the competitiveness of the anac agents.

  • Space exploration: The domain used in the paper has a relatively small set of outcomes. We increased the population size, added an extra evolution to the genetic algorithm and made some minor adjustments to cope with larger outcome spaces.

Accepting Fitness function Space exploration
Table 5. Baseline configurations parameters

6.1. Method

smac is run in embarrassingly parallel mode on a computing cluster by starting a separate smac process on chunks of allocated hardware. smac selects a negotiation instance and a configuration to evaluate on that instance and calls the negotiation environment genius through a wrapper function.

Input

The training instances were created by selecting a diverse set of opponents and scenarios from the genius environment. The scenarios have non-linear utility functions and vary in competitiveness and outcome space size (between 9 and 400 000). The scenario features were calculated in advance as described in Section 5.1, and the configuration space is defined in Section 3.2.

The opponent features, as defined in Section 5.2, can only be gathered by performing negotiations against the opponents. We gather these features in advance by negotiating 10 times in every instance with the manual strategy .

Hardware & configuration budget

We perform 300 independent parallel runs of smac for 4 hours of wall-clock time each, on a computing cluster running slurm. To ensure consistent results, all runs were performed on Intel® Xeon® CPU, allocating 1 CPU core, with 2 processing threads and 12 GB RAM to each run of smac.

Output

Every parallel smac process outputs its best configuration after the time budget is exhausted. As there are 300 parallel processes, a decision must be made on which of the 300 configurations to use. To do so, the smac random forest regression model conform Equation 6 is rebuild and used to predict the performance of every . The configuration with the best predicted performance is selected as best configuration .

6.2. Results

The configuration process as described is run three times without instance features and three times with instance features, under identical conditions. There is now a total of 8 strategies: 2 baselines , 3 optimized without features , and 3 optimized with features . An overview of the final configurations is presented in Table 6.

Accepting Fitness function Space exploration
Table 6. Configurations overview

The obtained configurations are now analyzed with an emphasis on the following three topics:

  1. The influence of the instance features on the convergence of the configuration process.

  2. The performance of the obtained configurations on a never before seen set of instances.

  3. The performance of the best configuration in an anac-like bilateral tournament.

6.2.1. Influence of instance features

To study the influence of the instance features on the configuration process, we compare the strategies obtained by configuring with features and by configuring without features. Only the training set of instances is used for the performance comparison, as we are purely interested in the convergence towards a higher utility.

Every configuration is run 10 times on the set of training instances and the average obtained utility is calculated by Equation 4. The results are presented in Table 7, including an improvement ratio over .

Description
0.533 -0.307 Literature
0.769 0 Manually configured
0.785 0.020 Configured without features
0.770 0.000 Configured without features
0.792 0.029 Configured without features
0.800 0.040 Configured with features
0.816 0.060 Configured with features
0.803 0.044 Configured with features
Table 7. Performance of configurations on

smac is capable of improving the performance of the Dynamic Agent above our capabilities of manual configuration. We observe that configuration without instance features potentially leads to marginal improvements on the training set. Finally, we observe that the usage of instance features leads to less variation in final configuration parameters (Table 6) and to a significant improvement of obtained utility.

6.2.2. Performance on test set

Testing the configurations on a never before seen set of opponent agents and scenarios is needed to rule out potential overfitting. We selected a diverse set of scenarios and opponents for testing, such that .

Every configuration is once again run 10 times on the set of training instances and the average obtained utility is calculated by Equation 4. The results are presented in Table 8, including an improvement ratio over .

Description
0.563 -0.261 Literature
0.763 0 Manually configured
0.779 0.021 Configured without features
0.760 -0.004 Configured without features
0.774 0.015 Configured without features
0.792 0.038 Configured with features
0.795 0.042 Configured with features
0.789 0.034 Configured with features
Table 8. Performance of configurations on

It is now clear that strategy configuration without instance features is undesirable as it potentially leads to a worse performing strategy. Configuration with instance feature on the other hand, still leads to a significant performance increase on a never before seen set of negotiation instances.

6.2.3. ANAC tournament performance of best configuration

The strategy configuration method is successful in finding improved configurations, but the results are only compared against the other configurations of our Dynamic Agent. No comparison is yet made with agents build by anac competitors. We now compare the performance of the best configuration that we found to the anac agents in the test set of opponents.

We select as the best strategy based on performance on the training set and enter the Dynamic Agent in an anac-like bilateral tournament with a 60 second deadline. The Dynamic Agent is combined with the test set of opponents and scenarios. Every combination of 2 agents negotiated 10 times on every scenario, for a total amount of 38080 negotiation sessions. The averaged results are presented in Table 9. We elaborate on the performance measures found in the table:

  • Utility: The utility of the agreement.

  • Opp. utility: The opponent’s utility of the agreement.

  • Social welfare: The sum of utilities of the agreement.

  • Pareto distance: Euclidean distance of the agreement to the nearest outcome on the Pareto frontier in terms of utility.

  • Nash distance: Euclidean distance of the agreement to the Nash solution in terms of utility (Equation 1).

  • Agreement ratio: The ratio of negotiation sessions that result in an agreement.

Agent
Utility
Opp.
utility
Social
welfare
Pareto
distance
Nash
distance
Agreement
ratio
RandomCounterOfferParty 0.440 0.957 1.398 0.045 0.415 1.000
HardlinerParty 0.496 0.240 0.735 0.507 0.754 0.496
AgentH 0.518 0.801 1.319 0.118 0.408 0.904
ConcederParty 0.577 0.848 1.425 0.047 0.358 0.964
LinearConcederParty 0.600 0.831 1.431 0.046 0.350 0.964
PhoenixParty 0.625 0.501 1.125 0.263 0.468 0.748
GeneKing 0.637 0.760 1.396 0.061 0.383 0.993
Mamenchis 0.651 0.725 1.377 0.087 0.360 0.927
BoulwareParty 0.662 0.786 1.448 0.043 0.319 0.968
Caduceus 0.677 0.486 1.163 0.241 0.453 0.784
Mosa 0.699 0.640 1.339 0.113 0.385 0.902
ParsCat2 0.716 0.671 1.386 0.108 0.286 0.904
RandomDance 0.737 0.716 1.453 0.024 0.344 0.998
ShahAgent 0.744 0.512 1.256 0.188 0.389 0.821
AgentF 0.751 0.605 1.356 0.100 0.367 0.918
SimpleAgent 0.756 0.437 1.194 0.212 0.470 0.801
0.795 0.566 1.361 0.087 0.407 0.922
Table 9. Bilateral anac tournament results using (bold = best, underline = worst)

Using the Dynamic Agent with

results in a successful negotiation agent that is capable of winning a anac-like bilateral tournament by outperforming all other agents (two-tailed t-test:

). It managed to obtain a higher utility than SimpleAgent, the number two in the ranking, while also outperformed it on every other performance measure.

Since the presence of our agent in the tournament also influences the performance of other agents, we also ran the full tournament without our Dynamic Agent as a sanity check. The top 5 performers of this tournament are presented in Table 10, along with their margins over the respective next lower-ranking agent in terms of utility.

Agent
Utility
Margin
Mosa 0.715 3.01%
ShahAgent 0.736 2.43%
RandomDance 0.754 0.65%
AgentF 0.759 0.01%
SimpleAgent 0.759
Table 10. Bilateral anac tournament without

7. Conclusion

The two main contributions of this work are (1) the success of automated configuration of negotiation strategies using a general-purpose configuration procedure (here: smac), and (2) an investigation of the importance of the features of negotiation settings.

7.1. Configuration

Two baseline strategies were selected for our comparison. The first configuration, , is based on publications from which we derived the agent Lau et al. (2006); Baarslag et al. (2014). The second configuration, , is configured based on intuition, recent literature and manual search, which we considered the default approach for current anac competitors. In Section 6, we automatically configured our dynamic Agent using smac.

The configuration based on earlier work  Lau et al. (2006) performed poorly compared to the manually configured configuration , and achieved 26.1% lower utility on our test set. The best automatically configured strategy outperformed both baseline configurations and achieved a 4.2% increase in utility compared to . From this, we conclude that the automated configuration method is successful in outperforming manual configuration.

Our experiments show that the automated configuration method can produce a strategy that can win an anac-like bilateral tournament by a margin of 5.1% (Table 9). This is particularly striking when considering that without our agent, the winner of the same tournament beats the next-based agent only by a margin of 0.01%.

7.2. Features

We consider a set of features that characterizes the negotiation scenario as well as the opponent. Our empirical results indicate that when using the negotiation instance features, smac is able to find good configurations faster.

Overall, using smac in combination with instance features leads to less variation in the parameter settings between the final configurations obtained in multiple independent runs (Table 6, Table 7), as well as significant and consistent performance improvement. Furthermore, our results show that automated configuration without features does not always outperform manual configuration. Therefore, we conclude that the instance features presented in this paper are a necessary ingredient for the successful automated configuration of negotiation strategies.

7.3. Future work

For this initial step towards automated configuration of negotiation agents, the negotiation scenarios were simplified by removing the reservation utility and the discount factor. Now that we have demonstrated that our general approach can be successful, additional validation should be performed in more complex and different negotiation environments.

Over the years, it became clear that there is no single best negotiation strategy for all negotiation settings Lin et al. (2014). In this work, we have presented a method to automatically configure an effective strategy for a specific set of negotiation settings. However, if this set becomes too diverse, we inherently end up in a situation where the automatically configured best strategy may not perform too well. Future work should exploit the strategy space of the dynamic agent by extracting multiple complementary strategies for specific settings, along with an on-line selection mechanism that determines the strategy to be used in a specific instance.

References

  • (1)
  • Ansótegui et al. (2009) Carlos Ansótegui, Meinolf Sellmann, and Kevin Tierney. 2009. A Gender-Based Genetic Algorithm for the Automatic Configuration of Algorithms. In Principles and Practice of Constraint Programming - CP 2009, Ian P Gent (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 142–157.
  • Aydoğan et al. (2017) Reyhan Aydoğan, David Festen, Koen V. Hindriks, and Catholijn M. Jonker. 2017. Alternating offers protocols for multilateral negotiation. In Studies in Computational Intelligence. Vol. 674. Springer, 153–167. https://doi.org/10.1007/978-3-319-51563-2_10
  • Baarslag (2014) T Baarslag. 2014. What to bid and when to stop. 338 pages. https://doi.org/10.4233/uuid:3df6e234-a7c1-4dbe-9eb9-baadabc04bca
  • Baarslag et al. (2015) Tim Baarslag, Reyhan Aydoğan, Koen V. Hindriks, Katsuhide Fujita, Takayuki Ito, and Catholijn M. Jonker. 2015. The Automated Negotiating Agents Competition, 2010–2015. AI Magazine 36, 4 (2015), 2010–2014. https://doi.org/10.1609/aimag.v36i4.2609
  • Baarslag et al. (2013) Tim Baarslag, Mark Hendrikx, Koen Hindriks, and Catholijn Jonker. 2013. Predicting the performance of opponent models in automated negotiation. In Proceedings - 2013 IEEE/WIC/ACM International Conference on Intelligent Agent Technology, IAT 2013, Vol. 2. IEEE, 59–66. https://doi.org/10.1109/WI-IAT.2013.91
  • Baarslag et al. (2011) Tim Baarslag, Koen Hindriks, and Catholijn Jonker. 2011. Towards a quantitative concession-based classification method of negotiation strategies.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    7047 LNAI (2011), 143–158.
    https://doi.org/10.1007/978-3-642-25044-6_13
  • Baarslag et al. (2014) Tim Baarslag, Koen Hindriks, and Catholijn Jonker. 2014. Effective acceptance conditions in real-time automated negotiation. Decision Support Systems 60, 1 (2014), 68–77. https://doi.org/10.1016/j.dss.2013.05.021
  • Baarslag et al. (2012) Tim Baarslag, Koen Hindriks, Catholijn Jonker, Sarit Kraus, and Raz Lin. 2012. The first automated negotiating agents competition (ANAC 2010). Studies in Computational Intelligence 383, Anac (2012), 113–135. https://doi.org/10.1007/978-3-642-24696-8_7
  • Balaprakash et al. (2007) Prasanna Balaprakash, Mauro Birattari, and Thomas Stützle. 2007. Improvement strategies for the F-Race algorithm: Sampling design and iterative refinement. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 4771 (2007), 108–122. https://doi.org/10.1007/978-3-540-75514-2_9
  • Birattari et al. (2010) Mauro Birattari, Zhi Yuan, Prasanna Balaprakash, and Thomas Stützle. 2010. F-Race and Iterated F-Race: An Overview. In Experimental Methods for the Analysis of Optimization Algorithms, Thomas Bartz-Beielstein, Marco Chiarandini, Luís Paquete, and Mike Preuss (Eds.). Springer Berlin Heidelberg, 311–336. https://doi.org/10.1007/978-3-642-02538-9_13
  • Dworman et al. (1996) Garett Dworman, Steven O. Kimbrough, and James D. Laing. 1996. Bargaining by artificial agents in two coalition games: A study in genetic programming for electronic commerce. Proceedings of the First Annual Conference on Genetic Programming (1996), 54–62. http://portal.acm.org/citation.cfm?id=1595536.1595544
  • Eymann (2001) T Eymann. 2001. Co-evolution of bargaining strategies in a decentralized multi-agent system. AAAI Fall 2001 Symposium on Negotiation Methods for Autonomous Cooperative Systems (2001), 126–134. http://www.aaai.org/Papers/Symposia/Fall/2001/FS-01-03/FS01-03-016.pdf
  • Holland (1992) John Henry Holland. 1992. Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press. 232 pages.
  • Hutter et al. (2010) Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2010. Automated configuration of mixed integer programming solvers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6140 LNCS (2010), 186–202. https://doi.org/10.1007/978-3-642-13520-0_23
  • Hutter et al. (2011) Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential model-based optimization for general algorithm configuration. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6683 LNCS (2011), 507–523. https://doi.org/10.1007/978-3-642-25566-3_40
  • Hutter et al. (2009) Frank Hutter, Holger H. Hoos, Kevin Leyton-Brown, and Thomas Stützle. 2009. ParamILS: An automatic algorithm configuration framework. Journal of Artificial Intelligence Research 36 (2009), 267–306. https://doi.org/10.1613/jair.2861
  • Ilany and Gal (2016) Litan Ilany and Ya’akov Gal. 2016. Algorithm selection in bilateral negotiation. Autonomous Agents and Multi-Agent Systems 30, 4 (2016), 697–723. https://doi.org/10.1007/s10458-015-9302-8
  • Klein and Lu (1989) Mark Klein and Stephen C.Y. Lu. 1989. Conflict resolution in cooperative design. Artificial Intelligence in Engineering 4, 4 (1989), 168–180. https://doi.org/10.1016/0954-1810(89)90013-7
  • Lau et al. (2006) Raymond Y.K. Lau, Maolin Tang, On Wong, Stephen W. Milliner, and Yi Ping Phoebe Chen. 2006. An evolutionary learning approach for adaptive negotiation agents. International Journal of Intelligent Systems 21, 1 (2006), 41–72. https://doi.org/10.1002/int.20120
  • Lin et al. (2014) Raz Lin, Sarit Kraus, Tim Baarslag, Dmytro Tykhonov, Koen Hindriks, and Catholijn M. Jonker. 2014. Genius: An integrated environment for supporting the design of generic automated negotiators. Computational Intelligence 30, 1 (2014), 48–70. https://doi.org/10.1111/j.1467-8640.2012.00463.x
  • Marsa-Maestre et al. (2014) Ivan Marsa-Maestre, Mark Klein, Catholijn M. Jonker, and Reyhan Aydoǧan. 2014. From problems to protocols: Towards a negotiation handbook. Decision Support Systems 60, 1 (2014), 39–54. https://doi.org/10.1016/j.dss.2013.05.019
  • Matos et al. (1998) Noyda Matos, Carles Sierra, and Nick R. Jennings. 1998. Determining successful negotiation strategies: An evolutionary approach. Proceedings - International Conference on Multi Agent Systems, ICMAS 1998 (1998), 182–189. https://doi.org/10.1109/ICMAS.1998.699048
  • Nash (1950) John F. Nash. 1950. The Bargaining Problem. Econometrica 18, 2 (1950), 155. https://doi.org/10.2307/1907266
  • Osborne and Rubinstein (1994) Martin J. Osborne and Ariel Rubinstein. 1994.

    A Course in Game Theory.

    (1 ed.). Vol. 1.
    MIT press. https://doi.org/10.2307/2554642
  • Raiffa (1982) Howard Raiffa. 1982. The art and science of negotiation. Harvard University Press.
  • Robinson (1990) W.N. Robinson. 1990. Negotiation behavior during requirement specification. [1990] Proceedings. 12th International Conference on Software Engineering (1990), 268–276. https://doi.org/10.1109/ICSE.1990.63633
  • Rosenschein (1986) J. S. Rosenschein. 1986. Rational interaction: cooperation among intelligent agents. Ph.D. Dissertation. Stanford University, Stanford, CA, USA. http://www.osti.gov/energycitations/product.biblio.jsp?osti_id=5310977
  • Rubinstein (1982) Ariel Rubinstein. 1982. Perfect Equilibrium in a Bargaining Model. Econometrica 50, 1 (1982), 97. https://doi.org/10.2307/1912531
  • Smith (1980) Reid G. Smith. 1980. The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver. IEEE Trans. Comput. C-29, 12 (1980), 1104–1113. https://doi.org/10.1109/TC.1980.1675516
  • Sycara (1988) Katia Sycara. 1988. Resolving Goal Conflicts via Negotiation. The Seventh National Conference on Artificial Intelligence (1988), 245–249. http://www.aaai.org/Papers/AAAI/1988/AAAI88-044.pdf
  • Sycara-Cyranski (1985) K Sycara-Cyranski. 1985. Arguments Of Persuasion In Labour Mediation. Proceedings of the International Joint Conference on Artificial Intelligence 1 (1985), 294–296.
  • Van Galen Last (2012) Niels Van Galen Last. 2012. Agent Smith: Opponent model estimation in bilateral multi-issue negotiation. Studies in Computational Intelligence 383 (2012), 167–174. https://doi.org/10.1007/978-3-642-24696-8_12
  • Yoo et al. (2003) Andy B. Yoo, Morris A. Jette, and Mark Grondona. 2003. SLURM: Simple Linux Utility for Resource Management. Lecture Notes in Computer Science 2862 (2003), 44–60. https://doi.org/10.1007/10968987_3
  • Zlochin et al. (2004) Mark Zlochin, Mauro Birattari, Nicolas Meuleau, and Marco Dorigo. 2004.

    Model-based search for combinatorial optimization: A critical survey.

    Annals of Operations Research 131, 1-4 (2004), 373–395. https://doi.org/10.1023/B:ANOR.0000039526.52305.af