Aims: To implement a virtual tournament of a set of programmable players engaging in a round robin for the Iterated Prisoner’s Dilemma game, test different strategies for these players to follow, identify winning strategies and evaluate the final results.
Background: Imagine the following social interaction presented here as a kind of game that requires intelligent reasoning: You and your accomplish are both held prisoners, having been captured by the police and charged with a serious crime. The prosecutor interrogates you separately and offers each of you a deal. If one of you, the defector, incriminates the other, while the partner remains silent, then as the defector you will be convicted of a lesser crime and your sentence will be cut to one year for providing enough information to jail your partner. Meanwhile, your silent confederate will be convicted of a more serious crime and burdened with a four year sentence. If you both remain silent, and thus cooperate with each other, there will be insufficient evidence to convict either of you of the more serious crime, and you will each receive a sentence of two years for a lesser offence. If one the other hand, you both defect by incriminating each other, you will both be convicted of the more serious crime, but given reduced reduced sentences of three years for at lest being willing to provide information.
If you follow a purely selfish strategy as a player, the best outcome in terms of years in jail is that (a) you defect and the other person cooperates, (b) you both cooperate, and (c) you both defect. If both you and your accomplish are selfish, then you will both try to defect, hence this means 3 years in jail, which is the worst outcome. So if you could only trust each other, by following a strategy that is cooperative, you would both be better off than if you had acted selfishly. A more cooperative strategy will also give you benefits in the future, especially if you had to meet your accomplish after the sentence, engage in more crime, and as it likely need to play the game again, but now with knowledge of how your accomplice behaved in the previous game. This is called the Iterated Prisoner’s dilemma game.
Method: The project should design and implement, in Java, a centralised implementation of a generic virtual tournament framework for running the Iterated Prisoner’s dilemma game and test different types of player agents.
A type of agent will be characterised simply by the kind of strategy the agent is using in the game (e.g. always defect, randomly choose between defect and cooperate). Your work should show how to implement these kind of agents and evaluate their performance in terms of their cooperation characteristics.
A successful project will need to be a balanced project both in terms of the intelligent behaviour of individual players (strategies) and the generality of the software engineering techniques used to implement the tournament.
Early Deliverables
- Familiarisation with the idea of strategies and the identification of set of them.
- Implementation of strategies and the definition of an interface that allows agents to be selected and play against each other
- Definition of a tournament
- Report on the design of a tournament.
- Implementation of a tournament, where a user selects the number of agents, associates them with existing strategies, specifies the number of iterations and deploys a tournament
- Report on the Iterative Prisoner’s dilemma problem
- Report on previous work with the Iterative Prisoner’s dilemma
Final Deliverables
- A program that allows us to simulate tournaments of the iterative prisoner’s dilemma.
- A GUI that allows a user to create and manage tournaments.
- GUI should provide summary statistics that are tournament specific, i.e. a summary for each player/strategy and how it is compared with the other players/strategies in the tournament.
- The report should describe some of the theory behind strategies in cooperative settings.
- The report should provide an analysis and an evaluation of the different strategies, including an indication of which one is the best strategy, if any.
- The report should contain a design of the program and a discussion of any software engineering issues encountered.
- The report should describe any useful and interesting programming techniques that have been employed to develop the final prototype.
Suggested Extensions
- Make use of multi-agent system platform to support the tournament in a completely distributed setting.
- Use logic-based rules to specify the strategies and wrap them in the Java implementation.
- Use strategy recognition techniques so that agents can detect what strategies opponents are playing, and adapt their behaviour accordingly.
- Create a methodology for developing tournaments of this type.
Prerequisites: Excellent programming ability in Java. Good knowledge of Prolog / SQL is desirable but not strictly necessary.
Reading
- M. Nowak, R. May and K. Sigmund. The Arithmetics of Mutual Help, Scientific American, June 1995.
- R. Axelrod, W. D. Hamilton (1981), The Evolution of Cooperation, Science 211: 1390–96.
- K. Stathis, A game-based architecture for developing interactive components in computational logic, Journal of Functional and Logic Programming, 20(1), Jan 2000, MIT Press.