Game Theory and Applications in AI |
Billings et al, AIJ 2002 |
The
Challenge of Poker |
Sanmay |
Sandholm, AI Magazine 2010 |
The State of Solving Large
Incomplete-Information Games, and Application to Poker |
Available |
Southey et al, UAI 2005 |
Bayes'
Bluff: Opponent Modelling in Poker |
Chun-Yi |
Littman and Stone, Decision Support Systems 2005 |
A
Polynomial-time Nash Equilibrium Algorithm for Repeated
Games |
Vishwas |
Greenwald et al, NIPS 2012 |
Approximating
Equilibria in Sequential Auctions with Incomplete Information and
Multi-Unit Demand |
Meenal |
Wunder et al, AAMAS 2011 |
Using Iterated Reasoning to Predict Opponent Strategies |
K. Alnajar |
Jordan et al, AAMAS 2007 |
Empirical
Game-Theoretic Analysis of the TAC Supply Chain Game |
Available |
Archibald and Shoham, AAMAS 2009 |
Modeling
Billiards Games |
Fang Liu |
Archibald and Shoham, IJCAI 2011 |
Hustling
in Repeated Zero-Sum Games with Imperfect Execution |
Jose Cadena |
Hu and Wellman, JMLR 2003 |
Nash
Q-Learning for General-Sum Stochastic Games |
Available |
Bowling and Veloso, AIJ 2002 |
Multiagent
learning using a variable learning rate |
Fei Li |
Conitzer and Sandholm, ICML 2003 |
AWESOME:
A General Multiagent Learning Algorithm that Converges in Self-Play
and Learns a Best Response Against Stationary Opponents |
Available |
Reinforcement / Online / Optimal Learning and Sequential Decision-Making |
Lagoudakis and Parr, JMLR 2003 |
Least Squares
Policy Iteration |
Boris |
Poupart et al, ICML 2006 |
An
Analytic Solution to Discrete Bayesian Reinforcement
Learning |
Qianzhou |
Ryzhov et al, Operations Research 2012 |
The
Knowledge Gradient Algorithm for a General Class of Online Learning
Problems |
Liangzhe |
Charlin et al, ICML 2012 |
Active
Learning for Matching Problems |
Behrooz |
Chhabra and Das, AAMAS 2011 |
Learning
the Demand Curve in Posted-Price Digital Goods Auctions |
Farzaneh |
Stone and Kraus, AAMAS 2010 |
To
Teach or not to Teach? Decision Making Under Uncertainty in Ad Hoc
Teams |
Marjan |
Vermorel and Mohri, ECML 2005 |
Multi-Armed
Bandit Algorithms and Empirical Evaluation |
Available |
Das and Tsitsiklis, JEBO 2010 |
When is
it Important to Know You've Been Rejected? A Search Problem with
Probabilistic Appearance of Offers |
Available |
Seuken and Zilberstein, IJCAI 2007 |
Memory-bounded dynamic programming for
DEC-POMDPs |
Huijuan |
Williams and Young, Comp. Speech & Lang. 2007 |
Partially
observable Markov decision processes for spoken dialog
systems |
Prithwish |
Li et al, AAMAS 2009 |
Online
Exploration in Least-Squares Policy Iteration |
Mohamed |
Abbeel and Ng, ICML 2004 |
Apprenticeship learning via inverse reinforcement
learning |
Allen |
Johns and Woolf, AAAI 2006 |
A
Dynamic Mixture Model to Detect Student Motivation and
Proficiency |
Sally |
Ganchev et al, UAI 2009 |
Censored
Exploration and the Dark Pool Problem |
Mithun |
Tetreault and Litman, NAACL 2006 |
Comparing the Utility of State Features in Spoken Dialogue Using
Reinforcement Learning |
Austin |
Goldman and Rao, MIT Sloan Sports Analytics Conf 2011 |
Allocative and Dynamic Efficiency in NBA Decision Making |
Andrew |