Online Decision-Making and Collaborative Filtering

Tom Hayes, TTI Chicago

Online Decision-Making is a setting in which an agent must make a sequence of decisions while lacking important knowledge affecting their outcomes. After making each decision, the agent receives feedback about how well it turned out. Our goal is to design algorithms which allows such an agent to adapt to the environment over time, and hopefully converge quickly to the best performance possible. One example is the classical “k-armed bandit” problem, by now fairly well understood, in which the agent must, at each time step, choose one of k options, each of which results in an unknown payout. In this case, a very simple and efficient algorithm is known, based on “multiplicative updates.” One application domain is machine learning, and in fact, the aforementioned algorithm is a key component in the celebrated Adaboost algorithm. Collaborative Filtering is a setting in which many agents try to share information about common interests, for the purpose of later decision-making. As opinions and tastes may vary wildly among these agents, this can be quite difficult. The difficulty is compounded by having only partial and noisy data, and by the likely presence of many adversarial agents, acting on behalf of other interested parties. Recommendation systems, such as those of Netflix and Amazon, or Google’s Pagerank, are examples. Another example is drivers using citizen’s band (CB) radio to warn each other about road conditions, traffic, speed traps, etc. I will describe some of my recent attempts to apply techniques from Online Decision-Making to develop better algorithms for Collaborative Filtering. This area seems ripe for new techniques.

Speaker Biography

Thomas Hayes received his Ph.D. in Computer Science in 2003 at the University of Chicago. He is currently a Research Assistant Professor at the Toyota Technological Institute in Chicago. His interests span discrete mathematics, algorithms, statistical physics, machine learning and probability theory. His most recent focus has been on online optimization, Markov chain Monte Carlo algorithms, and statistical group theory.