Programmatic Policies: A Promising Path to Reinforcement Learning in the Games Industry


Reinforcement learning (RL) algorithms have led to groundbreaking achievements, with systems mastering complex games like Go and StarCraft II at a grandmaster level. Despite these successes, the adoption of RL in the games industry remains cautious, primarily due to the non-interpretable nature of the learned policies and designers’ inability to manually modify them to meet specific design objectives. Additionally, these policies often require an excessive number of training samples.

This tutorial covers approaches that use programmatic policies written in domain-specific languages. These languages can define spaces of interpretable and modifiable policies that can be trained faster than neural policies. Participants will learn how to build systems that can write such programmatic policies. The tutorial includes a hands-on exercise using MicroRTS, a real-time strategy game.


Levi Lelis, University of Alberta: Dr. Levi Lelis is an Assistant Professor at the University of Alberta, an Amii Fellow, and a Canada CIFAR AI Chair. Prior to joining the University of Alberta, Levi was a Professor for six years at the Universidade Federal de Viçosa in Brazil. Levi was the program chair for the Symposium on Combinatorial Search (SoCS) in 2015 and for the AAAI Conference on Artificial Intelligence and Interactive and Digital Entertainment (AIIDE) in 2019. Levi has published more than 60 papers in AI and ML venues. One of such papers won a Distinguished Paper Award at IJCAI in 2023.

Psychophysiology in Games and UX Research


This tutorial will guide the attendees through the basic knowledge needed to understand the nature of the most common physiological signals (e.g. EEG or GSR) employed for user/player experience research and their connection to higher-level cognitive and affective phenomena.For each of these signals, we will discuss the challenges connected to their collection and analysis and the opportunity they offer to understand the player experience. We will highlight the latest research articles in the area and give pointers to useful resources such as datasets and software libraries.


Laurits Dixen, IT University of Copenhagen:  Laurits Dixen is a PhD student at the IT University of Copenhagen and a member of the ITU brAIn lab. In his research, Mr. Dixen is investigating how deep learning algorithms such as convolutional neural networks and graph neural networks can be used to model the spatiotemporal behavior of EEG. During his PhD, he designed and conducted several studies involving recordings of multiple physiological signals.


Paolo Burelli, IT University of Copenhagen: Dr. Paolo Burelli is an associate professor at the IT University of Copenhagen where he heads the ITU brAIn lab. Together with his team, he investigates how machine learning can be leveraged to model and understand human perception and cognition, by replicating brain structures and their physiological behavior during gameplay.

Dr. Burelli has published more than 40 articles in international journals and conferences in the fields of user modeling, games, and machine learning and has been part of several editorial boards and conference organizing committees.


Statistical Forward Planning Algorithms


Statistical Forward Planning (SFP) is a group of robust and general AI techniques that use a simulation model to adaptively search for effective sequences of actions in various games and other problems characterized as Markov Decision Processes (MDPs, including Partially Observable – POMDPs).
SFP methods can operate without the need for prior training and can handle complex and dynamic environments. This tutorial will provide a tour of SFP from basics to finer points, complete with pointers to Python code (e.g. in OpenSpiel and other repositories). We will cover a number of powerful SFP algorithms including Monte Carlo Tree Search (MCTS), Rolling Horizon Evolutionary Algorithm (RHEA) and Monte Carlo Graph Search (MCGS), as well as handling partial observability with Information Set MCTS. We’ll also cover:

  • the relationship between SFP algorithms and Counterfactual Regret Minimisation (CFR)
  • incorporating policy and value functions (similar to AlphaZero)
  • efficient exploration functions for flat reward landscapes
  • handling combinatorial action spaces

Demonstrations will show these algorithms can play a variety of video games surprisingly well and provide insights into their working principles and behaviours. The tutorial will be suitable for those with no experience of SFP, but we also expect MCTS veterans to gain some fresh insights. We will conclude with a discussion of some of the most exciting challenges in the area.


Simon Lucas, Queen Mary University of London:  Simon is a full professor in the School of Electronic Engineering and Computer Science at Queen Mary University of London (QMUL) where he leads the Game AI Research Group. He was previously Head of the School of EECS at QMUL, and prior to that, Head of the School of Computer Science and Electronic Engineering at the University of Essex. He recently spent two years as a research scientist/software engineer in the Simulation-Based Testing team at Meta, applying simulation-based AI to automated testing.

Simon was the founding Editor-in-Chief of the IEEE Transactions on Games and co-founded the IEEE Conference on Games, was VP-Education for the IEEE Computational Intelligence Society and has served in many conference chair roles. His current research is focused on simulation-based AI and sample-efficient optimization. 


James Goodman, Queen Mary University of London: James Goodman is in his final year of PhD on statistical forward planning in multiplayer board games. He is currently focusing on working with tabletop game designers to use AI techniques in TAG to test their games during the design phase. He indecisively has first degrees in Chemistry, Mathematics and History, and an MSc in Computational Statistics and Machine Learning from UCL.


Marko Tot, Queen Mary University of London: Marko Tot is a PhD student at Queen Mary University of London and a part of the GameAI research group. He is the recipient of the Microsoft Research PhD Studentship in 2020, working on combining Learned Forward Models with Statistical Forward Plan.


Prompt Engineering for Science Birds Level Generation and Beyond

Large language models (LLMs) are widely utilized in game domains, including level generation, which is part of procedural content generation (PCG). ChatGPT4PCG is a competition that provides a platform for participants to utilize their skills in prompt engineering (PE) to generate levels for Science Birds that satisfy stability, similarity, and diversity requirements. In this tutorial, we will introduce various PE techniques from basic to advanced and provide hands-on experience with the ChatGPT4PCG competition platform by implementing tree-of-thought prompting, one of the more advanced PE techniques. We will also discuss the potential of PE in PCG (in general) and its importance to the future of PCG research.

IMPORTANT: If you plan to attend this tutorial, please visit


Pittawat Taveekitworachai, Intelligent Computer Entertainment, Ritsumeikan University: Is a second-year Master’s student at the Intelligent Computer Entertainment Lab at Ritsumeikan University. His research interests include procedural content generation, large language models, and prompt engineering


Febri Abdullah, Intelligent Computer Entertainment, Ritsumeikan University: Is a third-year Doctoral student at the Intelligent Computer Entertainment Lab at Ritsumeikan University. His research interests include procedural content generation, large language models, and prompt engineering.



become a sponsor

previous IEEE CoGs


© 2023 All rights reserved

Scroll to Top