Temporal dynamics of prediction error processing during reward-based decision making

Marios G. Philiastides; Guido Biele; Niki Vavatzanidis; Philipp Kazzer; Hauke R. Heekeren

doi:10.1016/j.neuroimage.2010.05.052

Temporal dynamics of prediction error processing during reward-based decision making

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Marios G. Philiastides - , Max Planck Institute for Human Development, Max-Planck-Institut für Kognitions- und Neurowissenschaften (Autor:in)
Guido Biele - , Max Planck Institute for Human Development, Max-Planck-Institut für Kognitions- und Neurowissenschaften, Freie Universität (FU) Berlin (Autor:in)
Niki Vavatzanidis - , Max Planck Institute for Human Development (Autor:in)
Philipp Kazzer - , Max Planck Institute for Human Development (Autor:in)
Hauke R. Heekeren - , Max Planck Institute for Human Development, Max-Planck-Institut für Kognitions- und Neurowissenschaften, Freie Universität (FU) Berlin (Autor:in)

Abstract

Adaptive decision making depends on the accurate representation of rewards associated with potential choices. These representations can be acquired with reinforcement learning (RL) mechanisms, which use the prediction error (PE, the difference between expected and received rewards) as a learning signal to update reward expectations. While EEG experiments have highlighted the role of feedback-related potentials during performance monitoring, important questions about the temporal sequence of feedback processing and the specific function of feedback-related potentials during reward-based decision making remain. Here, we hypothesized that feedback processing starts with a qualitative evaluation of outcome-valence, which is subsequently complemented by a quantitative representation of PE magnitude. Results of a model-based single-trial analysis of EEG data collected during a reversal learning task showed that around 220. ms after feedback outcomes are initially evaluated categorically with respect to their valence (positive vs. negative). Around 300. ms, and parallel to the maintained valence-evaluation, the brain also represents quantitative information about PE magnitude, thus providing the complete information needed to update reward expectations and to guide adaptive decision making. Importantly, our single-trial EEG analysis based on PEs from an RL model showed that the feedback-related potentials do not merely reflect error awareness, but rather quantitative information crucial for learning reward contingencies.

Details

Originalsprache	Englisch
Seiten (von - bis)	221-232
Seitenumfang	12
Fachzeitschrift	NeuroImage
Jahrgang	53
Ausgabenummer	1
Publikationsstatus	Veröffentlicht - 25 Mai 2010
Peer-Review-Status	Ja
Extern publiziert	Ja

Externe IDs

PubMed	20510376
ORCID	/0000-0002-5009-1719/work/142235800

Schlagworte

ASJC Scopus Sachgebiete

Neurologie
Kognitive Neurowissenschaft

Schlagwörter

Decision making, EEG, Model, Prediction error, Reinforcement learning, Reward, Single-trial

Forschungsportal der TU Dresden