Meta-cognitive Efficiency in Learned Value-based Choice

Sara Ershadmanesh, Ali Gholamzadeh, Kobe Desender, Peter Dayan

Meta-cognition, our ability to assess the quality of our own decisions, is important for regulating choices and has been extensively studied through assessing and modeling reports of confidence. However, these studies focus on immediate choices rather than sequential ones. The latter pose particular problems for meta-cognitive efficiency assessments, because the underlying difficulty of decision-making changes as the problems evolve. Here, we focus on sensitivity and bias of meta-cognitive judgments in learning/decision-making tasks in which outcome values must be learned across trials. We repurposed the central idea underlying the M-ratio, a popular meta-cognitive assessment measure in perceptual decision- making.

A two-arm bandit reversal task was applied in which the rewards following a better or worse choice were drawn from normal distributions with three levels of variance associated with different conditions of task.

We built a Forward model of confidence, characterizing the subjects’ choices and generating ‘first order’ confidence from the modelled probability of being correct; and a Backward model of confidence, which generates choices whose first-order confidence best matches the subjects’ confidence reports. The performance of Forward and Backward models play the roles of d’ and meta-d’ in our measure of meta-cognitive efficiency, the MetaRL-Ratio.

We found that the performance of the Backward model was consistent with previous measures of meta-cognitive sensitivity. We showed the benefits of MetaRL-Ratio relative to previous measures, including insensitivity to accuracy and bias. The MetaRL- Ratio differentiated simulated low and high meta-cognitive competence, and was suitably sensitive when different levels of noise were added to confidence judgments. Furthermore, the measure was robust across different levels of difficulty.

We plan to examine whether our measure is robust across gain and loss contexts, and also reversal and non-reversal conditions. We will extend this measure to a broader range of RL problems, including those in common use for assessing cognitive disorders.