Check out my free video series about what’s missing in AI and Neuroscience
Support the show to get full episodes and join the Discord community.
Anne Collins runs her Computational Cognitive Neuroscience Lab at the University of California, Berkley One of the things she’s been working on for years is how our working memory plays a role in learning as well, and specifically how working memory and reinforcement learning interact to affect how we learn, depending on the nature of what we’re trying to learn. We discuss that interaction specifically. We also discuss more broadly how segregated and how overlapping and interacting our cognitive functions are, what that implies about our natural tendency to think in dichotomies – like MF vs MB-RL, system-1 vs system-2, etc., and we dive into plenty other subjects, like how to possibly incorporate these ideas into AI.
- Computational Cognitive Neuroscience Lab.
- Twitter: @ccnlab or @Anne_On_Tw.
- Related papers:
- How Working Memory and Reinforcement Learning Are Intertwined: A Cognitive, Neural, and Computational Perspective.
- Beyond simple dichotomies in reinforcement learning.
- The Role of Executive Function in Shaping Reinforcement Learning.
- What do reinforcement learning models measure? Interpreting model parameters in cognition and neuroscience.
0:00 – Intro
5:25 – Dimensionality of learning
11:19 – Modularity of function and computations
16:51 – Is working memory a thing?
19:33 – Model-free model-based dichotomy
30:40 – Working memory and RL
44:43 – How working memory and RL interact
50:50 – Working memory and attention
59:37 – Computations vs. implementations
1:03:25 – Interpreting results
1:08:00 – Working memory and AI
Anne 00:00:03 Something that’s curious about working memory is how limited it is really. Right? Right. Like it’s, it’s very, uh, it’s very stupidly limited, right? Like three or four items, really. It’s, um, it’s, uh, you know, like if you’re, if you’re an AI person, you’re like, why would I bother, uh, <laugh>? And that’s the thing, right? It’s like we have this system that has a very low capacity, and I think AI sees that as a bug. Um, and I think it’s actually most likely a feature <laugh>.
Paul 00:00:35 Is that why I’m such a slow learner? Because I’m always just using my working memory. Do I need to back off and try to, uh, use my reinforcement learning more? What do I need to do? How do I learn better?
Anne 00:00:46 <laugh>? So I have two answers to this <laugh>. This is brain inspired.
Paul 00:00:55 Welcome to Brain Inspired. It’s Paul. Reinforcement Learning has been one of the greatest success stories tying together brains behavior and artificial intelligence. Long ago now, reinforcement learning algorithms that were developed in computer science were imported into neuroscience to account for the brain activity associated with how we learn. Uh, since then, a wide variety of algorithms and computations underlying various forms of reinforcement learning have been explored along with the neural substrates, possibly implementing those algorithms. However, our brains are highly complex entities, and as we’ve discovered more about learning, the story has become more complicated. It isn’t clear how and when various brain activities map onto various particular equations used to describe how we learn. And people like Anne Collins, my guest today, are showing that reinforcement learning isn’t the only game in town in terms of how our brains learn. Anne is a professor at the University of California Berkeley, where she runs her computational cognitive neuroscience lab.
Paul 00:01:59 And one of the things that she’s been working on for years now is how our working memory plays a role in learning as well. And specifically how working memory and reinforcement learning interact to affect how we learn depending on the nature of what we’re trying to learn. So in this episode, we talk about that interaction specifically. We also discuss more broadly, uh, how segregated and or how overlapping and interacting many of our cognitive functions are, and what that implies about our natural tendency to think in dichotomies, like model free versus model based reinforcement learning system one versus system two, and so on. Uh, and we dive into plenty other subjects like how to possibly incorporate these ideas into artificial systems. You can learn more about on in the show notes at brain inspired.co/podcast/ 154. Thanks to the brain inspired supporters, you people are the best. And it’s just so generous of you to take the trouble to send a few bucks my way each month to help, help me make this podcast. And I always look forward to, uh, our live discussions and our interactions. Thank you. All right. Here’s on,
Paul 00:03:09 On, um, I know that you’re not at SFN right now, the annual neuroscience meeting. And in fact, this, um, our, our discussion here is I think over a year in the making because I’d asked you so long ago, but you had decided to go and procreate apparently for the third time. Um, and, and you were telling me that that’s, that’s why you’re not at this annual neuroscience meeting. So, but I thought maybe that was your first child. So I was gonna ask you, you know, what it was, you know, how motherhood was, was treating your career, and, uh, otherwise, but, but you have three.
Anne 00:03:41 Yeah. Yeah, I have three. They’re, uh, five and a half, three and a half and, um, six month old now. Um, uh, I’m not going to lie, motherhood is rough with a career, especially if your partner has a carrier too. Um, actually with my first child, um, my husband, uh, wasn’t quite working full time yet. Um, and so we were able to travel and go to lots of conferences and stuff like that, which makes for some really interesting memories of, um, you know, being at SFN with a baby in the pouch and stuff like that. Um, uh, but yeah, um, I think the combination of having the other tool, having full-time carrier, and, um, just, you know, having lost the habit of traveling with Covid too mm-hmm. <affirmative> Yep. Um, has really made it much harder this year.
Paul 00:04:32 Are are you done? Are you gonna keep going? What? I stopped at two and I have a surgery, uh, to, to prove it. <laugh>
Anne 00:04:40 <laugh>. Um, that’s a bit too much detail. Okay. Um, I <laugh>, um, I’m one of six children. Um, oh. So people like, feel like they can ask me this question. Um, I’m actually the fifth of, um, of six children. Um, but, um, no, I don’t think so. <laugh> I think, you know, like, it’s already pretty, pretty hard enough, um, at this point. Um, and, you know, I have three girls. Uh, they’re very lovely, but they’re also a handful, so, <laugh>. Um,
Paul 00:05:11 Yeah. All right. Well, I’m glad that we’re finally doing this. So I, I appreciate you, um, finally coming on to the podcast. It was a lot of emails back and forth on the making, so thanks
Anne 00:05:19 For the persistence, <laugh>.
Paul 00:05:21 Yeah. I am persistent. Um, so we’re gonna talk a lot about today about your work, um, relating reinforcement learning, uh, in the brain to working memory. And, and hopefully we’ll talk a little about, a little bit about, uh, attention as well. But I, I wanted to start by asking you, um, since you have, you know, you’ve worked a lot on the interactions between working memory and reinforcement learning, uh, I wanted to start by asking you just how you feel, how you would describe your outlook or your conception of learning and reinforcement learning, uh, has changed or been shaped throughout your career. C can you describe that, that sort of projection?
Anne 00:06:04 Yeah. So, you know, I, I thought about it since you kindly sent me, um, the questions to prepare a little bit. Um, and I had, that’s the question I had the hardest time with actually mm-hmm. <affirmative>, because, um, I got into this field, not in a traditional way, not that I think many people, there’s
Paul 00:06:22 No traditional
Anne 00:06:23 Career there, but, um, I think in France, it’s maybe, at least when I was there, it’s maybe even less traditional. Um, you know, there was no undergrad, um, in anything close to cognitive science. Um, I dis I discovered cognitive science as part of, um, um, of my, um, uh, breadth requirement, um, in engineering school, um, you know, alongside with, um, you know, painting and music and stuff like that. Um, so it was really, I was in a very stem-oriented, um, undergrad and, um, you know, like, um, this field was considered very outside of, um, of scientific, um, rigor. Um, and because of that, I think I’ve, I’ve had this approach that like, I, I dove into problems and didn’t have much breadth or, or, or, uh, height of you either. And immediately saw that learning was fairly complex. And throughout my career, essentially, that’s been pretty much confirmed <laugh>, you know, like, it’s, um, it’s, um, there’s no, like, it’s complex and, um,
Paul 00:07:31 Uh, is it complex or, or complicated?
Anne 00:07:35 Um, both probably. Okay. So, um, yeah. Um, yeah, I think it’s both, uh, really, um, I was pretty lost at the beginning. I, I feel like we’re still pretty lost <laugh>. And also I felt from the beginning that there were, you know, many chunks to it, uh, which I think ma maps to the complex, um, part. Yeah. And, uh, and I’m still pretty convinced of that. So, um, so it hasn’t really changed much. It’s just, you know, like become more defined in a sense.
Paul 00:08:06 So a lot on this podcast, we, we’ve talked about, um, this current trend of reducing the dimensionality of like neural activity, right? And then, um, describing these lower dimensional manifold states that kind of, you can start mapping onto cognitive functions, but what you’re saying is that, um, learning is, uh, high dimensional. I mean, how high dimensional is learning and, and do we need to keep it at a, at a high D level, or are there dimensionality reduction techniques, you know, that we can, um, use, I mean, for instance, reinforcement learning, um, also, you know, just a has turned out to be a, uh, complicated affair also. Right? Um, so we call it learning. I guess that’s the lowest dimension that we can term that we can use. Um, yeah. But in your, in your, have you thought throughout your career that, um, it is even more complex than you originally thought? Or have you, uh, been able to sort of take some of those chunks away and hone in on, uh, what you think are, are fundamental principles?
Anne 00:09:08 Yeah, so I, I, um, I’ve gone more the direction of more complex than less complex. And I think that’s, um, um, my undergrad major was in theoretical math. And I think lots of people in this field come from math or physics and have a bias for elegance. And, um, you know, like, um, you know, unified theories and unified theories are wonderful and shared principles and stuff like that. Um, and I think if there’s room for that in the brain, but I think, um, I think we’re probably too much in that direction in the theory, um, of cognition. I think, um, uh, I think it’s more messy, you know, than we might like <laugh>, um, for an elegant, um, uh, theory. So I, so I do think it’s, it’s high dimensional. Um, but for your question as to whether I’ve been able to take out some chunks, um, yeah, I do think there are shared principles and maybe not shared principles, but shared computations.
Anne 00:10:18 Um, and, and I wasn’t sure if I should count that as a different dimension or not, right? If you’re doing the same computation, but apply it to different things. Um, in a sense, algorithmically speaking, it’s, um, the same dimension, but, um, it results in, uh, behavior that is differently dimensional, um, or more complex. Um, and so I think it depends a little bit how you look at it. And I think, you know, like the, the trend in looking at the brain and, you know, um, taking manifolds and seeing the dimensionality and stuff like that tends to apply more, I think, to representations than to, uh, changes in representations, um, which is learning, um, mm-hmm. <affirmative>, you know, like computations, uh, by themselves. So I don’t think it’s easy to define the dimensionality of computations, um, which I think learning is, um, yeah. Sorry, that’s a bit ramly.
Paul 00:11:18 No, that, that’s okay. I mean, I’m trying to grapple with, uh, the, the sort of the different levels that we talk about and the dimensionality. Yeah. And, you know, so there’s cognition and then there’s what happens in the brain, like the, the mechanisms and computations in the brain. Um, I, I just had Michael Anderson on to talk about his neural reuse and, and, um, ideas and thinking about going away from thinking about modules in the brain and thinking more in terms of interacting, uh, brain areas and how that, how the interactions, there are different players for different cognitive functions, but those, like the same brain area will get reused. Right. And you, you were just mentioning, you know, if you’re using the same computations in, uh, different areas, whether we should consider that, um, you know, a different, a different dimension, but
Anne 00:12:05 Go ahead. I don’t, I don’t mean in different areas. I mean, really. Um, so, so I think of a, of something very specific, which is, um, the cortico basal ganglia circuits mm-hmm. <affirmative>. Um, so there’s loops that go from cortex to, um, strum to the, uh, output of the basing alia through the thalamus and back. Um, and I think it’s an example where we have, you know, a decent idea of what computation or transformation those loops do. Um, and we know that there’s multiple of them, um, with different starting points in, in cortex, right? And so I think that’s an example of, you know, not intra region computations, but like a network mm-hmm. <affirmative>. So that’s making a computation or at least has a, uh, some kind of algorithmic function or, you know, information transformation function, um, that, um, can be applied to different, uh, things, uh, which may lead to, uh, fairly different consequences on cognition and behavior. Um, does that make sense? Yeah.
Paul 00:13:09 Well, part of what we’re gonna talk about is, um, maybe not the computation itself, but the idea of, uh, you know, adding a working memory component to a reinforcement learning algorithm, right? And you, you do that within a computation, but Yeah. Uh, but then, you know, there are other cognitive functions like attention, et cetera, that maybe you’re gonna eventually throw into the, the, the equation. And would that mean, would that be a different computation or, you know, I’m trying to think about how to think about adding terms to an equation, right? And calling it a computation. Does that change the computation? Do we think of it as one computation? What I was gonna ask about is, you know, the modularity of the brain is, is, um, giving way to like interacting brain structures and like the, uh, basal gangu, um, cortico thala, the laic loops, you know, that you were talking about. Um, and I, I’m, I’m trying to think, you know, do we need to think of computations as modular or, you know, or as high dimensional, you know, how messy computations are clean and the brain is messy. Yeah. Right? How to, how to reconcile those two things?
Anne 00:14:16 Yeah. I don’t have a good answer, but the way I think about it again, is, um, is, um, as a tailor expansion, <laugh>, or like I said, I did math in undergrad, but you know, like essentially we’re, we’re, we’re doing this thing, right, where we, um, we’re trying to understand, um, some aspect of cognition and, you know, via the brain. And, um, and you know, like we start with, you know, like point approximation, okay. Like maybe learning is like behaviorist or, you know, like cause and effect or something like that. Um, and then we are, we’ll go, okay, well actually it’s a bit more complicated than that. Let’s, you know, let’s approximate the function with a line. And, you know, you have two, you know, two, um, two, uh, two degrees of freedom, right? Um, and then you could add a third one, et cetera. And, and it’s a little bit like that, that I think about it.
Anne 00:15:18 Um, it’s, you know, um, at first approximation, you know, like it was okay to capture lots of learning with just the delta rule kind of error model mm-hmm. <affirmative>. Um, but then you did give it deeper, uh, you know, and, and I could tell you how I got into working memory, um, if you’re interested later, please. But, um, okay. Um, but, you know, and you discover you have to add working memories, now you have reinforcement learning and working memory, and they’re independent modules, you know, that just, they’re just, uh, mixed for output, but they’re independent. And then, you know, you dig a bit deeper and you discover, well, actually they’re not independent modules. They kind of like, um, have an impact on each other. And so you have to add that to the mix, et cetera. And, and to me, that’s the third order. And, you know, uh, and, uh, and, and it’s not the third order in the sense that it’s less important or less true, it’s just the third order in the sense that, um, you can’t really get to it until you’ve gone, uh, to the third second order first. Like, you have to identify them as potentially independent, uh, modules before you can even start thinking about how they might be interacting. Um, and so I agree with you, it’s, it’s kind of tough, um, because I don’t think any brain region works on its own, but I still think we can try to isolate, you know, like first order kind of question. Um, um, uh, computations they do, um, that will help us then understand how they talk, um, together, how they, and without having to assume that they’re independent.
Paul 00:16:51 Well, yeah, it’s, so they’re the cognitive functions and the brain areas, and those are two separate things, obviously, but is wor this is a, well, maybe you’ll have an answer to this, is, is working memory a thing or is it working memory, like your model, right? Like your model just lumps them together, and then maybe that’s a separate thing? Or how do, where are the, the bright lines between cognitive functions, right? If, if they’re interacting, so there’s, interacting is like two different things, but they are interacting. Yeah. Or we could consider that interacting system one thing, or, you know, uh, so is working memory a thing <laugh>?
Anne 00:17:28 Yeah, I think it’s a thing. Um, so I think, I think it’s the animal researchers who have the answer of error. Um, you know, I think it’s a thing on its own. If, if you kill one, um, uh, the other one can still work. Um, right. So I think if somehow you were able to, um, you know, just take the approximation of my, uh, reinforcement learning and working memory model without interactions, um, I, I think if you were able to cancel out working memory, you would still be learning just fairly slowly. Um, and if you were able to turn off a reinforcement learning, you would still be using working memory to do something that looks like learning, um, uh, it would just have a fairly different, um, characteristic, um, uh, behavior. So it’s, I, I think it, the, the lesion in a sense of like, if you take it out, does it still work on its own? Um, is a sense in which you can say, well, it is a thing, um, even if it’s deeply enmeshed with something
Paul 00:18:32 Else, but in the brain, you would have to lesion a particular area and then tie that thing to a particular area. So, so then there, it’s still not clear how to go from the level of, you know, the implementation level in the brain to the cognitive function level that we have named in psychology, for instance.
Anne 00:18:48 Oh, I completely agree that it’s very non-trivial to do that mapping. Um, and that, that’s a lot of work, um, that, that’s, that’s an enormous amount of work to be done to do that mapping, and that goes through, you know, taking multiple approaches, you know, cross species, cross methodologies. Um, I, I fully agree
Paul 00:19:08 You’ll stick with humans though.
Anne 00:19:12 Uh, you know, I collaborate with, um, with people. I, I, I definitely won’t do nonhuman research myself. I’m not qualified, but, um, but I’m very, very happy to collaborate with, um, for example, uh, professor Linda Lbeck here at Berkeley, and, you know, all this elsewhere, um, uh, to try to bridge, uh, cross species.
Paul 00:19:33 All right. Let’s back up and talk about reinforcement learning itself a little bit more, um, before we bring in your work and with working memory. You, you’ve written about the, and other people have written like Nathaniel do about the dichotomy, uh, of model free versus model based reinforcement learning. Um, and, you know, these are, were thought to be totally separate, um, cognitive functions and in the brain, et cetera. And, and now it’s not so clear. Um, so I, maybe I’ll just ask you, you know, what is your view, and there’s a nice review that I’ll, I’ll point to that you’ve written, um, about this dichotomy. So, so what is your view on model based versus model free reinforcement learning? Or are they two separate things? Are they interacting? Are they one reinforcement learning thing? How should we think about that?
Anne 00:20:20 Yeah, and I, I’ll, um, uh, mention Jeff Coburn with, with whom I wrote this review. Um, so the title of the review is Beyond, uh, dichotomies. So <laugh>, I think that that tells you a little bit,
Paul 00:20:31 And maybe just for for listener’s sake, we, I guess you should describe model free versus model based. Sometimes I don’t, I forget to do that. <laugh>.
Anne 00:20:40 Yeah. Yeah. So model free reinforcement learning is, um, uh, this approach where, uh, you assume you integrate, um, uh, the information of about reward you’ve received for a given choice, uh, over time, and you summarize it into a single cash value. So you say like, if I go to this restaurant, in average, it has an expected value of 0.8, this 1.6, or something like that. Uh, mall based, um, reinforcement learning, the way it’s framed most often is, um, that you, um, have a model of the world that tells you, if I do this action, I expect to be in this state. Um, after that, and, uh, you have a model of, um, the outcomes, uh, which tells you if I am in the state, I also expect to experience this kind of valence. And, uh, that you use this information in an effortful way to plan your choices.
Anne 00:21:40 It’s like, oh, if I, um, go here, I expect the restaurant to be open, but, um, I know that on Mondays they don’t have my favorite food, so I actually don’t expect it to be that good, or something like that. And you combine that and you compute on the spot, um, what the value is. Uh, so it’s obviously super popular, uh, framework model based was small, free, and it’s, um, resonates a lot with the whole, uh, thinking fast and thinking slow. Um, there’s a whole like, history of psychology, uh, around, uh, dual systems like this. And, uh, everybody likes it. So I think for that reason, uh, people think of it also as relating to habits versus goals, um, automatic
Paul 00:22:24 Versus effort. Yeah. But yeah, we love, we love dichos. Mm-hmm. <affirmative>,
Anne 00:22:29 We love dichotomies. Um, the short story, um, of our, uh, opinion paper is that, um, is that while this is a very, um, productive framework, it’s also very, um, oversimplified <laugh>. And, um,
Paul 00:22:48 She said she, she’s gonna have to wave like that every few minutes for the lights to come back on. Sorry to interrupt.
Anne 00:22:53 Um, I’m in a green building. Um, the green building turns off, uh, lights often, and I don’t move enough when I talk. Um, so I don’t mean oversimplified, but I think it’s, um, I think it’s, uh, it, it’s approximating things and, um, everyone who works directly in this field knows what those approximations are. But because the framework, um, is, um, very, um, um, seductive in a sense, it’s, you know, like it’s a mathematically well defined, uh, model that seems to map on well to many, uh, uh, many, uh, heuristics We have, um, it’s been taken a little bit too seriously, I think, by people who are less familiar with the details of where it can go wrong. And I think it leads to, um, it, it can put potentially lead to very big issues of overinterpreting, uh, the findings, um, and, uh, over, uh, and also, you know, lending you down wrong avenues of research, um, to, um, so I’m happy to say more about that. Um, yeah. You know, if you want more details, but, um,
Paul 00:24:10 Yeah, gimme more details. I have follow up question as well, so.
Anne 00:24:15 Okay. Um, well, I, I think for example, that, um, you can make a reasonable case, um, you know, like for the all free computations in the brain, I think, you know, like there’s been a lot of work around this, and we have a good sense of a network. So to me, more free I can accept that it’s, um, it’s, uh, it’s a meaningful chunk, um, of, uh, learning behavior. Well,
Paul 00:24:43 This was model free, sorry? Model free was the first kind of reinforcement learning discovered in the brain. Yeah. Um, based on, yeah, yeah. In a sense, you could consider it the most, the basis level of automatic learning or something, perhaps.
Anne 00:25:00 Well, that’s the thing is like, I don’t know that that’s true. I don’t know that it relates to habits. I don’t know, uh, at least directly that it maps onto habits. I don’t know that it maps onto toity. Um, um, I, I know that lots of behavior that’s well described by moral free, like, uh, models is not, um, you know, this kind of implicit effortless, um, uh, automatic, um, kind of process. So I think there’s a bit of difficulty in mapping it to more, you know, um, broader concepts like this. Um, but, but, but I do think it has the benefit of, you know, like, uh, at least we have a good hypothesis for where and how it might be implemented in the brain. Mm-hmm. <affirmative>, um, as far as model base goes, um, um, I think it’s both, uh, too much and too little <laugh> say more?
Anne 00:25:59 Um, yes. So it’s too much in the sense that I don’t think it’s a thing. Um, I think there’s too many components, um, in it for it to be thought of as a thing. Um, and there’s many things you could, um, say about that, but the simplest one is to say, well, you know, um, there’s the planning component, the planning component requires, uh, you know, like either simulating things and holding them in working memory or doing it some other way. Maybe, you know, we have some approximations like, um, I’m sure you’ve heard others. Um, um, but then there’s also learning the transitions or, you know, like representing the reward function. Um, so, so I think there’s multiple sub components of it. Um, and so thinking of it as a chunk doesn’t seem right. Um, and the second thing I was going to say is that it’s not enough in the sense that, um, we often read, you know, even in very prominent journals in the new, the abstract, something saying like, learning is well known to be either mall free or mall based mm-hmm. <affirmative>. And I think that’s a big, big problem to, uh, say something like that. It’s that there’s many aspects of learning that are not, um, to be put into either of those two, um, bins. Um, and that, uh, should go beyond. And if only because, you know, model free and model based are focused on this, um, on this type of environments where, um, uh, sequential and decision making. Yeah. Um,
Paul 00:27:34 Yeah. So, so model based is, is too much and not enough, and mm-hmm. <affirmative>, it’s not a thing. And model free is a thing, <laugh>. So what is that? Like, how should we think
Anne 00:27:46 It’s more likely to be
Paul 00:27:47 A thing? It’s more likely to be a thing. Okay. So probability of 0.7 being a thing. Um, okay. Well, I was gonna ask you then, how, how should we think of, as opposed to a dichotomy, how should we think of the, like the gradient or transition? Is it just piling more cognitive functions necessary? What we, what we call model based is, um, I guess we could continue to call it model based, but just realize that it, it com is comprised of lots of different elements, perhaps.
Anne 00:28:18 Yeah. So it’s this question of, you know, if you imagine you have, um, if you imagine learning as a high dimensional space, and, um, you imagine you have the, you know, mall free dimension here, and you have mall based, and, and I think essentially that mall based is not a single dimension. It’s a, you know, like scatter plot. It’s a manifold and Oh, okay. Yeah. And, and the question is, um, the question is what’s, what’s, what are meaningful dimensions that we want to consider out of this? And I think, and, and what other meaningful dimensions are there around, uh, around this, right? And I think people kind of agree that mall free is one meaningful dimension in that, because we can somewhat isolate it. Um, but I think in model based, um, it’s, it’s less clear. And I, and so I think the way forward is to say like, okay, what are, you know, like the key core ingredients that can be isolated that then get mixed together, um, to support learning?
Paul 00:29:20 Is there a better term than model based? Then how do we destroy the dichotomy? How do we correct the <laugh> dichotomy in words?
Anne 00:29:28 I mean, I, I think I, yeah, I think it words are important, and in particular, not, um, not equating things that are known to not be e equated. So I think, for example, that, um, it’s dangerous to equate mall free to habits. It’s dangerous to equate mall based to goal directed. Um, I, I don’t think that’s a one-to-one mapping in any, um, any way. And I think, you know, uh, prominent mall based, mall free people would agree with me. Um, um, uh, you know, I’m not, I’m not picking a fight I haven’t discussed with people here, <laugh>. It’s, um, um, it’s, it’s more of the way we, we approximate, um, what we say in papers, really. Um, so I don’t know what the right, um, wording, um, is. But I think we need to, and, and, and I think the strength of computational models is that you can, um, you know, you can say exactly what you mean, but the problem is when you put, you know, like when you take the model based URL equation, um, it doesn’t Ming go directly, right? It just means like, here’s the, you know, way to compute forward <laugh>, right. The plan. Um, yeah.
Paul 00:30:41 Okay. So maybe, maybe we should talk about working memory and inspiration to reinforcement learning. I, I think of it, do I have it right? That the working memory and reinforcement learning that you study, that you research as interacting, you would, that would be model free reinforcement learning, right?
Anne 00:31:01 Yeah. Um, so,
Paul 00:31:03 So far, anyway,
Anne 00:31:04 <laugh>, um, you could call it model free, or you could call it model based. Um,
Paul 00:31:09 Well, is the interaction between the model, the model based component? Or is it No,
Anne 00:31:14 Even, even individually. Uh, I think, uh, because, so here’s the thing, right? It’s like if the choice you’re making only, um, constraints, what reward you get, not what the next stimulus is, or next state you are in, is there’s no way to distinguish whether you’re doing model based or more free error in the classic sense, right? And both models make the same, uh, prediction. So in that sense, that’s why I’m saying in I’m, you know, like, I, I can’t project it on tools too, because, you know, they’re making the same prediction there mm-hmm. <affirmative>, so they’re collapsed. Um, really, um, that’s said, obviously I do think that working memory, there’s more resemblance to based, um, uh, than all free in the sense that it is, um, something that we think of as a fruitful, more cognitive, uh, more, um, you know, more flexible, uh, in the same way that, uh, the model based, uh, processes. Um, but, but yes, you know, the, the, the behavior we’re looking at would have been modeled by the simplest model possible as a, as a model free era, uh, normally.
Paul 00:32:24 Yeah. Okay. Okay. Let me see if I can just describe the sort of overarching, um, uh, conclusion or story, uh, that you have that you continue to work on, but that you have thus far, um, come up with regarding the interaction between working memory and reinforcement learning. So, uh, and then you can correct me and then, but before <laugh>, I have a question about my own cognition. So if you gimme something hard to do, like it taxes my working memory, right? Then when I make an error, um, there’s a ill, I’ll, it’s, it gives me a large prediction error. Um, and therefore I actually, my reinforcement learning system then is allowed to learn fast. Whereas if you give me an easy task that doesn’t require much working memory, then when I make an error, it, it leads to a smaller, uh, reward prediction error, and that actually makes my reinforcement learning system, um, learn more slowly. Right. Okay. So that’s my summary of it.
Anne 00:33:30 So, so, so, so that’s nearly right. Okay. Um, it’s nearly right, but it’s not because it doesn’t require much working memory. Um, it’s because, um, it’s easy for you to use your working memory on this task. So if it’s easy for you, you can use working memory, um, without much effort.
Paul 00:33:51 Oh, right. Okay. It’s, it’s, um, there’s a lower bar to actually using my working memory to perform the task,
Anne 00:33:57 Right? So it’s easy, um, it’s easy to hold, you can easily retrieve it. Um, and so you’re using working memory for it. Um, and, and that’s what’s, uh, creating the interaction.
Paul 00:34:10 Okay. Okay. So my question about my own, so
Anne 00:34:13 It’s a bit of a
Paul 00:34:13 Yeah, yeah, yeah. Sorry. It’s, it’s subtle. Yeah. Sorry. Thank you for the correction. So, so my, my question is, I’m a a, sadly, I’m a really slow learner <laugh>, but, um, I think I have high working memory capacity. Maybe it’s easier for me to use my working memory. And so, um, is that why I’m such a slow learner? Because I’m always just using my working memory. Do I need to back off and try to, uh, use my reinforcement learning more? What do I need to do? How do I learn better? <laugh>?
Anne 00:34:47 So I have two answers to this <laugh>. One is, um, if you believe that my theory, um, generalizes to the real world mm-hmm. <affirmative>, uh, then yes, that’s what you need to do. Um, you need to dual task yourself or something like that, um, and make it harder for you to learn, um, and, you know, prevent your more explicit working like processes from, uh, you know, rescuing you, which will, um, enable like kind of slower things like aral, um, to, uh, be less blocked. Um, so
Paul 00:35:23 You’re saying I don’t try hard on, and, you know, at things, I, I don’t, I don’t do things that challenge me enough, perhaps.
Anne 00:35:29 No, I’m not saying that. Um, uh, because you might be trying hard to use your working memory. Um, I’m, I’m, I’m saying, uh, you should strike yourself a little bit. And actually, you know, it’s, it’s interesting how actually well known this is, uh, from a, a heuristic point of view. Um, my, my voice teacher when I used to study singing, uh, would make me do all kind of crazy things to distract me while I was singing, and it would always result in better, uh, <laugh>, um, better, uh, modern learning, um, for, for singing. Um, so, you know, it’s a trick that’s known, I think, to educators. Um, and, but I thought
Paul 00:36:07 We
Anne 00:36:08 We’re not supposed discovered the review thing. Yeah,
Paul 00:36:10 Go ahead. Sorry.
Anne 00:36:13 I, I think it’s related, for example, to the concept of, um, space repetition or stuff like that, right? Um,
Paul 00:36:20 But we’re not supposed to multitask, right? Because then that means we don’t perform well on anything that we’re multitasking on. But what you’re saying is maybe you weren’t singing well while you were being distracted, but, uh, that led to better singing later. Is that what I should take from that?
Anne 00:36:37 Um, I think it, it led to not retrieving the thing that I was trying to explicitly retrieve. Oh. Which allowed, um, the more implicit system to, um, both makes its own choices and its own predictions. Um, and, and so and so learn from them. I have no idea, right? <laugh>, this is, this is true. I don’t, I don’t study singing, but that’s, that’s what the implication
Paul 00:37:04 Would be. Let’s, yeah, let’s turn this episode into a self-help episode. Those are popular, right? So let’s
Anne 00:37:08 Do that. <laugh>. Yeah. So that’s my second answer actually. Um, it’s that, um, um, you know, I, I think it’s important as cognitive scientists, um, to think about, uh, applications of our findings. Um, but I also find it very terrifying, um, especially in my, uh, domain where I study learning, um, um, especially because I study it in a, you know, because I try to deconstruct learning. I try to study it in a very well controlled kind of environment that are not readily, uh, generalizable to the real world. Um, and, um, and, uh, I very much worry that, um, that, uh, people will do exactly what you’re saying is like, take my theory, right? And assume it generalizes to the real world and apply it. And, um, and for all I know, it could be, you know, it could lead to the opposite effect essentially because, um, because of the st expansion again thing, right? Like, because like, I’m trying to deconstruct it and I’m going step by step, but it’s possible that orders three, four, and five, you know, might slip around
Paul 00:38:21 Cause it’s a complex
Anne 00:38:23 System I’ve seen so far. Yeah. Um, exactly.
Paul 00:38:26 What do you think then, I’m sorry, this isn’t aside because there are a lot of neuroscientists, uh, who do make life recommendations, right? And, and take from their work, uh, lessons for behavior. And I know that that’s the ultimate goal, but are you skeptical of, say, a scientist who tries to make that transition into, you know, a life coach advice person,
Anne 00:38:54 <laugh>, person, person, person? Um, I won’t make a broad call on this. Come on. I think it’s probably fairly dependent on, uh, the specifics of, um, of what you study. Yeah. Um, you know, essentially the question is how generalizable, how much evidence do you have that the findings you have in the lab are, uh, broadly applicable, um, to real life? And I think plenty of people ask this question, right? Like, plenty of people, you know, take club experiments and try to relate them to real life outcomes in some ways. Um, or try to apply them to more, uh, naturalistic experiments, whether that’s, you know, like, um, I dunno, I saw recent paper, um, I think by Ross Otto’s, um, um, uh, team, um, looking at choice of pizza, uh, online or something like that, you know, and try to see like, okay, those principles with developing the lab, like, do they explain things also for a real kind of decisions?
Anne 00:39:53 And so if you have, you know, done this kind of homework, and I’m able to say like, well, the principles I derived are generalizable, then, then sure, maybe you have the expertise needed to, um, to do that transition. Um, I personally, in my specific domain, don’t feel like we are there. Um, and so I personally would be very worried about, um, doing that jump. Um, especially because I think, you know, the people, um, who ask for this kind of advice tend to be vulnerable. Um, you know, people who want to improve, um, uh, might have needed to improve in a sense. And so, I, I, you know, I I would not want them to be my Guinea pigs on that. Um, yeah.
Paul 00:40:39 So, well, since you said you’re not, there are a, do you have an interest in getting there and B, how long do you project
Anne 00:40:48 <laugh>
Paul 00:40:48 Where you would feel comfortable?
Anne 00:40:52 Um, I have an interest in getting there. Um, I, I confess that I’m, um, my, my primary interest is really a fundamental research. Um, but, um,
Paul 00:41:08 Uh, does that mean u understanding the phenomena?
Anne 00:41:12 Yeah. Yeah. Um, yeah, that’s, that’s, that’s, uh, uh, that’s my primary interest. But, but I do have a genuine secondary interest in, uh, making this something useful for society.
Paul 00:41:26 <laugh>. Yeah. Well,
Anne 00:41:26 You, you, I mean, not that I don’t think,
Paul 00:41:27 Yeah, you’ve studied this in the context of schizophrenia. I know. And, you know, so disorder, so you, you are applying it in that sense, um, and linking it to behavior, but, um,
Anne 00:41:37 Absolutely. Yeah. Yeah. Yeah. And I, and I, and I do think, you know, um, I mean, maybe something that I’m even more interested in than, um, you know, computational psychiatry, um, might be, um, education actually mm-hmm. <affirmative>. Um, because I think, you know, I think, uh, there’s lots of individual differences in how people learn, and I think, you know, if we understood things better, we might be able to help, um, more there. But I definitely think that’s somewhere where we have to be very, very, very careful because the impact on individuals, individual individuals is, um, yeah. You know, could be very big.
Paul 00:42:20 So we need to really understand these
Anne 00:42:22 Positive or negative directions.
Paul 00:42:23 Yeah. Okay. So you, so it could be initial conditions, uh, and, and applying some diagnostic or some procedure could, uh, lead to <laugh> to, um, you know, widely divergent results.
Anne 00:42:36 Exactly what you said, right? Like, if I go and take like a class of, you know, uh, third graders and tell them to, and dual task them, you know, for the whole day, because I think they’re going to learn better. And then I discover that actually the impact of like, you know, <laugh> Yeah.
Paul 00:42:52 The kids get really fucked up
Anne 00:42:54 Of, of being too old task the whole day that they lose all their motivation and they stop learning and, you know, so maybe they’re everything that better, but the like, downstream impact is completely different from what I thought. It’s like, well, you know, like how ethical was this experiment? You know, like, it’s,
Paul 00:43:12 Well, you have your children, uh, do you, do you think about that with your own children? Do you, do you apply this behaviorally to yourself and or, and we’re gonna talk more about what this is because, you know, the relation between working memory and, and reinforcement learning, but has it altered your own behavior regarding the way that you go about trying to learn things and or, um, teaching things to your children?
Anne 00:43:36 Um, I mean, I definitely think about it. Um, it hasn’t really, I wouldn’t say it’s altered it per se, but it, it makes me consider things. Um, um, but I do a lot of dueling though, <laugh>. Oh, you do? And
Paul 00:43:54 I, what does that, you have to stay with that, and I you have to stay with Dualling.
Anne 00:43:59 Oh, dual lingo. Oh, sorry. Yeah. Oh, I thought that was well known. Um, dual Lingo is an app for learning languages. Um, I’m a very, um, multilingual family. Um, so, um, anyway, I’m using that to, um, to learn languages. Um, but I’ve definitely noticed the, you know, like the few times when the algorithm messes up and repeats the same sentence twice in a row, it feels very obvious that I just, you know, like repeat what I just memorized. And that does not help me <laugh> store that information.
Paul 00:44:29 Well, it’s long term that it’s in your working memory still, you mean?
Anne 00:44:33 Yeah, exactly. Yeah. Yeah.
Paul 00:44:35 Uh,
Anne 00:44:36 Okay. Which, which, which makes me likely to get a point right now, but makes me much less likely to do well, uh, you know, when they ask me again the next day.
Paul 00:44:44 Okay. Well, since, since, um, okay, so we just said working memory, or, I did. So, so I guess the con the conclusion thus far, um, and you can tell me any new conclusions that you’ve had recently or, or are on the verge of, is that working memory is actually contributing to reinforcement learning, um, in your brain by contributing to the re reward prediction error? That I say that correctly?
Anne 00:45:11 Yeah. Yeah. Um, and specifically to the expectation, um, of the outcome. So the reward prediction error is the difference between the reward we get and the reward we expected to get. And what we think happens is that, um, if working memory allows you to say, oh, I know what to do right now, um, it can also allow you to say, oh, I know that I should expect a reward for doing this right now. Um, and so that instead of getting the expectation of reward from the reinforcement learning system, we get it at least partly from the working memory system before the reinforcement learning system has, uh, learned it. So that makes the reward prediction error smaller, and that slows down learning.
Paul 00:45:58 So then classically working memory is thought of as the sort of online short term storage and manipulation of information. Yeah.
Anne 00:46:08 Yeah.
Paul 00:46:09 Would it be proper then to say working memory is also a learning, uh, mechanism?
Anne 00:46:16 Yeah. Uh, I think, I think it is. Yeah. Um, you know, I, I I think the, this words are complicated, right?
Paul 00:46:26 <laugh> no words are easy. Their, their meaning is complicated, right? <laugh>,
Anne 00:46:30 That’s true. Yeah. Yeah. Um, <affirmative>, it’s so, so no. Okay. You can’t call it a learning system in the sense that it’s not, um, it’s not going to be long term, but it is a learning system in the sense that if you are in a dynamical environment where you’re obtaining information and making decisions as a function of this information, um, working memory, uh, contributes to, um, managing that information and helping you make decisions, uh, better and better. Um, so, you know, you, you pick what you call learning. Okay? Right? Yeah. Um, but I, to me, uh, it’s a very clear player in the learning, uh, environment.
Paul 00:47:19 So I was trying to think of an example. Um, and let’s say you’re trying different, I think you said pizzas earlier or restaurants, or let’s say you’re trying different pizzas, right? You have a, a, you’re in a taste testing competition, and you have five different types of pizza, uh, and your job is to learn, or maybe we could say, decide or learn, you know, which, which is your favorite, right? So then you have to taste one, and you can eat your saltine cracker, but then you have to think, you know, you can, you go, okay, that one’s pretty, oh, is pretty good. And then on down the line with five, did you then learn what you prefer <laugh>, because that would be keeping it in your working memory, right? And that would probably be called learning. And I apologize if I’m taking us down a meaningless road here,
Anne 00:48:04 <laugh>. Um, I’ll, I’ll follow your example. I think it’s probably the timeline is probably a bit beyond what working memory, um, would be doing, but that’s okay. Um, I don’t think in the end what you have learned would be in working memory, but I think working memory would have contributed to the process of learning.
Paul 00:48:24 But then what you have learned is called memory and not, and not learning, right? So the learning is what I’m asking about because you, because there is a difference. Learning and memory are always associated, but the learning is the storing is, is the process by which one is stores <laugh>, right?
Anne 00:48:42 Sure. So then it, it is part of the process. Yeah. Um, I, I, I don’t know if everybody would agree with you that learning is just the process, not the outcome. Uh, but, uh, I agree in that case, if you see learning as the process, then certainly, I think, um, I think that, um, um, uh, that, uh, working memory is an important part of the learning process.
Paul 00:49:04 Okay. So we, you’ve established that working memory, memory contributes to reinforcement learning. Um, and, you know, just going back to the idea of, you know, the, the anti modular brain, all the brain areas are interacting. Um, all the cognitive functions seem to be interacting. So, uh, does reinforcement learning itself also affect working memory? Is it, do they go back and forth?
Anne 00:49:30 Um, yeah. Yeah, absolutely. Um, I actually also wrote a review recently with, uh, a postdoc in my lab, Aspen U uh, where we tried to show that it’s, you know, bidirectional. Um, uh, and I think that’s actually better known. I think that’s been for a long time, uh, a bunch of, uh, you know, of, of computational models that show. So, so it’s this idea that, you know, um, reinforcement learning helps you learn. Um, we, we tend to think of it as, you know, learning concrete actions or, you know, concrete choices like pick between A and B or, you know, press whiskey or that key, or, you know, like turn right or turn left, or something about. Um, but, but there’s been for quite a while now, this idea that you can apply reinforcement learning to more, uh, abstract, um, uh, inner actions like deciding to store something in working memory or deciding to, um, get something out of working memory. And there’s, there’s a bunch of models including Michael Franks and, uh, Randy O’Reilly’s and, um, and others, uh, you know, uh, making good cases for this and for why would be useful and, and for why reinforcement learning processes might help you, you know, learn, uh, how to use working memory in that sense.
Paul 00:50:56 All right. So we’re, we’re gonna throw another one in the mix here. Um, I just had Carolyn Jennings on the podcast to talk about her philosophical account of attention. Um, and so she has this, this view that attention is at the sort of whole organism level, uh, based on the, it’s the pri prioritization of, um, just the mental prioritization by a subject or a self. And so, of course, like, well, maybe not working memory, but attention, you can talk about it at the whole organism level. You can talk about it as, you know, the thousands of different cuts you take, visual, spatial feature, endogenous, you know, top down, bottom up. Um, so the, the question is, you know, what’s the difference between attention and working memory? Because it seems, you know, from like, um, non-human primate studies, everywhere you have a readout for attention in a neuron, there’s a readout for working memory as well. And it seems like they are inextricably intertwined. So I wanna throw attention in the mix and do, how do you distinguish between attention and working memory? And then what do you think about the effect of attention on reinforcement learning and in its interaction with working memory? And we’ll stop at three because it gets unwieldy.
Anne 00:52:16 Yeah. <laugh>, well, you know, you could, you could go be own for sure, but, uh, yeah. Um, yeah. So I think attention is essential. Um, and I think
Paul 00:52:28 Is a, is attention a thing? Yeah.
Anne 00:52:29 I mean,
Paul 00:52:30 Working memory is a thing. Is attention a thing?
Anne 00:52:34 Yeah. No, I, I think, I think they’re separable things, although I do think that, um, you know, there’s a big overlap between working memory and attention. Um, and, and I think, you know, attention is also a term that is not necessarily defined the same by different people, um, same as working memory. So I want to be a little bit careful there. I’m also less of an expert in attention than, than working memory. So I’m <laugh>, I’m, I’m treading a little bit lightly here, um, but I think I want to, um, I want, I want to show two ways in which I think, um, attention or working memory can play a role in reinforcement learning that are separable. And then you tell me if you think that, okay, that matches two, two different things or on, okay. So I think one way we’ve talked about is working memory, um, is just holding information in mind.
Anne 00:53:36 Um, right. So in my example, it’s when you learn you, like you, you know, like you, you, you, um, um, for example, see a stimulus, make a choice, get a reward, and you just hold that in mind. So like, oh, I got one point when I, uh, pressed left, uh, when I saw, uh, a red triangle. And so that’s good. So I’m going to, uh, hold in mind to press left for the red triangle. Um, so that’s, um, that’s one function that working can have is like active hold in mind of something that you’ve constructed internally. Um, this policy, um, then that, that you can reuse and that is useful for learning separate role, um, is, um, filtering. So if you think about it, um, reinforcement learning processes, um, they assume, um, the state space and an action space, right? Um, and, um, state spaces in the real world are, um, you know, approximately, infinitely high dimensional, right?
Anne 00:54:49 Like it’s, you know, the number of pixels in on your retina now, you know, like it’s very high coming in photo at any given time. Yeah. Um, but at any given time, only a very little bit of this, uh, matters, right? And, um, if you have this super high dimensional input, then, um, there’s no way your reinforcement learning system in the brain can learn anything about it, right? Um, and so it has to be fed some kind of state, um, in action that’s much lower dimensional, um, to be able to learn at any reasonable, um, rate. And so I think a critical role of attention in interaction with reinforcement learning is to provide that, uh, state space, an action space, uh, over which the computations are going to, um, be applied. And so to me, that’s separable from the other role, um, that I mentioned, uh, previously, and it’s, you know, as important if not more. Um, but it seems to me like you can see here fairly different functions, um, uh, happening, even though I think working memory does need attention, and maybe this attention, uh, component also might somewhat rely on working memory to hold in mind what’s relevant. Um, so I, I, as usual, I don’t think they’re fully dissociable, um, but I still think that you might be able to isolate some functions.
Paul 00:56:18 Uh, I fear I’m, I’m belaboring uh, a point here, but you know, so, so there’s a filtering. What you just said is there’s a filtering mechanism, um, that allows this infinite dimensional space, uh, to be filtered down to a lower dimensional space. Um, and in that sense, attention might be necessary for, would you say attention is necessary for working memory? Because working memory is by definition in a low dimensional space, um, seven plus or minus three or whatever it is. And then depending on your, the system of working memory you’re dealing with, you can only hold a few things in mind, right?
Anne 00:56:52 Yeah. Yeah. Uh, yeah, I think it’s necessary in, in, in my specific domain. You know, I think there’s different types of working memory, and actually the seven plus three, in my case, it’s more like three or four, um, mine’s like 12, but,
Paul 00:57:06 Um, no, I’m just kidding,
Anne 00:57:09 <laugh>. Um, but, um, um, but yes, I think it’s, uh, it’s, um, in, in this domain at least, I think attention is necessary. But I do know that this other domains of people who study short, short term visual working memory, for example, um, where they show, um, uh, fx that seamless dependent on attention. So again, not, not my, not my, not my specialty. So I, I don’t want to comment too much on that. Yeah.
Paul 00:57:41 Do you have, uh, plans to, you know, study attention and how it interacts with working memory and with reinforcement learning? I mean, I was referring to that, I think it was a current opinion piece, um, where you talk about the potential role of attention, um, because attention and working memory are both under the umbrella term executive function, and often you use the term executive function and its interaction with reinforcement learning, so mm-hmm. <affirmative>, is there interest in applying these different executive functions also, or is there just so much to do with working memory itself that it’s gonna keep you occupied, uh, until you’re ready to tell the world how to behave?
Anne 00:58:18 <laugh> <laugh>? Um, yeah, no, I’m, I’m looking into it. Um, although I think there’s been very nice work done already on attention by, you know, by Ya’s Lab, for example, by, um, um, Sheva Farhi and a Saltan and by others. Um, but, um, uh, the way I’m approaching it is more, um, less directly from this attention construct and more from the, um, this question of, okay, what are the inputs to this reinforcement learning function? Um, we’ve, as modelers we’ve taken for granted while those states and actions and rewards are. Um, but can I try to dig a little bit deeper into what our, you know, like what behavior tells us about, um, how, um, what states and actions, uh, we actually feed into those computations, um, internally. Um, and I think that will inform us onto the how executive functions and attention, um, play a role in reinforcement learning, but not by going directly, uh, through the classic attention route, if that makes any sense. Mm-hmm. <affirmative>,
Paul 00:59:39 So if I, um, if I model some behavior, you and or neural activity using model free reinforcement learning, or let’s not say model based, but the other, what some other kind of, uh, reinforcement learning algorithm and computation and it approximates, but doesn’t, you know, perfectly describe the behavior because it’s science. Do I think, um, do I think well that’s because the, the computation isn’t really what’s happening? Or do I think that it’s due to, um, neural variability and noise? Or do I think that there’s always this ongoing interaction between the different cognitive functions, like working memory and attention, and they’re having an effect that I’m just not pulling out because my task isn’t right, or I’m not asking the right question, et cetera? I mean, cuz all of these things are, it’s not like working memory turns on and then does its thing to reinforcement learning and then goes back to its cave or something. All these systems are always constantly interacting. So, um, I don’t know. That’s, maybe that’s too much of an open ended question.
Anne 01:00:46 No, I think I see what you mean is like, I mean, I’ll try to give a case example and you tell me if it’s representative, right? Okay. So, uh, in my, the key way in which I can reveal working memory, um, is by having people learn about a different number of things in par. Uh, sometimes two, sometimes three, sometimes six. Right? Um, and, and that’s what allows Mary to say, okay, there’s working memory and reinforcement learning because people learn at different speeds in different, uh, numbers of items. And, uh, reinforcement learning just can’t do that. Okay. Um, so now take an experiment in which you only have one number of things that you’re learning in path, or for example, all the time you’re learning about four things. I still know that working memory is playing a role, but I also know that I can’t use my model, um, to identify working memory in there because it’s not going to be identifiable Yeah.
Anne 01:01:46 Um, in this experiment, because I don’t have that leverage. Okay? So one answer is, well, you know, the experiment is wrong. I’m not going to be able to put working memory, but that’s a bit frustrating because I might have done this experiment so that I can manipulate other things and, you know, um, and answer other questions. Um, so the other thing I can do is, um, you know, still use a reinforcement learning model, for example, <laugh>, um, but be aware of its limitations, right? And say like, okay, I know in this case what this reinforcement learning model is not, um, only accounting for reinforcement learning process, but also for contributions of a working memory process,
Paul 01:02:27 Even though you can’t tease them out,
Anne 01:02:29 Even though I can’t tease them out. And, you know, just knowing that is already going to limit, you know, the errors I make in interpreting my fine interpreting. If I find an effect in of, you know, symptom X on the learning rate, I’m not going to attribute it directly to the brains reinforcement learning process. I’ll say, well, maybe it’s the brains reinforcement learning process, but it could also be, uh, from the brains working memory process, right? So I think the, the, the, uh, uh, you know, what’s important is knowing what you can and can’t conclude, um, from your models. And obviously you’re going to have different goals in different experiments. So sometimes it’s important that you capture as much vari in the behavior as you can, and sometimes it’s not. Um, um, and, um, what’s important is transparency, I think, around that.
Paul 01:03:25 Um, yeah. But we have to interpret in our conclusions is, which is sometimes perhaps the least, um, enjoyable part. Uh, perhaps, I don’t know. That’s a question, is that, is that in some sense it’s the most fun part, right? Because you really wanna understand, but then maybe your, that’s where your least one, maybe least confident is in, or should be least confident is in interpreting the results.
Anne 01:03:51 Yeah, I mean, it’s the riskiest part in a sense. It’s, uh, it’s, um, it’s where you like, you know, put yourself on the spot to a degree. And, and, and it’s the process of science. I think it’s where you can generate predictions in a sense, right? Saying like, well, you know, um, this is risky. This is how, um, I’m interpreting it. Um, and either I’m convinced that my results fully support this conclusion, um, in which case I need to make that case <laugh>, um, or I’m not, in which case I need to, uh, state, um, how I would strengthen, um, this, you know, say like, well, if that’s the case, then I should also see this and that, and it could be tested in this way, in that way. Um, you know, I, I’ll make a shout out to, um, this nice, um, uh, paper by, uh, parliamentary, um, via and Colin, um, you know, that made the case for, uh, falsifying, um, models. So not just, you know, fitting models and saying that Quanta quantitatively they fit better than another model, but also saying that they make qualitatively different predictions than other models. Um, and that those qualitative predictions are, um, uh, born out. Um, and, and that’s very task dependent. Like you said, it’s like you, you need to design the tasks that will answer, allow you, you to, um, to get the models, uh, to give those answers.
Paul 01:05:20 Do you think that it’s a better career move to just make wild claims, um, and strong interpretations? Um, so that, I, I’ve said this before on the podcast that when I was interviewing for a postdoctoral position, this, the faculty member that I was interviewing with said when he was a postdoc, his advisor told him to just say as much crazy shit as possible. And eventually, either something will be true or either way, it’ll get a lot of attention.
Anne 01:05:51 Um, <laugh>, um, I don’t know what to say. I mean, I’m not very good at this. I think I’m not a very good marketer. Um, right. That’s, um, and unfortunately, I think you’re right. You know, I think, um, I think, um, uh, uh, overstated claims get attention and, and particular get editors attentions Yeah. And, and get past desk review. Yeah. Um, you know, get, get past desk rejection and, um, yeah. And so, I mean, I, there’s research on this, right? <laugh>, it’s not just my impression, right? Um, it, it, it, it pays, um, for individuals careers. Um, I don’t think it pays for science, though. I mean, I feel like I’ve seen, I’ve seen lots of people follow, uh, through research directions that, you know, like I told them I thought were wrong. Um, you know, they spent two years in it, and then, you know, um, so like, oh, I guess I identify that model in this experiment, like, sorry, <laugh>. Um, so I, I, I would not recommend doing that, but, um, but unfortunately, you know, we live in an ecosystem, right? Um,
Paul 01:07:14 Right. You’re never supposed to admit when you’re wrong in our ecosystem.
Anne 01:07:19 Uh,
Paul 01:07:20 You get elected president of the United States when you, when you never admit that you’re wrong. So I don’t know, there’s, you know, all right. We’ll move on. Go ahead. No, go ahead. If you, if you wanna comment, go ahead, <laugh>.
Anne 01:07:34 No, I mean, it’s, it’s, it’s very unfortunate. I think that, um, that there are very strong incentives to, you know, like, to make, uh, two big claims, um, uh, to, yeah, um, in the system. I think it’s bad for signs. Um, so no, I wouldn’t encourage anyone to do that. But at the same time, you know, I recognize the strength for the incentives,
Paul 01:08:00 Uh, speaking of saying crazy shit, um, you wanna talk about artificial intelligence for a few minutes, <laugh>,
Paul 01:08:07 If you want. How’s that first segue? Um, so one of the, you know, one of the reasons why I wanted to talk about, uh, your work about, you know, working memory, interacting with enforcement, learning, and then then the broader picture of cognitive functions interacting and how it’s a complex mess. Um, you know, I’m wondering if there are lessons that, um, I’m not sure how much, how much you’re into the artificial, the deep learning world and artificial intelligence, um, explosion. I’m wondering what lessons we can draw from work like yours that is, you know, peering into the interactions between these. Because, you know, I know that there’s a lot of, um, attention being added to artificial networks and the attention in transformers, we don’t, we can skip over, but there are other forms of attention that are being added to deep learning networks that are improving them in certain ways, making them more biologically realistic.
Paul 01:09:01 And, you know, not that AI cares about that, the AI world, but, you know, do you have, uh, advice or, or, or just in the broader scope of things, are there lessons to be drawn from this? So, one more, I’ll add one more thing. You know, the idea of like, you have your equation, right? Your reinforcement learning, working memory equation. Um, you know, is it right to just cleanly add that equation into an AI system and then say, well, now we’ve put working memory into it. Um, or like, do we need to understand all these interacting cognitive functions and they’re quote unquote mechanisms to implement them? I mean, how far do you think AI can get by just putting a clean computation in and an algorithm eventually they’re gonna have to interact, right? To get real intelligence. Do we need all those, all the components? What are the components? What, what, what is there to glean from this kind of work?
Anne 01:09:57 Um, thanks for the question. Yeah. So I, I do follow ai, although I don’t think anyone can follow it at the base. It
Paul 01:10:03 Goes <laugh> or neuroscience or neuro or just the neuroscience of attention or working memory or any of them. Yeah,
Anne 01:10:09 Yeah, yeah. Yeah. But AI is pretty particularly, uh, crazy these days. Yeah. Um, but yeah, no, I do try to keep, um, uh, keep an eye on it. Um, so, um, I, I do think, um, crosstalk, uh, in the di in both directions, but in the direction of cognitive neuroscience towards ai, um, could be very beneficial to ai. And I don’t think it necessitates us to have figured out all the interactions and, uh, um, and stuff like that. I think we’re getting lots of, um, or, or precise, you know, details. I think we’re getting lots of insights as to things that matter, even if we don’t exactly, exactly how they matter, um, that can be used, um, that, that, that, that should serve as inspiration, if not exact models, um, for ai, um, um, agents. Um, I, I think working memory is an example. Um, although in that I’d love to see more tested, you know, I, I think in, I think, um, something that’s curious about working memory is how limited it is really. Right? Right. Like, it’s, it’s very, uh, it’s very stupidly limited, right? Like three or four items, really. It’s, um, it’s, uh, you know, like if you’re, if you’re an AI person, you’re like, why would I bother, uh, <laugh> considering such a,
Paul 01:11:38 Yeah, I mean, I’ve even seen, you know, in like the neural touring machines where you know that you have a long term memory storage, it’s even sometimes called working memory because you could just retrieve it after a few steps, right? So, so in that sense, it’s working memory, but in the, you know, we would consider it more like long term memory or something.
Anne 01:11:55 Yeah. No, it’s long term memory. It’s, I mean, I, no system that has a high capacity, um, can be coworking memory, I think. I think, and that’s the thing, right? It’s like we have this system that has a very low capacity, and I think AI sees that as a bug. Um, and I think it’s actually most likely a feature <laugh>, uh, although I don’t have proof, uh, of that, but I think it’s a feature in the sense of attention, essentially, of forcing, um, a bottleneck on processing, uh, that enables us to, uh, create a small state space over which being fast and flexible is much easier than over a big, um, and complex, um, state space. But, and so I think Go
Paul 01:12:42 Ahead. I was gonna say, but if I had unlimited or much less limited resources and computational speed, would I still need that? Because it, it does seem to be dependent on unlimited kind of resources.
Anne 01:12:56 Well, I think, you know, I think, um, I, I think if you are, you know, um, uh, someone trying to do automatic, um, like, uh, self-driving cars or something like that, right? Mm-hmm. <affirmative>, um, you care about how much resources you spend and how quickly you are able to adapt in a different, uh, environment, right? Um, um, even if you have much more, uh, than humans, uh, right, you have limited resources and a very limited amount of time, um, to, uh, direct to a new situation. So, so I think, you know, even if you don’t take the exact three, four, um, number, uh, seriously, I think you, you should still take the, you know, more general idea of like maybe a, a bottleneck. He’s a good principal. Um, uh, seriously,
Paul 01:13:47 Just
Anne 01:13:48 Not our, I think there’s also example,
Paul 01:13:49 Sorry, just not our dumb limited bottleneck, perhaps maybe a larger, uh, wider capacity bottleneck depending on how fast you need to make decisions, right?
Anne 01:13:59 Yeah. Maybe. I think, I think it’s an open question. You know, I think, I think, um, I think it’s something to be explored really. Um, and I think there’s other examples. You know, like there’s another side of my work that’s about structure learning, um, that, that shows very, um, that, that humans have those very strong biases to, to create complex, you know, uh, structure where they don’t necessarily need to do so. Um, and you know, that’s kind of an anti, uh, uh, anti all comes razor kind of thing where you’re like actually complexifying things instead of simplifying them. And, um, and that seems like very counterintuitive in a sense. But, um, but then what we see is that by creating this kind of, um, uh, bigger structures, you end up being more flexible later on, um, for generalizing. And so, and, and I’ve seen actually recently a few AI papers like having this kind of ideas, um, in there.
Anne 01:15:00 So, so I do think, you know, I, I, I do think there are ideas, uh, that come out of this research that could useful be exploited not as is. Um, but, um, uh, as, you know, broad ideas that then can be translated into, uh, their own use form, uh, for ai. Hmm. I, I think there’s a more pragmatic answer to your question too, <laugh>, uh, which is that, um, uh, honestly, my impression of AI papers is that they’re terrible at analyzing behavior <laugh>, uh, mm-hmm. <affirmative>, I, I don’t know if you’ve seen that, but, um, uh, the performance of AI agents is often just like, how many points do they get? Sure. Something like that, right? And I think, and I think we, we could learn so much more about, um, about the representations and the, and the, and the computations that those agents, um, do with more careful, uh, analysis. And I think that’s something that cognitive scientists do very well, <laugh>, and could maybe teach, um, uh, a little bit, um, to more AI people.
Paul 01:16:08 How about the idea of, okay, so we’ve talked about the interaction between working memory and reinforcement learning, but, uh, in, in the brain you kind of think of these systems as mostly separable, right? Um, and so you could imagine implementing some working memory network, um, and then kind of having it connect with a reinforcement learning agent or network, right? In a robot or, or just a large artificial network. But then you take something like attention, a cognitive function, um, and we know that, you know, like I was mentioning earlier, attention and working memory are highly intertwined. Um, and just cognitively, you know, they’re highly intertwined, but also when you look in the brain, you see, you know, lots of neurons who have attention modulated activity and working memory modulated activity. So I think in AI or in deep learning, um, the I idea would be to just, you know, add an attention network and have it connect to a working memory network, whereas in the brain, these things would be, you know, overlapping a lot, right? And, and widely distributed. Um, so I mean, would you imagine that in an AI system that, you know, you would need to build it to generate true intelligence or something more human-like that you would need to build it in the same way as it’s, um, as it develops in our brains, um, at the implementation level? Would you need to implement it in, in that same kind of overlapping way?
Anne 01:17:39 Um, I’m not sure. <laugh>, um, I’m not sure. I mean, when you think about transformers, um, which is, is the way I think of, um, you know, having most successful implemented attention in, uh, in recent deep learning stuff,
Paul 01:17:56 Attention.
Anne 01:17:58 Yeah, of course, <laugh>, um, it’s, um, it, it is, it is embedded, right? In that sense, it is like, uh, it is, yeah. Uh, present throughout the networks, right? Um, so, um,
Paul 01:18:13 That’s true. Yeah, that’s true. I, I should have, I should have even thought of that before I started spouting off about, yeah, adding it as a module.
Anne 01:18:19 Um, but, but, but I think most, most of the time, um, the approach is more like you said before, is to create a separate module and make it, you know, like, you know, only regroup it, um, some more towards, you know, the end of the, of the, of the, of the process or something. Um, yeah, I, I, I really don’t know. <laugh>, it seems to me like the, the more, the more successful approach. Um, I don’t know. I guess I can think of examples of both, right? Like, like you said, the nurturing machine has this kind of like modular, um, um, uh, structure that was very successful for, for certain applications. The transformer has that much more integrated, um, structure that’s, uh, successful for the approaches. Maybe that’s where we can learn more, you know, like if we go back to our dimensions at the beginning, right? Like if, you know, maybe the fact that it’s omni present tells us very less of a pure dimension <laugh> than we think, um, it is. Um, uh, you know, and maybe that’s where we need to do more work. I don’t know.
Paul 01:19:29 Is intelligence itself a super high dimensional space? Or, or, cause you know, we measure it with one number often.
Anne 01:19:37 Yeah. Well, I mean, we all know how controversial that is, right? <laugh>?
Paul 01:19:43 Yeah.
Anne 01:19:43 Yeah. Um, yes. I think, I mean, I, I have no expertise in this either, but I, I don’t doubt that intelligence is super high dimensional. Yeah.
Paul 01:19:54 You are a, you’re a classical singer. What does that mean? Yeah. Like what does that mean? Operatic
Anne 01:20:01 Opera? Yeah, classical, you know, mo stuff like that.
Paul 01:20:06 <laugh>. Oh, okay. You play an, you play probably multiple instruments. You’re multilingual. You probably play multiple instruments. You, you probably dual task on the instruments so that you can learn them better, right? <laugh>?
Anne 01:20:18 I play the cello.
Paul 01:20:19 Oh, of course you do. Okay. Well, I don’t know if you’ve ever been in a band. I don’t, uh, particularly like it when scientists name their band something like sciencey, and then especially if they sing about their science, that’s the worst, you know? But, uh, <laugh>, when I, even before I was in, uh, neuroscience before I got into grad school, I wanted to name my band Working Memory. You think that’s an okay band name? And I, I know it’s better than reinforcement learning
Anne 01:20:45 <laugh>. Um, it doesn’t ring very nice to me, but, you know, too sciencey.
Paul 01:20:51 Yeah,
Anne 01:20:52 Well, too clunky, right?
Paul 01:20:54 Working memory, I thought. I think it has a nice ring.
Anne 01:20:56 Yeah. Okay. Okay. Well, I’m not a native speaker, so, you know.
Paul 01:21:01 Oh yeah. I don’t know what the French would be. Okay, well, thank you so much on and, um, carry on the work into your ripe old age and stop having children, you know, it’s gonna interfere with your, your career even more so <laugh>. So thanks for being with me.
Anne 01:21:14 Oh, they’re lovely. And they’ll have their own career. <laugh> <laugh>,
Paul 01:21:33 I Alone produce Brain Inspired. If you value this podcast, consider supporting it through Patreon to access full versions of all the episodes and to join our Discord community. Or if you wanna learn more about the intersection of neuroscience and ai, consider signing up for my online course, neuro Ai, the quest to explain intelligence. Go to brand inspired.co. To learn more, to get in touch with me, email Paul brand inspired.co. You’re hearing music by the new year. Find email@example.com. Thank you. Thank you for your support. See you next time.