BI 165 Jeffrey Bowers: Psychology Gets No Respect

Brain Inspired

00:00 / 01:38:45

Check out my free video series about what’s missing in AI and Neuroscience

Support the show to get full episodes and join the Discord community.

Jeffrey Bowers is a psychologist and professor at the University of Bristol. As you know, many of my previous guests are in the business of comparing brain activity to the activity of units in artificial neural network models, when humans or animals and the models are performing the same tasks. And a big story that has emerged over the past decade or so is that there’s a remarkable similarity between the activities and representations in brains and models. This was originally found in object categorization tasks, where the goal is to name the object shown in a given image, where researchers have compared the activity in the models good at doing that to the activity in the parts of our brains good at doing that. It’s been found in various other tasks using various other models and analyses, many of which we’ve discussed on previous episodes, and more recently a similar story has emerged regarding a similarity between language-related activity in our brains and the activity in large language models. Namely, the ability of our brains to predict an upcoming word can been correlated with the models ability to predict an upcoming word. So the word is that these deep learning type models are the best models of how our brains and cognition work.

However, this is where Jeff Bowers comes in and raises the psychology flag, so to speak. His message is that these predictive approaches to comparing artificial and biological cognition aren’t enough, and can mask important differences between them. And what we need to do is start performing more hypothesis driven tests like those performed in psychology, for example, to ask whether the models are indeed solving tasks like our brains and minds do. Jeff and his group, among others, have been doing just that are discovering differences in models and minds that may be important if we want to use models to understand minds. We discuss some of his work and thoughts in this regard, and a lot more.

0:00 – Intro
3:52 – Testing neural networks
5:35 – Neuro-AI needs psychology
23:36 – Experiments in AI and neuroscience
23:51 – Why build networks like our minds?
44:55 – Vision problem spaces, solution spaces, training data
55:45 – Do we implement algorithms?
1:01:33 – Relational and combinatorial cognition
1:06:17 – Comparing representations in different networks
1:12:31 – Large language models
1:21:10 – Teaching LLMs nonsense languages

Transcript

Jeff 00:00:03 You know, I’ve been doing a lot of work. My lab is working with neural networks. I’m a fan of neural networks in a lot of ways. Uh, I do disagree about how people are testing some of these neural networks and the claims that they’re making about these neural networks at the moment. And I was just struck by, there was almost no reference to psychology. You know, that that’s where the mind comes in. It’s like, how can one make a claim that this is the best model of human vision and ignore a hundred years of research and human vision carried out by psychologists. Just psychology just gets no respect. I mean, we need more, uh, you know, people need to pay attention cuz psychology has the rel, you know, be beautiful data and relevant data for so many things.

Paul 00:00:55 This is psychology inspired. Just kidding. This is brand inspired. I’m Paul, but this is a psychology inspired episode, uh, because my guest today is Jeffrey Bowers, a psychologist and professor at the University of Bristol. As you know, many of my previous guests are in the business of comparing brain activity to the activity of units in artificial neural network models, um, when humans are animals and the models are performing the same tasks. And a big story that has emerged over the past decade or so, um, is that there’s a remarkable similarity between the activities and representations in brains and in models. This was originally found in object categorization tasks, um, where the goal is to name the object shown in a given image where researchers have compared the activity in the models that are good at doing object categorization to the activity in the parts of our brains good at doing object categorization.

Paul 00:01:57 It’s been found in various other tasks using various other models and analyses, many of which we’ve discussed, um, on previous episodes. And more recently, a similar story has emerged regarding a similarity between language related activity in our brains and the activity in large language models. Namely, the ability of our brains to predict an upcoming word can be correlated with the model’s ability to predict an upcoming word. So a common claim is that these deep learning type models are the best models of how our brains and cognition works. However, this is where Jeff Bowers comes in and raises the psychology flag, so to speak. His message is that these predictive approaches to comparing artificial and biological cognition aren’t enough. And that approach can mask important differences between artificial and biological networks. And what we need to do instead is start performing more hypothesis driven tests like those performed in psychology, for example, um, to ask whether the models, models are indeed solving tasks like our brains and minds do.

Paul 00:03:03 So Jeff and his group, among others, have been doing just that and are discovering differences in models and minds, um, that may be important if we want to use models to understand minds. So we discuss some of his work and thoughts in this regard. And a lot more, of course, show notes for this episode are brainin inspired.co/podcast/ 165 on the website, brainin inspired.co. You can also learn how to get more brain inspired stuff by supporting it on Patreon, or you can sign up to get a short video series that I made about some of the open questions and challenges in understanding our brains, our minds and artificial intelligence. All right, thank you for being here, and thank you for your support. Here’s Jeff. Why do you hate brains and artificial neural networks So much

Jeff 00:03:57 <laugh>. Yeah. A lot of people think that about me and then, you know, it’s, it’s really not true. I, uh, you know, I’ve been doing a lot of work. My lab is working with neural networks. I’m a fan of neural networks in a lot of ways. Uh, I do disagree about how people are testing some of these neural networks and the claims that they’re making about these neural networks at the moment. So it’s, you know, and I do think, you know, ultimately I think, you know, what I would like to do is improve neural networks to make them better models of the visual system. Or I, I’ve focused quite a lot on vision, but I’m also interested in other domains, language and so forth. So yeah, the goal isn’t to somehow reject them. Uh, the goal is to improve them. The goal is to have a better assessment of how similar they are to the human visual system or the other memory systems.

Jeff 00:04:53 Or language systems, because you really can’t improve a system until you have a better understanding of how good the correspondence is. So I would say, I would also say is, you know, I do think that, you know, these kind of, in the case of vision, these so-called image computable models, these models that actually take photographic images and classify them accurately in impressive ways. And it isn’t certainly, no doubt, an amazing engineering feat. So there’s no question about the engineering, uh, feed about this <laugh>, but I also think they’re not the only game in town. They’re not the only way to gain insights. And so I’m a fan of deep neural networks as a promising approach. I think it needs to be approached differently, but I’m certainly in my own research using them. Uh, but I also think models in psychology, uh, you know, maybe models in neuroscience that are not image computable, but they may nevertheless be insightful about how the brain works. And in the end of the day, a model really, I think is, is designed to increase our understanding of the, the human brain. Uh, and that’s the ultimate criteria means to judge a model by,

Paul 00:06:05 Well, you, you just mentioned the human brain, and I wanna come back to that. Um, but how did your, when did you get interested in helping improve these models? Was it, you know, with the advent of, you know, um, Alex Nat or something 2012? Cuz I know you have a kind of a long history of, um, arguing for symbolic cognition mm-hmm. <affirmative>, um, over parallel distributed processing mm-hmm. <affirmative>, and then just bring, to bring it back to the human mind or the human brain then mm-hmm. <affirmative>. Um, I, I thought your interest was more in the human mind than the human brain.

Jeff 00:06:35 Yeah. Well, good. I mean, they’re obviously, you know, I think the brain, you know, when it works, it produces mind <laugh>. Uh, but, uh, but you’re right, the sense I, I think, so I way back and then, you know, I got my PhD way back in like late, uh, or early 1990s, not that long after the p d p stuff was going on. And I found it very exciting about that. And I, and it’s true, I was quite interested and I continued to be interested in, uh, you know, at the time I was kind of somewhat critical of, you know, or at least the big debate back then was distributed versus localist coding in psychology. And I was kind of at least open. And it’s the idea that localist coding, right, you know, so-called grandmother cells, uh, might be a, a useful way to think about how knowledge is coded in the brain, or at least not rule it out.

Jeff 00:07:28 Um, so way back when I was interested in that debate. Uh, and then, yeah, I guess, you know, for a while, uh, I, I continue to be interested in all along, but, uh, with uher, you know, AlexNet and all this excitement, I, I think what really in generated my interest, uh, late by late, you know, last, you know, seven, eight years now, is all these strong claims being made about how these are the best models of human vision or fundamentally improving our insights into how the visual system works or the brain works. And I was just struck by, there was almost no reference to psychology, you know, that that’s where the mind comes in. It’s like, how can one make a claim that this is the best model of human vision and ignore a hundred years of research in human vision carried out by psychologists?

Jeff 00:08:19 And it just hit me as, you know, you go to the reference section of some of these papers and there’s virtually no, or, you know, very few, if any, uh, articles published from a psychology journal. And so, and I, you know, I think that’s been a theme of my thinking. Cause I’ve, I’ve, you know, I’ve dabbled in various different areas and I think one of, one thing I’ve always tried to emphasis, you know, argue in, in different contexts, and psychology has a lot to say. So I think sometimes people, you know, jump on neuroscience or comput uh, artificial intelligence and they ignore psychology. And I, you know, I’m, I’m fundamentally a psychologist, and when I see these claims being made with, uh, not enough reference to psychology, that just kind of irks me a bit, I suppose. So, so, you know, that got my attention.

Jeff 00:09:08 And so that got me back in. Get, like, I, you know, like you said, I was way back in the nineties, interest in some of these questions, and I kind of, um, got invested in, you know, researching, you know, this things got updated. There are lot, lots to learn. Uh, and I was just, in a way, struck an interesting, you know, an interesting feature about how things have changed, I think is back in the nineties or late eighties with the P dp McClellan and Ramel Hart and Hinton back then, um, they were using neural networks and they were really much more informed, I think, by psychology. So they really,

Paul 00:09:46 Right. They were psych, a lot of them were psychologists.

Jeff 00:09:48 Psychologists. And so they really focused on trying to account for psychological phenomenon. Uh, and then, uh, you know, uh, was it 2012? Is it AlexNet or something about that time? Uh, and, and subsequently after that, a lot of excitement. Uh, but that crowd, the, you know, first the, you know, the actual remarkable engineering successes was computer science. And then people who got more interested, uh, kind of starting to again, relate this to them, to the brain and mind, uh, more the brain, because more what they more focused on trying to predict neural activations in the brain. And so again, their claims again about links between humans again, but there was less, less contributions from psychology. And I think that was the big empty space that I decided to walk into.

Paul 00:10:44 Well, that, uh, so I mean that’s, that’s often pointed to by, you know, modern people who use, let’s say convolutional neural networks, people like Jim DeCarlo, Dan Yemens, and that ilk is that these models which were not, um, built to, so, so which were engineering feats, and were not built to model specifically the visual system, even though convolutional neural networks were born out of, you know, the Fukushima neo cognitron. Yeah. And, and what we knew about, um, those sorts of, the way that the human visual system is supposedly, um, set up. But the astounding thing was that lo and behold, um, when you actually do compare these models to brains, it does predict through linear regression predict, um, neural activity. And, you know, even more recent for sticking division, more recent work even, um, you can find a cell and figure out what image, however, bastard non ecologically valid that image is what image will drive that unit the most, right? In, in a living brain. And so it, that is sort of the, as the astounding thing, right? Is that these quote unquote best models of vision, and you can riff on this, uh mm-hmm. <affirmative>, if you, if you’d like best models of our human vision, were not born out of neuroscience or psychology, but were born out of, uh, engineering would be the, the claim,

Jeff 00:12:04 Right? Well, that’s the claim, I suppose. Uh, that’s the claim, <laugh>, yeah, that’s the claim. Uh, yeah. So, so yeah, that, so that, you know, one of the, you know, findings, you know, I think the main kind of finding that, uh, is used as empirical evidence that these models are, like the human brain is their ability to predict brain activations. Uh, and it, you know, it, it looks impressive at first, but, but there’s a basic, you know, what, what we know of, you know, basic fact about statistics is that, you know, a, you know, a correlation is not causal. It’s, I mean, and you can have confounds that can predict quite well. Uh, so you can predict off things that have no mechanistic similarity, just the mere fact that something is predicting something else is, you know, I suppose it’s unnecessary, but certainly not sufficient condition to conclude that there’s a similarity there.

Jeff 00:13:07 And what you normally do when you’re concerned that confounds might be driving a prediction is you run an experiment. You kind of vary independent variables trying to rule out, uh, certain possible bases of the prediction and, and say, okay, uh, you know, you know, if I vary, something does, does, how does that impact in the prediction? And the, you know, if you, and we, you know, and psychologists flaw of, uh, you know, the basics experimental method where you manipulate things, you understand something about how something works by running an experiment. And those, to me, are the kinds of findings that would be much more d a, they’re more theoretically informative. You learn something if you run a, you know, if you manipulate something and you find what affects something and what doesn’t, that gives you some conceptual insight of what’s driving, uh, you know, what, what, how is, how is the system operating?

Jeff 00:14:04 So, yeah. So I just think good predictions by themselves are not secure in knowing how similar the system is. And you could have, it could just simply be a contract. In fact, I mean, prima faia, that just seems like very likely, you know, it is a crime found in the sense that, uh, what we know is, so, you know, you, you, I know you’re familiar with brain score and they’ve ranked hundreds of models, uh, on trying to get the higher, uh, highest brain score. But if you look at the models in the top of the leaderboard, if you look, doesn’t matter, you look at any of these models, but even at the models in the top of the board, they classify objects in a fundamentally the most fundamentally different way than humans could be, which is they rely on texture, and, and they, they largely ignore shape.

Jeff 00:14:54 So there’s just immediately a bit of a paradox there. It’s like, how can you have a model doing good predictions about human object, object recognition and miss the, like, step number one? I mean, you, you, you, I just sort that you could have a more basic criteria in a developing a good theory, if you envision of ob at least an object recognition, is that your theory should be caring about shape and shape representations. I mean, there’s a lot more to say about what kind of shape representations you want, but let’s just start with step number one. It should encode shape. Now, the fact that a model that does rely on shape gets a high brain score should raise serious alarm bells. And there is a straightforward way that that could happen, which is or could be a confound, it’s driving the good prediction. So for instance, you know, the shape of an object will correlate with its texture.

Jeff 00:15:46 I mean, so cer, you know, airplanes have a certain shape and they’ll have a certain kind of texture, and cats will have a certain shape and a certain kind of, uh, texture. Now you can predict potentially a shape representation in a human brain on the basis of a texture representation in a model. And so you can make a good prediction, uh, but it’s due to a confound. And then in fact, people like Robert Garros did this lovely experiment. They ran a controlled experiment. They ran these studies where they, you know, they, they said, well, let’s, let’s run an experiment. Let’s attach the texture of an elephant on the shape of a cat and see how the model classifies that object. Because the higher the brain score you get, you’d never know the answer. Is it texture or shape? Well, you, you couldn’t know. You can’t, you’ll never know.

Jeff 00:16:35 You can get a hundred percent prediction, and you wouldn’t know on what basis is the good prediction being made. Uh, the way you find out is the way you’ve always, people have always done experiments. You, you manipulate something. And so Robert Garros and colleagues did that experiment, and they found, lo behold, these models recognize things based on their texture. So, you know, on the, you know, there’s, there, it just seems like, well, hold on. There’s something seriously problematic with brain score as a measure when you get a high score and you’ve failed on such a basic criterion criteria. Now, I know it’s interesting. I, I still don’t think it’s, uh, um, we can talk more about this, but I, I don’t think it addresses the problem deeply. But nevertheless, it’s interesting that there are these new models, some transformer model with, I don’t know, 20 billion parameters trained on, I don’t know how many 30 billion images or something like this.

Jeff 00:17:35 Yeah. And that model does start to pick up on, on shape rather than texture. So and so people point that out, oh, scale is all you need. I mean, more oh, so more, you know, more trading. And then in fact, the model picks up on shape. And I guess my resp two, two responses to that is it still doesn’t point, it’s still the case that the models on brain score, the top scoring models on brain scored recognize things based on texture. So that weighs his question on the predictivity measure by itself. I mean, and, uh, but then, you know, and then when people find that, oh, I can get a model to recognize a cat that has the skin of an elephant as a cat like a human does, uh, that’s great. Uh, but that’s still a pretty low bar in terms of our understanding of how human vision and shape representation.

Jeff 00:18:29 We, we know a lot in terms psychology from lots of research over time about the nature of the shape representations that are employed by humans when they recognize objects and reco, you know, so that’s fines great. That you’ve recognizing based more on shape and not based on texture, but to claim that, that then a model of human shape representation is a big jump. You know, again, you, you, you’ve now passed step one, but there’s lots we could, we could, there’s lots of stuff we could do if you paid attention to the psychological research. And there’s, but that’s, again, that’s gets me back to my, you know, the thing as a psychologist, I’m really trying to, you know, get people to pay more attention to psychology. And there’s a lot of interesting research out there, and I think people would, uh, find out that these models are failing and more are, are more dissimilar than they think.

Jeff 00:19:25 And, but the great thing is, you know, once you find a failure, I mean, the garis is a, is a, is a, a lovely examples. As soon as you ran an experiment, you found that there is a failure that isnt relying on texture and not shape, it immediately generates a hypothesis of the kinds of things you need to do. At least le now you know, okay, our model has a particular type of problem. It has a problem that is not working on shape. And then people like Eros, their first intuition was, was to train the model differently with the kind of data sets in which shape was more diagnostic than texture. That was their, so they changed the model, they changed the training space, which is fine know. I mean, the training, the particular training they did was not particularly psychologically plausible. What they did is they said, okay, let’s take the shape of a cat and attach the texture of an elephant to the ca.

Jeff 00:20:19 You know, there’s a hundred different kinds of textures. You call it a cat. And each time, so now that you know, the, the shape and not the texture is diagnostic. So that’s obviously not human-like, but other people have done more realistic training environment and found that, okay, you can start getting in the, in the direct, but you can start getting in the direction of getting a shape bias with better training. That may be different architectures will do a better job as well. Uh, but the thing is, when you ran an experiment, you knew what to try to fix. But if you just did brain score and you got, oh, I got a brain score of 0.5, and then the next model gets a model of 0.56. I mean, what have you learned? I mean, what do you know? Have you learned? Oh, that means it has more of a shape bias.

Jeff 00:21:03 No, you don’t know that. You have no idea what the increased predictivity advantage is due to. So, so anyway, that’s what’s good about an, I mean, an experiment not only starts removing confounds, which is important, but it also gives you the conceptual understanding of what the nature of the failure might be, or the nature, hopefully, of the success sometimes. I mean, it’s, it gives you a better understanding and that, you know, that’s just like, you know, running experiments, why, why, you know, it’s fine to run, you know, big regressions on big data, just observational data sets. But it’s better to run an experiment if you can. And this is a domain where you can run experiments. So why not?

Paul 00:21:48 I just, as you were talking about the shape versus texture, it made me wonder, and you would know this better than I, sorry, this is an aside, but are there disorders of human cognition where that is reversed and, and, um, you know, you could find a, maybe a model for some disorder in these, uh, great vision <laugh>.

Jeff 00:22:09 I mean, there’s certainly various forms of iias. I, I don’t, I don’t, yeah. And so you definitely can have, I mean, normally I think of iias as disorder as a shape perception. There’s various different ways in which shape perception breaks down. Uh, offhand, and I used to know this literature, but I haven’t Yeah. Followed in a while. I, I’ve, I don’t remember people, uh, like, you know, a form of ecno, which people make judgements based on texture. It’s certainly the case that people, you know, if you have pronoia, you might be lo recognizing things on local features, which actually is a little bit like a neural network might do sometimes as well. And so you might yeah, recognize somebody because they have a big nose or, uh, you know, some particularly distinctive feature. So, and you’re not recognizing faces in the normal way. Um, so, but whether it’s texture or some local feature, but it’s, it’s possible, but just hits me as an odd

Paul 00:23:11 Yeah,

Jeff 00:23:12 It’s, it’s a, you know, it’s like maybe, okay, you train a model to classify objects accurately, and you get a good theory of disordered human vision. Maybe, maybe. But you, you’d have to then run experiments to test your theory. Okay. It’s just really like a form of, I mean, there’s gonna be a vast, large, interesting neuropsychological literature that would probably then something that you should run experiments to determine whether that’s true or not.

Paul 00:23:38 I mean, so running experiments, the, you know, a current, um, controversy maybe, or a, a current criticism of the way that, for example, neuroscience has been done the past 50, a hundred years, is that the experiments are too controlled. Um, and that the, um, animals and humans that we’re experimenting with, um, that because they’re not ecologically valid situations that we’re gonna get results that are tailored to the experiment or the experimenter and not how it’s actually working in the brain and or mind. And in these like, really highly controlled, low number of stimuli kind of experiments, that’s kind of, you know, what you could be getting. But then these giant data sets that the, the deep neural networks are trained on should, in theory, dissolve that they’re not ecologically valid. And, and actually I’m gonna ask you about what vision is for in a second, because object recognition and categorization is that what vision is for. But, um, in inherently, you know, these models that are trained on so many different types of images and categories, uh, you would think in principle would, if they are constrained, like human vision is constrained, would have a shape over texture kind of bias in, in that one example. I mean, there’s lots more examples that you list in your recent paper as well. Mm-hmm. <affirmative>.

Jeff 00:25:04 So is it, is the question should, uh, I mean, I mean, I, I, I agree that there are, I mean, there’s no doubt dangers on both sides. I mean, you could run experiments with, you know, artificial stimuli that, you know, you’re missing the big picture, I suppose. Um, but, you know, so many of these studies, I mean, you know, what is an art, like a line drawing of an object? I mean, it was so striking is we can recognize an object, uh, by its line drawing. In fact, there’s a kind of a, well, I can’t remember the often name of the, but a study down in the fifties, I think it was, I can’t remember the author offhand, but, uh, he did this bizarre developmental study, uh, well, not bizarre, but kind of, I think about the ethically questionable today. But so

Paul 00:25:47 Today, yeah,

Jeff 00:25:48 He raised his child with the kind of ensuring that he never saw a line drawing. So until two or three years old, the child, they did their best, uh, not to, to let the child see a line drawing. And then eventually, you know, so they had a, I suppose, turn off the TV when cartoons were on or whatnot, but eventually, you know, it became too difficult to continue this thing. So, but the child was old enough, I don’t know if three or something light or two or something like this. So the big, the big moment happened is, okay, take a look at this and tell me what you see. And they showed line drawings, and it was trivial. I mean, it, it wasn’t even a, you know, there was no, you certainly don’t have to train, um, uh, a child to see line drawings. So there’s an artifact.

Jeff 00:26:31 It’s artificial in some sense. I mean, until cavemen drew on walls, you know, there weren’t line drawings. Uh, but it was, uh, yes, that’s an artificial stimulus, but I think it’s very informative about, uh, how human vision works. I mean, illusions are such striking things to me. I mean, I just seems like, yeah, there are, I mean, sometimes, you know, there are real illusions that occur in the world and in natural context, but it just seems odd to me to think that those aren’t important constraints on how the human visual system operates. And to call ’em artificial would be Well, yeah, I mean that’s, you know, we kind of, that’s when, you know, an experiment is I, you know, you kind of find particular circumstances and it works, you know, when, when something happens. And that just seems, uh, important to me. I mean, I, but the thing is, in deep neural networks, and this is sort of what I’m doing with <inaudible>.

Jeff 00:27:31 I have a team of people. So in fact, the people on my team are doing all the re all the modeling. I, I, you know, I, I can’t model myself, uh, <laugh>. So they’re doing all the actual hard work. And, um, and you can, the point is you can run experiments on deep neural networks. So that’s what you should do. That’s, you know, that’s sort of the, the main, I mean, you know, one of the main messages, you know, another message is, you know, there are other approaches to, to modeling as well that we should dismiss out of hand. But what I’m doing in my, in our lab is running experiments on deep neural networks. So there’s no reason why if you, if, if you’re absolutely committed to having an image computable model, and I, I, I think this commitment is, you know, it’s, it’s, it’s a nice thing to have an image computable model, but I just don’t think that’s the entry point to doing good research, you know?

Jeff 00:28:29 But, but anyways, you know, alright, say you want to work with image computable models and you want to ignore any model that’s not okay, that’s, you know, you’re free to do that. But what you really shouldn’t do is not run experiments with your model. You should run experiments and find out whether these, you know, artificial stimuli are producing results, uh, that resemble human results. And if they don’t, you might have, I mean, I’m sure there are some phenomena that are potentially not interesting. I mean, you know, I mean, it’s not that you have to count, you know, I mean, I, I think some phenomena, no doubt, more interesting than others, but, you know, like, but recognizing things based on shape is not one of the things you can ignore safely.

Paul 00:29:17 What, yeah. One of my favorite examples that you give, um, in fooling, um, artificial networks or showing that they don’t categorize the way that humans do is the single pixel example. Yeah. Mm-hmm. <affirmative>. Um, can you just describe that and then we’ll come back to, and then I’ll have some follow up

Jeff 00:29:33 Questions. Sure. Yeah. So this is, uh, uh, a experiment run by Grove Melo where what we did is just added a confound in a stimulus set. So we, we use this very simple data set called Cfar 10, which is just 10 categories each with I think, 60,000 images. So it’s a, compared to what’s normally used image in that much, uh, smaller, easier model, um, data set to work with. And we simply added a pixel, like, you know, in, we had different conditions, but one of our conditions was a single pixel where you’d say, okay, every time you see a plane, put a pixel of a certain hue in, in this location or in, in, in this at least this small region of space. And every time you saw a horse, you put it over here, and every time you see a bus, you put a, a certain pixel over there. So now there’s two kind of diagnostic cues for the classification that a model in principle could pick up on. They could pick up on the overall well shape or texture, in fact, texture or

Paul 00:30:34 Texture <laugh>.

Jeff 00:30:35 Yeah, probably overall texture really. But anyways, some, you know, something that you think, yeah, the, the whole image itself is used, or alternatively you could do it, you know, with sometime called a shortcut. You could just exploit this single diagnostic pixel. And that’s what the model does. The model will decide, you know, what’s a lot easier because the model does, you know, the model just wants to classify it. Uh, you know, there’s no reason it should do it like a human. And so when, given that those data sets with this kind of confound that we just artificially inserted the model chooses that shortcut. And so what happens is that if you then attest, give a picture of a, of a plane, but put the pixel where a bird was, the model will, you know, from a human person, you know, it looks exactly like I came, remember I said a plane, uh,

Paul 00:31:24 Horse. Yeah, yeah.

Jeff 00:31:26 Whatever it was. Oh, yeah. But the model will jump and oh, it’ll say, oh, I think that’s a bird, because it’s just, yeah. All it knows is the pixel doesn’t learn any, it’s just focused on the pixel. So, and because the, because,

Paul 00:31:37 Sorry, I was gonna say, because the pixel in that case is the most predictive thing of that category.

Jeff 00:31:42 Yep. It’s the most predictive, or at least it’s the easiest thing to pick up. They’re both in principle, the human could, you know, pick up on either, and if they’re, you know, they could get essentially in, see both are, you know, at some level equally predictive. But, um, but the prediction on the, at the level of the, of the image is a more complicated, challenging set of, um, in variances. And it has to pick up, and it just picks up on the simpler feature that it can pick up on. Um, whereas a human being doesn’t even notice there’s a pixel there. Uh, right. So, and data sets are just full of confounds. And some, sometimes, you know, you know, it’s nice in our experiment, we, you know, we know what the confound is because we inserted it. But in, in naturalistic data sets, God knows how many confounds there are. And you know, some of ’em you won’t see, but, you know, some of ’em are things like what texture is correlated with shape, that that’ll certainly be one. And that’s no doubt, I think why models to our surprise. I mean, I don’t think the expectation was, uh, prior to Garros that, uh, the model, you know, was relying on texture not shape. I mean, but oh, turns out it did, because that’s just an easier feature to pick up on

Paul 00:32:54 In something like, um, cancer diagnosis, right? Where, where a deep learning network is being used as a tool to look for anomalies or some, you know, detect pixels, you know, like little, that, that’s like little clusters of pixels, right? That might not be detectable by the human eye in that regard. You would want a, uh, superhuman, for lack of better term, a a machine that can attend to all pixels equally well and doesn’t run out of energy, and can make decisions based on taking in entire, you know, entire, um, images and statistically go over all the pixels and see if there is any anomaly. On the other hand, you know, that’s not how humans work. Um, even though we can get pretty good at that kind of stuff. And then there’s the question of whether we want these things to work like humans or to work differently than humans or, or better than, or just differently, or whether it even matters. Why would it matter for one of these networks to function the way that a human mind or brain does?

Jeff 00:33:57 Yeah. If you’re an engineer. I mean, uh, it does. I mean, you know, my, you know, my, my critique doesn’t apply. I mean, look, I do think if you take the example of, you know, these, these models that are trying to detect cancer and things or make diagnoses, I mean, it turns out that these models often pick up on that you don’t want them to pick up on Yeah. Because it turns out that there’s just, you know, is a particular logo on the hospital, you know, the corner, or there might be if people are lying down versus standing up, there might be some weird, uh, kind of configuration that you weren’t aware of, but actually the model picks up on that. So compounds are a problem in that engineering, pure engineering context as well. But as, but if you put away, you know, if you put aside the fact that, you know, let, let’s, you know if the confound is going to allow you to, on a test set, diagnose correctly, you know, you don’t actually care in a way.

Jeff 00:34:56 I mean, if it’s, if it’s picking up cancer cells or some extra correlated thing that’s associated with cancer cells, and you thought, you know, you thought it was picking up cancer, in fact, it’s picking up, you know, a, a, a byproduct of cancer, who cares is as long as you’re correctly identifying, uh, uh, some x-ray that’s start, you know, needs to be, uh, picked up. Um, and then it’s fine. I mean, you know, I don’t, you know, models can outperform players in chests and go and, uh, yeah. You know, and that’s, you know, I have no nothing. But, uh, you know, kudos to any engineer that took this stuff. Fantastic. I mean, it’s impressive on who would’ve thought.

Paul 00:35:34 But, but also from an engineering perspective, there are people who are interested in build building, you know, general in generally intelligent agents, right? Someone who could detect cancer perfectly, paint your house perfectly, uh, I don’t know, tell you what cartoon you should watch next perfectly, whatever that means. Yeah. Um, and, and which is a, a different task than understanding the brain and mind. And I, I understand that you are more interested in understanding the brain and mind than, you know, building these sort of gen I, I know that you are interested in, in general, uh, generally intelligent mm-hmm. <affirmative>, um, agents. But I mean, how much of the mind and brain do we need to emulate if we want to build general, generally intelligent agents? And if you know what a generally intelligent, intelligent agent is, you should let me know.

Jeff 00:36:22 Yeah, yeah. Well, I mean, I mean, I think what, you know, one of the messages that we’re doing in our research is that there’s often, you know, multiple ways to get an outcome. There’s, there’s shortcuts. So, you know, you can pr you can classify images based in ways that, you know, you didn’t think, uh, you know, wasn’t the way you imagined. There’s probably multiple ways to achieve some goal. And, uh, yeah, if you can get a car that can drive you home safely at night based on training it on trillions of miles of, uh, of, you know, simulated driving and, and, you know, and things like that, you know, I, you know, I’d be happy to have a, a si you know, if once it works, I’m all, I’m all for it. And it’s, and, but, and, but the idea that that’s how human learn to drive would be, you know, absurd because it’s been trained on a trillion miles of, uh, of, of data.

Jeff 00:37:15 But, uh, so yeah, you know, if you can, if you could train a model on all, you know, relevant human skills and, you know, you just have a whole collection of, you train it with an exorbitant amount of data and the driving context and an exorbitant amount of training and language context, and an exorbitant amount of training on image data sets, and, you know, you know, and you know, in motor, you know, I dunno, motor robots that are somehow eventually, um, trained in, you know, ways, you know, you can capture all the forms of intelligence that would be useful for us if then I’m all, I’m all for it. It’s just, but I know, but like you say, I’m a psychologist, so, uh, you know, to me the question is to what extent is, is success on these tasks a demonstration that it works like the brain? But yeah, I, I, I, I don’t have any magic definition of what general intelligence, general artificial intelligence should be, but I guess a model that just was very useful in helping us in our daily life, uh, and doing all the things we wanna do, I suppose, uh, that would be, you know, a practical way. But, but don’t confuse that or just at least certainly don’t assume that that’s a human based solution.

Paul 00:38:32 You think it’s worth continuing with, like, with, I’m just gonna put everything under the deep learning, um, umbrella mm-hmm. <affirmative>, um, and then to understand our brains and minds better. Um, and if that’s correct, what, you know, what is the solution? You know, the term inductive biases is thrown around a lot. We need to put more, uh, constrain these systems in a way that we are constrained more, that, that we have to move about the world. Um, but, you know, are we gonna, are we gonna have to essentially emulate all things that humans and animals do to, uh, build and test these systems correctly? Or since there are multiple ways to skin a cat, are we gonna find 14 different unique solutions, um, until, you know, once you’re satisfied to satisfy to, to eventually satisfy you rather?

Jeff 00:39:26 So you’re, you’re asking me now back to whether to emulate the brain as opposed to just solving engineering, not

Paul 00:39:31 Just Well, yeah. Yes. I mean, whether, yeah. I mean, how much are we gonna have to emulate the brain to understand the brain? Yeah. And mind?

Jeff 00:39:41 Uh, yeah, it, it’s a good question. I’m skeptical that we’re gonna be able to, you know, use, I mean, you know, the one approach that, uh, people are adopting is that scale, you know, you know, scale is all you need more training. Yeah. Basically use the architectures. Okay, maybe go from convolutional models to transformers or, but you know, you know, somewhere we got something as architecturally, we’re kind of close, but the training environment or the objective function or, you know, some, you know, some extra, uh, and that may be, and, you know, proof us in the pudding, run experiments and find out how far you get. I mean, but I mean, there are just such striking differences between what these models look like and what brains look like, and the assumption that, so, you know, all these models have, you know, have units that are effectively all the same as each other, other than the connection waste they have to other units.

Jeff 00:40:40 So, I mean, we’re, you know, and they’re not spiking, you know, so we have mo if you look at a brain, there’s, you know, hundreds of different types of neurons that have incredibly different, you know, diverse morphology and, and they vary in ways that would, you know, potentially be important computationally, they have different, you know, time constants as how long they integrate information. They have different, they conduct information at different speeds, and they have spikes in which the timing of, of spikes is really quite essential into whether information’s communicated. And, you know, there’s endless other kind of differences. And sort of the assumption is that we can ignore all that. Uh, we can kind of assume they’ll be functionally equivalent if we just train a model in which the only adaptive area of change is a synapse between units in which everything works in fixed time steps.

Jeff 00:41:34 And there is no other form of variation. Now, that may be true, but it’s, but you know, a, it would be, if you want a model of the biology of, I mean, it, you know, maybe as a psychologist, I, I don’t, I don’t, I I I I tend to imagine that these vari forms of variation matter, and they may give you some capacities that you wouldn’t otherwise have. Or even if they don’t give you additional capacities, they make you the certain they make you ma you man manifest your capacities in a certain way that are interesting from a psychological point of view. Uh, so just to ignore all of that is, is, um, is a big gamble. And, you know, you know, and, you know, again, test your model and see if it works. But, you know, I, I wouldn’t be surprised if, if you’re missing something important by ignoring all these forms of variation.

Jeff 00:42:24 And, and, you know, I guess the assumption is that people often talk about, well, you know, when we train a model endlessly, you know, with, you know, more and more, you know, we’re gonna eventually get there, it’s like some as if the, the learning of the model, some combina, you know, some of the learning is sort of like the in life learning, you know, the learning that happens in a lifetime. And a lot of this other training is sort of emulating the evolution in the first place. You know, somehow you just can get everything, you know, so it’s, you know, I don’t know which part of my outcome was a product of what, you know, what, what we’d call evolution and what we call learning. But at the end of the day, the answer I obtain is a lot like the brain. And, but the thing is, you know, evolution’s not constrained by changing just the weights between neurons.

Jeff 00:43:10 I mean, evolution produces whole varie of, you know, again, a wide range of morphological forms of neurons, all of which, you know, may change. You know, the architectures change, the, the, the properties, the neurons change, and all of that is outside the scope of these models that just change their weights. So whether you’re in the possible space of a solution, that’s gonna be a biologically, you know, interesting bio, you know, a close correspondence to humans is a big is is it’s certainly a failure at the, at the single cell, at the neuroscience level, at trying to understand, you know, if you say, what, why, you know, why do we have this morphological University of neurons? Well, you’re not, you’re not gonna get an answer from a model that has every neurons the same. Uh, whether, um, whether you’ll have a psychologically equivalent model that, okay, we could just ignore those forms of variance and we’ll get a, a functionally equivalent thing.

Jeff 00:44:06 Okay, well, that, that’s in the realm of possibility, but it’s not, it’s not a, I don’t think we should say it’s a safe bet, you know? And, uh, but one, one way you’re never gonna find out is doing more brain score. I mean, you’re gonna have to run experiments to see whether these models function in the way that a human visual system functions or a language system functions. Uh, but it just seems to me a very unsafe premise to assume that just changing weights, that the only relevant form of learning and computation in the brain is in synaptic changes, uh, and ignoring other forms of variation in the brain. I’m just repeating myself. It seems unsafe. Yeah.

Paul 00:44:54 Yeah. <laugh>. Um, so in some sense, okay, so I’m sure you’re aware of the idea of multiple realizability and people like Eve Martyr’s work who show that, um, given like a, the same set of a small number of neurons, you can still, um, there are lots of different ways that you can come up with a behaviorally relevant behavior outcome to produce the behavior behavioral behaviorally relevant, um, outcome. <laugh>, I’m repeating myself now, <laugh>. Um, and, and in some sense though, like, so then you have this like problem space in, in eve martyr’s, uh, work, it’s like the somatic gastric ganglion, right? So the, the, the problem space is how to, um, digest food and stuff and contract the muscles in a certain way, and that constrains, in some sense constrains the possible solutions, although there are lots of different solutions to do that. Um, and so one of the, again, I’m at a loss, maybe to ask you a question about this or maybe just ask, uh, you to comment on it.

Paul 00:45:58 One of the impressive things, um, maybe about neural network models is that even though they are so divergent from our biologically, uh, realistic because they’re bi biological brains, that maybe the, if the problem space is the same kind of problem space, they’ll converge on a solution within the space of possible solutions that are still valid, right? But then that goes, uh, back to the question of like, well, are we just giving them the wrong training data or are we asking them to solve the wrong problems? So this is where I was gonna ask you what our vision is for mm-hmm. <affirmative> is our, is our vision for object detection.

Jeff 00:46:35 Yeah. Yeah. Well, almost certainly that’s not the, uh, you know, I mean, we can classify objects and we can label objects, but, you know, our visual system obviously evolved from, uh, you know, our vision’s been around a long time. I don’t know how far back vision goes. And we, you know, evolutions kind of dissent with modifications. We’ve adapted our, our visual system is based on much more, uh, species long, long, long gone. And, uh, yeah, I don’t think they were labeling, uh, the images in their world. Uh, so the visual system is, uh, is no doubt. And, you know, and, and, and so our, his, you know, evolutionarily from timescale, it wasn’t all about that. And then, and then in our own lives, it’s not all about that. I mean, obviously we’re, you know, we, we, we, we manipulate objects, we walk around the world.

Jeff 00:47:32 Um, I mean, one, one of the ways that people talk about things is that, you know, what we’re doing is trying to infer a distal representation of the world. So we have this 2D image projected on our retina, and we’re trying to infer a three-dimensional representation of the world. And it’s an ill posed problem. I mean, we have, you know, going from 2D to 3D means that you’re in principle, there’s lot, you know, they’re infinite in principle, different worlds out there that can project the same 2D image on our, our, on our eyes. We need to kind of have some assumptions about, uh, the, the likely causes of this image on our retina. And, uh, that’s where these kind of various priors come in. But we’re trying, the argument is we’re trying to build a 3D representation because that, or, you know, a, a distal representation of that world.

Jeff 00:48:21 Because once you have a, a distal representation, ideally you can do lots of things with that. You can reason about the objects. You can, you know, you can, uh, you can grab them from different points, you know, different perspectives. You can kind of, you can do lots of things. Whereas, you know what, what, and I mean, not all, I mean, models, you know, more and more people are developing self supervised models. And so not all the, all the models are only classifying images now. But yeah, most, you know, the models and brain score for the most part are, they don’t, they don’t compose a three-dimensional representation at all. They don’t, they’re, they’re taking a 2D surface representation of texture and labeling it. Uh, there’s no in, there’s no attempt to infer the distal representations of the objects in the world from those images. It’s just a, a 2D mapping onto, uh, a label.

Jeff 00:49:18 And that’s clearly not what our visual system is doing. Uh, again, what, what do you make of that fact that you get a high brain score in a model that’s not composing 3D representations, it’s not organizing the scene by gestalt principles or performing cons, consistencies, or, you know, all these features that we know about human vision, uh, are just not part of the task of, or they’re certainly not, they’re, they’re not necessary for a model that’s trained in a supervised way to classify 2D representations in a labeling task. Uh, and the hope, and, and it’s, I think it’s definitely worth pursuing, is that if you get much more sophisticated training, you know, a much more realistic, uh, you know, a more realistic data set to train, and you’re not classifying static images, but you’re predicting know, you know, predicting, uh, you know, images and you’re, you know, developing embodied representation in the world and you’re kind, you know, developing a robot that’s kinda learning and grabbing things, um, you know, almost certainly, I don’t doubt that a lot, you know, some of the things that we see in human, uh, visual properties are gonna manifest themselves in a much, you know, once you have a much more realistic training, uh, and a better objective function.

Jeff 00:50:45 Uh, I don’t, I wouldn’t be surprised at all if more and more of these, uh, properties of human vision start manifesting themselves in a deep neural network. But I also wouldn’t be surprised if some things are just outside the scope of current models are just not gonna ever find themselves in the human space because they’re just working with neural networks that, you know, for one thing, don’t spike. They don’t have, uh, morphological diversity in their units. Uh, and they’re not the product. They didn’t start from the evolutionary starting point that human visual system work from, which at, at times had very different functions in the first place. I mean, there’s lots of things that, you know, our, you know, our bodies, you never would’ve, uh, you never would’ve ended, you know, if you started from scratch and you tried to evolve something, you probably wouldn’t come up with a solution that humans have done.

Jeff 00:51:39 But the, the reason humans ended up with, uh, a solution that was sort of adaptive, having a weird evolutionary history, and you started, you know, once you get to this place, the best solutions here, but you never would’ve started there. And I don’t expect, I mean, even like the silly thing that our photoreceptors are in the back of a retina <laugh>, right? So we have all this light that’s, you know, is it’s going through three layers or more of, uh, of, uh, of, of other neurons in the retina. And we have a big blind spot because, you know, the gangland themselves are leaving from a part of there. And, uh, and you have veins blocking the light and all these different things. I mean, if you, you know, an engineer wouldn’t decide on adaptive grounds for that. I mean, it turns out that, you know, there’s lots of other constraints that you have. Uh, um, yeah. So

Paul 00:52:31 Humans have it, hard animals and humans have it

Jeff 00:52:34 Hard. Yeah. Yeah. I, I, I’m not sure, I think the octopus, there’s some, there’s some weird animal that doesn’t have them. The photo, the photoreceptor are the front of the retina. So there’s like one, I think it might be an octopus. I dunno, what if octopus have eyes? Maybe I got really wrong there, but there’s some animal that doesn’t do it. But I think for some reason, most animals have the photoreceptors at the back of the retina, which just from a surprise, you know? And then, so then, then the visual system is, you know, is confronted with problems, uh, that it went other. And so there are these interesting ways in which we have filling in processes, and there’s a whole there, you know, there, there I, ID, you know, people have developed, like people like Stephen Grossberg and others have developed interesting architectures that are trying, in part, trying to rectify the, the, you know, terrible signal that’s being projected to the, to the photo.

Jeff 00:53:25 But that’s just sort of like, so, you know, so there are, you know, there are actually important, you know, cognitive visual, low-level visual processes that may be only managed as they’re only required because the nature of the input. Uh, but you know, these models, you know, I mean, they don’t have those problems. In fact, they have just a equal tend to have equal acuity across the entire space, and they don’t have any. Yeah. So, uh, so whether if you, if you train the model more and more stuff, but you didn’t work on getting a more realistic retina, you may end up again with a very, you know, missing important parts of, of filling in phenomenon. And then, you know, things that, you know, might not, it’s not even a problem to solve.

Paul 00:54:15 Well, maybe octopi can’t fill in. Yeah. Not the, that’s something to test <laugh>. Um, you know, thinking about our evolutionary needs and, and projection, I mean, do you think that homeostatic needs and thi you know, life processes, uh, a sense of autonomy and, you know, the, the fact that our bodies regenerate and have to maintain itself. I mean, how important do you think that is to our quote unquote higher cognition?

Jeff 00:54:42 Ooh, that’s a good question. Uh, yeah, I mean, I think, I may not be answering your question, but things like, you know, you know, I, I, I, I sort of get the sense, you know, embodied cognition, I think, you know, sure.

Paul 00:54:59 But that’s just a robot where a robot is not, doesn’t have to be autonomous. We can just give it a battery, you know, it doesn’t have to, it doesn’t have to maintain itself. Mm-hmm. <affirmative>, for instance.

Jeff 00:55:09 Yeah. Well, yeah, I, I’m not a hundred percent sure I know what you mean, but I, but I, I do think that, you know, hey, all the, they’ll have know, achieving homeo status is certainly to be important for the, you know, hypothalamus or the, you know, hypothalamus lower,

Paul 00:55:25 The lower, the lower brain,

Jeff 00:55:27 Lower the lower bits for sure. Uh, the higher bits I expect so too, I, I, I expect everything is gonna be at some level, uh, constraining, you know, whether that’s the place to constrain, you know, a theories of human vision. I, I, you know, just, you know, that may not be the place to start. I mean, I think, but

Paul 00:55:48 Do, do you think that we, uh, let’s say just humans, right, are implementing algorithms, straight up algorithms and objective functions, or is it messier than that? Or, you know, is that, is that just like the, an algorithmic solution to something, a necessary means to attain that thing, whatever the algorithm that’s needed at the time? Or do you think that we have just by hook or by crook come to, uh, develop a bag of tricks? I think someone like Gary Marcus might say, um, to, you know, to serve as objective functions and algorithms,

Jeff 00:56:22 Everything. I mean, I mean, the neural network is, you know, so, so by algorithm do you mean like symbolic, some kind of list? Kinda

Paul 00:56:30 Like No, just a, a set a, a set of, um, concrete predefined steps to determine a solution to a problem. Mm-hmm.

Jeff 00:56:38 <affirmative>, but I think the brain clearly can do that. I, I do, I do. You know, I, I mean, I do think the brain does have a bag of tricks. So I, I think there are, you know, uh, lots of, uh, hacky solutions and, uh, you know, satisfying, uh, in, in many different ways. So I, I, I don’t doubt that at all. But, you know, we can have an algorithm for, you know, addition and multiplication and, you know, we, you know, you know, so we can kind of very strategically, I mean, I, I might relate to system one, system two condiment kind of thing. So we certainly can implement, uh, kind of

Paul 00:57:13 Rational of caution. Do you think?

Jeff 00:57:16 Uh, yeah, I don’t know of,

Paul 00:57:18 Of our cognition.

Jeff 00:57:20 I mean, I do, I do think we need some, one of the things that certain types of models of vision or language have are, you know, these compositional representations where you have kind of explicit encoding of, uh, relations and elements that can ab bind in, in various ways to these things. And that, again, it’s Gary Marcus, or people like, uh, you know, Jay Faar and, and, and envision people like John Hummel and, and Beaterman with the gion theory, with structural description theories where you have an explicit coding of parts and the relationships between parts. Uh, I, I, I, I do think, I don’t, I, I do think that’s very likely an important component of human vision. I mean, I do think that you, you know, coding relate. So one of the things I, I, I was mentioning before that it’s not enough to, in theory of vision, just to say, oh, I can, I recognize this thing that has a looks, has a shape of a cat and the texture of an elephant.

Jeff 00:58:23 Oh, that must be a cat. So that’s okay, that’s good. Cuz humans do the same thing. Uh, but there’s a lot more about shape representation. So one thing about shape representations is that we recognize objects based on their parts and the relationships between the parts. And we have an explicit representation. At least that’s what we’d argue in some of our research and other people long before, uh, would say. And that thing that might be really hard for a lot of these models. I mean, even people like, um, uh, you know, some people like Beaterman and John Humel developed, uh, or was humble and Beaterman in 1992, had this great psych view paper where they kind of implemented the gion theory of OB ignition into, uh, neural network, you know, 30 years ago now, more than 30 years ago. Uh, and it has a, a, a mechanism by which you do dynamic binding of, of relations and explicit, you know, explicit representations of relations and explicit representations of parts and associate these things.

Jeff 00:59:21 But even recently, people, like, people, none other than like, uh, Jeff Hinton are developing things like capsule networks and this, and this, this kind of, uh, I don’t really has this thing called glam. It’s, I think a conceptual idea, but he’s trying to understand the hierarch relationships between parts and, uh, and explicit representations. So, you know, and, you know, and, and these are not image computable models. I mean, so he’s developing these non image computable models, uh, that, you know, would do terribly on brain score. They, I presume we get a score of zero probably or something. But, uh, but he’s developing these models to say, uh, you need to develop models that code for relations. And he has, he’s not, he doesn’t do it in the way that Hummel uh, uh, did Hummel had like synchrony of, of, of activation units. And, and, uh, you know, Hinton has some routine mechanism of, uh, you know, which I don’t really couldn’t tell you much about it really, but it’s, uh, but it’s, uh, you know, uh, an attempt to develop ex, you know, coding of explicit forms of relations and part relationships between these things.

Jeff 01:00:30 And, and they, in both cases, a sacrifice, uh, at the moment, image computability, ultimately you’re gonna need an ulti. You know, ultimately the human visual system, uh, works, has a retina. And so, you know, you know, I agree, ultimately you wanna model that image computable, but you know, whether that’s the place to start or whether you should kind of work with a toy model in order to figure out how can you code for parts relations. Um, and everyone from, you know, John Ulta, Jeff Hinton have adopted the approach or play with toy models to work and try to understand these relations, which eventually you would try to implement in a, in a, in a, in a, in a full scale model, it’s image computable. Uh, so I don’t that relate to algorithms or not. I mean, I think it relates to the idea of having an explicit encoding of parts and relations and a mechanism by which that’s achieved. Uh, but it’s not like an algorithm of, you know, you know, long division.

Paul 01:01:37 Yeah. Yeah. Part part of what, uh, one of the examples that you were just alluding to, or at least one of them, was, uh, and I don’t remember the, I know you’ve done work on this also, but basically, neural networks will judge if something is same or different based on like some trivial pixel, uh, displacement, whereas humans won’t be able to tell that difference. But humans will be able to tell if the relations between two Yeah. For example, lines or different. Um,

Jeff 01:02:04 That’s the kinda researcher I’m talking about. So we have this stuff where we, we, we kind of, so this is stuffed by, uh, John Humel and, uh, I think a PhD student, him Stankovich back nineties showed the humans are highly sensitive to this kind of the categorical relationship between object parts. Um, and, uh, in, in these simple line drawn kind of things, like kind of sim people might know these tar and Pinker had some kind of simple stimuli, uh, and they kind of slightly based off those things. And, uh, yeah. So in our research showed these models, current models completely blind to these kind of, so they don’t care about those things. We even tried to train the model on to, okay, you know, okay, we’ll train it on these relationships and, you know, we, we left out one relationship, but we trained it on lots of other relationships, right?

Jeff 01:02:55 So there were the categorical relationships between parts matters, uh, here and here and here and here and here. And then we have a left out, one left out one, we say, okay, how, you know, and the model ignores it there. So there, it’s just over and over. It’s just, you know, you, you’re not, it’s not learning the relationship. It’s learning some particular local feature, but it’s not understanding the concept of a relation that then applies. And that’s where people like, uh, you know, John Humel says, well, you’re, you’re never gonna understand these, you need an explicit representation of a relation that can take on any kind of o you know, kind of ca you know, uh, object part. And it’s not a conjunction, it’s not a single unit or some distributed representation of a circle on, on top of the square. It’s a circle of representation, a square representation, a relationship between those two.

Jeff 01:03:45 And that that same relation would be involved in representing a, a cube on top of a cone. I mean, you know, you maintain, and that’s kind of the stuff that people like, uh, photo and pollution argue for compositional representations where the parts, uh, the, the complex thing is composed of parts and the parts maintain their identity. You can combine them in new ways. Uh, but you’re, you’re, you don’t, you don’t kind of just forget, uh, you know, you, you actually have a concept of loves, you know, that, you know, John loves Mary, Mary loves John. It’s the same love. You know, it’s not like these are just different things. Uh, uh, hopefully it’s the same thing. Yeah. Who knows, maybe that could be argued. Yeah, <laugh>. That’s right. Yeah. That may good example. But, uh, but, uh, anyway, so, you know, it’s possible some of these massive models will kind of pick up these things in a way that, you know, are compositional.

Jeff 01:04:38 But, you know, our experience, so I’ve, I’ve worked, uh, with, in our lab, we’ve, we’ve used models where we learn what are called disentangled representations. We take these, these, uh, variational auto coders. You take an image and you reproduce an image, it’s some variant of the input, and you can kind of train in a certain way with certain kind of loss functions that leads the model to learn what are called disentangled representation. So it’s kind of this unit in the hidden layer is coding for the exposition of the, of the object in the scene. And this is the Y unit. This is, you know, the level of activation of unit number three codes for its poi its shape. And some you learn these kind of sort of grandmother like representation. They’re not exactly grandmother themselves, but they’re kinda interpretable locals. Uh, and, but those models don’t support, uh, combinatorial generalization.

Jeff 01:05:29 So even once you learn those things, you say, oh, great. Now that will be, that’s the answer to now, I should be able to kind of recognize, I’ve seen circles on the right hand side. I’ve seen, uh, and I’ve seen, uh, you know, but, and I’ve, I’ve seen squares, but I’ve never seen this on the right hand side, but I’ve never seen a blue square on the right hand side, you know, that combination. It’s, you know, having learned those disentangled representations, you still fail when you combine these features, uh, in a novel way. So there’s something, there’s a, I think, a real, a challenge about how, how you solve these kind of commentator compositional representations. It’s a challenge. And, you know, maybe these models will do it, but maybe you need something, again, qualitatively different than what these models that we’re, were working with at the moment.

Paul 01:06:22 I, I want to ask you about language in a moment, because you’ve done some fun tricks with language as the large language models. But before we leave the vision aspect, taking us back to comparing brains and, well, two different brains or two different models, or a brain versus a model, something that you write about a bunch is representational similarity, uh, analysis. And this is a way to basically, um, see the, the correlation between how a model and it’s given units are representing pairs of objects, for example, versus how a brain or different model or something would be. And you don’t like representational similarity analysis, uh, because you can get a high score saying that the representations are, are very, um, correlated between, let’s say, a brain and a model. And still you have these psychological dis disparities. Um, but the psychological, ex psychological experiments, if I didn’t explain that correctly, or if I didn’t, if I don’t have that correct, uh, you should correct me mm-hmm. <affirmative> before I ask you a question here.

Jeff 01:07:25 No, that’s right. That, that’s right.

Paul 01:07:27 Okay. Um, okay. So, and we, we’ve talked about those a little bit on the podcast, but we don’t need to go into detail on them. Um, another thing that people are looking at is the sort of low dimensional, um, dynamical structure, uh, between net artificial networks and brains, um, especially in like recurrent neural networks, and finding that there’s, you know, a similarity between the low dimensional, um, structure on let’s say like manifolds, um, and that the trajectory of neural activity through time is similar, often between, you know, between a model performing a task and a, and a monkey, monkeys, neurons or something performing the same task. I don’t, I don’t know if, uh, how much into this kind of work you are or if you have opinions on it. I was just curious if you did

Jeff 01:08:15 Yeah. In terms of the temporal dynamics and, and showing, yeah, I don’t know too much about the temporal domain, about how, but, but almost certainly it’s gonna be another case of, um, it’s a kind of correlation. You’re, you’re, you’re taking some kind of representation in the case of r uh, you know, some kind of representation, geometry in the case of rsa. Yes. Yeah. I guess some temporal dynamics, uh, in the case of, uh, what you’re talking about, which I don’t know much about, uh, correlating with each other, but there is a, an interesting paper that came out. I saw, you know, I can’t say too much about it, but it, it has to do with temporal domains where they have, they show that, uh, you can kind of predict, I hope I get this right, but you can predict ups and downs of Bitcoin based on neural activations in a monkey or, or in a, in erratic.

Jeff 01:09:14 It is. So there’s some kind of ability to predict well above, you know, in the, for statistical chance and things based on relating brain activation of a, a rat and Bitcoin. And one of the art, and this is, I don’t understand the details of it, but the reason I’m thinking of it now is that it has to do with time dependencies. So it’s kind of like you can get these spurious correlations and it hinges on time dependent things. So if you can kind, so it may not apply stochastic things probably. Well, it’s not, I mean, it’s, as long as you have some kind of, you know, alpha rhythm or something in there Yeah. There’s the state you’re in now is, is auto correlated with what it was before, and you can kind of get these spurious auto correlations between systems. And it depends on the, it, it’s it, but it relates to the temporal aspect of both Bitcoin moving up and down and neural firing of a, of a rat moving in a maze.

Jeff 01:10:14 And you can just get spurious correlations between these things. I thought, you know, that’s, yeah, I think, you know, so I don’t know, but I just, I just think it’s very easy to get correlations that are very misleading and, you know, and, and, and it, it hadn’t been Bitcoin, but if you’d done, if you’d kind of run something that looked like a model, you might go, oh, look at that correlation. It’s just, I mean, you only realize it doesn’t mean very much when it becomes an absurd correlation that you’re measuring. But if it had been, this is my model of, of, uh, you know, something, then, then, then it might seem quite impressive. So, I don’t know. I, I, I, I don’t know about, I do think that these correlations that are found in RSA correlations that are found in kind of linear predictivity that you found in brain score, uh, are dangerous because they could be the product of confounds or some other factors as well. There’s another thing where just simply the number of independent light factors, and we can kind of, there’s some paper that came out just having more sort of orthogonal like representations in a, in architecture affords better predictivity, you know, independent of whether the representations are aligned. So there might be yet another kind of way in which you can get these correlations that look impressive, but they’re not necessarily informative of similar mechanism.

Paul 01:11:40 This all reminds me of the, um, dead salmon in an F M R I machine. Yeah. Experiment in neuroscience <laugh>, yeah. Just, yeah. Wait, people put a, a dead salmon and, and in an FM r i machine and, um, recorded a bold signal, which is supposed to su suggest, uh, that there’s, you know, brain activity. I think they have the, the dead salmon performing a task or something. I don’t remember the specifics. There’s,

Jeff 01:12:04 I think it’s mind social psychology, mind reading, mind

Paul 01:12:06 Reading. Okay.

Jeff 01:12:07 Yeah. Something like that. Yeah. But I, I think that’s a danger. You get these things and you can get things that look impressive, but, you know, but then you can do a kind of a reductive absurdum kind of condition. Okay. And you find that actually, well, the, the methods we’re using are, it doesn’t mean it’s wrong, it just means it, it, it means you need to run an experiment to find out whether this correlation is really, uh, producing the similarities that you think that they do.

Paul 01:12:36 All right. So we’ve focused on vision so far, and, uh, I am aware of our time here, but, um, you’ve done some fun things with large language models in this same kind of vein. I know you haven’t worked with them as much. Um, one of the, first of all, well, I’ll ask you the question and then you can kind of generally tell me, you know, what, what your thoughts are on large language models in general. Uh, and especially again, as models, uh, you know, the same kind of work that, um, predicts, uh, visual model activity and compares it to, um, neural activity. This same kind of linear regression readout has been performed on data sets like F M R I and E EEG in terms of predicting an upcoming word, which is how these large language models largely, uh, function these days. Um, so, but, uh, well, so maybe, what do you think in general, these large language models, and then I wanna ask about the, how they can learn nonsense languages just fine.

Jeff 01:13:35 <laugh>. Okay. Yeah. Well, I mean, like everyone else, I mean, like, I’m, I’m blown away by chat G P T, it’s like an amazing thing. I, yeah, if you asked me, you know, a year ago, you can have something that actually is so useful and interesting, I would’ve, you know, so I didn’t see it coming. So, again, amazing. Uh, I mean, I know some people are criticizing, and you, you can find cases where, you know, you know, many, maybe many cases where it, it does silly things, but Sure. Overall it’s bloody impressive. So I, I don’t wanna suggest it’s not impressive, but yeah. But the next step is, is it a good model of human language? And so that’s, um, a, a totally different category, um, a fish. And, um, the, um, yeah, for one thing, it just hits me as implausible in the sense, it just seems unlikely that the, you know, language acquisition is a product of predicting the next word. So the idea that children are, so

Paul 01:14:39 It’s like solely a product of predicting the next word, or, or the prediction surely has to be a part of it.

Jeff 01:14:46 Um, well, I think you’re pre, you know, I would’ve thought what you’re predicting is, you know, you’re trying to communicate ideas and you know, and, and, you know, you, you, you’re hoping using words, uh, to achieve an end that you, you want the cookie, or you want, you know, you want, you know, someone to agree with you that that’s a doggy or something like this. So you’re just gonna have reference to the world. And then, and I don’t know too much about, uh, the syntactic capacities of, uh, of one and a half year olds, two years olds, but, you know, I don’t, you know, but I, but I don’t, I really don’t think that the child is, as their learning language, their primary objective is to predict the next word in a sentence and hits me as implausible. So in a way, it’s sort of another example in my mind. I mean, what’s, you know, before we talk about any data, I just kind of, just a striking example, how remarkable similar something can be, because it is, it just generates, you know, flawlessly lovely text, uh, synt, tactically coherent now, now meaningful in terms of useful things. Yeah. I don’t, by me, I don’t mean it means

Paul 01:15:53 Understand. Okay. All right. I was gonna ask you.

Jeff 01:15:56 No, yeah, I don’t, I don’t wanna go there, but, but I would say that, you know, I, you know, I can read this and go, that’s a, that’s a, that’s a

Paul 01:16:03 Meaning meaningful

Jeff 01:16:05 To you two. One, it’s a, it’s a two, one in my essay, you know, a student can get a, you know, a b ah, uh, so, you know, so it’s a pretty impressive text that they could write, you know? So, but they can get that from a system that had a completely wrong objective function, you know? Or largely, I mean, it is a, you know, it’s one where you first were predicting the next word, and then you have this weird reinforcement kind of human in the loop reinforcement component, which is again, you know, obviously not part of the, the, the mechanism by which children learn to kind of generate coherent tech. So just as an example of how very different, you know, learning objectives, uh, can produce behaviors that are, you know, eerily similar to human performance, doesn’t mean that the same thing, uh, in a way, you know, or, you know, example I gave, you know, a couple times on tweeter tweet

Paul 01:16:57 A tweeter. I like that. Yeah.

Jeff 01:16:59 <laugh> <laugh>, the, uh, you know, if you have a model like Dali that kind of generates images, you wouldn’t say, oh my God, that’s the best theory of human drawing or painting. It’s an amazing painting. You could sell the painting and win a competition, a painting contest. But it’s, it’s not, you know, you know, it doesn’t seem like a plausible account that it’s the best theory of human painting. And, you know, you can generate models of coherently and meaningful text to a human being anyways. And, uh, doesn’t mean it did it like, uh, in a way, in a way like a human did it. Yeah. So I think it’s a nice example of how different something can be, uh, in its objective function and nevertheless emulate behavior, you know, in impressive way. Um, yeah. When it comes to the, okay, but then you can say, okay, well that’s fine, you know, uh, but Jeff, look at, you know, I can predict brain data, you know, again, so it’s, again, it’s, again, it’s a predictivity thing again. But, so, you know, there’s, there’s a paper that came out p n a s last year, I think, and they were claiming you can account for a hundred percent of explainable variants. That’s

Paul 01:18:04 Shrimp at all, I believe. Yeah,

Jeff 01:18:05 That’s right. That’s the paper. And so they have a paper where they’re making, you know, sound is very impressive. Now, the first thing you find out if you go, you know, is that by a hundred percent of the explainable variance, explainable variance isn’t that much because the, you know, the amount of variance you can predict one brain, a human brain from another brain is, is, is quite small. So, you know, it doesn’t account for a hundred percent of variance. It’s a hundred percent explainable variance, and the explainable variance is very, very small. So I, I think 10% is like that. So, you know, so that, you know, so it per, it protects a hundred percent of 10%. Um, but then if you look carefully at the paper, uh, you find out that it can predict in some cases just as well, the brain activations and non-language areas.

Jeff 01:18:52 So it’s like, okay, you can predict a hundred percent of 10% both in language areas and non-language areas. So now I think, you know, what’s going on here? And that just, again, highlights, well, maybe these predictions are not reflecting the mechanism similarities that you think there can be other ways in which you can get similarities that don’t reflect mechanistic similarities. It seems to me a problem that non brainin areas are, um, you can predict those just as well as brain language areas. I mean, what does that mean? That means that this, this predictivity stuff is not doing what you think it’s doing, um,

Paul 01:19:32 Or that language is distributed more highly and

Jeff 01:19:35 Well, I think, I think Yeah, that may, yeah, I think they have some, you know, they give some explanation about semantics. Yeah. Anyway, you could do that, and it’s possible, but, you know, but all of a sudden the story’s not quite so straightforward, and there’s some questions that one should raise. Uh, um, yeah. So yeah, there’s a bunch of these studies that, you know, there’s another paper, but even smaller, the amount of variants accounted for is, I don’t know, 0.007%. It’s significant, but it’s so small. It’s like, so, you know, uh, I don’t know. I, I’m not very persuaded anymore that, I mean, the, the correlations are much more powerful in the domain of vision, and I’m not very persuaded by them. Right. So, you know, so they, you go to the language domain, they’re much, much smaller, uh, and they’re not impressed by them

Paul 01:20:25 Either. Yeah. But it’s a, it’s a different, um, data set, right? Because envision a lot of the data sets are from non-human primates where you’re recording actual neurons. I’m not trying to argue with you, but, you know, I’m just thinking there’s probably a lot more variation in, in the f r i bullet signal in the eg. Um, yeah,

Jeff 01:20:40 Well, it’s definitely the case. I mean, because you get much less, you know, where they call a noise, stealing the amount one human brain in predict another is much slower in language studies than it isn’t. So, yeah, so you, so it’s not, you know, it’s not, it’s not a, it’s not the fault of a model that the noise ceiling in humans is so low, it’s just the data is much more noisy. But that, that does nevertheless, to some extent, weakens the evidence that one’s presenting. When one says that, uh, I account for a hundred percent of explainable variance, and then you find out, well, the noise ceiling is awfully low, uh, so that is relevant.

Paul 01:21:16 So these large language models can learn English, they can learn Dutch, they can learn French, and I don’t know all, you know, all the different languages, learn them very well. They can also learn impossible languages, which is what you’ve taught them to do. Um, and I, you know, you used the term meaning, and then I <laugh> I thought you were gonna go toward, you know, I thought you were gonna start criticizing, you know, whether these large language models understand the meaning of what they’re producing. Right. And to me, you know, when I was reading your work on teaching these things, nonsense, I impossible languages, that speaks against the idea that large language models can understand the meaning of, uh, what they’re generating. But, uh, maybe you can clarify what you guys did. And then I don’t know if you have comments on that related to mm-hmm. <affirmative>, you know what that means about meaning in large language models.

Jeff 01:22:09 Yes. So this is work with Jeff Mitchell. So he, uh, and yeah, so we just trained, you know, so, so this is a couple years back now. So this is pre-chat G B T, and these are models that just was purely predicting the next word. There was no other humans that leap. And so it wasn’t generating, it was generating tactically coherent sentences. It wasn’t generating useful, you know, you know, it would just generate, it would ramble on and talk nonsense. So it didn’t mean the meaning wasn’t there because it was just talking rubbish after a while. But the meaning for

Paul 01:22:44 Us, it

Jeff 01:22:44 Was tactic. Yeah. It was the meaning, even the meaning for us was missing, although at the sentence level, it might be meaningful to us, but at the, at the centen, at the paragraph narrative level, it would start rambling off. And, you know, that’s what the genius of chat G B T was, I guess, to reign these, these models in to kind of be more, it’s meaningful for us. Yeah. But, but anyways, the models could learn synt tactically impressive sentences, and they could do various, you know, subject, you know, um, um, verb agreement and num number agreement between various, uh, grammatical, uh, markers. So, you know, so it could do things, uh, quite impressively. Some in other words, synt, tactically correct sentences. Um, and you know, the thing that people have been saying recently as well, especially with chat C p t, is Aha Chomsky is wrong.

Jeff 01:23:36 I mean, look at, we’ve demonstrated that you can learn syntax, uh, and these models have no universal grammar. They’ve just been, uh, yeah’s just some generic architecture. There’s nothing language. It’s the same. I mean, in our case, it wasn’t a, I don’t, uh, nowadays it’s transformer, but it’s not built, you know, that transformer can be used in visual domains. I mean, yeah, it’s to modify things, but it’s, nothing’s nothing about language. Uh, and nevertheless, you can get these synt tactically coherent sentences and, um, so therefore chomsky’s wrong. But I think that’s just a misunderstanding. Like Chomsky’s story was, I mean, and I’m no expert on the details of Chomsky for sure, but, uh, yeah, uh, Jeff Mitchell might know a lot more than me. Okay. But, uh, but anyways, um, but the basic idea is that the critical thing about human language is we have, uh, an ability to learn a language with relatively little data a child is learning, uh, to generate some tactically competent, you know, uh, grammatically, uh, accurate sentences with relatively little, uh, input with a claim that’s kind of the poverty, the environment, uh, environment.

Jeff 01:24:49 And so, but that comes at a cost according shamsky. So you can learn quickly, but you have to kind of abide by the rules of universal grammar. So we have inductive biases that allow us to learn certain types of languages, and we can learn them relatively quickly. And so that’s, uh, so we’re constrained in the languages we can learn in a normal way quickly, you know, in a natural way. Uh, but the benefit is we can learn quickly, whereas large, large language model have exactly the opposite properties, which hit me is, you know, supporting Chomsky. Mm-hmm. <affirmative>. So these models need a lots of data, uh, much more data than a human would require. And, uh, but they can learn anything equally. Well, there are no inductive biases. That’s the reason they can learn anything, is there’s nothing constraining them out to learn certain types of things.

Jeff 01:25:41 But, uh, so the benefit, I suppose, if you wanna call it benefit, it can learn anything. The, the cost is, it takes a lot of training. Hmm. But that hits me as not a falsification of Chomsky that supports the idea that there’s something, you know, it doesn’t support Chomsky’s particular theory. And Chomsky’s theories have changed a lot over time, but doesn’t support any particular theory. But it does seem to me support an idea that there are important, uh, you know, inductive biases that, um, a human being needs in order to learn a language quickly. So that’s the, you know, that’s the main story. I, I, I think, and it’s, you know, I’ve heard recently people, uh, you know, again, this, you know, informal conversations on social networks and Twitter and stuff saying, ah, humans actually do have a massive amount of exposure to language a child does expose.

Jeff 01:26:39 But it, I don’t think, I don’t think it’s appropriate to say, you know, words that they overhear as a child. I mean, I mean, if you just passively just the words around you are not contributing very much, I don’t think to language learning. You have to actually, there, there are studies by, so I think this is right. So I, you know, I, I’m, you know, I’m saying something that I’m only 90% sure about this. I think it’s, Patricia Kool did some interesting studies. It may have not have been about syntax, it might have been about phenology, but aspects of language where if a child is just kind of pa watching or passively, you know, being exposed to language or foreign language that’s on a radio or television, they don’t learn, right? So just pa you just passive exposure. So when someone says, oh, a child’s learning, you know, he’s exposed to 50 million words a week, or something like that, just because that’s how many words are spoken around them, and that’s maybe less than a, you know, check G B T, but nevertheless, you know, it’s not crazy, you know, it’s like, well, I don’t think those are the relevant kinds of exposure.

Jeff 01:27:42 To what extent are children actually engaged in language and keep chil parents speaking to the child and the pa? And, you know, that’s the more relevant number of episodes. Not, not just, you know, the, the, a sea of words around them that there might largely be ignoring. I mean, cause, um, so I do think that, I do think there is a challenge for these, these models require a lot of training. Every exp every exposure is part of the learning environment of the, of, of these models. The child’s learning is, is it’s not gonna be anything remotely as large as what’s required to train chat g p t. Uh, so I, I do think, you know, I, I think the large language models provide some evidence and support of the general hypothesis that humans have inductive biases to learn certain types of constructions. Uh, and they can do it relatively, um, with, with little data compared to what a model can, I mean, and there’s a parallel kind of case that we’ve studied and vision where we’ve said these, well, it wasn’t, uh, I think it was Sammy Ben in his lab a few years, 2019 or something like this.

Jeff 01:29:04 So we kind of followed up on some of this work subsequently. But they found you can train a model to learn, you take ImageNet and just convert all those million images in ImageNet. So there’s like a a thousand categories each with a thousand exemplars. And if you just kind of make them all random pixels like a TV static, yeah. So there’s just absolute randomness. There’s no structure, there’s no structure amongst the 1000 cat members of category one. There’s similarly no structure of the members of category two. You can, you know, logically rotate it. So it’s just a million random bits that are arbitrarily assigned patterns assigned to a thousand categories. And you train a model and it can learn it. And it doesn’t take that much longer than actual images. I mean, it’s sort of like, but that’s not like humans. And you know, in the same way, it doesn’t seem to have inductive biases in our mo our language models like that.

Jeff 01:29:52 It can learn impossible stuff. It’s no human being is gonna learn. Now you can, to some extent in human case, in a language, you could deliberately slowly kind of learn something about it, but it’d be like a problem solving exercise. Yeah. I don’t think there’s any reason to imagine it’s the kind of language that would ever be learnable in some cases. They just clearly not learnable. We had cases where the dependencies are beyond short-term memory capacity. Oh. So these models, so, you know, you know, a human couldn’t do it because they couldn’t, you have to, you repeat a word, I dunno, 27 times like that on the 28th trial, something happens. It’s like, well, you, you can’t do that. Cause it’s beyond our human capacity. But the models are exploiting that human superhuman capacity to perform as well as they do

Paul 01:30:40 These. When you’re science minded or have a science background like me, and you’re talking about all these different, you know, the poverty of the stimulus, um, and not subjecting, uh, how children aren’t subjected towards, I mean, I, I just automatically think on about experimenting on my son. And if we just stopped speaking to him all, you know, completely or started reversing, uh, nouns and verbs and

Jeff 01:31:03 Speak backwards. Speak backwards.

Paul 01:31:04 Yeah, speak backwards. Yeah. Oh, that’d be rough

Jeff 01:31:07 If you could, I think, yeah, I think, yeah, exactly. That’s the point. It’d be very hard to do that, but yeah, give it a go.

Paul 01:31:12 Have you, do you know, the, uh, the phenomenon when I, I’m sure you’ve seen this when, you know, you can take every word in a sentence and sort of flip a couple letters and you still read the sentence, sentence fluently. Right. And you don’t even notice like, differences often. Um, and even, you know, even when there’re kind of harsh differences, you, you know, even if you notice the difference, you can still read the sentence fluently. Um, I I maybe this is just an another experiment for you to perform, to test psychologically these models, uh, how they perform with that kind of, with, with the, like the anagram, I’m not sure what they’re called, but they’re, every word is kind of an anagram, right? <laugh> mm-hmm. <affirmative>.

Jeff 01:31:49 Yeah. We’ve done, we’ve done actually a little bit like some work like that actually, but, uh, oh yeah. Okay. And they do surprisingly, and they didn’t surprisingly well actually,

Paul 01:31:56 Is that because there’s so many typos on the internet? Or is there, there

Jeff 01:31:59 It’s a good question. I, in our case, no, I don’t think, I don’t think we trained it on, uh, on, on typos. So it, it, it’s so, uh, the model, so these letter transpositions, I mean, I don’t, it’s not quite true that you, if you read a word, you trans transpose things, you don’t notice it. I mean, there’s, I mean, some word, some words yeah, are different like form and from, yeah, I mean you can, you can read those words so clearly many words. Uh, and turns out there are other, some language like Hebrew, there’s some interesting studies in Hebrew where lots of words, I think it’s not only Hebrew, but for what it, it’s some studies been done in Hebrew where in Hebrew, when you swap two letters, you produce another word in English, you swap two letters, you end up with just a nonsense thing.

Jeff 01:32:44 But in Hebrew for just the nature of, of how, oh, the it is, and so human, so, so it, they’re very saliently different. So if you transpose letters in Hebrew, the visual systems learn to be very true to the orders of letters where in English, because it doesn’t matter as much, people are a little bit less sensitive to those transpositions. So, you know, so the training environment will impact the extent to which you’re sensitive to transpositions. Uh, and yeah, we found a study, we have a paper that’s I think coming out now, but it was showing to my surprise, the model did a surprisingly good job at accounting for the similarity between letter trans, you know, a word that you transposed something and then you compare it to a word without the transposition, and you ask how similar those internal representations, uh, and they’re, they’re more similar than I would’ve thought. And then it, the pattern of results kind of capture the experimental conditions that people have done. So sometimes these models do, uh, you know, I, I still think I can make the model fail if I tried harder. But I was, uh, but I was, but I was, you know, I suppose, please, I don’t know, I, I, I try to enjoy finding faults in these models, but, you know, credit to a credit too. The model did a good job.

Paul 01:33:56 Well, I was gonna say, to your surprise, you mean to your chagrin, because you hate, yeah, hate these models, right? Um, I don’t, yeah, I know, I know. Um, I don’t wanna give people that impression. Actually,

Jeff 01:34:05 Psychology just gets no respect. I mean, we need more, uh, you know, people need to pay attention cuz psychology has the rel you know, beautiful data and relevant data for so many things, uh, that somehow people think it’s more impressive to look at neuroscience than, uh, than the, than the relevant behavioral data. So, I mean, the very simple version of this is that, you know, people would kind of, I don’t know, do F m I scan, you know, you, you, you’d do some instruction on, uh, intervention to try to improve someone’s reading skills. And you show, you do, you know, you kind of say this is a state of it, you know, Johnny has trouble reading, uh, here’s a state of his, his brain as measured by F M R I before intervention. And then you do some intervention, and then you look and then you see the brain has changed, the bold signal’s, different post-test, and you think, oh, ho the intervention worked.

Jeff 01:34:58 Just like, well, no, I mean, we don’t know. I mean, I mean, the question is, does Johnny read better? I mean, behaviorally, that’s what matters. And if they did, if the, if the reading did change, we know that the brain changed. I mean, you know, where the brain changed is interesting if you’re a neuroscientist, but it’s just not that relevant to, is it, is it Johnny? I can’t remember who I’m talking about. Johnny. Yeah, Johnny. So it’s irrelevant to Johnny, it’s irrelevant to Johnny’s mother. I mean, the point is Johnny can now read, and it’s, it’s the behavioral, you need a careful behavioral measure of, of, of the performance of, of Johnny when reading, we need a, you know, a good behavioral intervention to kind of, you know, hopefully design a study that has some, you know, logic given what we know about learning and memory.

Jeff 01:35:46 And, uh, you know, so you kind of use psychological insights to design the intervention. You use psychological methods to evaluate the outcome of the intervention to find out whether performance is better. And neuroscience doesn’t have any role in any of those stages, I don’t think the neuroscience is just not a useful tool for the teacher, or, uh, you know, it, it, but, but it is certainly interesting intellectually, I’m, I, you know, I like to know where Johnny’s brain changed after instruction, that, that’s complete. I’m not criticizing neuroscience. I’m criticizing the idea that it’s useful for education, and we should invest and, and, and says, you know, grant funding’s a zero sum game. And, and the neuroscience studies are expensive, you know, um, actually, so are psychological studies where you do, you know, randomized control studies are also super expensive. They’re both expensive. So, you know, so, uh, you know, better spend your money if you’re interested in education on psychology, not neuroscience. That would be my, uh, that that’s the sort of the, the short. But, but again, the theme is pay more attention to psychology.

Paul 01:36:58 I, I was gonna, I, I was thinking about naming this episode, deep Problems With Deep Learning, uh, based on the, the title of your recent, uh, your Archive Paper, deep Problems with Neural Network Models of Human Vision. But now, I think maybe I shouldn’t, uh, name it. Psychology Gets No Respect. What, what do you think? What’s a good name for the episode? I

Jeff 01:37:17 Like, I I like that. I like that one. Why not just for that. Okay,

Paul 01:37:20 I’ll go for that. But it’s gonna be attributed to you. Jeff, thank you so much for, for spending the time with me. Um, uh, we could talk for a lot longer, but, um, it’s late in your world and I have to go as well, so I really appreciate it. Uh, I had fun.

Jeff 01:37:33 Yeah, I too, I really enjoyed the conversation.

Paul 01:37:51 I alone produce Brain Inspired. If you value this podcast, consider

Paul 01:37:54 Supporting it through Patreon to access full versions of all the episodes and to join our Discord community. Or if you wanna learn more about the intersection of neuroscience and ai, consider signing up for my online course, neuro ai, the Quest to Explain Intelligence. Go to brandin inspired.co. To learn more, to get in touch with me, email Paul brand inspired.co. You’re hearing music by the new year. Find them@thenewyear.net. Thank you. Thank you for your support. See you next time.