BI 144 Emily M. Bender and Ev Fedorenko: Large Language Models

Brain Inspired

00:00 / 01:11:41

Check out my short video series about what’s missing in AI and Neuroscience.

Support the show to get full episodes and join the Discord community.

Large language models, often now called “foundation models”, are the model de jour in AI, based on the transformer architecture. In this episode, I bring together Evelina Fedorenko and Emily M. Bender to discuss how language models stack up to our own language processing and generation (models and brains both excel at next-word prediction), whether language evolved in humans for complex thoughts or for communication (communication, says Ev), whether language models grasp the meaning of the text they produce (Emily says no), and much more.

Evelina Fedorenko is a cognitive scientist who runs the EvLab at MIT. She studies the neural basis of language. Her lab has amassed a large amount of data suggesting language did not evolve to help us think complex thoughts, as Noam Chomsky has argued, but rather for efficient communication. She has also recently been comparing the activity in language models to activity in our brain’s language network, finding commonality in the ability to predict upcoming words.

Emily M. Bender is a computational linguist at University of Washington. Recently she has been considering questions about whether language models understand the meaning of the language they produce (no), whether we should be scaling language models as is the current practice (not really), how linguistics can inform language models, and more.

0:00 – Intro
4:35 – Language and cognition
15:38 – Grasping for meaning
21:32 – Are large language models producing language?
23:09 – Next-word prediction in brains and models
32:09 – Interface between language and thought
35:18 – Studying language in nonhuman animals
41:54 – Do we understand language enough?
45:51 – What do language models need?
51:45 – Are LLMs teaching us about language?
54:56 – Is meaning necessary, and does it matter how we learn language?
1:00:04 – Is our biology important for language?
1:04:59 – Future outlook

Transcript

Ev 00:00:03 Whether we can study the interface between language and actual kind of thinking and reasoning capacities. I am dying to understand how that works. I think that’s the most intriguing thing.

Emily 00:00:16 It’s not clear to me that large language models are something that the world needs at all. Certainly not larger and larger ones. That’s not a, um, like phenomenon in the world that, you know, a mountain that we climbed because it was there kind of a thing, right. We, we created the mountain as we were climbing it,

Ev 00:00:32 But if you wanna build a system that can think, then it just seems a little bit misguided potentially to just try to think that language will just give you that. And again, I think the idea that it can comes from the fact that a lot of people think that language is what made us smart.

Emily 00:00:49 Does it matter whether machines are learning language as much as they’re learning it differently from humans? Um, I would say yes in two ways. One is, um, to the extent that we’re claiming that the machines are a model that we’re gonna use to study humans, then we need to be very clear about what the similarities and differences are, cuz that gives us the limits of the model. Um, and then secondly, if we’re gonna be building technology that people are using the way in which the system was learned, might put some, some limits or tell us something about the resulting system. This

Speaker 0 00:01:24 Is brain inspired.

Paul 00:01:37 Hello, good people. I’m Paul large language models have taken over in the AI world, as you likely know. Uh, and their use has extended beyond language, as you also likely know. But in this episode, we’re focused on the language versions. So these are the models that are trained on enormous volumes of text, usually online and can do things like generate human, like language, answer questions and so on. And the most successful versions of them as of now are based on the transformer mechanism, which I won’t detail here, but basically works by learning these statistics of which words appear near other words. So one way they generate text is to look at what text has been produced so far and continuously predict which word might be good to insert next, based on the previous words, something called next word prediction, EV Fedco is a neuroscientist who runs the E lab at MIT and she among others has been testing how these models compare to how our brains process and generate language.

Paul 00:02:41 And it turns out next word prediction seems to account for a large part of our language abilities. Something we discuss during the episode, Emily Bender is a professor in computational linguistics at the university of Washington. And recently she has been considering questions like whether the models understand the meaning of the text they produced, the answer is no, whether we should be scaling up the models as is the current trend. The answer also is no. So we discuss these topics as well. Another thing that we discuss is the relation between language and thought of has amassed a large volume of data showing that the language network in our brains is distinct from our other cognition related networks F argues that this and other data indicate that language is not for complex thinking as people like nom, Chomsky have argued, but instead language is for efficient communication. So I brought Emily and E together to discuss these and other topics. And I really, uh, enjoyed our discussion. I recommend you dive deeper into their works, which I link to in the show notes at brain inspired.co/podcast/ 144 on the website, you can also learn how to support this podcast through Patreon. If you find it of value and or check out a free short video series, I made called open questions in AI and neuroscience. Enjoy

Paul 00:04:03 E Emily. Thanks for being on the podcast and welcome Emily. Have, have you had a chance to look at my logo for brain inspired?

Emily 00:04:12 Uh, no I haven’t.

Paul 00:04:13 Okay. Cuz I think that you hate look it up. Yeah. Go look it up. Go to brain inspired.co. I think that you would really hate my logo, uh, based on pre previous things that you have, uh, discussed. <laugh>

Emily 00:04:23 Yeah. So, so this is a logo that is definitely, um, leaning into the computational metaphors, shall we say?

Paul 00:04:30 Yes, definitely. Definitely. Okay. Just wanted to get that outta the way. Um, roughly where do you guys situate language, uh, among our cognitive abilities is, is the Lang is language, the pinnacle of our cognition, uh, as individuals we can get into social aspects of it later, but just as a, as a cognitive entity, where, where is language, maybe we can start with you AV.

Ev 00:04:56 Sure. Um, well the thing to keep in mind is that, um, I’m a cognitive scientist and um, I use neuroscience tools to study cognitive science. So my perspective is tad by the knowledge I have from that domain tainted. And so, um, well, or the positive version of that <laugh> um, and, um, uh, in my work, uh, I’ve been very interested in the relationship between, uh, language and uh, thought or complex reasoning abilities. And my initial prior was that language is at the very core of those abilities and I was very sympathetic to that kind of a view, but it turns out that empirically the way that humans are built, um, language rather reflects some of the complexities that we have in our thoughts rather than creating them. So in a sense, um, I would take the evidence to date and suggesting that language language is just kind of one tool in our cognitive toolkit. Um, and a lot of complex thought can happen, uh, without access to language.

Paul 00:06:01 Emily, do you agree with that?

Emily 00:06:03 Um, well, so I, I wanna say, you know, on things, cognitive science and neuroscience, I definitely defer to E my, my work is in linguistics and computational linguistics. And so I have no basis on which to compare language to other cognitive abilities. Um, I think that what I can say as a linguist is that language is an interface that allows us to work together with other people and also to work with ourselves across time. And my guess is that that will have lots of really interesting cognitive implications, but that’s not really answering the question of how does it fit into some hierarchy of cognitive abilities. Um, cause I don’t have an answer to that

Paul 00:06:37 Question, but, but can’t, you do some armchair philosophy, which is where we got us to the place, um, pre E where we, where we sort of assume that language is the pinnacle and, and according to your prior as well, I mean, is that an area E where philosophy has let us astray perhaps?

Ev 00:06:55 I think so. <laugh>, I think philosophy has led the field destroy many, many times, and I think it’s because relying on a introspection is dangerous. We have intuitions and intuitions can be helpful for generating hypotheses, but ultimately you can’t build stories on that. It’s just a starting point. Um, and because some of the tools for studying the relationship among different cognitive abilities just haven’t been around, um, you know, well, different tools became available at different times, but, uh, without those tools, we just couldn’t empirically ask these questions. And because, um, I think because some people have the situation that they talk to themselves when they think, um, I think that has led to this whole notion of language as being kind of the core essence of thought and the inability to think without language being there. And that just seems to be empirically, just doesn’t seem to be true.

Paul 00:07:46 What about the people who vocally talk to themselves all the time, who vocalize their, their thoughts, right. And can’t seem to think without doing so. Yeah.

Ev 00:07:55 That’s my daughter and my husband. Well, it depends on what you mean by not able to do, to think without doing so that’s again, an empirical questioningly, right. Um, I think, um, <laugh>, I think some of that has to do with, um, humans being highly social, although, um, I think some, some people like that even do it when they’re alone. Um, again, I’m not saying that language can be helpful in, uh, structuring certain thoughts or maybe extending the kind of window over which we operate in our thought mental space. Um, but, uh, like I said, the evidence that we have from both brain imaging and individuals with severe language problems suggest that a lot of stuff can happen. Very complex reasoning can happen without relying on linguistic representations.

Emily 00:08:43 So I think,

Ev 00:08:44 And there’s interesting story. Oh, go ahead.

Emily 00:08:45 Go ahead. So I’m detecting in your questions, um, some implicit hierarchy of kinds of thought and kinds of reasoning. So when you say people who can’t think without speaking <laugh>, um, that suggests that there’s certain kinds of, of the more general thing that we call thinking that you consider sort of real thinking, right. And other stuff that isn’t and then, and a value proposition where some of it is the stuff that, that we tend to do through language. Um, so I’m thinking about the experience of when I go to write something and I’m, you know, out for a walk and I have a really great idea. And then I sit down to write it and it feels like, oh, I actually had sort of the first, second and sixth parts of this. Yeah. And the stuff in between. I really have to sit down and work out. And as, as E says, if I set it down in language and sort of fix it there, then I can focus on those other connections in between. And that can extend what I want to do. Um, but I would not wanna call that kind of work, real thinking to the exclusion of other things.

Paul 00:09:41 I mean, do you think that people conflate thinking with language and, and just that they’re the same thing, you know, there’s the language of thought, uh, hypothesis of, of our cognition, et cetera. I mean, do you think that’s one of the mistakes that people make is just conflating our thoughts with language?

Emily 00:09:59 So I, I have a story for you from, um, when I was in college, my, my mom came to visit once and I was really proud to show her what I was doing. And I brought her along to a syntax class I was taking and my mom’s a poet and a personal essayist and a teacher of creative writing. So she’s really a language person, but in a different way. And she walked outta that class terrified because she said you are using language to study language. You’re gonna go crazy. Oh, right. <laugh> um, and I think that, you know, we’re fine. Right. Some tacticians actually do manage to hold it together. It’s okay. Um, but I do think that there’s a danger that because so much of this discourse is done in language and especially in, in written language that we conflate that with the thing that we’re studying and sort of analogously, um, there’s this, all this stuff going on, which maybe we’ll get into later in, in the hour about whether or not language models are sentient

Paul 00:10:52 And we have to, okay.

Emily 00:10:55 I hope not. But the thing that I wanted to bring out of that is, um, people who are taken in by the language models seem to project an inner life onto the language model. Um, and some of that discourse is around, well, see, it’s doing language, so therefore it must have an inner life. And my reaction to that is language is one of the key ways in which we become aware of the inner lives of other people. But that doesn’t mean that the language itself is the inner life.

Ev 00:11:22 That’s exactly right. I completely agree with this. I think it’s not a reasonably, you know, we’re a vision reasoning beings, right. And it’s not crazy to think that something that produces coherent well-formed language has thoughts because it, most of our experiences that’s the case, but of course it’s a fallacy. Of course, we’ve learned that you can learn a lot of rich statistical patterns of how words go together, including pretty sophisticated aspects of syntax that some Sy technicians had argued. You just can’t learn from experience no matter how much experience you have, but that does not imply that there is complex thought there. I mean, one way to, um, also think about, is for example, humans vary a lot in their reasoning capacities. And oftentimes you have people who speak very eloquently or fluid, fluidly and fluently, but it doesn’t necessarily mean that <laugh>, there’s a lot of, uh, kind of rich complex thought behind. We’ve seen it in politics a lot in the last years.

Paul 00:12:19 Do do you guys, I meant to ask you this upfront, and this is a total aside, so I apologize. Do you enjoy courting controversy and getting, uh, pushback because one thing that you kind of share is, so, so E has spent a lot of time, um, collecting evidence for and arguing, uh, against the idea that language is necessary for complex thought. Um, so this goes, you know, against a lot of, uh, intuitions and background philosophy. And of course, Emily, you’re writing these critical papers on language models, um, ethically, and you know, the fact that they don’t understand what they’re producing. So I assume that you guys both get a lot of pushback. Is that enjoyable to you?

Emily 00:13:01 Not particularly <laugh>, I’m not in it for the pushback that’s for sure.

Ev 00:13:06 Yeah. Yeah. Same. Um, I’m just doing my thing. Like, I, I wanna figure out how things work. Like I’m an empiricist. And so I’m just trying to understand how the system works and I’m happy to argue with people, but I think, you know, a lot of the pushback comes, uh, is, you know, it’s not, uh, incidental that we’re females like that has, um, oh yeah, yeah. You seem surprised. Yes. <laugh> I just

Paul 00:13:32 Don’t, I don’t think I want to believe that, but yeah. Okay.

Ev 00:13:35 Oh, well, you may want to believe, I, I wanna believe all sorts of good things too, but you know, like my most salient kind of awareness of these differences came from traveling. Uh, back when I was like a post doc or young faculty with, uh, my husband who is also a language researcher, and we would often go and give talks together, you know, sequentially, um, because we’re both there, people would wanna hear different anyway. And he was just amazed at how different, the tone is of people who talk to me, uh, compared to the way that they talk to him. And I just kind of, hadn’t really given it much thought before. Of course, I’m also raised by a female academic, Nancy chemistry, and she was always warning in me about this mm-hmm <affirmative> and, uh, yeah, there’s definitely a lot of, um, expectation of, you know, like we owe people something to explain to them every little thing without them bothering to even read the papers and, uh, you know, uh <laugh> we don’t <laugh>.

Emily 00:14:28 Yeah, yeah. And, you know, I sometimes look wistfully at the other kind of work that I do and wish I had more time for it. So, you know, multilingual grammar engineering and computational semantics, um, and I find myself putting more and more time into this discourse around, um, what is it that language models can do? Um, because I see the hype and the overclaims as harmful, um, and some folks are concerned with sort of the harm to the field of AI. And I frankly couldn’t care less like AI is not my problem. I’m not trying to build better AI if I think computer science as a whole and AI in particular is Fastly overfunded right now, um, to the detriment of the science in that area and to the detriment of the science in surrounding fields. Um, but I see, I see harms to people, right. Um, in terms of when we, uh, believe false things about what these systems are doing, then we end up putting too much faith and too much autonomy into automated systems. And so, um, not doing it for the pushback, but I sort of feel like this is a, a place where my knowledge and expertise in linguistics allows me to do some good in the world. And so that’s my motivation.

Paul 00:15:38 So, um, we could really go off the rails here ethically, and, and it’s not an ethics podcast, but, um, maybe let’s get back to what I was gonna ask about regarding the meaning and, and our as humans that we sort of grasp for meaning and anthropomorphize, and, uh, want to think that we are communicating with something, sentient, something, you know, that is producing meaning. And is that because we, so there’s this, uh, kind of a approach, the computer metaphor of the brain, right, is that we get input, uh, signals, we process that input, and then we produce actions and then there’s this alternative viewpoint that really we’re producing actions primarily, uh, to receive different inputs, right? So it’s instead of a perception to action, uh, notion it’s an action to perception, uh, notion. And I’m wondering if, if it’s the same, if we buy into the action to perception notion, I’m wondering if, if our grasping for meaning in other humans, uh, our cats and dogs, large language models, if, if that is the same sort of thing where we’re, so we yearn for meaning so much, and we build meaning out of, uh, things so much, if that is sort of an analog of the taking action aspect.

Paul 00:16:53 Does that make sense?

Ev 00:16:56 I’m just saying like, we’re like actively filling in the gaps when there are real gaps. I mean, we certainly are very good at filling in cognitively, right? Like we have a rich understanding of the world and if something is not there, we can easily fill it in by assigning mental states and so on. And again, because it’s, cuz I think the, a big fallacy is, um, that we often can and do. And in fact, it’s necessary to assign mental states to, uh, entities that produce coherent language, which is other humans. And to understand what somebody’s saying to us, it’s really critical to think about their intentions and the whole context in which they’re saying something. But, um, it, it seems, and it seems really hard to, for people to get that you can just learn the regularities of language and produce language, um, and not have all of the stuff that usually comes along with language as part of the human brain.

Ev 00:17:46 Now I think there’s kind of an interesting flip side of that, which is how much of this richness can you infer from patterns and language because of course, language reflects this generative process of, you know, um, thinking and we talk about things we think about and feel and so on. Um, and that’s an interesting question. Different fields have different kind of takes and approaches to thinking about this. I, there’s obviously a lot of information about the world that’s reflected in language, but, um, as, um, a lot of work, uh, suggests some of that knowledge seems quite brittle, not always generalizable in the same ways as what humans have. And presumably it’s because, um, you know, along with the linguistic regularities, humans get access to the physical world, um, they interact with, um, other entities using that same communication code as well as with the objects and engage in the events. And so it’s, um, richer, um, notion of meaning that they get. But, but I think you can get quite a lot from just the way that words, um, go together. Like that definitely has a lot of structure.

Emily 00:18:53 Yeah. Yeah. Um, I think to go back to your question of, uh, is R looking for meaning analogous to taking an action so that we get a percept back. Um, so this is, I, I feel outta my depth. Like I, again, I want to defer to E when it’s questions of like, what’s going on, how

Paul 00:19:09 Do no come on? You have to, you have to,

Emily 00:19:12 Well, um, so my, my, my sense is that, you know, looking at language and how it’s used, so linguistic pragmatics, and also what we know from the child language, acquisition and literature, um, we are, uh, very, very good at taking in linguistic clues together with everything else we have mm-hmm <affirmative> and then using that to make inferences about the communicative intent of the person who uttered the thing. Um, and we are, we do it so quickly and so automatically, um, and including all these processes of, for example, ambiguity resolution. So computers are very, very good at finding multiple different Syntac analyses of a string. If you’re actually like running a partial, that’s got a grammar in it. Um, my favorite example is have that report on my desk by Friday, seems like a completely straightforward unambiguous thing. Any, any scenario that you can imagine for that sentence?

Emily 00:20:02 It’s, you know, no, one’s gonna say, but wait, what did, what did she mean? 32 different pars for that given a reasonable grammar of English <laugh> all right. Um, because have could be caused to be right or it could be go ahead and take, right. Um, the report could be about the desk or it could be physically on the desk mm-hmm <affirmative> um, by Friday could refer to a time or it could refer to the author of the report and then all those things can combine together in different ways so that you get 32 different readings. But none of that, like it’s just, it goes right past us because we are using so much information, um, sort of general world knowledge, cultural knowledge, what we know about the person who said the thing and, uh, linguists sometimes get tripped up when we are asking speakers for grammatical judgments, because we forget how much of that world building goes into creating that judgment.

Emily 00:20:52 And so you’ll get people saying, oh no, no, no, nobody would ever say that. Um, because we, haven’t sort of given the time to say, okay, now in this kind of a situation where these things are going on and you need to emphasize that now, is it a natural thing to say? Um, and we sort of work with this false assumption of Nu context, which just never exists. So all that to say that our interpretation of language is very fast and very reflexive. Um, and we have the, the sensation or the intuition that we are getting all that information out of the language. When in fact what we’re doing is we are pairing a whole bunch of information with the language to make some inferences.

Paul 00:21:32 Well, I’m tempted to just go down the road of the, um, of E’s work, comparing, um, large language models with, uh, the predictive activity of that with our, our brains. But maybe before that, what, what does that say about large language models? Are they not producing language the way we think of language, because do, do we need a different definition of what they’re doing or a different word for what they’re doing than language?

Emily 00:21:58 So I think we often, the discourse often does get tripped up, um, because language could be the set of forms or it could be the form meaning pairing, or it could be the linguistics system that puts those things together. Um, and so I’ve, I’ve started talking about large language models as text synthesizing machines. Okay. Um, which still isn’t great because a text, especially if you’re talking about like, you know, humanistic scholarship, a text also is something that’s got a lot of meaning in it, but the, um, the point is that the, what the large language models are doing is coming out with strings and those strings are conditioned on very carefully modeled understandings. I don’t wanna use that word very carefully constructed models of the patterns of the strings, um, and embedded in those models is a lot of information about which strings are like each other and which ones tend to co-occur and, and, um, constraints, uh, that look like syntax and constraints that look like Lal, semantics. Um, but none of it reflects any communicative intent nor any, um, connection to the world outside the strings. Um, so no grounding, no social understanding and so on.

Paul 00:23:09 And yet one of the things that E has shown and with her group is that when you compare the activity of large language models with that of brains, uh, there seems to be this next word prediction that’s happening in our brains as well as, as it does in the large language models. And this is kind of in the tradition now of the visual system in our brains, uh, have been compared to the, um, like convolution neural networks. And we’ve talked a lot about that on the podcast, but EV how do, what, what does that mean? Uh, are we just missing? Are, are we just grasping that one tiny aspect of language, next word prediction. Uh, and, and so we need to find the other aspects, the grounding, the context, uh, what, what does it mean?

Ev 00:23:55 Yes, that’s, that’s a great question. So yes, that’s exactly. I think how I think about, except I don’t think that prediction is a tiny part <laugh> so, um, we have a set of <laugh>, um, regions in our brain that respond very selectively to language. And, um, my group and other groups have previously shown that, um, these, uh, responses are sensitive to how predictable upcoming words are. So there is extra cost if words are unexpected, and that has been shown behaviorally as well in many different paradigms. So, um, uh, so there’s the system and, um, uh, as you pointed out, it indeed seems to be the case that if you take representations from modern language models, um, like the transformer architecture models, uh, and you build a model, a linear mapping model between those representations and the neural representations extracted from that system, this language selective system in the human brain, um, there is a good relationship.

Ev 00:24:51 You can learn a relationship so that you can then predicts neural responses to some unseen stimuli. And I think it’s, it’s really intriguing. Um, and, um, I, it was very surprising to me how well that worked, but it also, um, kind of in my <laugh> kind of thinking about language and cognition, it came around the time where all of this stuff about acceptability of language and complex reasoning was becoming clear and clearer. Um, and then if you think about it through that lens, then perhaps it’s not so surprising that we have this system that is a system that has some that has learned some, um, mappings between forms and some rough approximation of meaning that could then be passed down to systems that actually do, uh, complex reasoning on those representations, be social reasoning or abstract reasoning, like, and logic and math and, uh, whatever else.

Ev 00:25:46 And so, um, so yeah, I think looking at, um, uh, neural responses in this language system is, um, capturing this one aspect of, um, uh, linguistic regularities, which of course, but the reason that I don’t think it’s a tiny part of language is of course, to, to succeed at predicting the next word, you have to learn to pay attention to all sorts of stuff in this signal, right? To how particular words go together to some more abstract Sy tactic patterns and learning all that as a kid is a nontrivial task, learning that as a model is a nontrivial task, but, and of course, you know, models and kids learn differently, presumably, and I think there’s a lot of interesting work to do to try to figure out how those differences may impact resulting representations and so on. But that said, it seems like we have this machine in our brain that does that, um, stores all of these knowledge representations that we acquire over our lifetime and then uses them to predict, um, upcoming, um, like how linguistic signals unfold, which is presumably we do, because it’s facilitated basically spreads workload over time better. So we don’t have to work as hard when things actually happen.

Paul 00:26:51 Emily, how did, does that sit well with you that a, a huge part of our language faculty is next word prediction, or just predicting upcoming words.

Emily 00:27:00 Oh, so I, I, um, first of all, again, you know, defer to the empirical work, right? So, so, you know, my work on language looks at the level of, you know, what can we say about the system? How can we model grammatic judgements? And I like to think that that, that kind of work digging into syntax can then help inform the kind of studies, um, that, that E and her team and other researchers are doing. Um, so, but in, in terms of, you know, what are we finding that the brain is doing? It doesn’t matter what I want. Right. <laugh> it matters, you know, is, is there good empirical work going on? Um, but also, I, I think that there’s a difference between saying yes, humans have, um, you know, this facility where we predict what’s coming next. And, and, you know, I, I appreciate the explanation of, um, that allows us to smooth out the workload, um, by sort of trying to do some of that computation ahead of time.

Emily 00:27:53 It sounds like, um, that is very different to saying. And so therefore, a system trained with the task of doing next word prediction, um, is getting at the heart of language, right. Is, is getting, and, and therefore, you know, understands what things mean, et cetera, et cetera. So, um, it’s interesting. Um, and, um, I think it’s a, it’s an interesting way to use the large language models, although with the huge caveat that the, um, nature and scale of the training data is so different between, you know, what something like, you know, even Bert is exposed to and, and what a human child is exposed to. Um, so like that the analogy there, I think, breaks down a little bit, um, but using the language models to sort of model that aspect of what humans do with language is interesting for sure. Um, and I noticed as E was talking that she was being very careful to reserve the word neural for things happening in actual brains. So while we’re talking about, can we have a different word please? Oh, yeah. And it would be nice to reserve neural for things that are actually neural rather than metaphorically neural,

Paul 00:28:58 That, that, that idea has flown it’s. I, I think that there’s no going back. Is there, is there Eve there’s no going back, right.

Ev 00:29:05 I don’t know. I dunno. I mean, like as long as people define their terms. Yeah. I think terminology is very hard to change. I’ve thought some of those battles and I usually tend to give up eventually I’ll just keep using them the way that they make sense to me and, um, ask that other people define what they’re talking about. But yeah. Um, but I think like, I mean, one other thing to say about this potentially like at least some similarity between the representations and the models and the human neural responses is that it’s for the first time. Uh, and like, I, I always say this, I did not think this would happen in our lifetime it’s for the first time that I think we can go beyond kind of, of these verbal, descriptive hypotheses about how things happen and maybe have like an actual implemented model of at least some aspects of how language might work.

Ev 00:29:51 It’s not, of course finding a similarity between two sets of representation doesn’t mean that they’re doing things in the same way, but it’s a window, right. It could allow you then to try to manipulate the model, architectures the training objectives, the training, the learning algorithms, and try to see which of these things affect how well those representations resulting representations can capture human neural responses. And I think we can learn a lot through that kind of careful experimentation on the models now. Um, just because as and of itself, the fact that some big set of parameters provide some fit to human neural data. Like that’s not the end point again, I see this, like, it’s a potential window to actually get beyond saying like, oh, this bit of the brain does syntax or whatever. Uh, the field has been doing for the last few decades.

Paul 00:30:37 Emily was mentioning, um, that the language models learn differently. They learn from different data and presumably different than humans learn. Does it matter how we learn language? So EV I know that you’re multilingual from a young age, right.

Ev 00:30:53 I was multilingual before I came to the us. But yeah, I mean, I’m still kind of bilingual, I guess, Russian is my native language. Yeah. But I used to speak a few others that I for

Paul 00:31:04 LAR here’s another aside before we continue, because I, uh, I wanna get this right. Is it true of that? FMR studies have shown that people who learn, um, multiple languages from a young age, uh, when you, when they’re processing and producing those different languages, the representations are more clustered and overlapping than people who learn language at a later age where, uh, their native language is like, there’s this kind of a central cluster. And then the, uh, newly learned languages, uh, are more active kind of outside that cluster.

Ev 00:31:40 Um, we don’t see that, oh, we basically, it seems like once you have, um, good proficiency in a language, it all loads on the same set of frontal temporal regions. Now of course, different language have to be segregated within that system. Otherwise we’ll be confused all the time, right? So at some fine grain level of multivariate responses, you can discriminate, you know, French from English or whatever when you’re producing it or understanding it. But if you achieve good proficiency, uh, in a language, it’ll all be in that same system for you, even if you learned it later in life.

Paul 00:32:10 So thinking about how to, okay. The difference between language and complex thoughts and there’s an interface, right? So whatever language you’re using, um, needs to be passed on to your working memory system or your reasoning abilities and vice versa to produce the language is F M R going to, um, allow us to see that, uh, that interface, that exchange, because it seems like that interface is perhaps one of the more important things to understand how we utilize language.

Ev 00:32:40 Yeah. It’s a great question. I mean, um, so, so like one first a clarification. So like, when you say there’s like working memory computations that you need to hold some chunks of language as you’re producing, whatever, all of that. Um, so, okay. One step back, uh, people used to think that we have a set of language, brain areas that do some aspects of language. And then we have some perhaps central hubs, like for example, working memory inter like working memory, integration, information, integration, hub, prediction, hub, and then different domains implemented in different parts of the brain old draw on these centralized hubs. Now it doesn’t seem anymore like that’s how things work. So the kinds of computations that, uh, support language processing, which includes prediction, but also things like integration as many theories of syntactic complexity have Ted for many years, all the working memory based stories.

Ev 00:33:34 It’s kind of the opposite, the flip side of prediction, right? So there’s a cost in, um, thinking of what might come next, but then there are also a cost in integrating elements into the, um, representation you’re building anyway. But all of these representations seem to be implemented vocally within this, uh, language specialized system. Um, and I think the same as likely true for other domains like music and, um, things like that. Um, but whether we can study the interface between language and actual kind of thinking and reasoning capacities, I’m dying to understand how that works. I think that’s the most intriguing thing to, um, uh, tackle next. And we’re trying to do some of this, like, what are the representations that the language system passes down to these areas that then for example, reason about, you know, social relationships among people or physical constraints in the world, or whatever other, you know, abstract, logical kind of connections, uh, between things.

Ev 00:34:32 Um, and we don’t have great tools, um, for studying those kinds of questions, um, for different tools that are available have, uh, uh, limitations. But, um, I think we’re trying to figure out if we can get really clever with tools like FMR, which is you kind of need whole brain coverage because you wanna be recording for multiple systems at once. And most intracranial recordings, which you can do in humans have the limit of very, very sparsely sampling the brain. And it’s very rare that you would have recordings from the language system and from some downstream, for example, abstract, uh, reasoning, um, uh, set of areas. And so we’re trying to see how far we can push a, to get at this. And I don’t know yet. Well, um, I feel like in the next decade, we’ll have a better sense.

Paul 00:35:18 Well, I know one of the things, uh, a, that you advocate for is studying animal communication as a proxy for studying language in humans. Um, and Emily, I don’t know how familiar you are with lots of animal cognition and communication, but I guess the question would be, can we really do that? I thought language was something special about humans and that other animals be, they non-human primates down to, you know, organisms like bacteria can communicate, but not at the, but there’s a clear distinction between language and other animal communication, uh, is language specific to humans. And, you know, is it a viable, do you think Emily, it’s a viable option to study communication in non-human animals to understand something about human language.

Emily 00:36:06 So I think if we’re gonna, if we’re gonna go to communication in non-human animals, we are adding a layer of complexity, but also giving ourselves some distance that might be helpful. So, um, if you look at, um, say, you know, dolphins or whales where we don’t really know what’s going on, there seems to be some interesting complexity there. They certainly, um, you know, uh, whales. So I’m thinking here in, in Washington state, we have this as resident orcas who are severely endangered in every year. There’s news reports about, you know, the new births in JPO and KPOD, and, and, um, it’s, you know, there’s a specific set of individuals and they are in, in specific groups. And so I think that, um, you know, someone studying that may well have evidence that there’s social structure and communication. And so that gives us some distance, of course, there’s the extra distance of environmental differences.

Emily 00:36:56 It’s like difficult to study Marine mammals cause they don’t share their environment. Um, but we also then are working with something where we don’t know the code, right. We don’t really know, um, what’s in, in their communication system. And so that’s both sort of a, a further, um, difficulty and some possibly beneficial distance that might help depending on what kind of question you’re looking at as to the question of whether language is special to humans. Um, there’s some pretty foundational work by Charles Hockett, looking at design features of language to say, what does something have to be before we call it a language? Um, and whenever you’re doing that kind of definitional work, I think it’s worth keeping an eye on why you’re doing the definitional work. So if we’re saying this is what linguistic studies, linguistic studies languages. So things that have these properties are languages and they are in scope.

Emily 00:37:40 That’s just descriptive. If you’re saying humans are better than animals because we have language or humans are more sophisticated or more something, then it’s a more value laid and different kind of a question. Um, so I think that it’s, it’s worthwhile to define our terms, as we were saying before and talk about, you know, what, what is this thing that we’re calling language? What are the properties that we care about and why do we care about them? And if we’re looking at it from a neuroscience perspective, um, it could be, well, we care about how this system works in a human brain. And if we want to look to animal models, then we need to establish that it is analogous enough. And so that’s what we’re creating the properties.

Paul 00:38:19 Are you on board with that?

Ev 00:38:19 Just maybe like to add? Yeah, yeah, yeah. I think largely very much so. I mean, I, I think one important, um, thing that I always say is that, you know, there’s a lot of, um, um, continuity in biology and I just suppose about people making categorical transitions when there’s no evidence with such transitions. So I would be very surprised if our system is in some, uh, qualitative way different. Um, and I think that, um, uh, desire to postulate the qualitative difference comes from historically the focus on syntax, uh, which is a big component of language, but words are also important and humans, unlike many other animals can store a vast number of communicative signals. And I think that’s not to be underestimated. I think that’s, um, similar important, once you start thinking about word meanings and the complexity communicative signals, then it becomes much more likely to be a continuum as opposed to some fundamental new circuit that we have evolved or some new brain region, some new way to process information, um, for which you know, that may well be true, but so far, um, evidence for homologies is, uh, overwhelming and evidence for like fundamentally new kinds of computations that human brains can do that other animals can do is quite sparse.

Ev 00:39:35 But there’s a lot of exciting work, both in terms of like understanding the actual biology of human neurons compared to other neurons. Yeah. Uh, and there’s some interesting differences. So, um, I’m excited about the kinds of things we can learn about those potential differences, um, again, in the coming years.

Paul 00:39:49 So, so you don’t see a qualitative difference between, okay, so someone like Terry te deacon, um, would argue that there’s a qualitative difference between the symbol like structures that humans use for language and, and that constitutes language versus what he would call Inex and other kinds of referential signals that are used in the animal, uh, kingdom, whether or not there’s different neural architecture underlying it, uh, you don’t buy that. It, there’s not a qualitative difference between the ability to use symbols that essentially wear a symbolic species and can pass these things down. I can write the word Tangerine and you can understand it in four years, right. And that it’s somehow detached from the immediacy of the environment. Um, so you don’t buy that. There’s a qualitative, well,

Ev 00:40:37 I mean, <laugh> qualitative difference is called for very strong evidence. And there is a very, very large field of animal communication that has shown that many features of human communication are also present. And yes, they don’t have writing systems. Okay. <laugh> so it’s harder to pass things down across many generations, but, um, animals communicate about things that are not, if you look at different species, you find evidence of communication about things that are not here and now necessarily. Um, and, um, again, like one thing, um, that I think is important that often gets kind of all lumped together is yes, humans are smarter and humans have language. It doesn’t mean that humans are smarter because they have language. And in both of these things, kind of the communication system we have and the thinking capacity we have, there may well be continuity as opposed to some fundamentally kind of different processes happening. But I think understanding in which ways we’re smarter setting language aside can also be a very fruitful thing to try to crystallize some of these differences without bringing language into the, um, bucket, which sometimes just leads to muddled reasoning, because it’s very hard for us to think about thought without using language. And then it just all gets lumped together and leads to like a lot of literature.

Paul 00:41:55 Okay, guys. So let’s talk about large language models a little bit more. Um, what I, what I, okay, so Emily, this is not a knock against linguistics or anything, but one question is, do we understand language well enough to build, you know, useful large language models? Do we need to understand language? And I know a lot of what you do is apply your linguistic knowledge to kind of critically evaluate large language models. But do, do we understand language enough and do we need to understand it enough?

Emily 00:42:26 Yeah. Uh, so, so yes, I think, um, that we do need to understand language in order to build good language technology. Um, it’s not clear to me that large language models are something that the world needs at all, certainly not larger and larger ones. Um, that’s not a, um, like phenomenon in the world that, you know, a mountain that we climbed because it was there kind of a thing, right. We, we created the mountain as we were climbing it. And, um, without I think a whole lot of, of purpose in mind, aside from some very fuzzy thinking about AGI, which I think is, is way off the rails, um, in terms of building good language technology. Yes. I mean, we definitely need to, um, look into what we know about language and the more we know about language, the better we can build the language technology.

Emily 00:43:10 And, and this includes things like if we want to build language technology that works well across different languages, then looking into linguistic typology, which is the study of how languages are similar and different can help us do that better. Um, if we want to build language technology that is well situated within this deployment context and not, um, discriminating against people because they speak differently or reproducing the discrimination that’s in language then, um, sociolinguistics is a really rich and useful, um, starting point, right? And I’m not saying that typology or sociolinguistics are, you know, finished areas of study. Is there anything that’s a finished area of study? I think the, the areas of study that we’ve abandoned are all things where it turned out to be, you know, pseudoscience, um, and anything that, that had something real at its core. There’s always more to do. Right.

Emily 00:43:55 Mm-hmm <affirmative> so, yes, we could always learn more about language. Um, but what we are not as a field doing in NLP is getting the full advantage of what is known from linguistics. Um, and there’s, it’s a little bit frustrating to me. Sometimes people will write off linguistics because they took one formal syntax class. And by formal, I mean, formalist, I mean like minimalist program and said, oh, this isn’t useful. And then decided that that was all of linguistics. Um, and that’s partially on linguistics because, um, within the field, especially in the us, there’s this culture of sort of putting syntax, especially that kind of syntax on the top of the heap and sort of putting that forward as the pinnacle of linguistics. And so if we do do that and people from the outside come and look at what we’ve done and go well, not useful and miss all of the other stuff, um, we could be doing better on the linguistics side. Um, but I also, you know, will continue telling the language technology people to please keep paying attention to what you can get out of linguistics.

Paul 00:44:46 And, and do they, is there, is there a lot of, uh, hesitancy to listen to linguists?

Emily 00:44:51 Yes. Um, and I think that hesitancy comes from a couple of places. So one is, sometimes people do go and they encounter something that is esoteric and not helpful. Um, and I’ve tried to work against that by, you know, writing a couple of books saying here’s 100 quick vignettes about the first one was morphology and syntax. And the second one was Seman and pragmatics. Um, and those came, um, initially out of frustration, um, as a reviewer for NLP conferences, um, around 20 10, 20 11 going my God, these people don’t know the first thing about how language works. Right. <laugh> um, so, okay, well, it’s not reasonable to ask people to go do a whole second degree. So what can I do? Well, here’s 100 things about how language works and then I did 100 more. Um, but also I think that there’s something cultural in computer science, which is to say, um, especially machine learning, the whole point is to build the system that learned the things. So you don’t have to mm-hmm <affirmative> and that leads to a direct devaluing of domain knowledge and linguistics is one important domain for domain knowledge for NLP, um, that gets devalued and ignored. And, um, so I’ve put a lot of effort into trying to counteract that

Paul 00:45:53 F what do you, what do you think language models need? What, what would you want to see in besides scaling up? Um, I, I mean, I know it’s a problem modeling, um,

Ev 00:46:01 I don’t know. I never said they need to be scaled up. No, no,

Paul 00:46:05 No, no, I didn’t. Don’t that’s

Ev 00:46:06 Necessarily good strategy, but yet,

Paul 00:46:08 No, I mean, that would be the, um, that’s sort of the default program, right. Is to scale up, but, but even, you know, looking at, um, comparing our visual activity to convolution neural networks, mm-hmm, <affirmative> the larger ones don’t perform as well, because they’re not mapped on as well to the structure of our brain. So I’m, I’m just saying that, you know, scaling up is not good for neuroscience. Like it’s not gonna buy you anything. Uh, but is it, you know, is there something that, that would you think Emily can help you?

Ev 00:46:35 I think so. I mean, <laugh>, I, I think, um, we have a good system, a good general intelligence system that still a lot remains to be understood about, which is the primate brain. And, uh, um, I think, um, the, when people started seeing some of the successes of the language models, again, conflating all sorts of being able to capture linguistic regularities versus abstract generalizable world knowledge. Um, then, well, it also depends on the goals, right? Like if you wanna build a system that can solve problems for you, um, then sure. Maybe scaling it up and seeing how far you can push it is, is a reasonable thing to do. Although of course it’s not environmentally responsible and comes with a lot of, you know, it its own issues, but, um, but if you wanna build a system, um, that can think then, um, it just seems a little bit misguided potentially to just try to think that language will just give you that.

Ev 00:47:34 And again, I think the idea that it can comes from the fact that a lot of people think that language is what made us smart. Um, and we have a very nice, uh, somewhat modular and I’m not bringing any of like fedora and baggage where nothing in like somethings are encapsulated or anything like that, but it does seem there is division of labor in our brain. And presumably it’s because it’s Metabo and computationally efficient to build a system like that. And so perhaps instead of trying to make the language models larger and larger and train them on more and more data, you can take existing models, which actually do language really quite well. <laugh> in many ways they capture many regularities really pretty well. Um, and you can use linguistic tools, like was saying to probe the knowledge rep, see the kind, the way they represent language is similar to how humans represent linguistic regularities.

Ev 00:48:21 But I think we have a working decent language module ish, and then we can try to see, okay, how would we build a system that reasons about different aspects of the world? How build a system that does math build a system that can, you know, interact with computers using like computer code, a system that, um, inter implement social reasoning, right? So this is kind of going to, um, this notion of having a bunch of distinct capacities. Some of them are relevant to particular domains. Some of them are abstract reasoning capacities, like the kinds of things that, um, are linked to fluid intelligence, just abstract reasoning, novel problem solving and so on, and then maybe try to combine those different solutions together, uh, and try to build interfaces. And, um, maybe that’s a better way to try to build a general intelligence system. Uh, and maybe it can be much more computationally efficient than, uh, trying to get, uh, some of this, again, like some of this, you may be able to get through language, but it’s just not necessarily the best way to design a system. I think given what we know from human greats,

Emily 00:49:22 Right. So I think we’ve maybe found a point of disagreement finally, F which is, um, yes, I don’t, I don’t think the language models are a nice language module in a system like that, because I think that they, they capture some things about language, but they don’t capture enough. Um, and so that’s one point of disagreement. And the other point, maybe this isn’t, um, disagreement with you, but disagreement with others that you were sort of, uh, modeling as you were speaking, um, is I don’t see the value of building a general intelligence. I think that we would, um, be much better served by building specific tools to help people do things. And, um, there are lots of places where building language processing tools could be extremely useful. Um, some of my favorite examples are, um, matching patients to clinical trials, um, helping comb through case law to find precedent. Um, automatic transcription is wildly, you know, useful, likewise machine translation. Um, so thinking about it as here are tools rather than running away with, um, okay, now we’ve got a language thingy, let’s build a general intelligence around that and then have it be a general purpose tool. I think we’re going to have a real hard time creating something that’s fit for purpose if we go that direction. Um,

Ev 00:50:33 No, that’s a great point. Um, yeah, that, that’s a great point. I mean, again, like I don’t particularly, I’m not an engineer, I don’t need to build tools. Like I see how some tools can be useful, but for me building something like a generalized intelligence system is another tool for probing the human brain. Like if we can build something, taking inspiration from the human brain, then maybe we can ask questions that we just can’t ask about how human brains work, because we lack the tools. But if we have a model that captures something about human neural responses, then we can try to understand, for example, how the language system passes information down to say the abstract, um, logical reasoning engine or social engine.

Emily 00:51:12 Yeah. And, and I have no objections to, to building scientific models. Um, but most of the people who are talking about building these things are not, are not doing it with that kind of a motivation. That’s

Ev 00:51:20 True. That’s true. Well, we at MIT do, <laugh> not only at of course, but a lot of people who I interact with are of course interested in fundamentally this interaction between the fields with a big goal of understanding how humans work.

Emily 00:51:36 Yeah. I want to live in your world. That’s true.

Ev 00:51:40 Coming over. Yeah.

Paul 00:51:46 Our, um, going, going maybe the, the other direction, um, instead of using language models, uh, as proxies, you know, for brain activity are large language models, teaching us anything about language. So linguistics can inform large language models, but are, are we getting anything in return, whether it’s, you know, li our limitations or, um, what we’re particularly good at good at that they’re not, are we learning anything about our own cognition through these language models?

Ev 00:52:18 I’d be very curious to hear what Emily asked about that.

Emily 00:52:21 Um, so I think there are linguistic questions around, and, and, and I’ve, um, mentioned this earlier around what can be learned from, um, youth phrase you used that was experience. Um, and there, there are linguists who want to posit, um, innate knowledge of language and PO it on the basis of saying, see people know these patterns that they can’t possibly have learned just from observing the data. Um, and I guess there’s a couple things I wanna say about that. The first is that, um, our experience with language is very, very different from, um, the input that a language model is getting about language, because our, our experience with language is always embedded in, you know, a physical and social experience. And even if part of what we’re doing is learning to predict the next word. Um, it’s not that we’re just sitting there receiving strings of words, right.

Emily 00:53:08 Which is roughly what the, the language model gets to do. Um, so that’s, that’s sort of one direction. The, it is interesting, I think, to say, Hey, look, these patterns of grammatical that, um, you were using to posit, um, innate knowledge of language actually are calculable just from data, if you have enough data. Um, and then, um, I suppose if you wanna get into nitty gritty of that argument, it becomes, okay, well, what’s an appropriate amount of data. Like how much do you need before you say this is, um, something that, that a child learning a language could reasonably have been expected to be exposed to. Um, so I think that there’s, there’s questions in that direction that are interesting. Um, I mean, I’m certainly for using computational models to do linguistics. Like that’s what I’m doing with grammar engineering. It’s a different approach. It’s basically saying let’s actually push the rule based idea.

Emily 00:53:57 Um, but instead of having those rules be pen and paper, let’s actually write them down on a computer. So we can then test them over large data sets and find the phenomena that our rules don’t yet account for. And that allows us to move on to the next thing and the next thing, um, and sort of in that same spirit, I could imagine using a large language model. Um, I mean, I wouldn’t build it for this purpose. Um, but given that they exist already and it doesn’t, you know, take very much more electricity to, to run them a little bit or to probe them. Um, it could be interesting to say, okay, you know, what are the things that, um, can be picked up in this paradigm versus can’t? Um, and so I’m thinking of the work of people like Alison Edinger and Ellie Pavlik, who do really interesting work on sort of probing and trying to understand what kinds of things can be picked up from this observation of lots and lots of linguistic form, um, versus what seems to require more than that. And so I think that there is there’s interesting studies that can be done, and I’m glad that people are getting some use out of them that way. Um, but like I said, I wouldn’t have gone to build a large language model just for that purpose. I think you can probably get at those questions otherwise, too.

Paul 00:54:58 Do we need, uh, so Emily, you you’ve famous silly, you know, we’re not gonna go down the list of famous papers that you’ve written. Um, but, but thinking about the climbing towards NLU, um, on meaning form and understanding in the age of data where you argue that large language models don’t understand, um, language don’t understand the meaning of what they’re saying, uh, is meaning required for a language. And I guess a, a second question, which is maybe orthogonal, but does it, does it matter how we learn language?

Emily 00:55:32 Okay. And I’m glad you came back to that question because you, you asked it before and then we didn’t get to it. So is meaning required for language? Um, I think yes, the, the, the operational definition of language that I work with, and I’d be interested to hear if, if this fits in E’s, um, linguistic or language related work as well, um, is that languages are symbolic systems. Um, they are systems that allow us to pair forming meaning in this open-ended way, sort of recombining large, but discrete sets of, um, the basic symbols into larger symbols. And also from the point of view of, of why we might build language technology is it’s about communication, right? And communication, isn’t just passing strings back and forth. It’s about using the strings as clues that allow the, um, the interlocutor to reconstruct communicative intent or a good hypothesis about communicative intent.

Emily 00:56:24 Um, and so that’s, that’s where I see meaning being really key to language. Now, does it matter if we learn it the same or learn it differently? Um, so who are we? Right. So, um, I think there’s probably really interesting questions about what’s the range of human variability in how languages are learned. Um, how does that interact with cultural practices, um, about how, uh, children are spoken to, how does that interact with modality and, uh, you know, always looking at that with a sort of expansive sense of normal and not, um, there’s better or worse ways of doing it. That’s sort of, one of the, one of the pitfalls of that kind of research is it’s often the, um, the questions can be asked descriptive descriptively and inclusively, or they can be asked, um, in a very discriminatory way, um, or, or does we include in that concept machines.

Emily 00:57:12 Right. Um, and, uh, it doesn’t matter whether machines are learning language as much as they’re learning it differently from humans. Um, I would say yes, in two ways, one is, um, to the extent that we’re claiming that the machines are a model that we’re gonna use to study humans, then we need to be very clear about what the similarities and differences are, cuz that gives us the limits of the model. Um, and then secondly, if we’re gonna be building technology that people are using the way in which the system was learned, might put some, some limits or tell us something about the resulting system, um, that we need to know about to have safe deployment and where safety there includes sufficient transparency for the user that they know what’s going on. And, and where this text that they’re encountering is coming from.

Paul 00:57:56 Agreed. Uh <laugh>

Ev 00:57:59 <laugh> uh, yeah, largely I, I think for, for the most, um, for the most part, I mean, I think, um, again, I’m really, um, excited. I have this renewed excitement about using these models to try to understand something about how humans learn. And in fact, I just, um, have a postdoc doc who started yesterday change Ang, who comes from Dan Yemen’s group. Oh, uh, who is interested in exactly this question of how do you need to train a model differently on linguistic input, including cross model, um, uh, data or different learning algorithms, uh, which are more, uh, human childlike or a different nature and amounts of input and seeing whether models trained in these more likely ways, more similar ways to human children capture something about responses and developing brains to language. And I think that’s a really cool and exciting enterprise. Um, and, uh, I’m optimistic.

Ev 00:58:56 I mean, I <laugh>, I’m a very strong optimist, so I’m optimistic that we can learn something that has been hard to learn with just having access to, uh, human neural data and being limited to these kinds of verbal hypothesis about how things happen, um, in terms of symbolic versus non symbolic. It’s a, it’s a hard, um, uh, it’s a, it’s a hard, hard issue. Uh, I don’t think we have great proposals for how symbols can be instantiated in neurons, um, which is not to say that that’s not how it is, but, um, it would be great to try to, uh, move a little bit more in that direction. And I think some of our thinking certainly is, uh, symbolic. And how much of that characterizes the language system? I think it’s, uh, <laugh>, I think there, I think I probably am less, less strongly on the symbolic side, purely symbolic side than you Emily, which is fine. Right. We can disagree <laugh> that’s okay. Yeah. Yeah. AB absolutely. But

Paul 01:00:04 What about our bodies? I mean, so there’s this grounding issue, right? That language needs to be grounded in the real world for meaning to, um, adhere. Do we, you know, thinking about how we learn language are, is our biology important? Are our bodies, the embodiment that we have is that important for language, but, you know, so EV like, you know, you’re gonna be training these different models, giving them different kinds of input, but they still don’t have, um, <laugh> bodies. They’re not grounded in the world. Right. So to speak,

Ev 01:00:37 It’s a very good question. I mean, I think, um, the literature to pay attention to here are, um, concern evidence from, um, individuals with very different developmental experiences. They’re individuals who are born without limbs or individuals who are born blind, or individuals who are born deaf or have other differences in how they experience the world. Um, and, uh, one thing that we’ve learned from some of these studies is that, um, a lot of these perceptual and motor experiences just don’t seem to be critically needed to learn really sophisticated, uh, models of the world to acquire sophisticated models of the world. So, so, you know, a striking example like congenitally blind individuals, knowledge of the color space, or really visual concepts like glance versus glare and things like that are very similar to those who, oh, to individuals, um, who have had access to visual input growing up.

Ev 01:01:33 And I think what this tells you is that a lot of that information is redundantly represented in the regularities and language. And I think there’s interesting questions you can ask given that, but, um, whether a model needs to interact, uh, perceptually and in motor ways, uh, with its environment, I think it’s an empirical question. Um, and I think will, you know, and people are trying to build embodied language systems. Um, and again, I’m less excited about building it for the goal of building it, uh, like I’m interested in how will the representations of linguistic meanings, for example, be different for those kinds of models compared to these disembodied, um, texts of the sizers as Emily called

Emily 01:02:19 It. I think there’s a lot to be really careful about here when we talk about the experiences of people with developmental differences, um, and, you know, Contently blind folks and so on, um, because there’s a, there’s a move that the people who are interested in promoting the language models as mines, rather than as text synthesizing machines make, where they draw an analogy between language models and, um, the experiences of, of deaf blind people, for example. Um, and, uh, it ends up just always coming across as terribly dehumanizing. And, and I mean, because it is like that that analogy is inherently dehumanizing. Um, and I’m in the middle of an ongoing Twitter argument about it again, <laugh> because people keep doing it. Um, but I, I think that it’s, it’s not that any given aspect of our physicality of our sensory apparatus, um, is inherently required.

Emily 01:03:10 Um, but rather, I think when, when you said redundantly encoded, I was actually thinking you were, I was making a prediction of about you going a different direction based on, on my thoughts. Um, mm-hmm <affirmative>, uh, which is that, uh, our experience of the world is redundantly encoded, right? We have many different ways of experiencing the world and we have also many different ways of, of performing intersubjectivity and sharing experiences of the world. Um, and I think that that’s what gives us the toe hold in learning the linguistic system. And then once we’ve acquired a linguistic system, we can use it. Um, like you say, to get it information, that’s redundantly encoded in language, and it’s not just, um, distributional. So people will often say, you know, um, the, the distributional hypothesis is that meaning is use, right. Vic Stein says, meaning is use. And my re report to that always is right, but use, isn’t just distribution, right.

Emily 01:03:58 Use use in some specific communicative context, which is embodied. And that’s probably important, but I don’t think any particular aspect of the embodiment is, um, there’s facts of embodiment, but, um, there’s a lot of redundancy there. And so I think that, um, what’s, what’s missing is not so much embodiment as experience of the world or what’s necessary as experience of the world. Um, and when we talk about the world, it’s important to keep in mind that the world is not just, um, the physical and natural world around us, but also the social world that we inhabit. Um, and so there’s, there’s a, a lot of richness and complexity there, um, that we are situated in. Um, and that seems to be, is it, is it necessary for learning is, um, maybe the kind of question that, that building these artificial models would help us answer, but it certainly is, um, a fact of how we learn, because, you know, we are all situated in our world and, and that’s where our learning is taking place.

Paul 01:04:59 I know that you’re very, both, both very busy people. So maybe we can just end on sort of an open ended question, um, of you were just talking about how optimistic, how foolishly optimistic you are put a judgment in there. I didn’t

Speaker 4 01:05:13 Say foolishly. No. I said foolish.

Paul 01:05:17 Um, and Emily, it seems like you have, uh, more of a doom and gloom proj projection toward the future. I have children and we’re in the midst of, you know, battling the screen time issue. And my daughter talks to the phone to call a song up and I’m somewhat terrified, um, for their future interacting with these things be because of our, um, proclivity because of our, uh, tendency to anthropomorphize. And, uh, but I don’t know that my, uh, terror is well founded because we’re terrible at predicting the future as humans as you know. And so I’m, I’m curious, you know, maybe we can start with you Emily, if am I right in thinking that it’s a doom and gloom scenario, or do you see light at the end of the tunnel and, uh, that good will, will come out of, uh, large language models and whatever happens next, because next year we, we won’t be talking about transformers. We’ll be talking about something else, right?

Emily 01:06:17 Yeah. Let’s, let’s call them exfoliators, um, just to pick a random word <laugh>. Um, so I I’m, I’m laughing at being characterized as not an optimist cuz cuz personally who I am, I actually very optimistic. Um, and I think my optimism is rooted in a sense of, we collectively make our world, right. We, we live in it, we learn in it and we also create it. Um, and so it’s up to us, right? Um, not, not any one of us individually, but all of us collectively, um, can make decisions and we can, um, make decisions, you know, regarding language technology, for example around, okay, well what kind of regulation do we wanna put in place? What kind of transparency do we wanna require? What kind of data production do we wanna require? And so on these aren’t gonna be easy things to do, but they are things that we can do. Um, and I think when you’re talking about the fear that you have as a parent of, um, screen time and how your kids are gonna be interacting with things, um, I feel like we can be empowered through transparency in many ways. And I think that there’s a course correction that’s needed away from technology. That’s leaning into our pros to anthropomorphize and towards technology that is designed to be helpful tools to assist people in doing what we want to do and not, um, take advantage. Or as I say, I’m on my op-eds abuse, our empathy,

Paul 01:07:35 But that’s not gonna happen because technologists won’t don’t care about that. Right. Um, in a capitalist society, for example, it’s, there’s the regulation, whether we could actually do it well is a different question. Um, I, I tend to think that we’re not good at regulating things either or you know, that that’s just a, a different road that maybe unwise, but I don’t, I just don’t see. Um, <laugh> I don’t see the regulators, um, the well-intended regulators catching up with the technologists.

Emily 01:08:09 So I, yeah. I mean, it’s not, it’s not something that’s necess optimism, isn’t saying, oh, it’s all gonna be okay. Right. Optimism is saying we have the opportunity to try to make it. Okay. Yeah. Um, and I know that there’s a bunch of work going on right now in Europe, around an AI act. Um, and I think that, um, what I’m hearing about is that there’s some sensible stuff in there and some stuff that’s missing the mark and a lot of people who are working in this space around, um, technology policy and regulation are engaging like that. That’s also a whole field of study. Um, and, um, yes, you’re right. That the forces of capitalism have certainly pushed things in, you know, a specific direction. But I also think that most people working on technology are actually interested in working on it to make people’s lives better. Um, and, um, that there probably is a lot of Goodwill that could be leveraged.

Paul 01:08:53 Uh, I’m the most pessimistic person here I can tell <laugh> FD. Do you share Emily’s

Ev 01:08:58 I, yeah, I think that was right on the mark. I mean, I will also say another thing there’s I was trying to remember the quote, but there’s these, um, actually a few quotes from, uh, children from parents worrying about their children from centuries ago. Yeah. Cool. We’re new technology comes <laugh> yeah. I’m just saying your kids are gonna be okay.

Paul 01:09:19 I know. I know. But kids

Ev 01:09:20 Come with their <laugh>. Yeah. You know, I don’t know. Um, I have a five year old and she likes her screens and she learns a ton from the screens. Sure. And so what’s a learning mechanism, like learn ways to learn information have changed, but fundamentally, you know, humans come in with a very good brain with a lot of tissue, much more tissue to do abstract reasoning than animals do. That’s one difference. Again, it’s a, a qual quantitative difference rather than a qualitative one. But when you have a lot of space to play with, that’s not taken up by perception and water control. You can notice patterns, you can make generalizations, you can connect things that nobody else had connected before. Um, and that, um, makes a, you know, fun and powerful tool that we all have in our heads much more so than probably any other tool that will ever be created by us. But

Paul 01:10:12 Yeah. Okay. I could go on and on about my children and the

Ev 01:10:16 Yeah, worries.

Paul 01:10:17 Yeah. Worries. But, um, I guess we’ll, we’ll, we’ll leave it here. Thank you both so much for your time and thank you for your work and success.

Emily 01:10:25 Thank

Ev 01:10:25 You. Thank you. It was very nice to meet you Emily. Nice to meet you.

Emily 01:10:28 Likewise,

Paul 01:10:47 I alone produce brain inspired. If you value this podcast, consider supporting it through Patreon, to access full versions of all the episodes and to join our discord community. Or if you wanna learn more about the intersection of neuroscience and AI consider signing up for my online course, neuro AI, the quest to explain intelligence, go to brain inspired.co to learn more, to get in touch with me, emailPaul@braininspired.co you’re, hearing music by the new year. Find them@thenewyear.net. Thank you. Thank you for your support. See you next time.