Support the show to get full episodes and join the Discord community.
Jovo, as you’ll learn, is theoretically oriented, and enjoys the formalism of mathematics to approach questions that begin with a sense of wonder. So after I learn more about his overall approach, the first topic we discuss is the world’s currently largest map of an entire brain… the connectome of an insect, the fruit fly. We talk about his role in this collaborative effort, what the heck a connectome is, why it’s useful and what to do with it, and so on.
The second main topic we discuss is his theoretical work on what his team has called prospective learning. Prospective learning differs in a fundamental way from the vast majority of AI these days, which they call retrospective learning. So we discuss what prospective learning is, and how it may improve AI moving forward.
At some point there’s a little audio/video sync issues crop up, so we switched to another recording method and fixed it… so just hang tight if you’re viewing the podcast… it’ll get better soon.
0:00 – Intro
05:25 – Jovo’s approach
13:10 – Connectome of a fruit fly
26:39 – What to do with a connectome
37:04 – How important is a connectome?
51:48 – Prospective learning
1:15:20 – Efficiency
1:17:38 – AI doomerism
Transcript
[00:00:03] Jovo: And I’m betting you it would get us to a place where people could achieve levels of consciousness or mindfulness or meditation. That is unheard of currently. The normal kinds of statistical analyses and machine learning tools that one could apply to data, none of the theory applies when you have it as a network. My somewhat obnoxious prediction is that basically everything will be using principles of perspective learning, all the AI, all the modeling of humans and stuff. We’ll all be doing it relatively soon. [00:00:46] Paul: Hey, I’m Paul. This is brain inspired. Do I look sleep deprived? Because I am. And I’m on a little vacation to my former home, Durango, Colorado. But bi must carry on, so here I am. And my guest is Joshua Vogelstein. Jovo, which is what Joshua goes by, runs the Neurodata lab at Johns Hopkins University, which seeks to understand and improve animal and machine learning worldwide. That’s the tagline of his lab. Today, our discussion revolves around two main themes, along with my usual random tangents. So Jovo, as you’ll learn, is theoretically oriented and enjoys the formalism of mathematics to approach questions that begin with a sense of wonder. So after I learn more about his overall approach, the first topic that we discuss will be the world’s currently largest map of an entire brain, the connectome of an insect, the fruit fly, and we talk about his role in that large collaborative project, what the heck a connectome is, why it’s useful, and what to do with it and so on. The second main topic that we discuss is his theoretical work on what his team has called prospective learning. Prospective learning differs in a fundamental way from the vast majority of the rest of AI these days, which they call retrospective learning. So we discuss what prospective learning is and how it may improve AI moving forward. At some point, there’s a little audio video sync issue that crops up, so we had to switch to another recording method and fixed it. So just hang tight. If you’re actually viewing the podcast, it’ll get better soon. Show notes are at Braininspired Co. Podcast 189. If you want to experience this full episode and all other full episodes, you can support brain inspired on Patreon. Go to braininspired co to learn how. All right, take two. Here we go. That’ll make sense in a second. Jovo, how’s the family? How’s the kids? How’s your shoulder? Just kidding about this shoulder. We didn’t get that personal last time. Good to see you again. [00:02:55] Jovo: Thanks. Um, yeah, I feel great. Family. Kids are all great. We had a beautiful weekend of family events. We’re all together. And so it’s really nice. [00:03:05] Paul: Where’s the cell phone right now? [00:03:07] Jovo: Cell phone is very close now because no kids are home. But when they get home, the cell phone will adjourn to the car until they go to sleep. [00:03:15] Paul: Oh, you attach it to a drone. [00:03:18] Jovo: Adjourn. [00:03:20] Paul: No, I know, but. Oh, see, my kids are going to walk in the door any minute now, and so I’m going to have to close this door behind me. So I was imagining your kids would be home soon as well, but, yeah. [00:03:29] Jovo: No, they’re going to martial arts, and they’ll be home afterwards. [00:03:33] Paul: Cool. So, as usual, beautiful setting behind you. I’m jealous of your space. Anyway, the reason why I said all that at the beginning is because this is take two of our episode. So we tried this a few weeks ago, right? It must have been a few weeks ago. A month ago, yeah. Yeah. And then we had a little postmortem after, and I told you, man, I was really off my game. And then we kind of went back and forth and mutually agreed to re record it. So this is the first time I’m actually re recording an episode that I hadn’t screwed up the recording on before. So congratulations. [00:04:11] Jovo: Thank you. [00:04:12] Paul: And thanks for doing it again with me. [00:04:14] Jovo: It’s my pleasure. I thoroughly enjoyed the first one. I look forward to this one, too. [00:04:18] Paul: Yeah. Should we, do we, do we let on why we are recording? I mean, it was basically my scattered brain. [00:04:26] Jovo: Well, I’ll just. My part that’s taking responsibility for my part, I’d say, is I love the concept of a redo. I first started doing redos, I think, in middle school recess, where we would be playing tag football, and, like, someone would say they tagged me, and I would say I didn’t feel it, and then it was unresolved, and so we just did a redo. [00:04:46] Paul: Yeah. [00:04:47] Jovo: And more recently in life, I decided I love that concept, and I want to keep it going. And so I have redos constantly with my kids, with my wife, with myself, where I’ll say something or do something, and I just wasn’t thrilled, in retrospect with how I did it. And so I’ll ask for there to be a redo, and I can explain why I want the redo, and we do it again, and I love the practice. And so this is just another manifestation of me doing a redo. [00:05:12] Paul: Well, in. What’s the phrase? Third time’s a charm. I hope we make second time the charm this time. Right. [00:05:19] Jovo: I hope each time is charming. [00:05:21] Paul: Yeah. Okay. So this time and I’m not gonna refer back to our previous conversation this whole time, but what I thought that we would start with is sort of your overall approach, because I don’t think. I think I neglected to highlight that last time. One of the things that strikes me at reading, especially this most recent paper that we’ll talk about a little bit later, is how into mathematics and formalisms and the theory side of things that you are. And we’ll get into my difficulty in reading your latest paper later. Right. But I wonder how you see. So you’re interested in neuroscience and artificial intelligence, and I wonder how you see yourself navigating between them, among them, in your thinking, in your working, and your excitement and enjoyment. [00:06:15] Jovo: I love that question. I guess the thing that most motivates me any given day is trying to understand us. And by us, I mean something that can be quite general. Like, sometimes I literally mean, like, just me, like, what is happening inside of my body and my thinking right now, sometimes. I mean, in my family, in my community, on earth, amongst all conscious beings. And I don’t know the answer in general, but I really like thinking about what is happening inside of us. How are we thinking, and how do we get to have these thoughts, these feelings, these emotions, these beliefs? And so, in exploring that, what I’ve come to decide is that the language of mathematics and statistics and more recently, data science and artificial intelligence is the language that I think is best equipped today to characterize the kinds of things that we are doing, which, as far as I can tell, is making a bunch of decisions with a bunch of uncertainty in an environment that we’re navigating, with a bunch of other agents that seem to be doing approximately the same thing. [00:07:26] Paul: So then, I mean, your work also, you know, you straddle artificial intelligence. You’ve done a lot of work in artificial intelligence. You’ve done a lot of work in neuroscience, but you didn’t answer my first question, and I asked, like, seven, which is, you know, how do you see yourself, like, navigating those two fields in particular? [00:07:41] Jovo: So what I’d like to think that we’re doing in our group is pushing artificial intelligence in the direction of being able to more formally and accurately characterize the kinds of things that thinking and learning beings are doing. [00:07:57] Paul: Okay, one of the things that I wanted to ask you is, it seems, and you can correct me in artificial intelligence and machine learning and in neuroscience, that a lot of progress is made by tinkering. Right. And kind of making a guess and trying out a new algorithm, seeing how it works and the sort of post hoc going in and thinking about why it might work, and then, you know, you ratchet it up, and, you know, in all of these fields and all the scientific fields, there’s this virtuous cycle that we’re supposed to do where we start with either experiment or model or theory, and then we just go round and round, and that’s how we progress in science. So, you know, but you seem like you’re a theory slash formalism first approach kind of person. So do you approach problems in. In general with starting with a theory and trying to build out a theoretical notion? [00:08:54] Jovo: I think I start with wonder. I start with, like, how is that happening? Like, how did that squirrel figure out where that nut was gonna be? Or, I don’t know if you watched the squirrel ninja warrior course thing online, that little YouTube video. But, like, how did they figure it out? Or. [00:09:14] Paul: No, I did. Yeah. Yeah, I know what you’re talking about. [00:09:17] Jovo: Or I have a two year old son, and he’ll say something to me, like, freezing. Like, how did he learn that word, freezing? It’s not like I gave him a thousand examples of freezing and temperatures, and he, like, somehow, over some supervised learning algorithm, figured it out. He just. Today, he knows that word. Like, how did that happen? How does he know how to jump now? Like, I didn’t teach him. He just jumps all of a sudden, walks backwards. Like, I find all that stuff amazing and awe inspiring, and I just wonder how it happens. [00:09:51] Paul: So then you don’t tinker, right? So you go from wondering to a sort of. Because you have an applied mathematics background. Partly. And so does your mind just go straight to, like, how do I formalize this in the mathematical language? [00:10:04] Jovo: Yeah, I’d say it goes from wondering to trying to characterize it somewhat formally. And then the tinkering is like, the first time I try to write it down, it’s usually nonsense. Like, it’s. It’s not even wrong. It’s just like. [00:10:20] Paul: You mean, like, write down a formal. Like, mathematical formalisms? [00:10:23] Jovo: Yeah, yeah. It’s not even wrong. It’s just nonsense. I mean, it seems right to me at the time, but then I show it to someone else, and they’re like, that’s nonsense. So then there’s tinkering and trying to figure out the. Right, I’d say, like, conceptual framework to even characterize what’s happening. [00:10:40] Paul: That seems like a different kind of tinkering than, let’s say, in an. Well, let’s say, in an AI algorithm, right? So you. I think it seems like a different kind of tinkering. Would you agree with that? That, you know, let’s add a whatever. Let’s add attention, or, you know, let’s, let’s add a convolution and see what happens or something. [00:10:59] Jovo: Yeah, I’d say it’s a conceptual tinkering rather than like a numerical experiment tinkering, but it’s not that we don’t do the other kind. Once we have a claim, the way we like to write our papers is there’s some theoretical claim, like, we have a theorem, like this kind of thing will lead to this kind of outcome. [00:11:17] Paul: Yeah. [00:11:18] Jovo: Then we write an algorithm where we can say this algorithm has the properties that we’ll know it will have whatever, or get this property that we care about converges to the right answer or whatever. But then making it work in practice certainly requires a bunch of tinkering. And there’s, it’s not like one directional, in the sense of we say some theory thing, we come up with some algorithm thing, and then, like, we’ll change the theory based on the results of the algorithms, often, and vice versa. So we’re mutually tinkering on both, because it’s not like there’s a right theory or a right algorithm. There’s like, algorithms and theories go together. It’s like, here’s the algorithm. That’s the manifestation of this theoretical construct that we desire. [00:12:02] Paul: So I think that, you know, I don’t remember, I can’t attribute the quote to any, anyone in particular. I’m sure it’s been attributed to Einstein, but the quote is something about something like, no, it can’t be Einstein, because I think it’s mostly referring to, like, sort of the experimental tinkering side where advancements are. I don’t remember the exact quote either, but the idea is advancements are made by, you know, you do an experiment, and then you say, huh, that’s funny. You know, something unexpected, right? And then, and then you go down that avenue. But is that the kind of thing that happens in the theory algorithm, theory, algorithm, back and forth? [00:12:41] Jovo: Oh, yeah. So today I was talking with one of my students, and she said to me, the results are bad. And I said, that’s great. That means we don’t understand something and we’re about to learn. And the results, you know, maybe we’re bad, maybe not, but we certainly learned a lot in the discussion. And so whenever I get, like, a result that’s unexpected, it’s exciting to me, because that means now there’s an opportunity to learn something. [00:13:11] Paul: Okay, well, so I’m not sure if tinkering is the right word, but you worked diligently for over ten years figuring out statistics and the. What you needed to figure out to make the connectome of a larval fruit fly brain, which you claim you’re best known for. Do you think you’re best known for that? [00:13:32] Jovo: Certainly recently. I mean, one thing that was really cool about that, that one of my, say, proudest achievements to date is National Geographic labeled that work as one of the top ten or top seven medical breakthroughs of the year last year. [00:13:48] Paul: A medical breakthrough. [00:13:50] Jovo: I don’t know what their criteria was and I’m not judging it. I’m just saying I was stoked that it happened. [00:13:55] Paul: Yeah, well, you know, I didn’t ask you this before. What is the difference, you know, connectome wise, before we talk about that work a little bit more in depth between a. Like a larva and an adult brain. [00:14:11] Jovo: Quite a bit. So, you know, larvae are basically baby flies and they’re pretty small, like, you know, on the order of thousands of neurons versus the adult ones of tens or hundreds of thousands of neurons. And, you know, depending on the species, it can be more or less. And then, like, the degree to which the larvae can behave and learn is differently. So their cognitive repertoire is more limited, their connectome, like the set of neurons, the set of connections between the neurons is much smaller. It’s also still in development. So the connectome of a seven day larva is different than a six day for adults. That’s probably true to a certain degree, but after a certain time, it mostly stops developing and then it kind of achieves a steady state. People think. [00:15:01] Paul: So the reason why you went after a larva instead of an adult, then, is because of the level of difficulty. [00:15:07] Jovo: Perhaps we could. Yeah, we wanted to finish something. That was a thing we could finish. [00:15:12] Paul: Yeah. Well, I know that I’ve heard you say, and you told me last time that this project began largely because you wanted to figure out whether by, you know, how you could tell by looking at a connectome, whether you. Whether a fruit fly knew calculus or could do calculus. [00:15:29] Jovo: Yeah, that’s right. So this was an idea I had in grad school, I think it was, I don’t know, 20 years ago probably at this point, where I started taking a class on statistical pattern recognition. And the deal with almost all the theory in statistical pattern recognition is you start out where you have p features or dimensions and they live in what’s called euclidean space, which means there’s really no structure to the different dimensions. Each one is kind of its own independent thing. And statistics and machine learning and data science largely operates on data that kind of follows that model, or at least it’s the prevailing model for our data. However, in many real data sets, it’s not really a particularly good model. For example, in networks, the variables or the dimensions tend to be the edges, and those are between two nodes. And so there’s a, there’s a structural, like, logical dependence, not a statistical dependence of like, which things happen, which other things, but like a. A different kind of dependence, where this edge happens to demand the existence of this node and this node. [00:16:35] Paul: I see. [00:16:35] Jovo: And there’s no such thing in Euclidean, in the normal way that people do statistics and machine learning. And so the implications of that is the normal kinds of statistical analyses and machine learning tools that one could apply to data. None of the theory applies when you have it as a network. And so we spent about a decade tinkering with the theory to, like, pivot it little by little over to being able to characterize big, complicated networks. [00:17:02] Paul: So when you started off this project, when you conceived of it, not that you predicted, oh, this is going to take me six weekends to finish, and then it takes a little over ten years. Did you have a sense of how long it might take and relative to how long it ended up taking? [00:17:21] Jovo: Yeah, it was pretty clear from the outset that it was going to take like a decade. It was a lot of work, like the theory of statistical pattern recognition is all grounded in this one set of assumptions or axioms about the space of the data, which is euclidean. And we wanted to pivot it to a different and frankly more complicated setting that we thought is more accurate to describe the data. But there’s 100 years of work on the first bit and approximately zero work on the second bit. So less than 100 was a good guess. But sometime for sure. [00:17:56] Paul: I just realized we didn’t define connectome. Can you define connectome for me? [00:18:00] Jovo: Yeah, you know, it’s funny, there’s a bunch of people publish things and they define connectome in the way that. So they can publish and like, use the word. But usually what people mean is we’re talking about a brain, and there’s a set of nodes. This could be neurons, but sometimes there are regions of brain to, and then there’s connections among them. And so those could be synapses or tracks, if you’re talking about regions of brains. And so the connectome is a complete set of nodes that defined however you want and edges defined however you want. But it’s a complete set at that particular spatial temporal resolution. [00:18:38] Paul: And you can have functional connectomes as well. Right. [00:18:41] Jovo: Well, people certainly use that word a lot. It’s. To me, it’s very unclear what it really means in the sense of now, it depends on there being some statistical model underlying the dynamics of the activity and which model you mean is like a whole nother thing to condition on. So if you just say connectome, you have to define what constitutes a node and what constitutes an edge. And in theory, the edges are measurable. Like, you can look into the brain and it’s. It’s an anatomical thing. Once you go to functional connectome, there’s no thing to look at that’s an edge. Now, it’s a quantity that’s abstracted and must be estimated from time series data. And so I see why one would say it, and I’ve said it and I have a bunch of papers on it, but it’s. I would kind of prefer the whole world just use correlation matrix instead if that’s what they were talking about. [00:19:34] Paul: Okay, so. But what we’re talking about here with the fly connectome is the fly connectome. [00:19:40] Jovo: Is the first insect where we have every single neuron. Pretty much. And every single connection between them. Pretty much. I say pretty much because the imaging, like, the actual experiment of taking the images, is complicated and difficult. The process of stitching all the images together is complicated and difficult. The process of identifying all the cells and their processes and the synapses is, again, complicated and difficult. And, like, there’s errors in every step. [00:20:11] Paul: And this is the largest connectome we have. [00:20:14] Jovo: Yeah, this is the largest connectome, certainly we had at the time. Maybe someone has completed an adult drosophila connectome at this point, but we don’t have it publicly available if they have done it. [00:20:25] Paul: Yeah. Okay. Because, I mean, the famous one is the C. Elegans connectome, which is, like, 302 neurons depending on whether it’s the. [00:20:32] Jovo: Hermaphrodite or the male. [00:20:34] Paul: Right. What is the male? [00:20:37] Jovo: Probably smaller. [00:20:38] Paul: I think it’s more efficient. You mean. [00:20:41] Jovo: I did not mean that. [00:20:43] Paul: No, no, no. But. Right, so. All right, so now this is public, publicly available. People can use it to do all sorts of things. One of the things that you’ve already worked on is looking at bilateral symmetry and comparing the left and right sides of the brain. So what did you find there? [00:21:02] Jovo: Well, actually, I want to start by telling you why we thought we should even look at that question. [00:21:07] Paul: Okay. [00:21:07] Jovo: Yeah, yeah. You know, it’s an interesting thing with analyzing data is people like to think about the sample size and, like, how many samples you have. That kind of tells you what you can estimate about the data. So, for example, if you have one point, you can kind of guess what the mean is, but you have no idea what the variance is going to be because there’s only one point, you have two. Now, you can kind of guess the variance, but it’s probably not going to be a very good estimate. Now, with the connectome, we have one of them. So, like, we have kind of a mean estimate. It’s this one. But, like, what else can you really say? And it took us a long time to figure out what can we say where we could really get some confidence about the answer? Meaning we could estimate confidence intervals or a p value or anything like that. [00:21:55] Paul: Do you just mean what question can you even ask with? [00:21:57] Jovo: Yeah, what question can we even ask and have a statistically rigorous answer to? And the answer was, you know, very few questions are actually particularly easily answerable with just one connectome. But it also turns out there’s lots of ways to think about a connectome. So you can think about there’s one connectome, or you can think about there’s like thousands of neurons or, say, tens of thousands or hundreds of thousands of connections, and then you kind of realize there’s some flexibility in how you do the modeling. And you can say, for, for example, maybe the left side is a copy and the right side is a copy. And now we have this internal control because we can look at how similar these two things are to each other. So that’s really why we looked at bilateral symmetry, because we had an internal control built into the data. [00:22:48] Paul: So then. So what did you discover? [00:22:50] Jovo: Well, the first thing we discovered, and this took probably a few years, was that the data was messy, but you. [00:22:57] Paul: Already, this is, this came. Wasn’t this published right around the same time as the official publication of the insect connectome? [00:23:05] Jovo: Yeah, yeah. So we really were kind of working on it in parallel. And so when our collaborators, Marta and Albert and Michael, who are all in the UK, they’re the ones who really got all the data, did the imaging, lots of image processing and manual labor, you know, really the vast majority of the work in ten years, and getting all the funding to do that. When they first gave us their estimate of the connectome, we looked at it and we’re like, nah, that’s not right. And they’re like, well, you know, what do you mean? And so we’ll say things like, well, the brain is a connected network we know you can go, you can traverse from any neuron in the brain. You can follow a path of connections to get to any other neuron in the brain. And the first network they gave us didn’t have that property. There was like a whole big network over here and then a bunch of disconnected ones over here. And so we’re like, that’s wrong. And they’re like, oh, okay. So then they go back and they, like, figure out, well, all the edges that they missed, and they give it to us again. And we’re like, nah, still wrong. And they’re like, how do you know? Well. And it’s like, well, this neuron and this neuron on the left and the right side of the brain are really the same neuron. Like, they have the same identity, the same, say, developmental history and stuff like that, more or less. They should look very similar to each other. But this one, which you’ve labeled a on the left, actually looks identical to what you’ve labeled b on the right and different from what you’ve labeled a on the on the right. And they’re like, oh, okay. So then they go back and they’re like, yeah, you’re right. You know, blah, blah, blah. We have to switch this one. Then we have to figure out what’s the real identity of this one. And that process, I mean, I’m talking about, it kind of lacksadaisically but required years of work because we had to develop the algorithms to be able to check these things on, you know, hundreds or thousands or tens of thousands of edges. And the algorithm had to be efficient enough, and we had to be able to interpret the things. And there was a lot of back and forth just till we got to a connectome where we’re like, yeah, we can study this thing. [00:24:58] Paul: So that I don’t want to, like, just throw a wrench on our conversation already because, I mean, so did that work get easier as big data got bigger and big compute got bigger? I mean, when this started, there was already big data, right? But, you know, I’m not sure where we are on the exponential curve, right, where we were ten years ago versus where we are now. [00:25:24] Jovo: What I’d say is my experience is, for every applied data set I’ve ever worked with, the approximately first year of work is me finding flaws in the data that mean, like, the kind of default analysis of one would do wouldn’t work. And that’s been true since for 20 years. It’s always been about a year of basically cleaning up the data is that. [00:25:48] Paul: Because the big data precedes the right big algorithms. Big compute. [00:25:53] Jovo: Or I think it’s just because data are messy and, like, doesn’t matter if it’s big or small. Like, you know, even when I was playing with, like, single cell EFIS data, there’s like, this big spike in the middle of the thing. It’s like, why is there a big spike there? It’s like, oh, I kicked the table then and I didn’t mark it down or whatever. It’s like, well, now what do we do? Do we think of it as two separate sections? Do we figure out how to do an analysis that’s robust to random big spikes? Do we just delete that and call it nans? Now we have to adjust our algorithm to be able to deal with nans. There’s so many options, but it’s a process of figuring out, like, there are anomalies that don’t kind of fit in the theoretical assumptions of what the data should look like. We have to change the algorithms and or the data to address them. [00:26:39] Paul: So then going back to, like, what, what this connectome is useful for, top ten medical breakthroughs, right? What are some of the things that people are doing with it? And what are some of the things that people should be doing with it? [00:26:55] Jovo: Well, I think the connectome, like, when we have one connectome for some species, the way I like to think of it is it’s a resource. And so, similar to the Sloan digital sky survey in cosmology or the human genome project in genomics. What it does is it lets people who are interested in understanding, say, circuit mechanisms in the connectome, but other things in the other branches of science, you have some idea that you think might be relevant or real or explain something, and you get to go check it in this resource before you go do your experiment. And so think about in cosmology, before there were sky surveys, you’d have to rent telescope time. It takes about six months. You hope it’s not cloudy that night, because if it is, you have to wait another six months if it’s not. Now you can look up at the skyd and try to get some data on your thing. But once the Sloan digital sky survey existed, the way that we did cosmology was completely different. Now, you check the resource first. If the resource says your hypothesis can’t possibly be right because of whatever, you don’t ever do that experiment. You don’t look in the sky at that time. You keep thinking and come up with a better experimental design that fits within the constraints of what we know is even possible. And so the connectome really is a constraining mechanism. It says, you have some idea about how this circuit works. Well, if it’s not in this connectome, it can’t be right. Now, obviously, there’s caveats to that. It could be writing a different fly or whatever. But like, in general, it constrains theories so that the experiments that we do are checking for things that could possibly be right. And that’s how people are actively using. The drosophila connectome today is investigating all sorts of circuits that they had multiple ideas, competing ideas for, what was the explanation? And now they have this thing they can use as a reference, and they can make their experimental design more efficient and more precise. [00:28:53] Paul: So you could potentially think about how some area or some circuit computes. Flying. Flying, let’s say. Right? And then you would use the model as a resource to potentially design your own model, use the connectome as a resource to potentially design your own model and constrain what your model should look like you should be able to implement, quote unquote flying in a model based on the connectome data. Is that. [00:29:25] Jovo: Yeah, exactly. So, right, like, and if your model, you thought like, this part of the brain connected to this part of the brain in your model, and you go to the connectome, you’re like, oh, that’s not there. Okay, that model’s wrong. You don’t ever have to do the experiment or like collect a whole nother connectome. It’s just there’s a whole brain regions are not connected in this one fly. It’s like, unlikely to be the mechanism that explains flight in flies. [00:29:50] Paul: Right, right. Is there something that you don’t have time to do that you would suggest others do that? I mean, I’m asking a terrible question because maybe you want to tackle the thing that you want to tackle. Right. But is there a question that you think should be being worked on that isn’t? [00:30:11] Jovo: Oh, I don’t know. I’m most excited about learning, for me, inference, just being able to guess the right thing, so to speak. Basically, calculators do that just fine. And it’s not interesting to me. Now, I understand explaining behavior is interesting to lots of people. And I’m not saying it’s not interesting in general. I’m just saying what I want to discover and learn about is how learning happens rather than how the inference happens after it’s learned. And so with respect to the drosophila, they actually do and can learn a whole bunch of different things. And so what I’m most interested in thinking about is how to understand how drosophila connectomes change, or, say, develop such that they learn some things and not other things. [00:31:00] Paul: So then you would need a handful of connectomes at larval stage. One. A handful of connectomes at larval stage. I don’t know how many larval stages there are up to adult. Right. And see how they change. [00:31:12] Jovo: That would be one way of doing it. The way I’ve been thinking about doing it is if we can train one hemisphere and not the other, now we have an internal control. Or if we can just select a fly that happens to be good at doing something with its left side of its brain embed with the right side of its brain, so we don’t even have to train it. We just find one that through luck or genetics or development or whatever happens, to have this internal control built in where the left side of its connectome gets it and the right side doesn’t. [00:31:46] Paul: But then, I mean, my first thought, I was like, well, you could just breed and select, right? But then it’s more innate. Then there’s the line between innate and learn. Learned is blurred. So I guess you’d want to select for flies that you could teach that don’t come with it prepackaged, but that you could teach something to. [00:32:06] Jovo: Yeah, I mean, you know, this word learn is kind of funny in machine learning. Branches of machine learning have done a pretty good job formally describing what learning constitutes. However, I would say in Iris van Rouge really made this super clear to me in conversations with her that, like, that’s not what we’re doing. Like, that’s a formalism that has lots of nice properties and some very similar properties to what we’re doing, and it’s not. [00:32:36] Paul: It. Yeah. Okay. Yeah, because I was going to ask, like, how many more connect homes are on the way? [00:32:44] Jovo: Oh, probably a lot. [00:32:46] Paul: I mean, fly, right? Like, to, you know, just. Just sticking in the fly. [00:32:51] Jovo: Well, there’s an adult fly presumably coming very soon. And then, like, a lot of the bottleneck in the research is developing the technology to be able to do things at scale. And so we kind of suspected it’ll be a similar thing. Like a human genome project. Like, the first one took about a billion dollars in like a decade of, but now it’s like a $1,000 in, like, I don’t know, maybe a week. So we. And it’s only been a couple of decades since the first one, so we suspect technology development to follow a similar trajectory. [00:33:26] Paul: Yeah, that makes sense. So. So the connectome includes, I guess, dendritic, all the dendrite, all the dendritic synapses. Right. [00:33:36] Jovo: Well, it’s kind of one of the, like, I’d say, dirty statistical properties. [00:33:42] Paul: Sorry. [00:33:42] Jovo: It’s just like when we get a connectome, what we actually measure is an electron microscopy image of the entire 3d brain. So we get everything that one can measure at that resolution that’s static at that time. One of the things you can derive from that is which neurons connect to which other neurons and say, how many connections there are between any pair of neurons, and you can say, that’s the connectome, but you could also include, say, the volume of each synapse, the location of each synapse, how many neurovesicular, whatever, anything you want, it’s all in there. There’s no one that can stop you from calling that the connectome. And sometimes people do mean all that other stuff, and whether you need all that other stuff to explain any particular cognitive phenomenon is totally unknown. [00:34:34] Paul: So a given connectome is frozen in time. And what we know is that the brain, you know, dendritic spines are just turning over all the time. Our brain structure is constantly in flux. Is that built into, like, those statistical properties built into the connectome, or is it. Is it simply. Sorry about my ignorance here. Is it simply a structure? [00:34:57] Jovo: It is a structure, but what I would say is it’s not very that obvious how much influx the anatomy of our brain is once we’re adults, there’s certainly a deterioration, and we’re always losing brain cells and synapses, especially if we’re drinking or, like, getting punched in the head a lot. But other than that, it’s really not clear, first of all, how much change there is. So there’s lots of changing. But whether the actual strength between a pair of neurons changes a lot over time, no one knows because no one measures it in brains, because it’s real hard. It’s certainly possible that the changes are such that homeostatic equilibrium is preserved, meaning that the connection strength between any pair of neurons is static, despite the fact that where the synapses are has changed. [00:35:44] Paul: But again, that’s a thought experiment, and I thought you wouldn’t, because I was going to ask you, would it matter if people did measure and the brain is constantly in flux in terms of new connections being formed? And it’s not, it doesn’t preserve the homeostatic, average connective strength between neurons, right? Would that matter? I was going to ask you that and I thought, oh, he’s not going to like that question. [00:36:07] Jovo: No, I love that question. I don’t know. No one knows. No one has any clue. Presumably it will matter for certain things and not matter for other things. And, like, then the fight is, will people be like, it does matter. It doesn’t matter, but, like, matter for what? [00:36:23] Paul: It matters for, like, Twitter bickering, I. [00:36:26] Jovo: Think matters for Twitter bickering, but my experience of when I Twitter bicker, I tend to Twitter bicker with people who I really profoundly respect in their knowledge and intellect about this particular topic. Our Twitter bickers are almost always resolved by a phone call, and it’s like, well, what did you mean when you said this? And they’re like, I meant this and they’re like, oh, I agree with that. [00:36:49] Paul: Right. Is Twitter picker an actual phrase? Because I just said it and it sounds good, but if I start seeing it a lot, I want credit for it. [00:36:58] Jovo: You know, I will duly credit you. [00:37:01] Paul: If you start using it a lot. Yeah. Okay. So the last time we talked, we got. I won’t say heated, but we both got a little. I don’t know if defensive is the right word or. I think I wasn’t articulating my points and I wasn’t maybe understanding your points as well as I should have. What’s that? [00:37:25] Jovo: Real time Twitter? Bigger. [00:37:27] Paul: Is that a thing? That’s a thing. [00:37:29] Jovo: I think that’s what we did. [00:37:30] Paul: Oh, that’s what we did. I thought you just looked up whether that was a phrase. Yeah, so. And what we were talking about is the difference between necessity and sufficiency. And I was asking if, you know, if structure is how important it is, which was not a well formed question, and you made the point, which I agree with that, of course. Like, structure is necessary to know, and I referred to Eve martyr in the past, who has made this point as well, that structure is necessary to understand function, but it’s not sufficient. We both agree with that. Part of the reason I was getting maybe defensive or I’m not heated, I don’t know what the word is. Emotional or something is. Because what I wanted to ask you is because I wasn’t articulating. Well, what I wanted to ask you is whether you have received. So therefore, because anatomy is necessary but not sufficient to understand function, we all agree on that. Have you received pushback? That. Oh, this work that you’ve done, it’s just not important. Do you feel that pushback at all? [00:38:42] Jovo: Oh, yeah. So I want to say two things. First of all, I would say anatomy is required for certain kinds of explanations of function, but not others. [00:38:53] Paul: I agree. [00:38:54] Jovo: And Mars levels, like, at the top level, computation, anatomy is totally relevant. It’s neither necessary nor sufficient. It just doesn’t show up. But if you want it, again, going into the mar levels, if you want to talk about mechanism now, I think you need anatomy. So I wouldn’t say anatomy is always required to explain function. I’d say for certain levels of explanation, it is required, and in particular, it’s required at the mechanistic level of explanation. In terms of pushback, huge pushback. [00:39:25] Paul: We didn’t get into this last time, so I’m glad that we’re getting into it. [00:39:28] Jovo: Yeah. So I’ll just give, like, a few concrete things. Like, first of all, there was no real funding mechanism for connectomes, so we did the research just on the back of other funding, but there weren’t grants where it’s like, yeah, we’re gonna study connectomes, and they’re gonna. NIH or nsF is gonna give us money. Really? [00:39:48] Paul: Did you get rejected funding wise? Cause that would surprise me, actually. [00:39:54] Jovo: There just weren’t solicitations for it. We kind of did it in, like, you know, the ways that you do things to get grammar. The other thing is cosine. So cosine is. Computational systems neuroscience is one of my favorite conferences. I went for years and years, and when I started publishing on connectome, they started submitting abstracts of cosine on connectomes. And we did this for about ten years, and every single year was rejected, and every single year it was rejected. And the reason was, it’s really not that interesting. Or the analysis you did wasn’t that interesting. But meanwhile, they were cosign abstracts for work that we had published in science, nature Cell, Nature Methods, PNAs. I don’t know. Name your other journal. Nature neuroscience. Every journal you could ever hope to publish anything in. We had published these results soon, and yet consistently, cosign said, it’s not that interesting. And so at some point, like, maybe last year, after they rejected our abstract, which was a science paper, and NatGeo thought was the most one of the top ten discoveries of the year, I was like, hey, I hear that you’re saying it’s not that interesting. Lots of people disagree with you. Like, just FYI, lots of people disagree with you, and it’s clearly computational and systems neuroscience. So what I said is maybe change the name of the conference to Computational Systems neuroscience, but not connectomes or whatever you want, just like you’re allowed to not be interested in it. I’m not interested in lots of stuff. I just wanted more transparency and what they really are interested in. And they were really receptive. Like, we had a nice conversations about it, so I’m not being critical at all of them. I just, like, that was a pushback we got, like, they didn’t think it was interesting. [00:41:45] Paul: So you. But you don’t, in general, feel exasperated by getting a healthy amount of pushback, because it must feel, like, somewhat exasperating to you. [00:41:59] Jovo: I don’t think it’d be as fun if everyone agreed. [00:42:03] Paul: Okay. All right, so that wasn’t as heated as last time. So that was nice, right? I didn’t perseverate on the structure function. Will we ever find out? How important is it on a scale of one to ten? Those sorts of ridiculous questions. But is there anything else that we didn’t cover about this connectome work that you think is important to mention? [00:42:29] Jovo: Well, there’s a connection to humans and human consciousness and intelligence and disease that we didn’t touch on yet. And I do think, although so far, we only have a fly connectome, I’m interested in their existence of a human connectome, and I do think it will be important for explaining many things about us. I don’t know, important on a scale of one to ten, but pretty important. [00:42:53] Paul: So, consciousness. [00:42:55] Jovo: Yeah. [00:42:56] Paul: How is the. How. How is a human connectome gonna help us? What’s the word? I better understand. Explain? [00:43:05] Jovo: Well, maybe we can start a little bit simpler and talk about how a human connectome could be impactful in supporting, say, medical research. So, for example, let’s talk about addiction. There’s many experiments we can do in mouse models of addiction where we give mice drugs or whatever, and we see that they start exhibiting addictive like behaviors. If we could also, say, measure the connectome of a mouse after it’s been, become addicted to something, and say, a bunch of clones, effectively, that haven’t become addicted to something, we could get deeper insight into the underlying mechanism of addiction at the neuroanatomical level, and that could lead to more effective treatments to help people who are suffering from addictive behaviors, to be able to free themselves from those behaviors and live a life that is more conducive to the way they want to be. And so that’s, like, one concrete example of how I think connectome research could have real world implications for dealing with human illness or suffering. But I think. I hope it’s really just the start. Like, insofar as the anatomy is involved in any degree of human suffering. I’d like to think that a deeper understanding of the anatomical underpinnings associated with any element of suffering, we could divine more effective either treatments or better yet, like, prognostic tools and interventions before people start having these debilitating issues to prevent those outcomes. [00:44:40] Paul: Oh, like, if there is degeneration of a certain area, then you could treat that area with anti degeneration drugs or. [00:44:47] Jovo: Something, for example, or something like psychiatric issues. Often there’s a long window, say, of years or decades leading up to, say, a psychotic break or something like that, where if you could predict a priori, like, say, ten years earlier, this person is more susceptible to that kind of thing happening. In the future, you could start developing therapeutics, potentially even non invasive ones, like different kinds of therapies or meditation practices or who knows what, to help pivot their connectome towards a range where they’re less likely to suffer from these kinds of disorders. [00:45:26] Paul: Okay, well, so this connects also, then, to what you just said about consciousness, because I know that you’re interested in connecting neuroscience to spirituality and that you’ve looked at, you’ve already looked into the difference between novice meditators and expert meditators and found some differences in their brain activity or structure. [00:45:49] Jovo: They’re functional connectomes. [00:45:50] Paul: Functional connectomes. There you go. All right. Yeah. So talk a little bit about that. [00:45:55] Jovo: Yeah. So. [00:45:58] Paul: And how that connects to consciousness. Sorry to interrupt, because, I mean, they’re related, but they’re different. [00:46:04] Jovo: Yeah. So when I’m in a state of presence and mindfulness and, you know, really any cognitive state of mind that I get to, presumably there’s a different activity happening in my brain that corresponds to this moment. And so I imagine I, like, get really frustrated about something. Now there’s some activity in my brain. Now I somehow I’m able to calm my nervous system. There’s a different set of activities. But if I want to transition my anatomy to be the kind of anatomy such that I tend to not lose presence or I tend to be able to regain presence more quickly, like, I don’t see why that’s not a possible set of scenarios and arguably, say, meditation practice that people do for years or decades, maybe what’s happening is they’re actually changing their anatomy such that their brain is more easily able to transition to a state of presence or mindfulness from reactivity or something like that. So I don’t see any reason a priori, that neuroscience research can’t be extremely informative about how to essentially train ourselves to be the kinds of people that we would want to be whatever it is. Maybe you want to be more conscious or mindful. Maybe you want to be more aggressive. Like, it could be anything. I’m not making a judgment call on how we want to change our personalities, but this often, we do have a preference. Like, I’d like to be more like this. And so what’s the underlying anatomy that would tend to make it easier to be like this? [00:47:37] Paul: Two things. One, it’s a shame that it takes ten to 15 years to get to that point. And presumably we could speed that up a little bit if we had better training, better, etcetera. But two, is it the case that, let’s say, an expert meditator or whatever kind of meditation you’re doing, it also has a different baseline, uh, conscious, subjective experience? Like, so if you wanted to be more mindful, right. Um, is a. Is an expert meditator more mindful over coffee than a novice meditator? Like, is their baseline. Does their baseline rate change because of their brain structure change? Right. Would. Would you guess? Do you know? [00:48:24] Jovo: I certainly do not know. Um, I want to address that earlier thing you said, and I want to relate it back to, say, like, running marathon. So people have been running marathons since, like, the times of greeks, right? That’s where it started. It’s been thousands of years. It was only recently that people even realized that it was possible to run a marathon under 3 hours, or that no one knew whether it was possible. And now people are starting to ask whether it’s possible for a human to run a marathon under 2 hours, because people are getting pretty close to that. And the thing that changed wasn’t like thousands of years of evolution or genetics. It was better training, better understanding of physiology and better training, or maybe just better selection of people and more focused training. But the point is, like, we can do really amazing things once we understand how to train our bodies into those states. And so it’s weird to think about meditation as, like, a competitive sport, but just for the moment, imagine somehow someone figured out how to make meditation a competitive sport. [00:49:30] Paul: Like yoga, like modern yoga, like the. [00:49:32] Jovo: Way people compete yoga. I promise you that incentive structure for humans would lead societies to come up with ways of making meditation training extremely fast and extremely efficient. And I’m betting you, it would get us to a place where I people could achieve levels of consciousness or mindfulness or meditation. That is unheard of currently. [00:49:57] Paul: But do you think if you asked an expert meditator, so if you ask, like, a great writer or comedian or artist, they would say that there’s no shortcut that you have to put in the work. I mean, there’s a certain level that you’re born with or whatever and start off with, and you can train really hard, but there’s no shortcut. [00:50:17] Jovo: What the fuck do they know? I mean, how do they know they don’t have a shortcut? Doesn’t mean there is no shortcut. It’s just like. [00:50:25] Paul: Well, it’s just. But it’s like 99 out of 100 experts would say that. Right? So that’s pretty, like. Well, you know, that’s a pretty good sampling. I just bullshit that number. Of course. But. [00:50:36] Jovo: But it’s a logical fallacy. The fact that they don’t know that there’s a shortcut is not evidence that there’s no shortcut. All that means is they don’t know of one. And I agree. I also don’t know of one. Maybe no one knows of one, but that’s not actually evidence that there isn’t one. [00:50:51] Paul: Yeah, I just kind of think it’s a beautiful thought that there isn’t one. You know? I mean, I think that they’re both beautiful thoughts, but there’s something. There’s something appealing about having to put in the hard work that other people aren’t willing to put in. Right. [00:51:09] Jovo: I promise you. Just like, people figured out shortcuts to running marathons faster. It’s not like now it’s easy to run a marathon. Like, the work won’t end just because someone figures out a shortcut. There will always be more levels of depth and consciousness and meditation to achieve, like the one in you that’s afraid that we’ll all achieve enlightenment and somehow that’ll be bad, I think, can rest assured. [00:51:34] Paul: I thought enlightenment was the end, though. I thought that was it. [00:51:38] Jovo: I believe that the. It is the search and the journey, and I don’t think there is a time where we’ll get there. [00:51:46] Paul: Okay. All right. Okay. Well, let’s move on because we have about a half an hour left, maybe, and I want to kind of take our time a little bit. Talking about prospective learning. This is something that you’ve been working on recently. You just sent me the. Is it the second draft? Because it’s quite different than the first draft that you sent me of the same thing because it looked like it’s a. Is it a neurops entry or. [00:52:12] Jovo: So we have a white paper that we publish in a conference called Colas, which is a conference on lifelong learning agents. And I guess what it’s called, that was a white paper where we introduced this concept of perspective learning. We didn’t have any particularly compelling theoretical results. We had, like, a couple of nice illustrations of what we were talking about. We now have a preprint that we’re circulating to friends where we have the first kind of theoretical results, demonstrating, I think, relatively clearly that perspective learning is a coherent concept, and it’s different from the kinds of learning that we typically talk about in machine learning. [00:52:54] Paul: Okay. Yeah. So just so that to be clear, it’s prospective learning, because when I say it, I could probably slur it, and it might say perspective. But this is in contrast to retrospective learning. Did you guys invent the term prospective learning? [00:53:10] Jovo: I don’t think so. I mean, people have talked about retrospective and prospective intelligence and cognition metacognition as well. Yeah, yeah. For at least a decade, presumably longer. I hadn’t heard of anyone specifically saying learning, but when they talk about it, they’ve included the concept of learning. So I wouldn’t say that we invented the concept or the phrase. What I would say is, we’ve tried to formalize it in a particular way, but it’s a thing that others have said in the past. [00:53:44] Paul: Okay. Okay, so this is in the previous, though. Um, archive paper that you sent me before was from what’s termed the. The future learning collective. The title of the paper is prospective learning, principled extrapolation to the future. And, uh, I think the manuscript begins with all learning is future oriented. I may be misquoting, but it’s something like that. Um, so where did the idea come from, and why are there 700 authors on it? And what’s the problem that you’re trying to solve? [00:54:17] Jovo: Yeah, so Chava Siegelman was a DARPA program manager several years ago, and she created a program called Lifelong Learning Machines, where she brought to my attention the fact that machine learning algorithms have this issue. A classical machine learning algorithm have this issue, which is if you train them to perform, say, task a, and then you train them to perform task bdez, they will forget how to perform task a. And so then if you give them samples from task a, again, they just do poorly. And it’s called catastrophic forgetting. It’s not just a machine learning problem. People have also noticed this in humans and non human animals, that there’s ways of catastrophically interfering with the learning process by introducing a new test that can interfere with the learning process. So it’s, in general, an issue with learning, and in particular, AI was really failing at this kind of problem, and she created a DARPA program to mitigate the issues. We were on that DARPA program, and we realized pretty quickly what was happening with the existing approaches to try to mitigate this issue, which is you would train a machine learning algorithm to perform well on task a. Then you would do some tricks so that it would learn how to perform well on task b, also without forgetting task a. And then whichever task you told it had now had to perform, it would do well on that task. And that was, you know, that was a lot of work from a lot of people, and very impressive. However, it’s limited in a few ways and in particular the way that we’ve identified. One of the limitations is that if there’s a way to predict what task is going to happen next, then you don’t have this issue. And so imagine instead of being told that you’re doing task a or task b, it’s just switching from task a to task b to task a to task b. And you don’t know the existing algorithms, some of which can handle this situation. What they do is they retrospectively respond. So after they get a bunch of mistakes on task b, they realize, oh, this must be task a, and therefore they switch to task a. So after a switch, they’re always at least making one mistake to discover which task you’re on, and then they can do well. And what we realized is that if the tasks are following some predictable dynamics, which isn’t necessarily the case, but if it is the case, then one could predict that the task is going to switch and then not have to, one could then prospectively do well, rather than retrospectively react to that. [00:57:01] Paul: There has been a switch by dynamics here. The key is that you’re simply adding in time into the performance of the algorithms to keep track of time, so that they can say, I was in task A for ten minutes, and then I was in task B for five, and then I was in task A for another ten. And so I bet I’m going to be in task B for another five, would be the guess. And if it is that structured kind of series of changes, then it behooves, formally, it behooves the learning algorithm to keep track of that and predict that the task is about to switch. [00:57:43] Jovo: That’s right. And not only is it to the advantage of the learning algorithm to make that prediction, we also claim it’s what humans and non human animals do often. And so if you’ve ever conducted a physiology experiment in animals, what you’ll know is people never make the time of things fixed. Not never, rarely make the time of trials fixed. There’s often like some randomness in that, because if it’s fixed, the neurons start predicting when they’re going to get a reward before they get the reward. Because we claim the neurons, the organism, everything is prospecting in this way that machine learning algorithms currently do not do. [00:58:24] Paul: So then, so was the impetus for this work then thinking about the way organisms work? Or was it thinking about how AI doesn’t work well and how it might work better? Yes, both of us. [00:58:40] Jovo: Yes. [00:58:41] Paul: So that’s a cheat. You can’t cheat like that. Come on. [00:58:43] Jovo: We noticed that the AI algorithms were suffering from this way. And because we, a large number of the co authors on this original paper have done a lot of physiology and behavioral experiments, also noticed that humans, animals and neurons often do it differently than it was. Like a natural thing to put the two things together and be like, okay, what’s missing from the AI formalism, such that the algorithms aren’t doing this thing, that natural intelligence things are doing this thing. [00:59:16] Paul: So you’ve mentioned to me, and you write about in the paper, that maybe the closest thing in AI in terms of keeping track of or using time as a metric is reinforcement learning, because you’re estimating a discounted, a time discounted reward function. Right? And sometimes, and, well, you tell me, because it seems to me that in this most recent version, anyway, that there are versions, one of the things that prospective learning does is it predicts into infinity, and sometimes it’s discounted and sometimes it’s not. Do I have that right? [00:59:55] Jovo: Yeah, it can be. I mean, arguably, not all problems require a temporal discounting, and some of them benefit from it, and sometimes it doesn’t make sense without it. So the deal with reinforcement learning is beautiful theory and very nice numerical experiments in certain settings. For example, video games, you can now use reinforcement learning algorithms to learn how to play video games, to basically human level performance with essentially the same amount of training experience as humans. That is a massive accomplishment from a large group of people that have been working on this for now, decades. [01:00:32] Paul: Yeah. [01:00:33] Jovo: However, the algorithm continued to have certain problems. For example, and this is like from a paper recently, if you just like, change the colors of the video game now, the algorithm does really poorly, or you shift everything over by a few pixels. The algorithms perform very poorly. So those are problems even in the video game level. When you get into the real world, the amount of training data that’s required for these algorithms to work is typically so large as to mean that people don’t really use reinforcement learning algorithms for the most part, in deployment for real world problems. Now, that’s not to say that they won’t or they couldn’t. It’s just as my understanding the current, say, discipline is not what’s done. Now, why is that? Well, I don’t know why that is, but we have a slightly different approach to dealing with time than the way reinforcement learning does. And there’s a few interesting differences. So one is just that our framework doesn’t require, it doesn’t have all the same machinery, in particular, reinforcement learning. The whole problem is set up that you’re making a decision that will impact what happens in the future. There’s many problems with times where that’s just not the case. Like, for example, you want to build an AI algorithm to detect whether something fell into the pool. You say you have a public pool and you really don’t want people falling in at night when there’s no lifeguard. However, the camera’s performance is going to degrade over time. You know that a priori because gunk’s going to get on the lens and like electronics are going to start going bad. And so you really want to be able to have an AI system that’s going to be adapting to changes over time even though the detection algorithm isn’t impacting whether people actually go into the pool or not. So that’s not a reinforcement learning problem, but that is a problem where you really care about encoding time in the algorithm so that, you know, say, when you should replace the system or update the data or something like that. [01:02:42] Paul: So, but in the, let’s say like the video game example that you gave, right, where you move a few pixels and then all of a sudden you can’t, the RL algorithm can’t perform at all. I mean, there are cases where prospective learning is better suited to be implemented and worse. Right. So I just immediately thought, well, what would you need prospective learning for that? Would that be a solution for that? [01:03:06] Jovo: I think it’s not clear. What I would say is reinforcement learning kind of evolved out of a perspective on, say, how learning should happen in basically a natural intelligence. And there’s really elegant theory, partially observable Markov decision processes is a formal theory underneath most of the theory behind reinforcement learning. And there’s lots of algorithms around it. But to a large part, it’s distinct from 100 years of kind of classical learning theory and probably almost correct learning theory that doesn’t have time at all. And so one of the implications of this, like, I’d say, historical accident of the different evolution of learning algorithms and theories is that reinforcement learning can’t easily leverage a lot of the machinery that was built up in the rest of Cisco learning theory because it uses different language and a different problem setting and things like that. So what we tried to do is take a. The thing that is like the bread and butter, basically how all AI is deployed in the world today, which is underlying probably approximately correct learning theory, where you assume there is no time. You just try to learn stuff and see, well, let’s see what happens if we include time. And the simplest, most obvious result that we prove now is that there’s many kinds of settings where classical approaches like empirical risk minimization, which we know provably will converge to the best possible answer for our classical learning problems, we can prove will fail. For problems where you need time and a simple modification to that, which we call time aware, empirical risk minimization succeeds. And so the point is just, if you literally include time as an input, like, you have whatever images as the input, but also when the images was taken just as another feature, you just include it, and you include it in a particular way, then you get these really nice results where things can be changing in time and you still perform really well over time. [01:05:15] Paul: So last time that we talked, I brought up that time is inherently part of transformers because of what are called positional encodings, which just says, this word came at one, this word came at two, this word at three, etcetera. And you made the distinction that, well, that’s. And I said, well, that’s kind of like time. Right? And you said, well, it’s not. Basically, it’s not absolute time. It’s relative time. And same thing goes for, like, recurrent neural net neural networks, which keep track of, like, the sequences, essentially, in their internal dynamics, more or less. So I think I just summarized that. Okay, but then in this latest paper, you added time to a transformer, which you hadn’t told me about before. So what’s going on there? [01:06:05] Jovo: Yeah, so everything you said, I think, is exactly right. Positional encoding in transformers converts the relative position of the token or the word in a sentence. If you’re using transformers in language modeling, to not just a number, but to a matrix, actually, that represents time, and that’s positional encoding. We changed that in our paper to be able to, instead of being the relative position, the absolute time. And we’re not the first ones that have changed that. Other people have done that. There’s several papers on autoregressive transformers, and so we’re just pointing out that this notion of encoding relative time can be used as encoding exact time. And our real contribution is just that. If you do that empirically, then you can learn things where time is changing, where empirical risk minimization, such as a transformer that doesn’t deal with absolute time, like if it’s in images, it was just like a vision transformer. And things are changing in time, tends to fail. [01:07:07] Paul: So presumably, I mean, I already mentioned, right, and I think you mentioned that there are situations where prospective learning is optimal and there are situations where it’s not necessarily optimal. Are you envisioning like an overall system that. Well, the question is, how do you know, how would a system know when the statistical structure of the task switching is predictable enough to then turn on a prospective learning algorithm? Or would you just leave it on all the time? I mean, can it be detrimental to a system, right? If you’re implementing prospective learning, when you should be implementing, you know, whatever IId type, what is it? Probably approximately correct type learning. [01:07:54] Jovo: Yeah. So for sure, time is another variable, which means if you include time, you’ve added variance, and so if there’s no signal in time, you’ve simply added variance and no signal. So your signal to noise ratio has gotten worse, and so your performance will be worse. I would argue for almost any real world problem, time does matter. Like things are changing in time. In the real world, the question really isn’t whether time matters, but whether it matters in a way that we can usefully predict it. Right. And so if it’s changing in such a chaotic way that it’s impossible to learn how it’s changing with the amount of data that we get, then including time will not be useful. And that’s an empirical question. But it’s not a particularly different empirical question than like, should you use a random forest or support vector machine? It’s just like, empirically, what works better for a particular application? And I think the answer is going to be almost always, you’ll want to include time because certain things will be changing in a predictable fashion that we can learn, and it’s worth it, but that remains to be seen. [01:09:09] Paul: So for you and I. Yeah, I think I got the grammar right there. Time kind of resets every day, right? Based on sunrise and sunset. So we have like, day, like her day time. We also have time since the beginning of the year, time since we popped out of wombs or bellies, whatever. Does time need to get reset in these kinds of models? Or does it go on in the past to infinity as well? Is there a usefulness of having layers or levels of time because you use the example of traffic. Right. Because you would want to know. You have a sense of when there’s going to be traffic. But you could tell that by the sun, you could tell that often, and that resets every day. And if you started every day, then it would be at the same time every day. [01:10:01] Jovo: Yeah. So I want to say a few things first. I’m pretty sure we all came out of wombs so far. [01:10:07] Paul: Yeah, I meant. Never mind. Yeah, I was picturing a natural childbirth type scenario as opposed to a c section. Yes. [01:10:16] Jovo: Cool. And then in terms of traffic, it’s a great example because you have to learn when traffic is relative to sunrise. It’s not genetically built into us. And in fact, that will change depending on where you are on earth, Alaska versus Florida, there are different traffic patterns corresponding to the sun. So you’ll have to learn the spatial temporal dependencies of that through just experiencing the world. And if you do, then you’ll be better predicting when you’ll arrive at a place than if you didn’t learn that. So it’s a great example of why you should care about time. [01:10:59] Paul: But so then do we need to have, like, a layered structure of keeping track of time? Right. Because there are certain things that happen seasonally, and I know not to go look for peaches in the winter. Right. [01:11:17] Jovo: Yeah. But you learned that. [01:11:19] Paul: Yeah, of course. But that is a. That’s a different scale. A different scale of time, of keeping track of time. Right. [01:11:26] Jovo: Yes. So I make no claim about what the right scale is or how many scales. I think, in general, it makes sense to think about multiple timescales of learning in multiple. In fact, there’s an interesting thing here, which is the timescale of learning has to be, like, matched appropriately in a way. Like, if you learn slower than the things are changing and you can’t ever learn it. So you’re. You’re kind of. If you have a learning rate, then your learning rate has to be fast enough that it’s faster than the dynamics of the thing, but slow enough such that you actually do learn the things. And there’s lots more work to explore and how to understand, like, the kind of optimal learning rate for certain kinds of temporal dynamics. [01:12:12] Paul: One of the things that you do in the paper is, and you’ve mentioned this a little bit already in our conversation, is compare prospective learning to other attempts at continual learning, at learning, kind of toward, as things change, continual learning, lifelong learning, meta learning. So there are these other approaches and going back to the usefulness of mathematical formalism, one of the things I really enjoyed in this latest draft is, um, you know, there’ll be like a, in bold, there’ll be a section entitled comparison to lifelong learning. And then you start with a sentence. If we use the english language, lifelong learning and prospective learning are the same thing. If we use mathematics, they are not. And every, I’m paraphrasing, but then every comparison, uh, section starts with. Starts with that. So. So that is the usefulness of the formalism. And you and I were speaking in English. So when I ask you a question, I wonder if it not gets under your skin, but if you automatically think, well, it’s kind of useless to ask that question in English about time scales or, you know, something like that, because what we would need to do is just immediately formalize it and test it in theory. [01:13:26] Jovo: No, I think, you know, English is great and like, it’s how we can. [01:13:33] Paul: We don’t want to. We don’t want to. You don’t want to start communicating in clicks and beeps? [01:13:37] Jovo: Well, I’m open to it, I guess. I would say, like, the formal language also includes spoken. I mean, could be English, but any other language too. Like, you know, I think people like to pretend like the formal language is a totally separate thing, but if you look at even a statement of a theorem, and certainly the proof, there’s like a lot of words. It’s not just math symbols, and so there’s a dance of them together. And I think the formal language can really help refine and make precise the english language or any other language that we’re speaking. And then the day we, you and I, and other people, communicate using spoken language, typically sometimes interlace with some formalisms. But it’s really helpful to be able to explain things and understand things at the level in which we communicate. [01:14:27] Paul: Well, and you and I are also communicating with facial expressions and hand gestures and all that. But that’s a separate thing, even less formal. So I want to move on. I want to move on because I want to make sure we talk about a couple other topics. But where are we? It’s still new prospective learning, right? Where are we in the trajectory of when you feel like you will have accomplished a certain level where you feel like you have not made it, but where it has become useful. And so, yeah, where is it now and where is it going? [01:15:00] Jovo: Well, my somewhat obnoxious prediction is that basically everything will be using principles of perspective, learning all the AI, all the modeling of humans and stuff. It will all be doing it relatively soon. That means the next five to seven years. [01:15:23] Paul: So if you do that on top of what’s being trained these days, that means you’re wasting way, way more power, right? [01:15:31] Jovo: Well, I actually think a lot of the failures of AI have to do with the fact that not the moral failures or the capitalism failures, but the actual, like, technical failures, like where they don’t perform as intended by the designer. Those failures often have to do with the fact that things are changing in time and the people designing the system failed to integrate that awareness into the training and the model and stuff. And so the most famous example of this is Google Flu trends, where, I don’t know if you remember this, but several years ago, Google announced they have a big new flu trend predictor. And based on what people were searching for, they could predict how much flu there would be in the upcoming days or weeks or months or whatever, and they made a big pr splash about it and everyone’s very excited about it. And then relatively soon afterwards, people realized that if you literally just use the most naive model possible, you just predict whatever happened last time, it worked better. [01:16:33] Paul: Than Google’s flu trend fit Wednesday lose switch, or something like that. Yeah. [01:16:40] Jovo: It was a big embarrassment for Google in this space. But the reality is that because I would say because they didn’t really address the fact that there’s time, that was one of their big failure modes. And so obviously, predicting what’s going to happen in the future is inherently about understanding time and the relationship between what’s happening and the time that things happen. So I really do think it will make things more efficient and better. Now I have fear around who’s in charge of AI today and what more efficient AI implications of more efficient AI is for, say, the rest of the humans on the planet and the rest of the planet. But I don’t think it will lead to more consumption of resources in that, in the way I think you were implying. But rather hopefully it’ll be more efficient algorithms because you’ll be able to predict things in time with less data. [01:17:38] Paul: Maybe we can kind of mix it in, in the last few minutes here, because we talked a little bit about your take on AI doomerism last time, which is roughly that. Well, what is your take on AI? So AI doomerism is that the robots are going to kill us eventually and we need to worry about it. [01:18:02] Jovo: Yeah. And my take on AI numerism is that AI is a tool and it’s a very powerful tool. That means it will harm lots of people and help lots of people. And you don’t need to guess about the future. We know today it is currently harming lots of people in various ways. And so we can talk till we’re blue in the face about who we expect it to harm in the future and whether it’s all of us or just the rich people or just the poor people or whatever. But the reality is it’s currently having, I think, relatively profound effects on all of us because it’s a powerful tool. And I’m much more interested in the conversation of understanding who it’s harming today. And how do we feel about that, what we would like to do in order to change it, to make it less harmful for, say, the billions of people on earth that essentially have no stake in no ability to change what it’s doing but are impacted by it. [01:19:01] Paul: I mean, so one pushback that you could get on that is that, sure, there are problems these days, but the robots taking over and killing all humanity would be, like, the worst problem because then we’d all be wiped out. I mean, I’m just. But what you’re saying is the relative weight of what we should worry about right now is right now, and we should fix what we can fix right now. [01:19:26] Jovo: I think I’m saying something like that. But what I’m saying is, it depends who you are. Like, the worst thing to happen is it could kill me today. Now it’s killing lots of people today. So it’s already catastrophic for all those people that are suffering catastrophically because of it today. The people who talk about AI doomerism are often not the people who are suffering today or people who are worrying about it or, like, getting funded. You know, there’s also lots of people talking about AI doomerism who are getting impacted by it today in many ways. But my point is, like, the worst thing is very subject specific. And, like, if everyone else dies and I’m alive, you know, it’s like, if I die, that’s real bad for me. And, like, I don’t need to imagine everyone else also dying to know me. [01:20:14] Paul: Dying would suck, but you’re obnoxious, so it’s good for the rest of the world. [01:20:17] Jovo: You know, it’s to be determined, I guess, by St. Peter whoever. [01:20:23] Paul: Right? Right. So, yeah, when we talked about last time, I mentioned the book superintelligence by Nick Bostrom. And you tried to read the book, you couldn’t. I suffered through the whole thing. Nothing against Nick Bottram, but I mentioned that in almost every sentence was the word, maybe in extrapolating that this may happen, it may happen. And. And if you put them all together, it’s such a low probability. And you made the point that, yeah, I mean, that these are all probabilities, that everything is a probability in the future. And so why don’t we work on what we can work on right now? [01:20:56] Jovo: Yeah, I mean, I guess my perspective on that is everything, maybe. I don’t know. I’m a huge fan of epistemic humility and generally acknowledging that I literally do not know the right answer ever. [01:21:10] Paul: I’m with you, Mandez. [01:21:11] Jovo: And so then that comes down to computing probabilities and costs associated with those things, and I don’t know the right estimate of the costs or the probabilities. So I’m not saying what one should or should not worry about. My only claim is today it is clear there’s lots of harm being caused by people running AI algorithms. And I care. I care about all those people that are suffering. I care about all the non people that are suffering, and I also care about potential future people, but I don’t. Those people might not exist. Like, for example, there’s an asteroid that kills everything in 100 years, and AI would have done it in 200 years. It just doesn’t fucking matter if AI would have done it in 200 years. All of that effort was wasted. [01:21:59] Paul: Well, that asteroid may hit us, maybe, right? [01:22:04] Jovo: Like, Gaia or the earth probably will be fine either way, whatever that means. I would like ecosystems to continue because I live in them and my children live in them, but I certainly don’t know the right thing to happen. [01:22:22] Paul: Before I let you go, I wanted to bring up the doomerism just because you had mentioned the people running AI, algorithms and efficiency and the problem with that. But to bring it right back to prospective learning, a couple of quick things, and I’ll let you go. One, do you feel that. Do you feel like competition of, like, getting scooped or something? For other people who find the value of incorporating time and a prospective, forward looking approach? [01:22:52] Jovo: I don’t tend to worry about being scooped, but I think I say that from a place of, like, profound privilege, my life is great. Like, if I get scooped, my life will be great. [01:23:03] Paul: But is there competition surrounding you that you know of or whether, however you feel about it? [01:23:08] Jovo: Yeah, I would say, like, transformers are kind of onto this idea, and, like, one could think of what we’re doing. Maybe part of what we’re doing is trying to formalize some of the ideas that are kind of implicit in transformers and attention and stuff like that. So I would say, like, maybe almost everyone is kind of working on this problem, and we’re just trying to formalize it. Like, it would be a great success, would be if we can prove when transformers can solve certain kinds of time varying problems and when they can’t, and, you know, we haven’t done that at all yet, but that would be a cool thing. So I kind of think lots of people are working on this one way or the other, and I hope to contribute to that because I like it and it’s fun. But, yeah, I think it’ll make progress with or without us. [01:24:03] Paul: Okay. The last thing I wanted to ask about is, you know, I mentioned earlier, you know, do you. Do you keep track forever? And then I was thinking, there are, as you learn, let’s say you’re learning musical instruments, right? So sometimes it behooves you. If you learn piano, you’re probably going to be better at trombone or something, and then if you learn a new musical instrument, you’re probably going to be even better at that. Having learned piano and trombone, even though they’re different tasks, there’s overlap. Right. I cannot conjure it. Maybe you can. I know that there have been published studies, and I know that there are cases where forgetting is a benefit to learning new things. And I’m wondering if you’ve thought about forgetting in terms of prospective learning and the value that it might bring when appropriate. [01:24:55] Jovo: Sure. So, computationally, forgetting can serve two purposes. One, if you have a finite memory, then you’ll want to forget stuff that’s not that useful in the future. Otherwise, you’ll forget the stuff that is useful in the future, and that would sell. That’s one purpose. The other is, with respect to inference, if you have a large memory bank, it can take time to search it for the right answer. Whereas if you forget a bunch of stuff, particular useless stuff or redundant stuff, then you can search it faster until you can do inference faster. So that’s two real unambiguous purposes of forgetting from a computational perspective. I don’t know if that answers your question, but that’s why forgetting is useful in general. [01:25:43] Paul: Yeah. Okay. In a formalistic way, as usual. Jovo, did we do it? Take two. Did you think we did it this time? [01:25:50] Jovo: Pretty happy with it. I also. I’m getting rained on, which I feel pretty excited about. [01:25:54] Paul: Oh, there’s a thunderstorm out in my window as well, but I’m not getting rained on. Well, let’s say goodbye real quick. So thanks for doing it again. I’m glad we got a redo and hope you’re happy with it. Keep up the good work. [01:26:07] Jovo: Thanks, man. I really appreciate your time. [01:26:24] Paul: I alone produce brain inspired. If you value this podcast, consider supporting it through Patreon to access full versions of all the episodes and to join our discord community. Or if you want to learn more about the intersection of neuroscience and AI, consider signing up for my online course, neuro AI the quest to explain intelligence. Go to braininspired. Co. To learn more. To get in touch with me, email paulinnspired Co. You’re hearing music by the new year. Find them@thenewyear.net. dot thank you. Thank you for your support. See you next time.