Steadcast
Talk the Talk cover art
Talk the Talk

136: These Languages Are Anchors (with Mary Walworth)

April 25, 20262h 10m · 21,103 words

Show notes

What will happen to the languages of climate refugees? Dr Mary Walworth has been working with the small island community of Nusi in Papua New Guinea, which was recently featured in an episode of Pole to Pole with Will Smith . What's it like doing media for a large audience, and more importantly, how do we help speakers hold onto their language? Dr Walworth joins us for this episode. Timestamps 00:00 Start 00:45 Intros 04:00 What's coming up on this episode 06:58 New patron shoutout and spruik 09:05 News: Air Canada CEO resigns over English communication 14:34 News: Banning foreign language election information in the UK 23:54 News: Can LLMs pass the Wug Test? 34:03 News: Does closing your eyes help you hear better? 39:48 Related or Not: Theme 40:23 Related or Not: GORGE, ENGORGE, and GORGEOUS 48:09 Related or Not: ANY and MANY 51:54 Related or Not: GUAVA, GUACAMOLE, GUARANÍ, GUARANÁ 01:00:20 Interview with Mary Walworth 01:45:52 Word of the Week: bimbofication 01:54:00 Word of the Week: glottophobia 01:55:50 Word of the Week: liveness check 01:57:54 Comment from Colin: CLAN and PLANTA 01:59:03 Comment from Lauren: crash blossoms 02:02:49 The Reads 02:08:09 Outtake

Transcript

0:00Hedwig has said that she's on whey. I wonder what that means. That she is bulking up. She is in a bulking regime for she is on whey protein. Oh, right. I think that she meant to spell no way, meaning there's no way I'm getting on a podcast with you guys. Understandable. The older I get, the more Groucho Marx I am becoming, right? Like, why would I want to be part of something that would have me? Terrible.

0:30Hello and welcome to Because Language, a show about linguistics, the science of language. My name's Daniel Midgley. Let's meet the team. First up, it's all around, man about town, Ben Ainslie. Ben, hello. From around town, I send you my salutations.

1:02Ben, could you please prove that you're a human and not an AI? Um, okay.

1:10So the way I notice that I'm the most human is I, to Hedwig's eternal chagrin, use dictation on my phone a lot because I really hate typing texts. And one of the most frustrating things about it is clearly my mode of speech just is not very like the corpus of data that natural language processors have been programmed on because I'll use words like shall and they'll, they'll, it just, it can't, it doesn't, it doesn't fuck with stuff like that. It's like, that's 1% of instances when people say a word that sounds like shell or shield,

1:46um, or shield that they'll actually be saying shall. So I'm just going to correct it to one of these other ones. And then I have to go through all of my text in a thing that saves me no time at all and go, uh, no, that's, that's this word. And no, that's this word. And no, that's this word. Um, so yeah, I think just the way that I speak kind of definitively classes me as not AI. Well, you, you had me when you tripped over the word processor. So I, I was convinced then. Also, I'm grumpy sometimes and like not American sounding.

2:17So, and I don't mean in terms of accent, but I don't have that like, sure thing, buddy. I can help you out with that. Um, demeanor, which all AI seems to have. Mm. Thank you. I'm convinced. Next it's good human, Hedvig Hiergaard. Hedvig, same request. Prove to me that you're a human. Prove to you that I'm a human. You could tell me something that only you would know about me. Oh, only I would know about you? You have weird coffee cups that are like double walls, like transparent.

2:51They're not weird. They're vacuum sealed. Oh, sorry. Nineties. It's very unlikely that you would know that. Mm-hmm. I smashed one of those early today. I like them, but they're unusual. Did you? Yeah. Oh, they are. Not in a fit of range, to be clear. I just dropped one. I don't know how I would prove that I'm a human. And I can't even refer to my very human sense of anxiety because I don't have one. I do have unwarranted confidence, but bots have that too. So I could actually be an AI. Sorry about that. And you do have that kind of like boundless Americano, like good vibe optimism as well.

3:26So you're like squarely in the slot for that. That's a persona. That's a sona. We need to bring him down with more like Aussie, European, whatever it is that we have. If we haven't done it by now. Yeah. I've got to be honest, Hedwig. I've been, as we'd established in our last show, I've been slogging away at that task for 15 years. It's just, I'm seeing precious little movement on the frontier. Can't beat my boundless sense of optimism. Mm-mm. It was put in there by American exceptionalism too young and too fervently.

3:57At least publicly. Well, today on the episode, we're talking to Dr. Mary Walworth of the French National Center for Scientific Research. She studies language loss. She's been working with people on, now I hope I get this right, Enousie Island in Papua New Guinea. Sometimes it's called Tench Island and they speak a language called Nousie or sometimes Tench, Tennis, which is Tench. Well, as Ben might have guessed, Ben, do you want to guess why there's the name Enousie

4:29and Tench for this island and people and language? Which one do you think is local and which one do you think is something else? Let's play the game. I'm guessing that Tench is what the French did and Enousie is like all the other way around. Is it French? Is it British? I don't think it's French, but it is the name of a colonial general who, or lieutenant or something, who saw the island. But Henvig, surely they have the Ench ending in their words because it's in the word French.

4:59I mean, I was picturing an E at the end of Tench, like Tenche or something. No, it's just like if you want to Google the island, you can Google both. Enousie is becoming more and more common, I've seen, but sometimes in older sources people will refer to it as Tench. So it's good to be aware of both, but people prefer to be called Enousie or Enousie. And the language is Nousie, but it's Enousie Island. What's the difference between Enousie and Nousie?

5:30You would know this, Henvig. I don't know this, and I think it's possible, I can ask Mary. Mary and I also happen to be friends, which was really funny because when Daniel said like we're going to interview Dr. Mary Walworth, I was like, great. She's like, used to be in an office next to me. Like I know her very well. And apparently Daniel didn't know and the people at, who was it? It was National Geographic who set us up, also didn't know. So that was fun. Yeah, I had no idea. And I've been trying to stay away from Max Planck Institute people.

6:01And yet they're just, whenever there's something interesting, they're there. Lousy. Lousy with Plankers. There's a lot of us. But you asked about the difference between Enousie and Enousie. I don't know the difference. I think it's possible that you can maybe sometimes call the language Enousie as well. I don't know the difference. Okay. I might flip back and forth indiscriminately during the interview with Mary. And we're interviewing Mary because she was recently on an episode of Poll to Poll with Will Smith, who is an actor and a rapper.

6:34There we go. We were interested in a couple of things. We were interested in what climate change is doing to minority languages. But we were also interested in, like, what goes into communicating linguistics to the public on a big scale, like a television program. So that's all coming up in this episode. Or a statewide radio show, Daniel. Well, I'm still coming to grips with that one. You know, our first live episode of the year is coming up, and that means we're inviting all of our patrons, both free and paid, to attend.

7:08We'll be announcing the date and time shortly. But to get in, you need to be a patron. So sign up. Patreon.com slash BecauseLangPod. Does anybody want to read our new, latest patrons? I can do this thing. What'd you please? Patrons are great. And we've got some new patrons. Our newest supporter is Becky. At the listener level, we've got Renee on the beach, bumping it up from free. Also, Simon A. with a yearly membership. Hello, Simon. All at once, eh?

7:38Big spender. And Susa K. bumping it up from the friend level. And at the friend level, speaking of which, K-C-H. K-C-H. I'm going to go with K-C on that one. I think it's Case. I feel like it's Case. Case. C-A-Y-C-E. Listeners, you decide. K-C or K-C-H. And Benedict, one of the very best kinds of eggs, both with a yearly membership. And our newest free patrons, Autumn, Tom C., Sam, Matt D., Barbara B.

8:12Is that Iara or Lara? That's a Lara. That's a lowercase L. Okay. Lara. Interestingly, Daniel pronounced it as Ilara, which very confusing. Lara. Sarah and Oliver B. To all of those humans, welcome to the fam. I should also issue, like, a general, like, if we say your name on this show and we say it wrong, we probably can't go back and, like, edit it. But if we read you at the end every time and we say it wrong every time, I don't know about you two, but, like, I would like to know.

8:48I would like people to wait just the right number of times to maximize how guilty I feel about having got it wrong. That's what I would do in your series. I think it's somewhere between three and five. Somewhere between three and five is that. That awkward limit. Mm-hmm. Well, let's get to some news. This one was suggested by Jill via email. Hello at BeacuseLanguage.com. Hi, Jill. Hey, Jill. This has been In the French Language News This Week in Quebec, and I thought I'd send it along in case no one has already mentioned it.

9:20So Jill is going to give us the rundown here. After a plane crash in New York that killed the pilot and co-pilot, the CEO of Air Canada, Michael Rousseau, was criticized for a video where he gave his condolences to the family in English only, although the pilot was a Francophone Quebecer. How about that? Mm-hmm. Mm. That's not going to sit well with the folks in Quebec. Mm-mm. Lots of Quebecers and politicians viewed this as a sign of disrespect. Not the first time Rousseau has been in hot water about his lack of French skills.

9:55Air Canada is subject to the federal government's language laws, so the CEO is supposed to be able to communicate in French. What do you think? Fair? Interesting. I wasn't aware that – so I'm interpreting that last piece of the story to mean any organization or company that is – do they say federally in Canada? Are they a federation across the provinces? It is the federal government's language laws. Yeah. So any organization or company that is operating at a federal level, meaning they have instances across Canada, the expectation is what?

10:35That all members of their organization are bilingual? Yeah. That all members of senior management are bilingual? Like where – I'd be really interested to know what that law is. Well, I'm looking at the text of the Charter of the French Language, which governs language use for businesses in Quebec, in particular Bill 96. That's the most recent revision. And if you're doing business in Quebec, you have to offer services in French. You have to make the signage with French being prominent. For big companies, you have to have a French council that meets every six months.

11:08I'm not seeing any provisions here that every single employee has to speak French or that, in fact, every CEO has to speak French. I mean, if we take a step back for a second, the optics are shockingly bad. Shockingly bad. Yeah. Regardless of what the law is. Like if you are giving condolences to a family in not their language, that's really bad look. Not good. I think there's a strong expectation that a CEO in a communicatory role who's doing a job like this,

11:41there is kind of an expectation that that's going to be available in all languages, if only just for symbolic reasons. I mean, there's also the idea of a law and a norm, right? Right. So, there might not be a legal basis for the CEO being expected to speak French. But if you've got an entire province or at least a good proportion of a province who will hate you if you don't or are not able to, just even from a business perspective of like you can't be fucking off that bigger chunk of your potential customer base with that kind of regularity.

12:20But if we think about it, this is, I think, a particular thing to Canada because there are also some people, there are multi-cultural nations where this wouldn't, this has more in it than just the language of the pilot and this company, right? This is like a part of like a general trend where French is losing ground in Canada overall.

12:51And this is like one facet of it. I mean, there's news like this, like I know there's a guy, I think, in Winnipeg who gets traffic tickets in English and refuses to pay them unless he also gets them in French and stuff like that. So like this is, I'm not saying that these are necessarily the same, but there's like a general feeling of like that the Canadian government isn't respecting French. And this is one instance of it. Jill continues, if the reaction seems somewhat extreme, this sort of thing happens frequently here, where American slash English Canadian companies are supposed to operate in French to do business in Quebec,

13:26but often they don't or the service is not as good. So Quebecers end up having to speak English to access services, while people here in Montreal, for example, do tend to know at least a bit of English. Generally speaking, the rest of Quebec is much less bilingual. Jill has another interesting angle here. I haven't been following the story in the English language news, but another interesting thing about this was the different angle in the two languages. And if you look at the URLs of two articles that Jill sent, and we'll slap them both up in the show notes for this episode. One of them says, this is the English version, Michael Rousseau retires.

14:03That's the verb that gets used. And then in the French, it translates to, what is the word? It's démission. The word is démission. Is that dismissed? Resign. It's resign. Not just retire, but actually resign. So there's a little bit of a different way of looking at things in the English speaking articles about this, and then in the French speaking articles about this. Kind of fun. Good data point. Jill finishes up, love the show. Thanks, Jill in Montreal. Thanks, Jill.

14:33Now, Hedwig, you showed this one on our Discord. And feel free to jump in if you know about it. I've written a bit on it, but... No, you read your bit that you've written. I know some things, but you've got it prepped. Okay, I'll start off. This is in the UK. I'm getting this from an Instagram post by Sophia S. Gaylor. There's also an article by Martin Shipton in The Nation, Qumri. Link in the show notes. Here's what's going on. There is a piece of legislation called NC37.

15:04It's an amendment to the representation of the People Bill, and it is, quote, a new clause that would make it a criminal offence to publish or promote election campaign materials in any language other than English or the native languages of the British islands during the short campaign period before an election. Criminal offence to give out election information in other languages. So you wouldn't be able to put something out in Arabic for, like, majority Arabic-speaking constituents or anything? Correct.

15:35Ooh. Ooh. Ooh. But you would be able to in the various Celtic languages. Yeah, like Scots Gaelic or whatever. Welsh. Yeah, Scots Gaelic. You could be spicy and try Irish and see what happens.

15:55But yeah, this was proposed by the Conservative Party, and it's obvious that they feel like the other parties are catering to populations that they shouldn't be. And that means some, I don't know, an unfair advantage, but maybe also more broadly, they're feeling like they're promoting a sense of the multicultural Britain that they don't. Right. And this is at an age where, like, you know, we all know that, like, AI transcription and Google Translate

16:26aren't perfect, but they're not that bad. And now I'm projecting myself mentally from, like, the influx of linguists who will contact me later. But, like, it's not that bad. I went to Japan recently, and I was in the shop, and I was like, what is this product? And I did, and I was like, it's not a pivotalization, but I can figure out what this is. My smartphone was an invaluable tool in non-English-speaking countries. It's wild how useful it is. Now, here's the argument from the Tories. Tory leader Kemi Badenoch said, this is from the Nation article,

16:59Tory leader Kemi Badenoch said, campaigning with different messages in a foreign language is a deliberate ploy to exclude those who do not speak that language. That's English speakers, I guess. It is not in the service of integration. It is about cynically driving a wedge between groups. Comment? Oh, God. Isn't this similar to a discussion that was had in Australia, in Sydney, a couple of years ago, where there were campaign material in Mandarin? Is this ringing a bell for anyone? No, I didn't know the story.

17:30I think there was a scandal because my memory of this event was that there was different campaign messages for, I think it was the New South Wales level politics, not nationwide. So maybe you guys in Perth didn't pay attention to it, but if you lived in Canberra, you noticed it. But in that instance, I believe there was different campaign messages in the different languages, and that was a problem. But people found that out because there are bilingual people,

18:01and also you can fairly easily figure out how to translate material. So the problem wasn't necessarily that they were different languages, it was that they were making different promises or doing different things. Secret messages. Yes. But they're not that secret, right? Yeah, yeah, yeah. The reason why the journalist wrote the story is because it was relatively easy to find out. Yeah. But you're making a rational argument, though, Hedwig, and that is never going to work with people who are not making arguments in good faith.

18:34And the Conservative Party in the UK is not putting forward the argument in good faith. The things that they are saying, they do not believe. That's, I know we tend to, I know you subscribe to this, Hedwig. Daniel, I think you're a little bit more cynical in this regard than I tend to be. But I usually don't prescribe malice when stupidity is a more obvious solution or answer to a question. But in this particular instance, when we get to conservative politicking and campaigning,

19:08I see malice far more than I see stupidity. There's so much stupidity and malice around. How can we keep track of which one it is? I mean, why is it that accommodating minority groups is labeled as divisive? Because they don't vote for conservative people. Oh, I see. So let me make sure I got it right. Racism, sexism, language discrimination, not divisive. Trying to fix those things, divisive. Have I got it? Trying to fix them by not making everyone homogenous.

19:39Yeah. And I would also, I would go a step further and say, and this is where my real cynicism kicks in. This is from the same playbook as various voter exclusion laws in the United States. This is a group of people who want to make it harder for certain groups of people to vote because those groups of people will not fucking vote for them. So we don't want people writing materials for, I don't know, the Pakistani community, the Jamaican people. I mean, those are terrible examples.

20:10The Cantonese community, the Polish community, the, you know, et cetera, et cetera, et cetera. Because those communities are not going to fucking vote for the Tories. But also, I mean, it also betrays just the basic thing, which is that, like, they want a homogenous sort of not fully ethno state, but a little bit cult states, cultural states. And they don't want a multicultural state. And that's sort of the basic assumption.

20:40And they're also, of course, probably trying to win voters from the parties who are even more xenophobic than they are. And I have to say that if they were sitting in that campaign room and choosing between different options, then this one directs the ire at the other politicians. Which is, like, you are bad because you are writing campaign material in Odo.

21:11I get it. Which, I'm being very cynical, it's a little bit better than we should throw out the people who speak Odo. I agree with you. It's like one step up in the hell. I agree with you in principle, but I can't help but notice that the undertone of the message is, like, these guys who want to write shit for the dirty brown folk who are, like, fucking up our country. Like, fuck those guys. Mm-hmm. Yeah. So it's kind of about who we want voting and who we want it to be harder to vote.

21:41It's also a discussion about whose language deserves to be represented in the public square and whose language should be excluded. So they're mounting multiple attacks on the people themselves and the people who represent them. One of the things that I imagine is quite challenging in this specific regard in England is, unlike in the United States, Canada, Australia, as much as we have a huge, very vocal block of human beings in all of those places who are like, this is a white majority country, this is what we should be always, blah, blah, blah, blah, blah.

22:15We are settled places. So there's a really easy comeback to that, which is, we fucking stole this land. Like, this isn't, this by definition is a place where people come to from other places and they, whereas England isn't that, right? That's not its historical lineage. As in, it obviously is, because there's been just like waves and waves and waves of people for the last like several thousand years who have come to the UK. But culturally and historically, it is like England where English people come from.

22:50And this was like not a quote unquote multicultural place. This was a, and fighting back against that, I imagine adds an extra layer of like difficulty because there's this like, I don't know, this standard you can bear to be like, this is England. But when you are the British Empire or the French Empire, which are, I think, the two best examples of this, and you're colonizing a lot of other places, and you need labor at home, and you're saying to people like, oh, you guys in the Caribbean, why don't you come and settle over here and work in our factories?

23:25And then be like, oh, no, we brought the culture from our empire to our empire's heart. And like, oh, no. So like, there is some of that as well, which I think happens both in UK and France, where they have colonized a lot of other places. And then those people have moved around freely, or not freely, but have sometimes moved to the imperial core. And then they're like, oh, no, the brown people from the perimeter came to the center? Let's move on to our next story.

23:56It's time for a wog test. Remember the wog test? I do not remember the wog test. Berker Gleeson's little squiggle. It looks like a little sort of bird. It looks like a peep, if you know a peep. I do know the marshmallow things. Yeah, American candy. And the test is to show it to a child, show one, and say, this is a wog. And then you show two of them, and say, now there are two. Fill it in, Ben. Wog. Congratulations. You passed the wog test. Right.

24:26It's a test if kids know how to form regular plurals. Aha. So let's do one. Hey, Ben, I like to spelf. I love spelfing. Guess what I did all day yesterday? Yesterday, I... Spelfed. Okay. Not spelved, surely. Well, anyway. Oh, okay. Okay, interesting. I don't know. Okay. So we give these tests to children to see if they've internalized the rules for English. Can they do plurals? Can they do past tense? And if children can, then can a large language model also?

24:59Ah. Hmm. This is work from Dr. Nicoleta Planteladou of the Universitat Autónoma de Barcelona, published in PLOS1. They gave large language models wog tests on different languages, and the models were ChatGPT 3.5, ChatGPT 4, Grok3, BERT, DeepSeq, and Mistral. Now, remember, large language models don't always have access to the internals of words. Remember the strawberry problem from last year? Yes. Yeah. Yeah. So you ask large language model how many Rs there are in strawberry, and it will say the wrong number, because it's not actually tokenizing within words and counting things.

25:36It's sort of, like, relying on, like, probabilities, and, like, often when people ask how many there are, the answer is three. So I'll say three. Yep. Now, some of the languages they used are high-resource languages with a lot of data, like English and Spanish. Some are rather smaller languages with low resources, like Greek and Catalan. But another thing that they wanted to check for was complexity. Some languages are more complex in certain ways, and some are less.

26:09Like, does it have tensed forms? In English, you've got buy and bought. That's something different. And you've got lots of verb inflections. So the list that they had from least complex to most was English, Spanish, Catalan, and Greek. The reason they chose it in that order, well, they used Grand Bank, actually, and they were looking at grammatical features. And English came up less complex because it doesn't have as many, with FX as encoding, higher fusion scores.

26:43Why did they say least to most? Yes, fusion is the score I made up. What? Tell me about fusion. Well, fusion, technically, is a thing in theoretical linguistics that means that particular, like, it means, like, how much material you have fused to each other. So you can think of it in a way as, like, how much, like, affixes and stuff you have stuck on each other versus if you have a language where things are more freestanding. There are more aspects to fusion, specifically for Grand Bank. And we have a bunch of features in there that say, like, do you have a past tense marker on your verb?

27:15Do you have a plural marker on your noun, etc.? And then we count up for each language how often the answer was yes. And then we say that if you answered yes more often to such questions, if you have something bound on something else, you have a higher fusion score. And me and Hannah Haney created this score because we were trying to understand our principal components analysis that we were getting out. I've written a little post about it on our wiki if you want to learn more. But basically, it's a very simplified way of saying how much sort of things do you have stuck onto other things.

27:53Okay. Well, the least to most list is English was the least fusional, Spanish, Catalan, and Greek was most. Yeah, that seems right. They also looked at something else, the informativity dimension, which is how much stuff you have to do. I also made that one with Hannah. Go on then. So, basically, we have a bunch of questions that's just like, do you make a distinction between male and female in your pronouns? Do you make a distinction between inclusive and exclusive we?

28:24You can think of these as information points. You can say, no, we don't make a distinction. We leave it ambiguous. Or you can say, yes, we make a distinction grammatically. So, every time you say we, you have to decide whether it's inclusive or exclusive. You can't leave it ambiguous. Or at least there's an option for you to switch grammatically. And so, we count up for all of those questions that ask about a sort of distinction. We count up for all of those if you said yes or no. And if you say yes, it means that for these particular concepts that we checked for, you made a grammatical distinction more often.

29:00You as a language. And if you answered no, then for these concepts, there's more ambiguity. So, we call that informativity. I found out later after we published this article that there is, the term informativity also means something else. But that is what it means here. Okay. Well, Greek was the highest in informativity, followed by Spanish. English, again, was the lowest. All right. So, now here's the question. Which kinds of languages would do better?

29:31Would they do better if they were big languages with lots of online resources and lots of data to pull from? Or would they do better if they were simpler languages, less complex in terms of fusion and informativity? Which would do the WUG test best? Simple. Okay. I'm going with simple. You're going with simple. High fusion.

29:56No, it should be, it should do better if they're high. Mm-hmm. The problem is also that these scores are primarily targeting what's called bound morphology. So, like the WUG, WUGS, like when it's stuck to the word. There are languages where you might mark plural with a separate word. That's separated by a space. And that should be easier maybe for the large language model to tokenize because that's a separate token.

30:26It doesn't need to look within. But these scores don't grab onto that. So, they don't know as, well, for some of them it knows, but not for some. Okay. Well, here's the result. The winner was the languages with more data had better results on the WUG test. Ah, boom. Even if they were more complex. Yep. So, this is the bitter lesson. If I use the term bitter lesson, have we heard that one? I'm not sure. Very sweet. It's from an essay from 2019.

30:58The bitter lesson means we've tried knowledge-based approaches. We've tried using human knowledge on computer problems. And it turns out that what we should have been doing was just giving it tons of data, tons of statistics, and tons of compute. Works better every time. To the chagrin of linguists. I also just want to say, large language models are just about there on the strawberry problem and other ones. People who develop these things have been hammering away at problems like the strawberry problem on the internals of words problems. They don't seem to be tokenizing on characters, which is what you'd expect them to do.

31:31Instead, the models seem to be doing a kind of reasoning approach where they're able to take the words, break them down into an array, and then do stuff to the arrays. So, for example, I tested this on Claude and ChatGPT. What word do I get if I take every third letter in the word rebelliousness? Letter three, letter... And so what it's able to do, it says, let's take every third letter counting from the start. It puts the word rebelliousness, and then it says one, R, two, E, three, B.

32:01And once it's got that in a row, then it's able to take letters three, six, nine, and twelve, and you get the word blue. See what it's doing? It doesn't have access to the internals of words, but it's able to do an operation where it takes the word apart, and then it's able to manipulate that. Yeah. So that's what's happening. Interesting. As soon as we talk about the inner workings of large language models, I get lost. It is very high-order thinking stuff. And I'm, like, a pretty smart guy, I think.

32:33Like, I know a lot of stuff about a lot of stuff. But as soon as people start talking about, like, how the actual large language model, like, learns, and how the tokens are constructed and broken down and all that kind of stuff, I'm just kind of like, the?

32:46How about you and I try to explain it like someone's five? No, no. But I think one thing that is confusing is that when large language models first came out, they were mainly just large language models. Now we're seeing agents that have other tools incorporated as well. So, for example, it used to be that when you asked HGPT, like, what's three plus four? It would just look at, like, what do people usually answer when you say three plus four? And it would answer that. Now there's a little bit more of an approach of, like, let's detect that the user is asking a math question

33:20and then add on a calculator that it can use. So this is a little bit more similar to what Daniel described here. So it's like, oh, I've detected that you want to look inside of this word. I'm going to pull on an extra tool I have that helps me look inside this word. So it's like you're giving more tools to the large language robot. So it's no longer just doing the vectorizing deep learning thing. It's doing, it's got a, yeah, a calculator and a magnifying glass.

33:52Tell you what, let's just do one last one. Oh, by the way, who sent me this one? It was me. Thank you, me.

34:03Last one. If you're trying to hear sound in a noising environment, should you close your eyes or should you keep them open? I'm a shockingly bad person to ask this question to. Or maybe I'm the exact right person to ask this question to. My hearing, my capacity to differentiate sounds from background noise is atrociously bad. Like it has been the scourge of every significant partner I have had in my life.

34:34If they walk out of the room and a little bit of echo is introduced to their voice as they potentially yell from the other room, I'm done. I'm gone. I got, I got nothing. Like it goes from like, hey Ben, like what, uh, what did you want to have for this?

34:52That is what I hear. Um, and if yeah, like parties or the, the one that kills me is like restaurants with big tables. So I'm like far away from someone, I'm cooked. I'm cooked. Same. I'm also struggle a little bit with it. It's also good to just remind us that like the human brain and ear is very advanced. Because if you look at like the incoming acoustic signals that you're receiving, especially in an environment like that, it, it, for example, it's pretty hard for computer.

35:23It used to be pretty hard for computers to do. And I think it still is in many circumstances. It's really hard to distinguish noise from signal and your ear is like evolutionary engineered, if you can call it that, to, to do this for human voices. Which brings me to my question, Daniel, you asked if it's better to close your eyes or not. Is it human speech? And is there a human I can see? There is no human you can see. It is a recording and it's not human speech. The sounds were canoe paddling, drumming, lark chirping, train and keyboard noise.

36:00Masked by 70 decibels of pink noise. What is pink noise? Ben, do you know about white noise? I do. It's when all, that's when all frequencies are randomly, evenly able to occur. So high and low pitch sounds. So what acoustic phonetics like to do other kinds. So where like there's brown, pink and other ones, which is just like different biases on which frequencies are able to occur.

36:31So I think pink is more high pitched. Is that true? And brown is more low pitched. Okay, cool. Well, it's been shown that closing your eyes enhances auditory sensitivity under fairly clear conditions. But they wanted to find out what happened when the target sound was masked by noise. This is work from Dr. K. Ni of Shanghai Jiao Tong University and a team published in the Journal of the Acoustical Society of America. All right. So we know what they were looking for. But some people looked at a video. They looked at a photo or they closed their eyes.

37:03What the team found was, quote, taking blank visual stimulation as the baseline, closing your eyes, elevated detection thresholds. Meaning you needed more sound to hear the sound when you closed your eyes. Oh, wait. Okay. The opposite of what I would have assumed. Elevated the threshold. That's interesting. By contrast, they say watching a video or a photo lowered thresholds by a few decibels, contrary to the belief. Hang on a second. Are you watching a video or a photo of the sound you're looking for?

37:37Not necessarily. So if you're looking for keyboard sound in pink noise, the video you're watching is not going to be of a person typing on a keyboard? Correct. Okay. That's fucking wild. Why would that possibly happen that way? Also, why is, what is the, because normally when it's human speech, we look at a human's face to see their mouth movements and stuff to try and infer what's happening. So that if we don't hear everything, we can be like, oh, well, it looked like Daniel was making an M, so I'm going to assume he made an M.

38:09But canoes don't work like that. They don't make M sounds. And especially if I'm looking at a video that isn't a canoe and I'm hearing canoe sounds, maybe seeing, yeah, that's crazy. I would have thought for sure Eyes Closed would have done better. It's the opposite of the turn the radio down when you look for the house number phenomenon. So they say that closing the eyes puts you in a state known as neural criticality, which increases how strongly it filters incoming information.

38:41In other words, disengagement is disengagement. Oh, closing your eyes turns down your ears? Closing your eyes turns down your ears. Wow. I would not have thought that. Turns up the filters. Wow. With this kind of thing. It's an unusual result. And that's why this really caught my attention. Do you know what I would love to test now? My yes and to this study is, is there a difference between having your visual stimuli removed via being in an entirely dark space versus having your eyes closed?

39:21Is there something to do with actually the act of closing the eyes that is turning the ears down? And if you're in like a dark cave or whatever, but your eyes are open, do you still have your ears turned up? That'd be cool. I want to know that. Further work by Ben Ainslie. Nice. That's that one's free. You, you, you canny linguists. You can just take that one. That's that's a gift from Ben Ainslie straight to you. And that's the news. And now it's related or not. This time our theme comes from Gordon.

39:53Hey, get ready. It's time to start the game. Which of these words come from words which were the same? Are they related or not? The end of the game is in the name. Are they related or not? Play it at home or on a train. Are they related or not? Say it out loud or in your brain. Are they related or not? Thank you, Gordon. Very cute. Charming. Charming. Charming.

40:23Okay. This one comes from Tigertronia and Farrowcat on the Discord, but it's been raised in times past by Aria Flame and Dermohurl earlier, so we're finally doing it. This one's about gorge. Gorge. On food. That's one meaning. What else? It's a low frequency word, but I know it. Awesome. To gorge oneself and also a cleft in the earth. Mm-hmm. Oh, yeah.

40:53It's also your neck, like gorge de pigeon. Gorge. I've never heard in English gorge used as neck or throat. Do you know when you want to kill someone, you might use a garrotte? Garrotte, yes. Not related, surely. I was just in my head. I was going, I was obviously really in my head, and I went, what? You saw me in real time figure out that I didn't know what I was talking about. How about this? I felt something rise in my gorge.

41:24Oh. That sounds like it could be your groin. Yeah, no. I've never heard that before. That is the lowest frequency of the three, for sure. Well, we're going to step through this pair-wise like we have before. Let's see at what point we get off the bus. Okay. So, the first one, gorge as in to engorge yourself and engorge. Gorge and engorge. Related or not?

41:50Yes, related. I'm still on the bus, yeah. Yeah, I'm definitely related on that one because it's the same word with a prefix attached to it. And pretty obviously means inside of one's gorge, right? We all said yes, and the answer is yes, obviously. This one is pretty straightforward. Gorge comes from Old French gorge, which probably, which is the same word but with a French accent. I picked that up. Oh, good. I'm glad that my French is so awesome.

42:21It's probably related to Latin gurgulio, which goes back to a Proto-Indo-European root, gora, and then you just say it twice, gong, gong, and it sounds like somebody's swallowing a whole bunch. Wow. So, there we go. It's an onomatopoeia. It's an onomatopoeia, as so much of language is. Next, let's add to that, not just gorging yourself and to engorge, but gorge your neck. This is your gorge right here. I'm going to have to go related on that, even though I do not believe that gorge actually means neck

42:53and that you've just made it up in this very moment for this game. But, yeah, to ingest, to put in the throat, that strikes me as very related. Me too. Yeah. I'm on the bus. We're still traveling because these are related. You put it in your neck. Actually, this sense of gorge, meaning the outside of your neck, is the older sense, and it was only the outside of your neck, not the inside. That's why you engorge, meaning you put it in. Oh, okay. Gotcha, gotcha, gotcha. I guess throat handles that.

43:23Throat's kind of a funny word because it means the front of your neck and inside, like I have a sore throat, but it doesn't mean the back. Yeah. What a weird word. Yeah. And neck in English mainly means the back, question mark. I think so, but it could be the whole thing. Yeah, I'm going to go no on that. I think neck refers to the entire, like neck is the equivalent of torso, but for this portion of the body. I would also say that torso is mainly the front. No. Oh, I feel that.

43:54The other thing is the back. I feel that. Dang. And if I say I have pain in my neck, I don't have pain here, do I? Yeah, but here's the thing. If we say we have pain in our neck or you wouldn't, I don't know. The front is your throat. No, no, I agree. I agree that throat is the front. I think that neck and torso are collective groupings of things and then back or throat are more specific areas upon those collective groupings.

44:26Interesting. You know, it's kind of like the way the word day could mean like all of a day, 24 hours, or it could just mean the bright part. Yeah, true, true, true, true. Or if you got shot in the neck with a crossbow, it wouldn't really matter what part of the neck that crossbow was now sticking out of. Dang, that's true. We'd be like, that guy's been shot in the neck. No, I agree. I just think that it's like, it's like, it's the whole thing, but with a flavor of mainly the back. Okay, fair enough. Okay, we sort of out neck and throat, but let's get on to the next meaning of gorge, which is a narrow valley.

45:02Also related to all of those? I'm going to go yes as well. It's the throat of like the river or the canyon or whatever, right? I also said yes, I can see the similarity. Simple metaphor. We went to a really nice gorge on Crete when we went on our honeymoon. Not the one that everyone goes to, a smaller one. And gortes are beautiful. We're not like the other honeymooners. I didn't know the other one. It was just like a four-hour drive or something and then like a six-hour walk. And then like it was, I'm not doing that much hiking.

45:36Anyway, it was really cool. And there was like a very old cave in there with like a really like old temple that was like 2,000 years plus. It's very cool. Was it a grotto? Like gorge. Ooh. It might have been a grotto. Because if it was a grotto, then you might have seen something that was grotesque. Ah, that's fun. I did not know that connection. But now it's time to get back to gorge. The answer was, yeah, a narrow valley is related to the other senses of gorge.

46:08Now it might be time to get off the bus. Okay. Gorgeous. Ooh. If something's gorgeous, is that related to gorge?

46:18I'm going to go no. Maybe somebody saw a pigeon's neck and thought it looked great.

46:24No, I think we are all still on the bus. And I think this has to do with the sense that like a woman's like neck and decolletage is like very sexualized and very like hot. Yeah. So I think that a woman who has like a very pretty neck and maybe front area here is gorgeous. I thought so too, but I went weak positive on this one. So Ben's a no. Daniel, you know this game. You're on the bus or you're off the bus.

46:56You got yes. You're going to be doing like a school camp, like hanging out the window kind of nonsense. I'm eyeballing the exits. I think, is this bus, have I missed my stop? What? Okay. Answer. Related. Oh. So the OED says it comes from old French, gorgias, elegant or finely dressed, of uncertain origin. But then Edim Online steps in and says, perhaps a special use of gorgias, necklace. And in fact, Oxford admits that it might be like there was a special kind of necklace.

47:30Sorry. I thought you meant without a neck. I was like, that doesn't sound very gorgeous at all. They say necklace here in Australia, don't they? Well. Hey there, necklace. When you said necklace, I assumed you meant the same way as backless, that like the dress exposes the skin. A necklace dress. Sorry about that. Well, how do I disambiguate that? I'll say it not Americanly. Necklace. There we go. A necklace that was really nice looking and very gorgeous. So all related. There we go. Thanks to Tigertronia, Pharaohcat, Ariaflame, and Dermohurl.

48:05Hope you didn't actually hurl because that would be a disgorgement. All right. Next one. This one's from James. But Pharaohcat wondered this one as well many years ago, I found, on our Discord.

48:16Any. As in, do you have any money? Mm-hmm. And many. She has many money. Oh, that's fun. I like this. She has many things. Any and many. These go back a ways, don't they? Oh, these must be old words, right? Super friggin' old words. Okay. I like this.

48:38I'm going to let Henning take the lead. Or you, Daniel. I said, sure. Why not? Again, I'm lukewarm on this one. I say, sure, as well. It reminds me of, like, willy-nilly, which is, there used to be an old negative prefix. So will and nil. Nil is to not want. Which means, does he want or does he not want willy-nilly? Yeah, it's someone who's, like, fickle. Okay. And, I mean, it shares nothing with any and many except that it's a prefix. Um, but, uh, well, let's think of when we used it.

49:12Like, I want any, uh, cat and I want many cats. That sounds, that sounds like they belong in a sort of paradigm, doesn't it? Yeah, yeah, yeah. Well, they're both determiners, aren't they?

49:25Yes. That was the, that was the lukewarmest yes so far. It's just, I know that there's a lot of language to listen to this as well. Determiners, like, a bit of a funny class of words. And sometimes I get lost in my own definitions and I start worrying. This, that, these, those, any, many. Any is a funny word because it's only used for negatives and questions. Like, do you have any money or I don't have any money. But you can't say, I have any money.

49:57Can't do it. Sounds good. Just, any is a function in, uh, the programming language I like to use, R. So you can say, like, if any of these is true, then do blah, blah. Mm. Ah. That's a fun one. That's a good way to do booleans. That's cool. Yep. All right. So, um. So, I haven't answered. I'm going to go. I say yes. I'm going to go no to be a contrarian primarily to put a little dramatic tension to the game. But also because, because they're really old words, I could see, and they're short words,

50:31I could see that maybe things have gotten rounded off and ended up close together, but started off not close together. So, let's see. Ben wins. Any and many, not related, coming from totally different proto-Indo-European roots. Any is from oi, no, one or unique. Like, many, perhaps from meneg meaning copious, but Enemon Line points out the pronunciation of many was altered by influence of any. Yeah. Ah, okay.

51:01So, would we call that, like, a cheat related, maybe? Yeah. Resemblance based on squishing. It's not. Well, look, let this be a lesson to all of our listeners. Play the devil's advocate. Be a contrarian. It works out well. You know, then I wondered, what did many look like before, before any came around? And it was probably something, it was, oh, go ahead, go ahead. Is there a velar nasal in it? There is a velar.

51:32It's gah. I'm being cheating, I'm using Swedish, so it's monga. Monga. Well, it was something like maneg with a G. Sometimes it had an O, like money, like the money that you spend. That's how it looked. But by Middle English, 1500s, 1600s, we start seeing many. So, there you have it. It grows together. Thanks, James and Pharaohcat. Finally, from Diego, with support from Hedwig. Diego says, this is on our Discord. I'm wondering if guava.

52:04Oh, yeah. Guarani and Guarana might be related, with the Guarani languages being the possible common factor. I know Guarani and Guarana are definitely from South America, but I'm not sure about guava. And then, Hedwig, you added guacamole. This is, Ben, I don't know if you know about this, but like, I'm trying to get more items related or not, and I don't want to look them up. So, what I do is I prod my friends when we go for Friday beers, and then I drunkenly write

52:34a bunch of things in our Discord, and then I forget about it. And you wake up the next day and go, oh. Yeah. What did I post? So, I think, so we had Guarani, Guava, and... Could you guys explain what the things other than Guava are? Oh, there's an ethnic group who have a language. That's part of it. That's Guarani. Guarani. Yes. Is that where Guyana gets its name? Different again. Yeah. I think that's Guinea, the same as in Guinea.

53:07Oh, yes, of course, like the many Guinness, yeah, yeah, yeah, yeah. Yeah, I think that's that one.

53:13And then there's a sort of berry thing that you can make that's related to coffee. It's a fruit with beans on it. Some people are using it as a coffee substitute, and there's also a famous soft drink made from not the beans, but the rest of it. Okay. Yeah. And, of course, we know guacamole because it's tasty and wonderful and slippery. Oh, yeah, I think I added guacamole to just be like... Yep, you did. Now, we're not wondering if they come from the same language. Simply tying back to the same language won't be enough, because they probably do.

53:45But the same word? Hmm.

53:49Yeah. So, we need to ask ourselves where the guava comes from, which is actually surprisingly hard, I have discovered in my... Really? Well, as in, to intuit where food comes from is actually quite hard. Obviously, I could just fucking Google it, and it's really easy. But the industrialization of food has made inferring or guessing where stuff has come from extraordinarily difficult, I think.

54:20Because, okay, sure, for like really big famous things like potatoes and tomatoes and the nightshade family coming from the new world in the Colombian exchange, fine. But guavas, A, are not super readily available as a food stuff. Like, you generally need to go to pretty niche supermarkets to get yourself some fresh guava when they're in season and that sort of thing. Or you can only get like guava paste from like Indian supermarkets and that sort of stuff.

54:50I'm going to go unrelated based on the fact that guava does not come from the place where the ethnic groups come from. I'm going to guess that guava is not from South America, and it is in fact from the South Asia region. Okay. So you're putting guava on its own. I am. The rest of them related, do you think? I think we're going to need a list here. Maybe the berry and the ethnic group are related.

55:22Okay. Ben says that the guarani and the guarana. Oh, and by the way, I think guacamole is unrelated to all of them. Okay. Not connected. You've picked out that pair. I have a theory about the guacamole, the guava, and the berry, which is, now I'm getting stuck in my head. Guarana. Guarana. What if gua is like fruit? Okay.

55:52Or to pound, maybe?

55:56Do you pound all of them? Yeah, yeah, you can.

56:01Interesting. So if it's like round thing or fruit or to pound, and that's in common, and then like the ending is like, oh, it's this one. I know that guacamole is so named because mole is the thing you make guacamole in. I'm pretty sure. No, mole is the sauce, isn't it? Mole is like a kind of... Oh, actually, no, you're right, it's like a thin... No, a molcajete is what you make the guacamole in, and the mole is, you're right, it's like things with like a thin, runny consistency, generally speaking, or like...

56:33Yeah, so I got guacamole, so... So if gua is like round thing or fruit or pound, and then ka, we don't know, and mole is like sauce, then guacamole is that. Okay, okay, okay. I don't know, but I'm having fun, and I think I'm going to go with my theory of gua in those fruit things being like round thing, fruit, egg, pound, and then we have the leftovers

57:05are guarani, the people and language, right? Mm-hmm. Which I'm going to guess is unrelated. I don't see... The odd one out. Unless it's like the people who eat those things.

More from Talk the Talk

139: Magpie Syntax (with Stephanie Mason)

Jun 9, 20262h 47m

138: Pop-Up Gaeltacht (live with Laura Pakenham and friends)

May 12, 20261h 32m

137: Are Trees Real? (with Yngwie Nielsen and Morten Christiansen)

May 1, 20261h 1m

135: Linguistic Illusions (with Dan Parker)

Apr 3, 20262h 8m

134: True Colour (with Kory Stamper)

Mar 13, 20262h 21m