Rise of the Music Machines

May 16, 202350 min · 6,179 words

Open in Steadcast for Mac Apple Podcasts Overcast

Show notes

On this show we explore three different AI and machine-generated music technologies; vocal emulators that allow you to deep fake a singer or rapper’s voice, AI-generated compositions and text-to-music generators like Google Music LM and Open AI’s Jukebox, and musical improvisation technologies. We listen to the variety of music these technologies generate, and two guitarists face off against an AI in improvised guitar solos.

Highlighted moments

It wasn't thinking of the solo as having any kind of development. It was thinking of the solo as just a space in which you could just put notes.

46:05

Transcript

Introduction

0:00Last year, 10 musicians got a call from Google to help them do one of the most mind-numbing jobs you could imagine. The company had five and a half thousand clips of music from YouTube. Music that sounded like this. Music that sounded like this and this and this and the job was to listen to every single

0:43one of these clips and describe them in words.

Music Description Example

0:47Like take this example. Music that sounded like this. The description one of these musicians came up with was, this is a remix of an R&B soul piece. There's a male vocal singing in a laid-back manner, joined by an auto-tuned male vocal. The atmosphere of the piece is groovy, and there's a feel-good aura to it. This piece could be used in the soundtrack of a sitcom. Music that sounded like this and this and this and this and this and this and this and

1:21Well, they finished the job.

Project Outcome

1:23And I crunched the numbers. These 10 people listened to 38 straight days worth of music. 92 hours each. Typing out their descriptions one by one before moving on to the next clip. They used a total of 370,000 words to describe all of the clips. Google ran the music and the words through a deep learning model, with a goal of figuring out what words correlate with what musical sounds.

1:57The outcome, if the project went well, would be the ability for any of us to say anything to Google. Google, and it would generate brand new music based on our instructions.

MusicLM Prototype

2:08And at the end of January in 2023, it released a prototype, MusicLM, a language-to-music generator. Okay, Google, use AI to generate techno-accordion music. Okay, using AI to generate techno music with an accordion.

2:33Okay, Google, play me music from a club in the 1950s. Okay, using AI to generate music from a club in the 1950s.

2:48Okay, that was really bizarre.

2:52I don't know what to say about that one. I'm not sure what that was supposed to be.

Philosopher's Perspective

2:56This is Robin James, a philosopher of music. Yeah, I feel like it was trying to combine, like, Jerry Lee Lewis and Frank Sinatra or something. Right. Okay. Okay, Google, now play music from a club in the 60s.

3:18Maybe there's a little, like, surf guitar. Maybe there's a little, like, Serge Gainsbourg. Mod? Yeah. I mean, it's AI, you've got to remember, like, no human being did this, right? Well, yeah, it just shows that, like, it's being trained on, you know, these sort of broad data sets.

3:38From Slate, this is Hi-Fi Nation, philosophy in story form. Recording from Princeton University, here's Barry Lamb. I actually like it. There's nothing particularly aesthetically interesting to me about AI-generated music that's a perfect copy of human-made music. Because why would I listen to that? I'll just listen to the human stuff. But glitchy, weird, robotic stuff that doesn't sound like anything humans would make? That's fun to hear and to think about.

4:10That was accordion rap, according to Google Music LM. I wish AI music researchers would focus on that more. What you might call distinctively robot glitch aesthetics.

AI Music Projects

4:31Well, today's show is about machine-generated music. There isn't just one thing that is AI music. There are hundreds of different projects going on simultaneously that are trying to use machines to analyze, recombine, and organize sounds into something we recognize as musical.

4:52I'm nowhere near an AI optimist about most AI technologies today. But with music, I think it has the promise to unlock a lot of musical potential from humans. So today we're going to look at three different kinds of AI music projects. And I'm going to do something rare for HiFi Nation. I'm actually going to give you my take about each project. What they mean for the future of musical creativity. So they got me in the likes and blasted the whip.

5:23Thought they laid me to rest, but the bag wasn't it. This is the story of what really went down in 96. If the music industrial complex doesn't sue it out of existence first.

Vocal Emulation Technology

5:44This is That Night in Vegas. Featuring a deepfake, AI, Tupac Shakura. It was written and produced by an online artist who I only know as Hip Hop Intelligence on YouTube. The lyrics, the sound production, even the rapping and flow is all his. But the voice is Tupac Shakur's. The piece is an imagination of Tupac rapping the track about the night of his own death.

6:20Made real by an AI vocal emulator technology.

6:25Vocal emulators train and mimic the timbre of an individual's voice. So when you, the artist or producer, sing or rap into a microphone into your digital recording software, the timbre of your voice gets replaced by that of the deepfake. Hip Hop Intelligence is in all respects using his own voice to play Tupac Shakur's voice like an instrument,

6:59in the same way any musician can play any other sounds as an instrument. It's a mistake to think of the resulting track as an AI-generated track. It's a human-produced track using a vocal emulator, which is one kind of AI technology. Here's another piece produced by Hip Hop Intelligence, playing Eminem's voice in the same way. Okay. The chainsaw's back. I told you Shady's alive. I was gonna stay away, but the game's all whack.

7:29I've been watching from the side, and it ain't all that. Now I'll have to put the spikes on this baseball bat. And here's a track produced by someone named Yeezy Beaver. It's Kanye West's voice, singing the Plain White Teas hit, Hey There Delilah. Hey there Delilah, what's it like in New York City? I'm a thousand miles away, because tonight you look so pretty. Yes, you do.

7:55Times Square can't shine if I see you. These kind of novelty productions currently operate outside of the law. No legislation prohibiting them has been issued, and people aren't really making money from the productions yet, so no artist or record label has sued. But when the lawsuit eventually comes, there's going to need to be a kind of metaphysical argument to be made about how to classify vocal emulation. Like a lot of AI, vocal emulators occupy an in-between space from what has come before.

8:30If Kanye West's voice were sampled and replayed in a track, that is clearly the use of something owned by Kanye, or whoever owns the sampled recording. On the other hand, if a really good Kanye impersonator, who learned to mimic Kanye's voice based on hours and hours of listening to it, then proceeded to cover Hey There Delilah, well, then it's a cover. Kanye would have no claim to ownership of that voice. It's clearly someone else's voice, no matter how convincing.

9:02This isn't just theoretical. Singer Mark Martell is famously so good an impersonator of Freddie Mercury's voice that they actually used his voice as Freddie's voice in the Queen biopic, Bohemian Rhapsody. This is Mark Martell singing, not Freddie Mercury. Open your eyes, look up to the skies and see, I'm just a fool.

9:35Vocal emulators are exactly in-between a sample and an impressionist. A machine did take samples to learn how to mimic Kanye. But the mimicry itself isn't generated from the samples. It's making new sounds just like any person's voice. It's a machine vocal impressionist. So it's a little bit like sampling, but not. And a little bit like learning to impersonate someone, but not.

10:12That's AI Freddie Mercury, covering Thriller. The producers of these tracks are mostly not trying to fake us out or deceive us. Most of them are being transparent that these are not the real artists. And anyways, it's easy for any live artists to deny it's them. And dead artists, well, we know they can't sing new music. These productions are instead a lot like fan fiction for music.

10:43Nice to meet you, where you been? I could show you incredible things. Magic, banners, heaven, sin. Imaginative producers and musicians are trying to create parallel worlds where Freddie Mercury did cover Thriller, or where Tupac was able to survive his death to rap about it. And how cool is it that it's now possible for you to write a song for Adele and actually hear what it would sound like if she sang it? And these producers are not making money from these tracks

11:15any more than fan fiction authors are. They're doing something quite pure creatively, putting work out there to consume for us to talk about and debate, but not for profit. So, of course, it's not going to last. Because history shows that if anything shuts down musical creativity, it's not new technology. It's a lawsuit. The only way you can make money in the music industry this day... Philosopher of music Robin James... ...is not by composing or being a gigging musician.

11:48It's by owning copyright to song catalogs. That's why you see Justin Bieber just sold his catalog for, like, $2 million. Taylor Swift sold her catalog. Bob Dylan sold his catalog. And they're mostly going to this firm called Hypnosis. Hypnosis is basically making money buying up these song catalogs and licensing the copyright. The only way to make money in the music industry these days is by owning assets that appreciate. These asset owners see lawsuits as another way

12:20to make money from the IP that they own. In this context, I think we have a pretty good sense of where AI vocal emulation technology is headed. The timbre of a person's voice is probably going to be a kind of digital asset. Something that can be owned, licensed, purchased, and a source of revenue. At best, it'll go to the artists whose voices are being mimicked. But in all likelihood, it'll end up being owned by corporations who have far more money to acquire a portfolio of celebrity voices

12:54to license out for passive income. Right now, it's so early in the era of AI voice emulation that everyone's voice is still open source. So the most creativity is going to be happening now.

13:11You're on the phone. Your girlfriend, she's upset. She's going off about something that you said. She doesn't... My take is, in a few years, vocal emulator technology is going to be professionalized, industrialized, and monetized. And the beneficiaries are not going to be amateur producers. So enjoy your AI Kanye covering Taylor Swift for now. That's AI music technology, number one.

Fully Generative AI Music

13:57Hi-Fi Nation will return after these messages. Now let's talk about AI music technology, number two. Fully generative AI music. If you wanted to treat music as a series of notes on a page,

14:29then deep learning solved the problem of making music sound like human compositions long ago. This is Deep Bach from Sony CSL, one of Sony's sound research labs in Paris, and it's from 2016. Deep Bach is one of many deep learning models that has been able to generate Bach-sounding compositions for years.

14:58And here's AI Mozart, a completely AI-generated composition based on Mozart's piano sonatas.

15:14To a large extent, AI is revealing to us how most human music-making actually does work. This is Theodore Grasik, philosopher of music at the University of Minnesota, Moorhead. There's a fairly low level of what we might call creativity in it. Grasik thinks, like many people do, that music composers like Mozart, Bach, and the songwriters of today all seem to follow a similar career path. You start your career by stumbling upon some new way of writing music.

15:46And then? After a certain point, he's just copying himself. Mozart is just writing by rote. He just works on autopilot. Nothing original happens. His writing creativity has stopped. He's just self-plagiarizing. This might sound like a grumpy complaint from any critic over the ages, about anyone famous from Bach to Paul McCartney. But Grasik thinks, in the age of AI, this kind of complaint goes from subjective value judgment to actually scientifically supported.

16:19How else could you explain, for example, how Mozart or Bach compositions can be so easy for a machine model to generate? In the minds of Bach and Mozart, it must be that they came across a pattern or template and then just wrote notes around that pattern and template. This is just how human creativity tends to work? It's exactly what the AI does when it studies Bach and Mozart.

16:50This is OpenAI's MuseNet from 2019. This is also a notes-generated piece. This time it's generating three overlapping sets of notes into a jazz arrangement, piano, drums, and bass. This is the next level of complexity, generating notes not just for one instrument, but for many. The reason that the patterns from notes on a page are so easily learned in an AI model is because the information is actually quite simple,

17:22even simpler than natural language, which has a vocabulary of hundreds of thousands. In music, you have 12 notes in at most seven octaves and rhythmic patterns between 80 and 140 beats per minute. The difference between jazz piano notes and bluegrass piano notes appears very quickly, even to human eyes. Generating compositions, though, was not enough of a technical challenge for AI researchers.

17:55Getting notes on a page isn't generating music as sound. It's generating music as abstract information, something we already knew computers were good at doing. What you're hearing are sounds played by a human or by computer-generated MIDI. Getting everything generated by an AI, composition and sound, is what they were trying to do at Google Music LM. Okay, club in the 70s. Okay, this will be super interesting.

18:30That one's better. Yeah, yeah. As a piece of music. Yeah, it's more legible as something funk disco. But again, in the 70s, that's going to be certain clubs, certain places, right? You know, it's not CBGBs, which was also a club in the 70s. Okay, let's do 80s. So clearly trying to reference new ways. Yeah, there's a little Blondie I hear in there.

19:01Yeah, yeah. Google Play Music from a club in the 90s.

19:10That's House. Yeah, sort of. Yeah. Isn't it? But again, it's sort of definitely privileging what's going on, I would say primarily in Europe at that point, because in the 90s, electronic dance music was, I mean, you'd find it in Chicago and Detroit, but, you know, they talk about the last decade as the year that electronic dance music broke the U.S. So, yeah, again, it's sort of a specific kind of cultural representation of what that era was. Okay, let's try the 2000s.

19:46That just made me laugh because it sounds like indie sleaze, which there's what they're calling the sort of 2000s retro trend right now. And it's a perfect encapsulation of what people today think music in that era sounded like. But again, that would be primarily, like, white people. So the indie sleaze trend would be, like, nostalgia for LCD sound system, right? Think about it, too. In the 2000s, this is the era of bling rap. So if you went to a hip-hop club, that doesn't sound anything at all like that.

20:16Do you want to try grunge? Sure. Yeah, I don't know. I mean, like, people argue about this to this day. Like, people say, oh, no, you know, Smashing Pumpkins, that ain't grunge. That was a little bit more metal than I would have placed in. But what do you think? I think it's at least as good as Nickelback. No.

20:43No, but, yeah, it sounds a lot like what you'd see on alt-rock radio, which is measured by a separate chart than hot alternative songs. So it's definitely in the universe. I don't think the goal of any of this completely AI-generated sound stuff is to create a new pop hit or an alt-rock hit or R&B hit or anything of the kind. If there's a market for it, my guess would be it'd be an ambient music, like moods.

21:15Let's listen to some of the mood genres. There's a couple. Let's start with Chill Out.

21:32All right, let's do Down Tempo. Yeah, I think that's going to be a sound that will be really popular, right? I bet you'll get a bunch of like AI YouTube channels where you can just listen to, you know, like the lo-fi steady beats to study relax to. So I think that's infinitely monetizable, right? Because for Chill Music, the point is people aren't really listening to it, right?

22:04They're just distracted.

22:07And there's a lot of content being generated by amateurs that requires music. You know, all the people who are doing their little exercise regimes on YouTube, you know, like to try to get attention, like they need exercise music. All the people doing podcasts need like soundtracking music. All the people doing YouTube channels. And I don't know at that scale how all these people can afford an actual musician to compose. Yeah. Right. You're pointing out that there's a market for non-copyrighted music from people who would

22:40never pay either for a bespoke commission for their amateur podcast or exercise video or whatever. This part of the market is not necessarily bad, right? And this is sort of the narrative of tech. In de-skilling certain things, like music composition, you make it more available to DIY creators. That's, I think, not inherently bad. This is the track that Google LM generates when you ask it to create dream pop.

23:20One take about AI-generated music that's actually listenable is that it's a populist art form. Theodore Grasik. Most people conjoin music with something else, some activity, whether that's taking a run, working in the gym, working in their office, doing their homework, whatever. Taking some human activity and making it special or memorable seems to be the real function of music. And this is what the artificial intelligence bots are great for.

23:53They're actually creating music that is non-disturbing, so it functions really well as background enhancement to something else.

24:11This is Winds of Winter, a completely AI-generated piano piece by Ava. The goal is for people to buy it as a piece of soundtracking, not a piece for music lovers to appreciate in a concert hall. The process of actually sitting in a concert hall, attending to instrumental music, is culturally

24:41weird. The idea of sitting there, listening to music, and attending to it with your full attention, this is a historical anomaly created basically in the 19th century. It's not how the average person relates to music. Theodore Grasik thinks that AI-generated music is pretty much most music that humans generate in terms of aesthetics, in terms of creativity. It's not going to be any better or any worse than what the typical human makes, because in the best case, it is making what the typical human makes.

25:14The AI bot here is going to familiarize itself with something like 20 to 50 times what any human being has ever experienced, okay? And then all it's going to do is find the recurring patterns. Well, that's the same thing our mind has been wired to do with basic musical schema, as they're often referred to, blueprints, right? It's making the music more or less the same way that humans make music. Robin James, on the other hand, thinks AI-generated music is interesting as a political and economic

25:49object, but as a technology that's going to take over human musical creativity? No way. AI tech enthusiasts and doomsayers are completely wrong about that. I think they exhibit what musicologists would call the just-the-notes approach to music, which sort of looks at music as like a text, and that's all that matters. And that perspective that AI can somehow become more creative than an individual person, it's centering this idea of the heroic individual and the idea that creation is an individual

26:20activity, and it's not. It's a participatory activity. I make good music by knowing what other good music sounds like and being in conversation with other works. If you think of music as a practice that people do together, AI music is not going to impact that part of the musical experience, especially as long as it's like a privately owned thing, right? That's creating private property for private businesses is never going to be engaged in the collaborative social process of music.

27:01Robin James is a philosopher of music.

27:05You're listening to an entirely generated piece of classical jazz in the style of Ella Fitzgerald from OpenAI's jukebox.

27:21The notes, the music, the sound of her voice, all of it is created by a machine. The machine didn't even get fed lyrics. You have to try to make them out as you listen to it, and probably most of it is indiscernible. As for me, my take is that some AI-generated music is aesthetically interesting, uniquely

27:53interesting, but only when it strays far from sounding like anything familiar. Because this is where musical breakthroughs happen. When at first things sound like unbearable noise, and then becomes coherent as something meaningful. The most interesting thing to me about Google Music LM is that it translates words into sound, which means that you can translate sounds into words. And when that's possible, that means you can translate anything into music that you can

28:27also translate into words.

28:32You're listening to The Painting, Starry, Starry Night by Vincent van Gogh.

28:55That's right. You're listening to The Painting. The Painting, as described in words on Wikipedia. And then the words fed into Google LM to translate into music. Here it is again.

29:23This is the sound of The Scream by Edward Munch. And this is Google LM's interpretation of Salvador Dali's The Persistence of Memory. It's the painting with melting clocks.

29:57One clock is melted over the edge of a table, another is draped over a branch, and there's a plate being swarmed with ants. This famous surrealist painting was inspired by Dali hallucinating after eating a rotten piece of camembert cheese. Anyhow, that whole description I just said, and more, was fed into Music LM to generate this sound of the painting. It may not be a radio hit, but the idea these pieces express is fascinating.

30:41It's the idea that however imperfect language is, as a meaningful description of other mediums of human expression, like visual art and music, it can serve as an ultimate translator. There is an approximate way for a piece of music to look, and there may even be music generated from our description of smells and touches.

31:15Google, what's the sound of a couple embracing? Maybe the right take about AI art generators is not only that we've got a scary new political and economic reality for artists, there's that. But maybe we've also discovered a tool to translate visual art to musical art, or musical art to visual art, or any one kind of art into another kind of art, and maybe that will serve

31:51to unlock new creative projects we haven't seen before. Okay Google, play me into the commercial break with some minimal house music.

32:06Hi-Fi Nation will return after these messages.

Musical Improvisation

32:14The final piece of AI music technology I want to introduce you to is musical improvisation.

32:36This is a machine-generated 12-bar blues solo. One out of dozens and dozens of solos generated as part of a research project. I once spoke to a creativity researcher who told me that she thinks improvisation is one of the purest sources of human creativity. She meant all improvisation. Music like jazz or jamming in rock, freestyle rap, comedy, sculpting. Anything that requires people to come up with something on the spot.

33:10If machines can improvise like humans, then maybe it's one step closer to modeling human creativity. But can it? Let's test it out.

33:26This is the backing track that researcher Anand Subarmanian used for all of the 12-bar blues solos he generated in the key of C. I'm a professor at Universidade Federal da Paraíba in Brazil. I sent this backing track to two amateur guitarists, and I had them each come up with a solo on the spot to compare to the machine-generated solos that were part of Anand's research project.

33:57But first, what Anand did? So, in our case, we had a database, so we worked with licks. So we can find those licks all over the place. There's a textbook dedicated to blues licks. So our strategy was to break those licks and have one-bar lick and populate our database. And from that database of hundreds and hundreds of one-bar blues licks, Anand and his team tried to generate 12 bars worth of solos. If you just tell the machine to do it at random, you end up with solos that sound like this.

34:32It sounds awful. How do you get the machine to do better? At this point, you can do one of two things. You can tell humans to sit down and listen to all of the possible solos and label which

35:03are bad ones and which are good ones. And then you give all that to the machine, and it tries to figure out through math what the good solos have in common and what the bad solos have in common. That's machine learning, the kind of technology we've been talking about all season. Or you could do what Anand did, which is called mathematical optimization. This is where a person has to come up with a math formula that separates good solos from bad solos based on human knowledge of what sounds good, based on music theory.

35:38For instance, if two notes are in the pentatonic scale, then they make up a good solo. You program all that into the machine, and you have the machine generate solos around the music theory. What is the advantage of doing it this way versus a deep learning way? In Anand's way, he has complete control over the generative algorithm, and he can tell you exactly why two consecutive bars count as part of a good solo, and why two other ones

36:13are bad. We're going to look at a deep learning solo later. And for that technology, it's not even possible to understand how the machine ended up with the patterns it did.

36:29All right, let's bring in our guitarists, Fabrizio and Keshav. First, I had Fabrizio listen to a generated guitar solo without telling him whether the solo was generated or played by Keshav. Are you asking me to guess if it was Keshav or the AI?

37:00I'm not asking you to guess that, but you may. I'm asking, as a guitarist and appreciator of music, what do you think of that? And then you can guess if you want. Well, I'm not going to guess, because there's no way of succeeding at that. I thought he had a lot of interesting chromatic passages. It's very even in the way it's phrased. It's like a lot of notes that come at even spacing from each other, and there's not much repetition. Kind of a lot of notes, one after the other. So if I had to guess, I would say that was the AI. I hope that's right. But I hope I'm not insulting Keshav.

37:32Sit with Fabrizio's indecision at this point. Notice how nervous he is in trying not to insult Keshav. I had Keshav listen to a different improvised solo from The Machine.

37:54It doesn't sound bad, but there's something oddly stilted about both of them. Like, it's hard to describe it in other terms than, like, the playing lacks a certain kind of humanity.

38:07Okay, so now here's the first of our human improvised solo for you to compare. Same backing track as the machine-generated ones. We'll see you next time. We'll see you next time. Bye. guitar solo

38:56That was nice. I like that a lot. I'm guessing that was Fabrizio. It sounded really good. So Fabrizio, talk me through how you hear your piece now in light of the two machine-generated ones that you just heard. When I approached my piece, I was thinking, what are a couple of phrases I want to make sure to play? And where am I going to live in major and where am I going to live in minor? So I mostly did it in C minor, but at one point I threw in a C major scale because I was

39:28like, whatever. And I was thinking in terms of like phrases with spaces in between them. So I wasn't like playing notes all the time. There were moments where the bass was playing, the bass track was playing. Whereas the AI was just continuous. It was filling almost all the space that there was with notes. So you can definitely tell that that's a person playing as opposed to the AI, just because there's a kind of feeling to it. That's especially important in the blues. I mean, I would think of the blues as one of the hardest things for an AI to emulate because there's something

40:02less easy to describe that feeling or that soul behind it that you can just kind of hear. Now let's play Kashav's improvised solo. We'll see you next time.

40:42I think that's very nice. And again, what's there that is not in the other track is there is this motif. And you can almost like sort of exit the solo with just like this melodic idea in your head, which I couldn't do when I was hearing the AI solo. That wasn't the only improvisation I had Fabrizio and Kashav do. The idea behind improv is to give a

41:13musician or a machine the same backing track and to see what they could come up with on top of the same underlying music. I also found a deep learning generated solo, a heavy metal one. This one was all AI learned. No person gave the AI any hints, any mathematical formulas to generate the solos. It just learned from listening to all matters of heavy metal solos and constructed one on the spot. This is the backing track on which the AI, Fabrizio and Kashav were asked to improvise.

41:51Thanks to Vishwanath Subrahmanyan for that track. First, Kashav's one-take improvisation. I'm kind of just messing around, you know, almost like noodling, you know, and yeah, you know, I heard

42:32the chord progression and I thought, okay, you know, here's the mode I want to use. I think it was C Phrygian or, and so then, yeah, I'm kind of just like messing around, floating over it because it didn't have much of a groove for me to latch onto as opposed to the blues. I mean, blues, you really get, you know, you start grooving, you start feeling it and you get that repetition that Fabrizio was talking about. You're kind of telling a story. With this one, it was kind of like, all right, here's a heavy metal backing track that's not that rhythmic. So I'm just going to kind of, you know, pick the scale

43:03and, and shred a little bit.

43:12And here's Fabrizio's.

43:33Cool. Okay. I think it's really interesting that it ended up with a lot of similarities in that the lack of like rhythm to latch onto, it felt like it led to, in Fabrizio's solo, you know, kind of floating over the rhythm of the track and kind of trying some different things over the course of the solo, you know, getting even a little avant-garde with it at the end. For both of us, even though we approached the solo differently, we ended up in various ways, kind of fighting against the track.

44:15I was fighting the track and I couldn't enter into the rhythm of it. And so I was like, I'm going to play different things depending on what the drummer in the track is doing. And so that's, that's, that's what I aim to do. And now here is the AI solo.

44:51I'm going to play the 음악. We're going to play. We're going to play. We're going to play. Oh, that was very interesting.

45:30I mean, it doesn't sound like a guitar at all, but it also kind of reminded me of like droids from Star Wars, you know, like making their beep boop noises. I was thinking what kind of pattern could the AI possibly have picked up that made it make those sounds, maybe faster notes. But as Keshav was saying, it did not have the attack. And so it just, it felt like it was this, some combination of a synthesizer and a flute making these intermittent sounds.

46:01But it also had that quality that the previous solo had that it wasn't going anywhere. It wasn't thinking of the solo as having any kind of development. It was thinking of the solo as just a space in which you could just put notes. No structure to it on either of the AI ones, old or new. When you guys talk about structure, you're talking about it in a kind of qualitative and like a musician vocabulary to me, but the blues sounded better to me. But you still thought there was no structure. Could you talk a little bit about what that means? You know, it's not like a mathematical structure in the sense that like an AI would be, you know, looking for.

46:34It's natural to me to describe it as like a narrative or sort of telling a story with the music or something like that. And I think that's what, you know, makes a good solo, you know, sound so distinctively human. The kind of development of a theme, it gives it kind of a narrative structure as opposed to just some kind of mathematical structure behind it. Like an emotional arc or even a plot arc. Like we talk about plots. Yeah, something like that. That's why it's natural to say that, you know, a good solo kind of tells a story over the course of it. What I think about it is sort of starting with an idea,

47:07like a musical idea, something you'll be able to sing and then enriching that idea as you develop it. You don't even have to think about guitar solos if you just take sort of songs, right? Usually there is a verse and then there's another verse. The other verse has maybe the same melody, but with a few changes or modifications that give you a sense of familiarity because you're still in kind of the same place that you were before, but you fight off the repetitiveness by adding some different musical elements.

47:37Or sometimes in music, of course, you can have like a certain statement being made and then a counter statement. But that was missing from the AI one, even in the blues case. And what I was hearing was that solo had no spaces. There were no spots in which the guitarist was signaling that a theme had just been completed. In that sense, the blues one was lacking something as well. The language we currently have to describe our experiences of human music

48:09are being tested in the era of machine music. We hear a tune in Keshav and Fabrizio's solos. We hear strings of notes when we hear the machine-generated ones. One has humanity, the other lacks it. But is there a pattern in the solos that sound tuneful? Or in the ones that have humanity? Because the tech optimists tell us that if there is a pattern, a machine will find it

48:39and will be able to replicate it. Or are we directly finding something unreplicable in human-made music? Something bordering on supernatural? Like a soul.

48:55Because if that's true, we have nothing to worry about. AI is never going to have that.

49:02That's it for Season 6 of HiFi Nation. It's a shorter season because I have a book to write. Look out for that in the next year. If you want links to all of the technologies we've talked about, go to the webpage, hifination.org, and look for the show page for this episode. I'll have links to Google Music LM, OpenAI Jukebox, and everything else. If you want me to cover a certain topic in future seasons, you can also email me through the website. See you in Season 7. HiFi Nation is produced, written, and edited by Barry Lamb.

49:32Story editor for this season is Eleanor Gordon-Smith. For Slate Podcasts, Alicia Montgomery is VP of Audio, Derek John is executive producer of Narrative Podcasts, and Ben Richmond is senior director of operations. Follow HiFi Nation on Facebook, Twitter, and Instagram at HiFi Nation. That's H-I-P-H-I Nation. Complete transcripts, show notes, and reading suggestions for every episode is available at hifination.org.

Rise of the Music Machines

Show notes

Highlighted moments

Transcript

Introduction

Music Description Example

Project Outcome

MusicLM Prototype

Philosopher's Perspective

AI Music Projects

Vocal Emulation Technology

Fully Generative AI Music

Musical Improvisation

More from Hi-Phi Nation

Announcing Slow Burn: Becoming Justice Gorsuch

Effective Altruism and its Critics

The Problem with Gig Work

Love in the Time of Replika

Living in a Zoopolis