Logo
Published on

AI generated music is a thing now. Should we be worried?

Authors
  • avatar
    Name
    Ben Lesh
    Twitter

I dove into trying to answer just this question a few months ago.  As a musician myself, I see the potential and the risks. AI is absolutely revolutionizing music creation and the industry at large right now, and its only just getting started.

First, What Is Suno?

Suno is a generative AI music platform that creates complete song - vocals, lyrics, instrumentation, drums, everything - all from basic text prompts and audio samples. You describe what you want, give it some sample audio to work with, and it builds a full production ready track. No musicians or instruments required. No studio. No production experience.

And at this moment, it's not producing generic loops or background noise either. It generates emotionally coherent, genre-appropriate songs that sound like they were written and performed by humans.

Mostly.

And thats the key difference right now.  It sounds like a human COULD have produced this.  

Some Examples

Here is a small sampling of the songs I've produced using Suno. They range in genre widely and I tried many different combinations of things.

From others (not me):

What I Discovered

The first several generations I made, I was absolutely blown away by how good everything sounded. The vocals alone wowed me every single time! However, now after three months of listening and making generated music, I've noticed a distinct "sound" that emanates from it and I can now easily identify when something is AI generated or not. Taking that "sound" word literally, there is a lot of background noise in most songs - from white noise to faint audio artifacts drifting in the background. It can handle the big things very well, but the details of quality production signal chains still elude it. And that is why everything sorta begins to sound the same the more you listen to it. From what I can hear, the AI music engine is doing to audio what LLM's do to language - assemble things in predictable patterns that it thinks humans will understand and like.

It is not, for example, understanding the physics behind a sound - it's just trying to mimic it based on what it's heard. Which means that every sound you generate will by its very nature sound the same over every generation you make with it. Forcing it to try something new with the sound is actually quite fun - and I made a lot of crazy things that way - but in the end, all the pieces end up sounding the same. Chose any instrument, make some funky sounds with it and stretch it to its breaking points, but you will eventually notice nothing "new" can ever come from it.

Comparison To My Own Music

When I write music, I spend a LOT of time designing the sound first. I tweak and alter a sound until it fits what I am after - usually either an emotional trigger or a specific rhythm or pattern. Only then do I begin adding in more sounds and forming it into a cohesive whole. It's an iterative approach that builds on one success after another. You end up with a finished product that has strong foundations and clear sonic boundaries.

That's not what I see happening in music generation. It first tries to find the emotional framework behind a prompt, then it tries to match sounds into that framework. Sometimes it works, and the emotion and sounds all line up, but a large percentage of time it's obviously failing somewhere in the process. It either pushes the emotional angle to hard making it sound forced, or the sonic elements don't line up and you get a sonic shift that makes you scratch your head in confusion.

In the end, it's a fundamentally different approach to making music. And it shows both the strengths of this approach and its limitations. It is absolutely amazing at mass producing a given pattern. I validated this when I had it make nearly an hour of soft jazz tunes. It was nice to listen to as background music when doing laundry, but there was no spark within the music that made me think about how talented the musicians were or made me feel like this was a live concert played by real people. It was just to formulaic and repeatable in nature.

My Verdict

It's early days still for this technology. I expect most of my issues with it will be resolved within a few years' time. Given how far it's come in such a short timeframe, I fully expect AI generated music to be the industry norm very soon. I also think that live music still has nothing to worry about yet and wont for the foreseeable future. The human connection embedded into music performance is not repeatable by AI and hopefully never will be. However, for musicians like me who have no desire to perform the music I create, AI generated music will eventually push us out of way. There will always be a niche market for human made music and the human connection embedded within that, but when it comes to passive listening, AI will soon become the norm.