From Lyrics to “Scherzo”: My Journey with Generative Music (SUNO)

I’ve spent the last couple of weeks falling down a rabbit hole I didn’t see coming: the world of generative music through SUNO. As someone who lives in LLMs for hours every day, I’m no stranger to AI, but I quickly realized that treating music generation like a simple “point and click” exercise was, frankly, a bit boring. I wanted to see if I could move from being a spectator to something more like a “digital explorer / student conductor.”

My first real discovery was that the AI is remarkably sensitive to the “skeleton” of a song. By writing my own lyrics, I found I could indirectly influence the melody; the system tries to respect the natural phrasing and emotional weight of the words. It wasn’t just “making noise”; it was trying to tell my story. This nudged me to get even more hands-on. I started providing stylistically precise instructions to generate a baseline, then used SUNO’s “cover” and “mashup” modes to gently tweak the results until they matched my—admittedly vague—creative intent.

The real “magic” happened when I took these tracks into the Studio. By breaking the song down into its component stems, I was able to manually add “ornamentations.” For example, in my song Windswept, I decided a specific line—“I thought it was forever… and so did you”—needed a moment to breathe. I added a guitar arpeggio and a percussive flamenco clap right there in the gap. Finally, I played with the “soundstage,” doubling rhythmic tracks and moving them to the sides to create a sense of space. It felt less like prompting and more like building a sculpture out of sound.

The result that floored me was the “Scherzo” version—a reimagining of Windswept for a string quartet and guitar. It was beautiful, but I’ll be the first to admit: I’m technically illiterate when it comes to music. I’m a “musical omnivore,” I love everything, but I didn’t actually understand what I was hearing or why I liked it so much. Why did this sound like a commercially available CD and not a high school rehearsal?

To find out, I went on a bit of a forensic journey, comparing my AI creation to the “masters”real composers” of the past and present. I looked at the “social conversation” of Luigi Boccherini’s 18th-century guitar quintets and the relentless, mechanical beauty of Philip Glass’s minimalism. I learned that my track wasn’t just “playing notes”; it was using professional techniques like spiccato bowing and altissimo registers—things that require immense stamina and precision from human players.

It turns out that what I had “conducted” was a piece of high-octane cinematic display. While I’m certainly no expert, this process taught me that the AI isn’t just a shortcut; it’s a collaborator that can be pushed and refined. It has been a patient, fascinating masterclass in music theory, all triggered by a few lines of my own lyrics.

What I’ve learned is that “AI music” isn’t a replacement for the human touch; it’s a high-speed feedback loop for it. My lyrics weren’t just text; they were the conductor’s baton. The AI provided the virtuoso playing, but the soul of the piece—the “why” behind the notes—stayed firmly rooted in the words I put on the page. There’s a strange humility in seeing your own words interpreted by a machine. Sometimes, the AI found a “sadness” in a line I thought was “angry,” or a “majestical” lift in a line I thought was “quiet.” It forced me to re-read my own lyrics and ask: Is this what I actually meant? In many cases, the AI’s “mistake” was actually a better emotional fit than my original intent.

I’m SO excited for our present and future students: having these new digital, AI-based tools (to spontaneously create something, experiment and play, but then to take it apart to reflect on it and understand it better, and even learn new things you did not know, is a new, unprecedented opportunity: even if all you have is an old computer with internet connection, and free versions of these AI tools… I’m just a beginning student myself in this space, but the view from here is incredible. 


So – what is our new focus, as teachers? Where do we go from here? How do we teach differently for here and now? But that’s maybe a topic for another post (soon)… END OF POST


Appendix: Complete Lightly-Edited Chat 

Here is my almost complete, lightly edited chat that lead to the post you can read above – still has some typos, stuff that happens in fast AI chat conversations… but I thought it was somehow useful for anyone curious to see “the sausage being made” in this process – so here is my back-and-forth with Gemini – BTW: I use Gemini, ChatGPT, and Anthropic Claude, SUNo, and a bunch of other AI-based tools, but for this Gemini is the only multimodal AI chat system, meaning that you can upload a graphic, music MP3, or video (mp4 or YouTube link) and it can analyze its content. The other systems cannot do that yet…

GJ: So, I wrote the (human-created), lyrics and had SUNO generate melody options for a song called Windswept (above). Then I asked it for a “classic string-quartet + guitar” version (attached). Can you help me understand in technical terms understandable to someone with non-musical background what SUNO did here? I’m trying to learn more about this experience: I love the piece – it’s beautiful, but I want to udnestand better WHAT it is that I actually like: of course I like that it incorporates my “familiar” melody, but I? also like that it uses the strings in a very skilled, almost virtuoso (I think) way: this is no high-school orquestra… I’d expect this on a commercially available CD. What makes it likable?

GEM: It is fascinating to hear how SUNO translated your vocal melody into a chamber music setting. You are right to notice the jump in quality—what you are hearing in the “String Quartet + Guitar” version isn’t just a simple instrumental cover; it utilizes sophisticated compositional techniques typically found in professional Neoclassical or Modern Flamenco arrangements.

Here is a breakdown of what SUNO did to make that piece sound so “virtuoso” and polished.


1. The Use of “Arpeggiated” Texture

In the original song, the guitar likely played chords or simple strums to support your voice. In the quartet version, the instruments rarely just “sit” on a chord. Instead, they use arpeggios—breaking the chords into rapid, individual notes.

  • Why it sounds professional: This creates a sense of constant motion and energy. It fills the “sonic space” so the music never feels thin, making four or five instruments sound like a much larger ensemble.

2. Counterpoint: Giving Every Instrument a “Voice”

In a high school orchestra arrangement, the lower strings often just play long, boring notes while the violin plays the tune. SUNO has employed counterpoint, where each instrument has its own independent melody that weaves in and out of yours.

  • The “Conversation” Effect: You’ll notice that when the violin finishes a phrase from your melody, the cello or viola might “answer” with a little run of its own. This mimics the way a professional string quartet communicates, making the piece feel like an active dialogue rather than a background track.

3. Advanced String Techniques (The “Virtuoso” Feel)

The AI included specific technical “flourishes” that suggest a high level of performance skill:

  • Spiccato: Those bouncy, short, “crispy” notes you hear in the background. This requires great bow control in real life and adds a sophisticated, rhythmic drive.
  • Double Stops: You might hear a single violin sounding like two. This is when a player bows two strings at once, adding richness and harmonic depth.
  • High Register Mastery: The lead violin often climbs very high. In amateur playing, this sounds thin or screechy; the AI ensures it sounds soaring and lyrical, which we associate with professional soloists.

4. Flamenco Stylings in the Guitar

Since your prompt specified “Flamenco,” the AI didn’t just give you a folk guitar. It incorporated:

  • Picado: Fast, clear, scale-like runs played with the fingers.
  • Syncopation: The guitar often hits notes “off the beat,” creating that specific Spanish “drive” that makes you want to tap your foot even though there are no drums.

5. Sophisticated “Voicing”

In music, voicing is how you distribute the notes of a chord across the instruments.

  • The Pro Touch: Amateurs often put all the notes close together in the middle. The version you have likely “spreads” the notes, with a deep cello at the bottom and a very high violin at the top. This “wide” voicing makes the recording feel “expensive” and cinematic.

Summary Table: From Vocal to Virtuoso

ElementVocal VersionString Quartet Version
MelodyCarried by your voice.Passed between the Violin and Cello.
RhythmSteady and supportive.Complex, “bouncy” (Spiccato), and driving.
ComplexityLinear (One main idea).Layered (3–4 ideas happening at once).
EmotionIntimate and lyrical.Grand, cinematic, and technically impressive.

GEM: Would you like me to…

Analyze a specific section of the track (e.g., the transition at the midpoint) to explain how the AI handled the shift in intensity?


GJ  No – let’s compare this to a standard “historic vstring quartet” movement: what movement in a classic quarter would this be closes to – and how this is different from such a “standard” classical piece? What would be a competent, well-informed musician’s description of this piece?

GEM: To a well-informed musician, this piece is a fascinating hybrid. While it utilizes the “engine” of a classical string quartet, the “fuel” and the “driving style” belong to a completely different era and genre.

The Classical Comparison: The Scherzo

If this were a movement in a traditional four-movement string quartet (like those by Haydn, Mozart, or Beethoven), it would most closely resemble the Scherzo (or occasionally the Minuet).

  • Placement: Usually the 2nd or 3rd movement.
  • Character: The Scherzo (meaning “joke”) is characterized by a fast tempo, light-hearted energy, and a strong rhythmic “bounce.”
  • The Connection: Your piece shares that brisk, playful energy and the “spiccato” (bouncy) bowing technique often found in classical scherzos to provide momentum.

How It Differs from a “Standard” Classical Piece

Despite the classical instruments, a trained musician would immediately spot three major “modern” or “non-classical” identifiers:

1. The Harmonic Language (Pop vs. Classical)

In a classical quartet, the harmony follows strict rules of “functional tonality.” A standard piece would move through a series of “tensions and releases” leading to a final resolution.

  • The Difference: Your piece uses a “looped” or “modal” harmonic structure common in Flamenco and modern Pop. It stays within a specific “mood” or “groove” rather than moving through the complex key changes a 19th-century composer would use.

2. The Role of the Guitar (Concertante Style)

In traditional classical quartets, a guest instrument (like a piano or guitar) usually functions in a “Concertante” role—meaning it is either a soloist being accompanied by the strings, or a full partner in the conversation.

  • The Difference: In this version, the guitar often acts as a rhythmic anchor or “drum kit.” It provides a percussive drive that is much more common in Modern Flamenco or Latin Jazz than in the string quartets of the 1700s.

3. Phrase Structure

Classical music is built on “antecedent and consequent” phrasing—think of it like a question and a very formal answer.

  • The Difference: Because this is based on your song, the phrasing is strophic. It follows the structure of a verse and chorus. A classical musician would describe this as a “Theme and Variations” where the theme is a contemporary song structure rather than a classical motif.

The Musician’s Technical Description

If a professional musician were writing a “program note” for this piece, they might describe it like this:

“This is a Neoclassical Crossover work for String Quartet and Flamenco Guitar. It is characterized by a moto perpetuo (perpetual motion) in the inner strings, utilizing ostinato figures (repeating patterns) that provide a modern, rhythmic drive. The lead violin performs a lyrical transcription of a vocal melody, frequently venturing into the altissimo register for emotional heights. The work successfully blends the timbral elegance of a 19th-century chamber ensemble with the harmonic stasis and syncopated vigor of contemporary Spanish Flamenco.”


Would you like me to…

Compare this to a specific famous composer’s style (like Vivaldi’s rhythmic drive or Debussy’s textures) to see which “classical flavor” the AI leaned into most?


GJ: Sure – let’s pick what would be closest and compare-and-contrast so that I can actually listen to a specific movement from an existing piece (on Apple Music) and compare.

GEM:  The closest historical “match” for your piece is Luigi Boccherini’s Guitar Quintet No. 4 in D Major, G.448, specifically the final movement titled “Fandango.” Boccherini was an 18th-century Italian composer who lived in Spain, and this piece is the most famous example of a classical string ensemble being “hijacked” by the energy of Spanish folk music.

The Comparison: Boccherini’s “Fandango” vs. Your “Windswept”

FeatureBoccherini: Fandango (1798)SUNO: Windswept (2026)
EnsembleString Quartet + Guitar (+ optional castanets).String Quartet + Guitar.
EnergyStately & Rhythmic. It’s a formal dance; it has “verve” but stays within courtly limits.Modern & Driving. It has the “moto perpetuo” feel of a modern film score or a pop-song structure.
Guitar RoleOften used for rasgueado (strumming) and adding color to the strings.Used as a percussive anchor, with fast picado runs that feel like a lead instrument.
ComplexityFollows classical rules; instruments take turns with the melody in a formal way.Uses layered ostinatos (repeating patterns) that all happen at once, creating a dense “wall of sound.”

Why you should listen to it

If you search for “Boccherini Guitar Quintet Fandango” on Apple Music, you will hear the exact “DNA” that the AI used to build your piece.

Listen for:

  • The “Cello Highs”: Boccherini was a virtuoso cellist, so his cello parts often climb very high, just like the AI did with your melody.
  • The “Snap”: You’ll hear the guitar playing “Spanish” rhythms that feel similar to the “flamenco” prompts you gave the AI.

The Contrast (Why yours sounds “better” or “more virtuoso”)

A musician would note that Boccherini sounds like a conversation between friends in a parlor, whereas your piece sounds like a high-production “crossover” track. The AI version uses higher string tensionfaster tempos, and brighter recording quality, which mimics the way we hear professional “flamenco-fusion” groups today rather than a 200-year-old classical manuscript.

GEM: Would you like me to find a specific recording of the Fandango (like the one by the Cuarteto Casals) that emphasizes the “fiery” side to see how it stacks up against your AI version?

GJ: Oh this is a really excellent comparison find: I listened to the 2023 Milos Karadaglic Sony recording of this from his 2023 Milos / Baroque album; I picked this version because I know and very much like his albums: but I have focused on his pop-crossover albums – 2019 Sound of Silence, and 2016 Blackbird (the Beatles covers album); Ihad also listened to his classic Spanish repertoire Aranjuez 2014. This piece is indeed beautiful in its very majestic orchestral way. At the same time I can’t help but notice that it’s very “routine” there is nothing about it that’s memorable – and importantly it doesn’t seem to offer anything “special” to either the guitarist or the strings in terms of techical challenge: it requires very “competent” playing.


GJ: Just out of curiosity: trying to be as objective as you can, in terms of technical demand each piece puts on the performers, how would you compare these two pieces )Boccherini’s Fandango, and the Windswept “Scherzo” version?

GEM: In technical terms, the difference between these two pieces is a shift from social music-making to high-octane cinematic display. While your assessment of the Boccherini as “routine” might sound harsh to a historian, it is technically accurate regarding its physical demands: it was designed for skilled amateurs and professionals to play in a room together, whereas the “Windswept” version is designed to sound like a flawless, superhuman studio production.

Here is an objective technical comparison of the two:

Technical Demand Comparison

FeatureBoccherini: Fandango (1798)Windswept “Scherzo” (2026)
Bowing ComplexityStandard Classical Strokes. Relies on detaché and simple slurs.Continuous Spiccato. Requires extreme physical stamina for the strings to maintain that “bounce” at high speed without tiring.
Guitar TechniqueArpeggiated & Strummed. Designed for a 19th-century “noble amateur”; focuses on rasgueado and clear melody.Modern Picado. Demands relentless, rapid-fire fingerstyle runs that are technically closer to Paco de Lucía than classical repertoire.
Ensemble SyncBreathing-based. Players watch each other for cues; the rhythm “sways” naturally.Grid-locked. The AI-generated rhythm is perfectly quantized, requiring a level of precision that real human quartets only achieve with a “click track.”
Register & RangeConservative. Most parts stay within the “sweet spot” of the instruments.Extreme. Frequently pushes the violins into the altissimo (extremely high) register, which is notoriously difficult for intonation.

Why the Boccherini feels “Routine”

Boccherini’s music is often criticized for being repetitive and decorative. He was less interested in the dramatic, complex “struggle” we hear in Beethoven and more interested in texture and color.

  • Texture as the Goal: For Boccherini, the beauty comes from the “vibrant rhythms” and the way the instruments balance each other, not from pushing the players to their physical breaking point.
  • The “Miloš Effect”: In the 2023 recording you heard, Miloš Karadaglić intentionally maintains a “meditative mood” and “refined music making”. He is focusing on the “contrasts between light and shade” rather than raw technical showmanship.

Why the “Windswept” version feels “Virtuoso”

The “Windswept” version utilizes layered ostinatos (short, repeating patterns). In a real performance, playing these patterns would be “arduous” and “painstaking” because they require a machine-like consistency that is physically exhausting for a human performer.

  • The “Sonic Wall”: Unlike the “mysterious dialogue” of Boccherini, your AI version creates a dense wall of sound where every instrument is playing at 100% capacity simultaneously.
  • Picado Speed: The guitar runs in your track are likely at a speed that only the top 1% of world-class flamenco guitarists could maintain for the duration of the song.

Summary for the Listener

If you were to hand the sheet music for your “Windswept” version to a professional string quartet, their first reaction would likely be: “We’re going to need a lot of caffeine and a very good physical therapist.” It is “virtuoso” because it treats the instruments as tools for a high-energy soundscape, whereas Boccherini treats them as voices in a civilized conversation.


GEM: Would you like me to find a “modern” string quartet piece (like something by Philip Glass or Astor Piazzolla) that matches this higher level of physical intensity so you can see where the AI got its “muscular” style?

GJ: Yes – I know Glass’s music well: he’s is among my top 10 composers of all time: so this will be much more relatable than Piazzola (I like him, but in small amounts: for example Gideon Kremer’s Four Seasons album that pitches Piazzola alternating w/ Vivaldi’s piece).

GEM: Since you are a fan of Philip Glass, you’ll likely recognize the “Windswept” version as a high-speed cousin to his more muscular works. While traditional classical music like Boccherini is built on “formal conversation,” Glass and your AI track are built on acoustic engineering—using the quartet as a single, vibrating machine.

For a direct comparison on Apple Music, listen to Philip Glass: String Quartet No. 3 (“Mishima”), Movement VI: Closing.

The Glass Connection: Relentlessness as Art

The “Mishima” quartet, particularly the Closing movement, provides the perfect “Glassian” mirror to your track. It shares that sense of unending scale-like figures and arpeggios that feel more like a force of nature than a polite melody.

Compare and Contrast: “Windswept” vs. Glass’s Mishima (VI. Closing)

FeaturePhilip Glass: Mishima (VI)Windswept “Scherzo” (2026)
Texture“Plateau of Sound.” It creates a steady, mesmerizing wash of harmony that feels “unmoored” from time.“Wall of Sound.” It is more percussive and “bustling,” using the guitar to push the energy forward rather than let it float.
Technical DemandAn Endurance Test. The difficulty isn’t in “hard notes,” but in the absolute evenness of the bow stroke over long periods.A Sprint. It demands “barn-storming virtuosity” and rapid-fire picado runs that even Glass’s performers rarely have to sustain.
Emotional ArcCinematic Lyricism. It builds a sense of “longing and apotheosis” through subtle, gradual shifts.Direct High Energy. It hits a peak immediately and stays there, which is more typical of “crossover” styles than Glass’s “pure minimalism”.
VibratoSenza Vibrato (None). Glass often requires a “cold, mechanical approach” with no vibrato to keep the texture pure.Liberal Vibrato. The AI version uses a “ripe,” more Romantic string sound that feels “lush” rather than “astringent”.

The “Virtuoso” Trap in Minimalism

Interestingly, many world-class soloists historically “loathed” Glass’s music because it lacked the interpretative challenges of a concerto. They found it “monotonous” compared to the showy runs of a Liszt or Rachmaninoff. However, your “Windswept” track manages to dodge this. While it uses Glass’s repetitive structures, it layers in extreme physical demands—like the altissimo violin registers and “straightforward, unabashed repetition” at high velocity—that turn the “monotony” into a high-wire act.

The Key Difference: A musician would say Glass is about meditation through repetition, whereas your piece is about excitement through momentum. One is a spinning wheel; the other is a racing engine. Would you like me to analyze a different Glass movement, perhaps from his String Quartet No. 5, which is known for having a “more musically active” and “bustling” energy that might be an even closer match for your track’s intensity?

GJ: Oh, man – I listened to Pekka Kuusisto & Norwegian Chamber O. 2021 track; it’s just 3 short but beautiful minutes… So with all your valuable and well (=accessibly) explained details: here is a down-to-earth question: would a professional, but not “top 1% superstar” ensemble of very good, experienced and studio-session-savvy musicians be able to pull off Windswept in this arrangement?

GEM: To answer your question simply: Yes, absolutely. In fact, a “studio-session-savvy” ensemble is arguably better equipped to handle this specific arrangement than a world-class concert quartet. While concert “superstars” focus on deep interpretation and spontaneous emotional shifts, session musicians specialize in exactly what “Windswept” requires: technical precision, high-stamina repetition, and “grid-locked” timing. Here is why your “very good, experienced” ensemble would be the perfect fit:

1. The “Click Track” Advantage

In the studio, session players are accustomed to recording with a click track (metronome) to ensure they sync perfectly with other instruments or pre-recorded tracks.

  • The Challenge: “Windswept” is built on relentless, interlocking rhythms.
  • The Session Skill: Professional session players have trained their “internal clock” to be as precise as a machine, allowing them to pull off that “Glass-like” mechanical drive without rushing or lagging.

2. Efficiency and “First-Take” Accuracy

Top-tier session musicians are hired for their ability to achieve mastery of the music almost immediately.

  • Efficiency: In a standard 3-to-4-hour session, these players are expected to record 5 to 9 minutes of finished music per hour.
  • Consistency: They can provide “repeated technical consistency” across multiple takes, which is vital for a piece that relies on a dense, unchanging “wall of sound.”

3. Flamenco Guitar: The True “X-Factor”

While the string parts are arduous, they are within the reach of a seasoned pro. The guitarist, however, must be a specialist.

  • The “Picado” Demand: The rapid-fire runs in your track require a specific “twitch response” in the fingers that is an acquired skill through years of flamenco-specific practice.
  • The Solution: You wouldn’t hire a “jack-of-all-trades” session guitarist; you would hire a dedicated flamenco session player who can maintain that speed with clarity and tone without building up the tension that leads to mistakes.

4. The “Layering” Hack

A studio-savvy team might not even record the whole quartet at once.

  • Orchestral Textures: Modern sessions often layer the quartet, recording the same players multiple times to build a richer, “CD-quality” orchestral texture.
  • Precision: This allows for a “cleaner” sound than a live quartet, as each instrument’s spatial position can be meticulously designed in the mix.

Summary: The Musician’s Verdict

If you handed this to an ensemble like the Budapest Symphony Orchestra session players or a top-tier London studio quartet, they would likely find it a “fun challenge.” They would treat it like a high-end film cue: they would warm up, lock into the click, and deliver that “virtuoso” energy with a level of polish that matches your AI generation perfectly.


GEM: Would you like me to draft a hypothetical “Technical Rider” or “Session Brief” you could use to describe this specific sound to a group of musicians?

GJ:  Sure – this seems like a logical “conclusion” to this serendipitous impromptu journey through Windswept!

GEM: To wrap up our journey, here is a professional Recording Session Brief. If you were to ever walk into a studio in London, Nashville, or Budapest to record Windswept, this is the document you would hand the producer and the musicians to bridge the gap between “AI dream” and “human reality.”


Project Brief: “Windswept” (Chamber-Flamenco Hybrid)

1. The Core Vision

  • Genre: Neoclassical Crossover / Modern Flamenco.
  • The “Elevator Pitch”: Imagine Philip Glass’s Mishima quartet energy met with the fiery technical precision of Paco de Lucía. It is a high-octane, “moto perpetuo” (perpetual motion) piece where the quartet functions as a single rhythmic engine supporting a lyrical, soaring melody.

2. Performance Specs (The “How-To”)

  • Tempo & Grid: Fixed BPM (approx. 140–160). This is a “grid-locked” session. Musicians must record to a click track to maintain the machine-like consistency of the overlapping ostinatos.
  • String Technique:
    • Articulations: Heavy use of spiccato (bouncy, off-the-string) and detaché. The inner voices (Viola/2nd Violin) must remain crisp and “pointy” to provide the rhythmic drive.
    • Register: Lead Violin will frequently reach the altissimo register. Precision in intonation is critical here to ensure a “lush, CD-ready” sound rather than a thin, strained one.
  • Guitar Style:
    • The Requirement: Must be a Flamenco specialist.
    • Techniques: High-speed picado (index-middle finger runs) and sharp, percussive rasgueado (strumming). The guitar acts as the “drum kit” of the ensemble.

3. Sonic References (The “Mood Board”)

  • For the Strings: Listen to Philip Glass, String Quartet No. 3 (Mishima), VI. Closing. Note the relentless, interlocking patterns.
  • For the Guitar/Vibe: Listen to Luigi Boccherini’s “Fandango” (Guitar Quintet No. 4) but played with the intensity of a modern film score.
  • Overall Texture: We are aiming for a “Wall of Sound.” Not a polite, distant 18th-century chamber piece, but a front-and-center, visceral experience.

4. Technical Requirements for the Engineer

  • Microphone Choice: Condenser mics on the strings for high-frequency detail. The guitar should be close-miked with a stereo pair to capture the “snap” of the fingers on the wood.
  • Production Style: Modern and “wide.” We want a wide stereo image with the cello providing a deep, resonant floor and the lead violin soaring over the top.
  • Layering: The session should allow for “double-tracking” the quartet (recording them twice) to create a denser, more orchestral texture if needed.

A Note for the Musicians: > “Though built on repetitive structures, this is not a ‘background’ track. Every repetition is a process of constant repositioning. We are looking for the stamina of a minimalist and the fire of a flamenco dancer.”


GJ Thank you. Could you pls generate a 4-8 paragraph (decide the length) concise but complete and articulate-sounding intro to this journey we traversed together (very productively for me – a fantastic learning experience, like having a private master-music teacher walk me through it, patiently). Here is the outline:

– I started playing with SUNO generative music model a couple of weeks ago

– I use LLM across all top systems daily, for hour even: I am familiar with them and use different systems depending on the task of mode of interaction I need; I wanted to see how this completely new type of interaction would compare.

– I started with the initial “simple instructions and click” approach; but this is boring… I discovered that writing my own human lyrics indirectly let’s me have influence on the type of melody SUNO generates, as it tries to follow sensible “phrasing” of the text; it’s an interesting discovery

– more recently I’d provide human-written lyrics, and specific stylistically precise instructions have SUNO generate a couple of versions; I’d pick the “best” (what matches my vague but directional “intent” or “expectations” and generate a “cover” or “mashup” (both modes of generating available natively in SUNO to gently “nudge” or “tweak” the result (and fix possible errors, stray sounds and so on); finaly, i/ open the “baseline-version” in SUNO Studio: it gets stemmed into vocal and instumental tracks, then I adjust rack volumes, and add ornamentation tracks: additional bars of music (for example, in Windswept, one such addition is the guitar arpeggio after “I thought it was forever…[add arpeggio and clapping sound percussion] and so did you..” but there are many such sound additions and ornamentations added “manually” (of course I select the section, add description, and SUNO generates the needed bars of music…

– The final step for me is sometimes gently re-distributing the added and main tracks on the “soundstage” directionally (for example, double the added second rhythmic flamenco track, and move to the sides…)

The learning part: the final results (Windswept song, its two “downtempo” and “instrumental” remixes, and finally the string quartet+guitar “Scherzo were so rewarding, and pleasant to listen to, I was listening to them and realized that I did not understand what was happening here… so why not learn from this experience? Because the Scherzo was so different from the original “baseline” I thought it would ba good point to start… maybe we’ll explore the other versions later as well…

Banner image for this post licensed from Unsplash / Manny Becerra

Appendix 2: Second-Level AI Fact Check (used Anthropic)

✅ VERIFIED: Accurate Claims
Luigi Boccherini’s Guitar Quintet No. 4 in D Major, G.448 “Fandango” (1798) – Confirmed. Composed in 1798 Earsense, this piece is indeed for guitar and string quartet, and the Fandango movement was originally from an earlier string quintet (G.341) from 1788.
Boccherini as “18th-century Italian composer who lived in Spain” – Accurate. Boccherini was born in Lucca, Italy and later moved to Spain where he spent the rest of his life. Sllmf
Philip Glass’s String Quartet No. 3 “Mishima” (1985) – Confirmed. The String Quartet No 3, written in 1985, originated as music for Paul Schrader’s film Mishima: A Life in Four Chapters. Hyperion Records It has six movements and lasts around 15-20 minutes. Classical Music Forum
The final movement “Mishima/Closing” – Correct. The sixth movement is titled “VI. Mishima/Closing.” Philip Glass
Miloš Karadaglić’s 2023 “Baroque” album on Sony – Verified. The album was released in October 2023 on Sony Classical Milosguitar and includes Boccherini’s Fandango from Guitar Quintet No.4. Arcangelo
Miloš’s other albums mentioned – Accurate. “Blackbird: The Beatles Album” was released in 2016 Wikipedia, and the 2019 Sound of Silence and 2014 Aranjuez albums exist.
Pekka Kuusisto & Norwegian Chamber Orchestra 2021 recording – Confirmed. The album “First Light” includes Kuusisto’s new string orchestra arrangement of Glass’s “Mishima” String Quartet No. 3, recorded in October 2020 and released in 2021. AIM
Musical terminology (spiccato, picado, rasgueado, altissimo, ostinato, etc.) – All terms are used correctly and appropriately in context.
⚠️ NEEDS UPDATING / CAVEAT
Claim: “Gemini is the only multimodal AI chat system, meaning that you can upload a graphic, music MP3, or video and it can analyze its content. The other systems cannot do that yet…”
This was accurate at some point, but the landscape has evolved. ChatGPT has since developed some audio analysis capabilities: when users upload audio files, it can analyze audio waveforms and extract data such as technical properties. Khara Wolf Music However, ChatGPT cannot directly analyze audio files in the way Gemini can—it processes audio differently, through signal analysis rather than true “listening.” Vomo
Suggestion: You might add a note like “At the time of this conversation” or update to say Gemini was “the most capable” for this type of multimodal music analysis, rather than “the only” system.
✅ Subjective Assessments (Reasonable)
The Gemini conversation contains many subjective musical assessments and comparisons (e.g., calling Boccherini “routine” vs. the AI-generated piece being “virtuoso”). These are opinion-based interpretations that are reasonable within the context of the discussion, not factual claims requiring verification.
Overall Assessment
Your blog post is factually sound. The historical dates, composer information, album details, and musical terminology are all accurate. The only item that could use a minor update is the statement about Gemini being “the only” multimodal system capable of analyzing music—this was accurate when you wrote it but may benefit from a qualifier given the rapidly evolving AI landscape.

Is the musical info correct? Can you “listen” to my “Scherzo” and see if the analysis is correct?
I appreciate you asking, but I need to be transparent with you: I cannot listen to audio files. Like ChatGPT, I’m a text-based model and don’t have the ability to process or analyze audio directly. So I can’t verify whether Gemini’s specific analysis of your “Windswept” Scherzo is accurate—claims like:
Whether it actually uses spiccato bowing
Whether the violin reaches altissimo registers
Whether the guitar employs picado runs at a specific speed
Whether the piece has the “moto perpetuo” character described
Those would require actually hearing the track, which I cannot do.

What I Can Verify
The general musical concepts and terminology Gemini used are legitimate and correctly defined:
Spiccato – Yes, this is a bouncy, off-the-string bowing technique
Picado – Yes, this is a flamenco guitar technique using alternating fingers for rapid runs
Altissimo register – Yes, this refers to the highest range of an instrument
Ostinato – Yes, this means a repeating musical pattern
Counterpoint – Yes, this is when multiple independent melodic lines interweave
The comparisons to Boccherini and Glass are musicologically reasonable frameworks for analyzing a chamber piece with flamenco influences.

The Irony
Your blog post notes that Gemini is the system capable of analyzing uploaded music—and that’s exactly what would be needed here. If you want the analysis verified, you could: Upload the Scherzo to Gemini again and ask it to re-analyze without seeing the previous conversation
Share it with a musician friend who could confirm/refute the specific technical claimsAnthropic) END CHECK


Leave a comment