AI Music

Reference-conditioned music, three honest modes.

Bring a song, vocal, or instrumental you already love. Give it new lyrics. AresGen renders a new track that keeps the original sonic identity, powered by MiniMax Music-01.

One engine · Three modes · MP3 output

  • 1 tuned engine: MiniMax Music-01 via aimlapi.com
  • 3 honest modes: song / voice / instrumental
  • Reference + lyrics input: file upload or URL
  • MP3 download: single file per render

Routed engine

aimlapi.com → MiniMax Music-01

One tuned engine, three honest modes (song, voice, and instrumental), all reference-conditioned.

1 tuned engine: MiniMax Music-01 via aimlapi.com3 honest modes: song / voice / instrumentalReference + lyrics input: file upload or URLMP3 download: single file per render

By the numbers

One engine. Three modes. Two reference inputs. MP3 out.

1
Engine
MiniMax Music-01 (single platform)
3
Modes
song / voice / instrumental
2
Reference inputs
file upload or URL
MP3
Output
single file, instant download

Backend surfaces

Three honest modes, one engine.

Song, voice, instrumental: each mode routes through the same reference-conditioned engine. No vendor drift between renders.

Song

Upload a reference song. Provide new lyrics. AresGen re-renders the track with the new lyrics while preserving the genre, tempo, and overall sonic identity of your reference.

  • mp3 / wav / ogg upload (max 10 MB) or audio URL
  • Lyrics required: custom text drives the new render
  • Output preserves reference tone and arrangement
Powered by MiniMax Music-01

Voice

Upload a reference vocal track. Provide new lyrics. AresGen generates a new vocal performance in the same vocal character, ready to drop into an existing instrumental.

  • Reference must be a clean vocal sample
  • Output preserves vocal timbre and character
  • Pair with the Instrumental mode for full-song workflows
Powered by MiniMax Music-01

Instrumental

Upload a reference instrumental. Provide lyrics. AresGen layers a generated vocal over the instrumental, matched to the tempo and key of the reference.

  • Reference must be vocal-free or vocal-light
  • New vocal layer aligned to the instrumental
  • No separate vocal stem in output: single mixed MP3
Powered by MiniMax Music-01

One platform

One tuned engine, not a multi-engine race.

aimlapi.com → MiniMax Music-01· Reference-conditioned music generation

Single provider, intentional. No multi-engine drift between renders.

Capabilities

Honest gaps. No marketing fog.

Character preservation applies to surfaces with a vocal (Song and Voice), not Instrumental. Every other capability is enumerated below from the same facts module the audit gate enforces.

Music capabilities across 3 surfaces. Filled dot indicates support.
CapabilitySongVoiceInstrumental
Reference-conditioned
Custom lyrics
Audio file upload (mp3/wav/ogg)
Reference via URL
MP3 download
Vocal character preserve

How a render runs

Three steps, one contract

Click each step to see exactly what AresGen sends to MiniMax Music-01. No auto-advance. The chip you tap is the step you see.

Step: Upload reference

Drop in an mp3, wav, or ogg up to 10 MB, or paste a public audio URL. The reference is uploaded to the aimlapi.com purpose-tagged endpoint as either voice or instrumental input.

Pricing

Metered, predictable, included.

Music renders are metered at 0.05 credits per generation, included in every paid AresGen plan. Free trial accounts start with enough credits for a handful of test renders. No separate music subscription.

Use cases

Four honest workflows our customers run today.

Re-lyric an existing brand jingle without losing the original character.

Upload your brand jingle in voice mode and supply the new campaign lyrics. The vocal character carries over while the message updates.

Iterate lyrics on a reference track before booking studio time.

Use song mode to test different lyric drafts against your demo. Same arrangement, different message, every render.

Add a custom vocal layer over a loop you already own.

Drop a loop or instrumental into instrumental mode and supply the lyrics. Output is a mixed MP3 with the new vocal sitting on the loop.

Reuse a signature vocal style across multiple lyrics.

Upload a vocal-only reference in voice mode once, then render different lyrics against it for episodic content.

FAQ

Six answers before you ask.

Pair this with