TECHNICAL DOCUMENTATION

How SHE works.

A technical overview of the engine architecture, input parameters, output formats, and generation pipeline. For teams evaluating SHE for integration.

HOW IT WORKS

Four layers, from physics to export.

SHE is not a neural network. It is a rule-based generation engine with four distinct layers of responsibility. Each layer has a single job and a clear boundary.

01

Core — Music Theory Primitives

The mathematical foundation. Intervals, timing grids, harmony math, and meter models that remain true regardless of genre or style. This is the universal physics of music.

02

Library — Validated Catalogs

Structured collections of chord progressions, rhythmic patterns, genre behaviors, mood profiles, and instrument definitions. Every entry is schema-validated and indexed. Labels like "Soul" or "Uplifting" map to specific, tested generation behaviors — not prompt interpretations.

03

Services — The Composers

Where musical decisions are made. Services choose progressions based on mood, generate basslines from harmonic context, plan section structure, and shape arrangement energy. Five domains: composition, selection, arrangement, rendering, and analysis.

04

Execution — Orchestration & Export

The conductor layer. Takes your parameters, routes them through the services, and exports the final package: audio files, MIDI, structured metadata, and machine-readable JSON/XML — all simultaneously.

INPUT PARAMETERS

What you control.

Every generation starts with a set of parameters. Each one maps to specific musical logic in the engine — not a text prompt interpretation.

Genre

Selects from validated genre catalogs that shape rhythm, instrument choice, and harmonic style. Each genre maps to specific generation behaviors — not a text prompt.

House, Jazz, Electronic, Soul, Ambient, Hip-Hop

Mood

Mood profiles influence harmonic color, rhythmic intensity, and arrangement density. Multiple moods can be combined.

Energetic, Melancholic, Uplifting, Dark, Peaceful

Key

Sets the tonal center for the entire piece. All harmonic decisions resolve relative to this key.

C, D, F#, Bb — any of the 12 chromatic pitches

Mode

Determines the scale and harmonic character. Modes shape whether music feels bright, dark, suspended, or resolved.

Ionian (major), Dorian, Mixolydian, Aeolian (minor), Phrygian

Chord Progression

Choose from a validated catalog of progressions, or let the engine select based on genre and mood context. Each progression carries metadata: formula, cadence type, and harmonic signature.

I–V–vi–IV, ii–V–I, vi–IV–I–V

Tempo

Beats per minute. Affects rhythmic grid density, swing behavior, and section energy calculations.

60–200 BPM (default: 120)

Song Form

Defines the structural blueprint: which sections appear, in what order, and how many bars each contains.

Full (intro–verse–chorus–bridge–outro), Loop, Verse-Chorus

Complexity

Controls harmonic and rhythmic density across the piece. Higher complexity increases chord-per-bar rates, syncopation, and voice movement.

Low, Medium, High

Resolution Style

Shapes how phrases and sections resolve harmonically — whether cadences feel final, suspended, or open-ended.

Resolved, Suspended, Open

OUTPUT FORMATS

What you get.

Every generation produces four synchronized outputs. No post-processing required.

Audio
  • Full mix and individual stems (chords, bass, melody, drums, pad)
  • WAV format, 44.1kHz sample rate
  • Ready for playback, production, or direct integration
MIDI
  • 480 ticks per beat resolution
  • Separate tracks per instrument role
  • Every note carries velocity, timing, and duration data
  • Compatible with all major music production software
Musical Understanding
  • Key, tempo, mode, and time signature
  • Chord progression with per-section harmonic rhythm
  • Section markers with bar counts and energy density
  • Instrument roles and per-section muting/velocity profiles
  • Genre, mood, and style tags — embedded at generation, not inferred
JSON / XML
  • Complete harmonic and formal description of every piece
  • Schema-validated Pydantic models
  • Progression signature, cadence type, and formula
  • Paired with audio for supervised ML training workflows
  • Generation manifest with full parameter provenance

Under the hood

These are some of the musical models that drive generation decisions. They demonstrate the depth of control SHE operates at — far below what a prompt-based system can reach.

Beat Strength Model

SHE uses a hierarchical emphasis model for every time signature. In 4/4, beat 1 carries the most weight, beat 3 is secondary, and off-beat subdivisions carry the least. This model drives how loud notes are played, where rests fall, and which chord tones are chosen on strong versus weak beats. Compound meters like 6/8 and 12/8 have their own models.

Chord Tone Selection

When bass and harmonic roles choose notes within a chord, they follow a weighted palette. The root is chosen most often, followed by the fifth and third. Octaves, chromatic approaches, and stepwise walks toward the next chord are used less frequently. These weights ensure basslines and harmony feel musically grounded rather than random.

Section Energy

Each section type has a baseline energy level that determines how busy it sounds. Intros and outros are sparse. Verses sit at moderate activity. Choruses and drops are dense and full. Bridges pull back for contrast. Builds ramp tension gradually. These baselines shape everything from note density to instrument layering.

Swing & Feel

Rhythmic feel is controlled per generation. Swing ranges from straight (no swing) to heavy shuffle, with off-beat notes delayed according to genre conventions. A jazz output swings naturally. An electronic output locks to the grid. The engine detects which notes sit on off-beats and adjusts their timing accordingly.

GENERATION PIPELINE

From parameters to finished music.

A single generation runs through five stages. Each stage is logged and traceable.

01

Define

Set genre, mood, key, mode, tempo, form, and complexity.

02

Select

Engine selects progression, instruments, and rhythmic patterns from validated catalogs.

03

Compose

Services generate bass, melody, chords, drums, and pads — each role aware of the others.

04

Arrange

Section structure is planned: energy arcs, transitions, muting, and density per section.

05

Export

Audio, MIDI, metadata, and JSON/XML are produced simultaneously. Every file is labeled and traceable.

System requirements: SHE runs locally — no GPU required. The engine operates on standard hardware, including low-power devices. No cloud dependency for generation.

API ACCESS

Integrate SHE into your product.

The SHE API accepts a set of musical parameters and returns a complete generation package: audio files, MIDI, structured metadata, and machine-readable JSON/XML — all in a single response.

Request

Send your parameters — genre, mood, key, mode, tempo, form, complexity — and the engine generates a complete musical output.

// Example parameters

genre: "electronic"

mood: "energetic"

key: "C", mode: "dorian"

tempo: 128

form: "full"

Response

Receive a complete package: rendered audio, MIDI tracks, embedded metadata, and a structured manifest describing every musical decision.

// Response includes

audio: [full_mix.wav, stems/...]

midi: [chords.mid, bass.mid, ...]

metadata: {key, tempo, mode, ...}

manifest: {progression, form, ...}

Authentication

API access is granted per-organization. Each partner receives dedicated credentials and rate limits tailored to their use case.

Output Formats

WAV audio (44.1kHz), standard MIDI files, JSON metadata, and XML export. All formats are delivered simultaneously per generation.

Local Deployment

SHE can also run entirely on-premise. No GPU required. Standard hardware, including low-power devices, is sufficient for generation.