# VoiceForge - Advanced Voice Transformation Application

## Overview

VoiceForge is a web-based voice transformation application that enables users to transform one voice to match the characteristics of another. The application extracts detailed vocal parameters from source and target audio files, then applies sophisticated audio processing to generate a transformed output that matches the target voice characteristics while preserving the source content.

The application uses a modern full-stack architecture with React for the frontend, Express for the backend, and FFmpeg for audio processing. It provides real-time feedback through a multi-stage processing pipeline with visual waveform representations and detailed parameter analysis.

## User Preferences

Preferred communication style: Simple, everyday language.

## System Architecture

### Frontend Architecture

**Framework**: React 18+ with TypeScript, using Vite as the build tool and development server.

**UI Component System**: The application uses shadcn/ui components built on Radix UI primitives, providing a consistent and accessible design system. Components follow the "new-york" style variant with custom Tailwind CSS theming.

**Design Philosophy**: Material Design influenced by professional audio tools (Audacity, Adobe Audition) and modern productivity interfaces (Linear, Vercel). The design emphasizes:
- Technical transparency with visible audio processing steps
- Dual-context clarity distinguishing between source and target files
- Progressive disclosure showing complexity only when needed
- Confidence through continuous feedback at every processing stage

**State Management**: TanStack Query (React Query) handles server state and API interactions, with automatic refetching during active processing jobs. Local component state manages UI-specific concerns like file uploads and playback controls.

**Routing**: Wouter provides lightweight client-side routing, though the application is primarily a single-page interface focused on the voice transformation workflow.

**Key Components**:
- FileUploadZone: Dual upload interface for source and target audio files with drag-and-drop support
- WaveformVisualizer: Canvas-based audio waveform rendering with playback controls
- ProcessingPipeline: Visual representation of the multi-stage transformation process
- ParameterAnalysisPanel: Detailed display of extracted voice parameters with comparison views
- ComparisonPlayer: Synchronized audio playback for comparing source, target, and result files

### Backend Architecture

**Framework**: Express.js with TypeScript running on Node.js, using ESM module format.

**API Design**: RESTful API with the following key endpoints:
- `POST /api/upload`: File upload with audio analysis (returns file ID, duration, waveform data)
- `POST /api/transform`: Initiates voice transformation job
- `GET /api/transform/:jobId`: Polls job status and progress
- `GET /api/download/:jobId`: Downloads transformed audio result
- `GET /api/waveform/:jobId`: Retrieves result waveform data

**Audio Processing**: Utilizes FFmpeg and FFprobe for:
- Audio format conversion and normalization
- Metadata extraction (duration, sample rate, channels)
- Waveform data generation for visualization
- Voice parameter analysis (pitch, formants, spectral features, MFCCs)
- Voice transformation applying extracted parameters

**Processing Pipeline**: Multi-stage transformation workflow:
1. Analyzing source voice
2. Analyzing target voice
3. Extracting features from both
4. Applying transformation algorithms
5. Matching target characteristics
6. Finalizing output audio

**Storage Strategy**: In-memory storage (MemStorage class) for uploaded files, transformation jobs, and results. This approach is suitable for development and small-scale deployments but should be replaced with persistent storage (database + file storage) for production use.

**File Handling**: Multer middleware manages multipart file uploads with:
- 50MB file size limit
- MP3 format validation
- In-memory buffering for processing

### Data Schema and Validation

**Schema Definitions**: Zod schemas in `shared/schema.ts` provide type-safe validation for:
- VoiceParameters: 17+ vocal characteristics including pitch, formants, spectral features, MFCCs, and temporal data
- TransformJob: Job status tracking with progress percentage and error handling
- Upload/Transform requests and responses

**Type Safety**: TypeScript types are automatically inferred from Zod schemas, ensuring consistency between validation logic and type definitions across frontend and backend.

### Build and Development

**Development Mode**: Vite dev server with HMR (Hot Module Replacement) running through Express middleware, providing fast refresh during development.

**Production Build**: 
- Client: Vite builds optimized static assets to `dist/public`
- Server: esbuild bundles server code to a single `dist/index.cjs` file with selective dependency bundling for improved cold start performance

**Path Aliases**: Configured in both TypeScript and Vite:
- `@/`: Client source directory
- `@shared/`: Shared schema definitions
- `@assets/`: Attached assets directory

### Styling System

**CSS Framework**: Tailwind CSS with custom theme extending the base configuration

**Color System**: HSL-based color variables supporting light and dark modes with comprehensive semantic color tokens (primary, secondary, muted, accent, destructive, etc.)

**Typography**: Custom font stack including:
- Inter as primary sans-serif (via Google Fonts)
- JetBrains Mono for technical/monospace data
- Design guidelines specify strict hierarchy for headings, labels, and technical data

**Component Styling**: CSS custom properties for dynamic theming with hover and active state elevation effects

## External Dependencies

### Core Framework Dependencies

- **@tanstack/react-query**: Server state management and API request caching
- **wouter**: Lightweight client-side routing
- **react** and **react-dom**: UI framework
- **express**: Backend HTTP server
- **vite**: Frontend build tool and dev server

### UI Component Libraries

- **@radix-ui/***: Comprehensive set of accessible UI primitives (accordion, dialog, dropdown, popover, select, slider, tabs, toast, tooltip, etc.)
- **class-variance-authority**: Type-safe variant styling
- **tailwindcss**: Utility-first CSS framework
- **lucide-react**: Icon library

### Form and Validation

- **react-hook-form**: Form state management
- **@hookform/resolvers**: Form validation resolvers
- **zod**: Schema validation and type inference
- **drizzle-zod**: Database schema to Zod conversion (prepared for future database integration)

### Database (Prepared but Not Active)

- **drizzle-orm**: TypeScript ORM
- **@neondatabase/serverless**: Neon PostgreSQL serverless driver
- **drizzle-kit**: Database migrations and schema management

The database configuration in `drizzle.config.ts` is set up for PostgreSQL but currently the application uses in-memory storage. The schema is defined in `shared/schema.ts` using Zod, which can be converted to Drizzle schema when database persistence is needed.

### File Upload and Processing

- **multer**: Multipart form data handling for file uploads
- **@types/multer**: TypeScript definitions

### Audio Processing (System Dependencies)

The application requires **FFmpeg** and **FFprobe** to be installed on the system where the backend runs. These are called via Node.js child processes for:
- Audio analysis and metadata extraction
- Waveform generation
- Voice parameter analysis
- Audio transformation

### Development Tools

- **@replit/vite-plugin-***: Replit-specific development plugins (cartographer, dev banner, runtime error modal)
- **typescript**: Type checking and compilation
- **tsx**: TypeScript execution for development and build scripts
- **esbuild**: Fast JavaScript bundler for production server build

### Utility Libraries

- **nanoid**: Unique ID generation
- **date-fns**: Date formatting and manipulation
- **clsx** and **tailwind-merge**: Conditional className utilities