Documentation
¶
Overview ¶
Package mel provides mel-frequency spectrogram generation and audio synthesis.
This package implements conversion between audio waveforms and mel-scale spectrograms, which are commonly used in audio processing and speech recognition. It supports:
- Converting WAV/FLAC audio files to mel spectrograms (saved as PNG images)
- Reconstructing audio from mel spectrograms using Griffin-Lim algorithm
- Configurable mel filterbank parameters (frequency range, number of mel bands)
- STFT-based analysis and synthesis with customizable window sizes
Index ¶
- Variables
- func ISTFT(s *stft.STFT, spectrogram [][]complex128, numIterations int) []float64
- func LoadFlac(inputFile string) []float64
- func LoadWav(inputFile string) []float64
- func SaveWav(outputFile string, vec []float64, sr int) error
- type Mel
- func (m *Mel) FromMel(ospectrum [][2]float64) ([]float64, error)
- func (m *Mel) Image(buf [][2]float64) []uint16
- func (m *Mel) ToMel(buf []float64) ([][2]float64, error)
- func (m *Mel) ToMelFlac(inputFile, outputFile string) error
- func (m *Mel) ToMelWav(inputFile, outputFile string) error
- func (m *Mel) ToWavPng(inputFile, outputFile string) error
Constants ¶
This section is empty.
Variables ¶
View Source
var ErrFileNotLoaded = errors.New("wavNotLoaded")
Functions ¶
func ISTFT ¶ added in v0.0.3
func ISTFT(s *stft.STFT, spectrogram [][]complex128, numIterations int) []float64
Types ¶
type Mel ¶
type Mel struct {
NumMels int
MelFmin float64
MelFmax float64
TuneMul float64
TuneAdd float64
Window int
Resolut int
YReverse bool
GriffinLimIterations int
// VolumeBoost when loading spectrogram from image, can be a value like 1.666
VolumeBoost float64
// sample rate for output wav
SampleRate int
}
Mel represents the configuration for generating mel spectrograms.
func (*Mel) FromMel ¶ added in v0.0.3
FromMel generates a wave buffer from a mel spectrogram and returns the wave buffer.
func (*Mel) ToMel ¶
ToMel generates a mel spectrogram from a wave buffer and returns the mel buffer.
func (*Mel) ToMelFlac ¶
ToMel generates a mel spectrogram from an input FLAC audio file and saves it as a PNG image.
Click to show internal directories.
Click to hide internal directories.