A DYNAMIC PROGRAMMING APPROACH TO CONTEXT-FREE VOICE TRANSFORMATION (MonAmOR3)
Author(s) :
Ozgul Salor (Middle East Technical University, Turkey)
Mubeccel Demirekler (Middle East Technical University, Turkey)
Abstract : In this paper, we present a dynamic programming approach to voice transformation (VT). The goal of VT is to modify the speech of a source speaker such that it is perceived as if spoken by a target speaker. The speech model used in this work is based on MELP (Mixed Excitation Linear Prediction) speech coding algorithm. The designed system obtains speaker-specific codebooks of line spectral frequencies (LSFs) out of MELP's multi-stage vector quantization LSF codebook for both source and target speakers. Those codebooks are used to train a mapping histogram, which is used for LSF transformation from one speaker to the other. The baseline system uses the maxima of the histograms for LSF transformations. The shortcomings of this system, which are the limitation of the target LSF space and the spectral discontinuities due to independent mapping of subsequent frames, have been overcome by applying the dynamic programming approach. Dynamic programming approach tries to model the long-term behaviour of the LSFs of the target speaker, while it is trying to preserve the relationship between the subsequent frames of the source LSFs, during transformation. Both objective and subjective evaluations have been conducted and it has been shown that dynamic programming approach improves the performance of the system in terms of both the speech quality and speaker similarity.
Menu