Analysing Deep Learning-Spectral Envelope Prediction Methods for Singing Synthesis. Bous, F. & Roebel, A. In 2019 27th European Signal Processing Conference (EUSIPCO), pages 1-5, Sep., 2019.
Analysing Deep Learning-Spectral Envelope Prediction Methods for Singing Synthesis [pdf]Paper  doi  abstract   bibtex   
We conduct an investigation on various hyperparameters regarding neural networks used to generate spectral envelopes for singing synthesis. Two perceptive tests, where the first compares two models directly and the other ranks models with a mean opinion score, are performed. With these tests we show that when learning to predict spectral envelopes, 2d-convolutions are superior over previously proposed 1d-convolutions and that predicting multiple frames in an iterated fashion during training is superior over injecting noise to the input data. An experimental investigation whether learning to predict a probability distribution vs. single samples was performed but turned out to be inconclusive. A network architecture is proposed that incorporates the improvements which we found to be useful and we show in our experiments that this network produces better results than other stat-of-the-art methods.

Downloads: 0