Training Neural Networks with Different Perceptual Loss for Speech Super-Resolution

Authors: Yupeng Shi, Jiatong Shi, Xi Chen, Nengheng Zheng

Audio Samples

Descriptions

WB: the reference wideband speech

NB: the narrowband speech compressed by G.711

GAN_MSE: the estimated wideband speech from the GAN-based SSR system trained by Mean Square Error Loss.

GAN_PWF: the estimated wideband speech from the GAN-based SSR system trained by Perceptual Weighting Filter Loss.

GAN_PM: the estimated wideband speech from the GAN-based SSR system trained by Psychoacoustic Masking Loss.

GAN_PE: the estimated wideband speech from the GAN-based SSR system trained by Perceptual Entropy Loss.

WB
NB
GAN_MSE
GAN_PWF
GAN_PM
GAN_PE
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10

Spectrogram

Some spectrogram samples are also shown intuitively.

Table 1

D7_803_clean.jpg

D7_803_nb.jpg

D7_803_cnnNb_mseloss.jpg

D7_803_cnnNb_perceptualloss.jpg

D7_803_cnnNb_mpegloss.jpg

D7_803_cnnNb_entropyloss.jpg

Table 2

D11_808_clean.jpg


D11_808_nb.jpg

D11_808_cnnNb_mseloss.jpg

D11_808_cnnNb_perceptualloss.jpg

D11_808_cnnNb_mpegloss.jpg

D11_808_cnnNb_entropyloss.jpg

Table 3

D11_808_clean.jpg


D11_808_nb.jpg

D12_780_cnnNb_mseloss.jpg

D12_780_cnnNb_perceptualloss.jpg

D12_780_cnnNb_mpegloss.jpg

D12_780_cnnNb_entropyloss.jpg

Table 4

D12_792_clean.jpg


D12_792_nb.jpg

D12_792_cnnNb_mseloss.jpg

D12_792_cnnNb_perceptualloss.jpg

D12_792_cnnNb_mpegloss.jpg

D12_792_cnnNb_entropyloss.jpg

------ 本文结束------
0%