Recognition
rates under Subway, Car, and Babble noise, extract from paper:
M.
Padellini, F. Capman, G. Baudoin, “Very low bit rate speech coding in Noisy
Environments”,
submitted
to INTERSPEECH’05, Lisbon.
Subway |
LPCC |
MFCC |
AURORA MFCC |
MMSE MFCC |
PMC LPCC |
||||||||||
SNR(dB) |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
Corr (%) |
15 |
36 |
45 |
19 |
41 |
47 |
34 |
53 |
57 |
29 |
51 |
52 |
35 |
51 |
62 |
Sub (%) |
82 |
59 |
50 |
76 |
53 |
46 |
59 |
39 |
36 |
65 |
42 |
41 |
41 |
36 |
29 |
Ins (%) |
26 |
28 |
23 |
29 |
23 |
22 |
27 |
21 |
21 |
26 |
22 |
21 |
5 |
10 |
10 |
Car |
LPCC |
MFCC |
AURORA MFCC |
MMSE MFCC |
PMC LPCC |
||||||||||
SNR(dB) |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
Corr (%) |
31 |
57 |
61 |
30 |
55 |
58 |
45 |
61 |
64 |
41 |
53 |
55 |
51 |
72 |
75 |
Sub (%) |
64 |
39 |
34 |
64 |
39 |
36 |
48 |
32 |
29 |
52 |
40 |
38 |
31 |
18 |
16 |
Ins (%) |
34 |
22 |
21 |
26 |
22 |
21 |
22 |
17 |
17 |
26 |
22 |
22 |
6 |
5 |
5 |
Babble |
LPCC |
MFCC |
AURORA MFCC |
MMSE MFCC |
PMC LPCC |
||||||||||
SNR(dB) |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
5 |
15 |
20 |
Corr (%) |
29 |
54 |
59 |
28 |
51 |
57 |
36 |
54 |
59 |
33 |
50 |
54 |
45 |
63 |
72 |
Sub (%) |
68 |
43 |
38 |
69 |
44 |
39 |
58 |
39 |
36 |
61 |
44 |
40 |
45 |
29 |
22 |
Ins (%) |
38 |
29 |
27 |
37 |
30 |
30 |
33 |
23 |
23 |
30 |
27 |
27 |
19 |
14 |
11 |
Table 1, 2, 3: Recognition rates under Subway, Car and Babble noise.
(Corr: Correct, Sub: Substitution, Ins: Insertion)
Example
of noise corrupted files coded with the VLBR coder.
The noise model has been trained
using the first second of the noise corrupted signal, preceding the utterance.
The noise added is the mean harmonic profile of
the first second of the noise corrupted signal, preceding the utterance.
at SNR 5dB
problems occur because of the lack of robustness in the pitch estimation.
Clean
speech original file: Play,
VLBR coded
speech file: Play
|
Subway |
Car |
Babble |
|||
|
Original |
VLBR |
Original |
VLBR |
Original |
VLBR |
20dB |
||||||
15dB |
||||||
5dB |