top of page

Noise Removal in Baby Cry Recordings: Comparative Analysis

noiselist.png

Existing noise removal methods like spectral subtraction and Wiener filtering  are used in speech processing. But their performance changes a lot with different noises, especially sudden noises like drilling. Baby cries have high-pitched sounds and change quickly, making noise removal harder. Past studies focused on adult speech or general audio, but few look at baby-specific cases. This is a problem because clear cry signals are needed to detect health issues in babies early.

Introduction

Method

Our research identifies Blend 0.7 as the most effective solution for enhancing baby cry recordings. This hybrid method combines:

  • 70% Wiener Filter (for natural sound preservation)

  • 30% Spectral Subtraction (for strong noise reduction)

Key advantages:

  1. Delivers 8-10 dB noise reduction in constant background noise

  2. Scores 1.6-2.04 in audio quality tests (significantly better than single methods)

  3. Minimizes digital artifacts that distort cry analysis

ss.png
wf.png
1. Overview of PESQ Scores

The PESQ (Perceptual Evaluation of Speech Quality) scores range from -0.5 to 4.5, with higher scores indicating better perceptual quality. The results show:

  • Most PESQ scores fall between 1.0 and 2.0, indicating "Fair" to "Poor" quality (annoying but intelligible).

  • The highest PESQ score achieved was 2.04 for white_noise_SNR10dB.wav using the Blend 0.7 method, which is still only "Fair" quality.

  • For most noise types, PESQ improvements are marginal (e.g., breaking_concrete ranges from 1.04 to 1.27).

3. Impact of SNR (Signal-to-Noise Ratio)
  • Higher SNR (10 dB) generally led to better PESQ scores, but the improvement was not dramatic. For example:

    • white_noise: PESQ improved from 1.40 (0 dB) to 2.04 (10 dB).

    • car_road1: PESQ improved from 1.09 (0 dB) to 1.60 (10 dB).

  • Lower SNR (0 dB) scenarios struggled, with PESQ scores often near 1.0 ("Poor" quality).

breaking_concrete_SNR10dB_20250530_000001_comparison.png
emviroment_noise1_SNR10dB_20250530_000056_comparison.png
2. Performance by Noise Type
  • White Noise:

    • Best performance with Blend 0.7 (PESQ: 1.69–2.04).

    • Adaptive and Wiener Filter also performed well, especially at higher SNRs.

  • Car Road Noise:

    • Blend 0.7 consistently outperformed other methods (PESQ: 1.09–1.60).

  • Breaking Concrete / Environment Noise / Remodeling Noise:

    • Wiener Filter or Blend 0.7 worked best, but PESQ improvements were minimal (1.03–1.31).

    • These noises are more challenging, with PESQ rarely exceeding 1.3.

Tailoring Methods to Noise Types

For continuous noises (white noise, engine sounds):

  • Spectral Subtraction: Best SNR improvement (8-10 dB) but introduces artifacts

  • Blend 0.7: Best balance (PESQ 1.6-2.04) with good noise reduction

For sudden noises (door slams, construction):

  • Wiener Filter superior (PESQ 1.03-1.16)

  • Spectral Subtraction degrades quality despite SNR improvements

breaking_concrete_SNR0dB_20250529_235942_spectrum.png
white_noise_SNR5dB_20250530_000213_spectrum.png

Details

Conclusion

  • For White Noise and Car Road Noise:

  • Blend 0.7 is the top choice, offering the highest PESQ scores (up to 2.04).

  • Adaptive methods also perform well for white noise at lower SNRs.

  • For Environment and Remodeling Noise:

  • Wiener Filter is most effective, though gains are modest (PESQ: 1.03–1.31).

ISDN2001/2002: Second Year Design Project

bottom of page