代码之家  ›  专栏  ›  技术社区  ›  Joe Mills

为什么WAV文件的持续时间会影响Scipy find_peaks返回的值?

  •  0
  • Joe Mills  · 技术社区  · 2 年前

    我正在写一些非常简单的代码来对WAV文件进行频率分析(如麦克风记录到Audacity中的):对文件进行FFT,并找到最终的峰值频率值。据我所知,我已经正确地遵循了库文档。

    WAV文件是从1到5000Hz,再回到1Hz的频率扫描。

    将Scipy find_peaks应用于FFT数据时,它返回的值将乘以文件的长度(以秒为单位)。请参见以下图表:

    Plots of: 1: Time-based data. 2. FFT single-sided. 3. FFT with peaks (note different frequency range). 4. FFT with corrected peaks (peaks/secs).

    第4张图显示,如果我将峰值数据除以文件持续时间,则值是正确的。我不喜欢这个解决方案,因为我找不到任何文档来支持它。有人知道我在这里做错了什么吗?

    我的代码:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    
    import scipy.io.wavfile as wavfile
    import scipy
    import scipy.fftpack
    import scipy.fft
    import numpy as np
    from scipy.signal import find_peaks as fp
    from matplotlib import pyplot as plt
    
    #read the wav file and print the sample rate
    fs_rate, signal = wavfile.read("1Hz-5k_sweep_phone_single.wav")
    print ("Sampling rate:", fs_rate)
    
    #calculate number of channels of signal
    l_audio = len(signal.shape)
    print ("Channels:", l_audio)
    
    #trim one channel, if channels = 2
    if l_audio == 2:
        signal = signal.sum(axis=1) / 2
        
    #print number of samples
    N = signal.shape[0]
    print ("Number of samples N:", N)
    
    #calculate duration of file
    secs = N / float(fs_rate)
    print ("File duration (secs):", secs)
    
    #calculate sample interval
    Ts = 1.0/fs_rate # sampling interval in time
    print ("Timestep between samples Ts:", Ts)
    
    
    t = np.arange(0, secs, Ts) # time vector as scipy arange field / numpy.ndarray
    
    #Do FFT
    FFT = abs(scipy.fft.fft(signal))
    #print("FFT size: ",FFT.shape[0])
    
    #split the FFT to one-sided range
    FFT_side = FFT[range(N//2)] # one side FFT range
    print("FFT upper side: ", FFT_side)
    
    #calculate the frequency bins.  Put into array
    freqs = scipy.fft.fftfreq(signal.size, t[1]-t[0])
    print("t:", t[1]-t[0])
    fft_freqs = np.array(freqs)
    #print("Frequency bins: ",fft_freqs)
    
    #Split the frequencies to one-sided range
    freqs_side = freqs[range(N//2)] # one side frequency range
    fft_freqs_side = np.array(freqs_side)
    print("size of upper-side fft: ",fft_freqs_side.shape[0])
    
    
    #do peak detection on data.
    h = (2.0e+04,2.0e+08)  #detected height range
    p = None  #prominence
    d = 100 #minimum distance between peaks (ignore closer peaks)
    peaks, _ = fp(x=FFT_side, height=h,prominence=p,distance=d)
    print("Peak detected frequencies (Hz):",peaks)
    print("number of peaks: ", peaks.shape[0])
    
    
    
    #data plotting
    fig, axes = plt.subplots(nrows = 2, ncols = 1, figsize= (12,9))
    
    ax1 = plt.subplot(411)
    ax1.margins(0.05)           # Default margin is 0.05, value 0 means fit
    ax1.plot(t, signal, "g",linewidth=0.1)
    ax1.set_xlabel('Time')
    ax1.set_ylabel('Amplitude')
    
    ax2 = plt.subplot(412)
    ax2.plot(freqs_side, abs(FFT_side), "b")
    ax2.set_xlabel('Frequency (Hz)')
    ax2.set_ylabel('FFT single-sided')
    ax2.set_xlim(0,5500)
    ax2.set_ylim(0,200000)
    
    ax3 = plt.subplot(413)
    ax3.plot(freqs_side, abs(FFT_side), "b")
    ax3.plot(peaks, FFT_side[peaks],"x",color='tab:orange')
    ax3.set_xlabel('Frequency (Hz)')
    ax3.set_ylabel('FFT single-sided with peaks')
    ax3.set_xlim(0,55000)
    ax3.set_ylim(0,200000)
    
    ax4 = plt.subplot(414)
    ax4.plot(freqs_side, abs(FFT_side), "b")
    ax4.plot(peaks/secs, FFT_side[peaks],"x",color='tab:orange')
    ax4.set_xlabel('Frequency (Hz)')
    ax4.set_ylabel('FFT single-sided: corrected peaks')
    ax4.set_xlim(0,5500)
    ax4.set_ylim(0,200000)
    
    fig.tight_layout() # Or equivalently,  "plt.tight_layout()"
    
    plt.show()
    

    stackoverflow上的多个条目表明find_peaks是查找FFT峰值的一种优选方法。我已经查看了find_peaks的文档,并验证了我的代码遵循了那里的示例以及其他各种声称有效的示例。

    我还搜索了stackoverflow和谷歌以查找相关问题,但没有找到任何内容。

    如上所述,我已经通过将峰值除以WAV文件的持续时间来“解决”了这个问题,但我确信我在代码的其他地方做错了什么。

    2 回复  |  直到 2 年前
        1
  •  0
  •   Tino D    2 年前

    那是因为 peaks 不是频率, 峰值 是指数。您必须使用:

    ax3 = plt.subplot(413)
    ax3.plot(freqs_side, abs(FFT_side), "b")
    ax3.plot(freqs_side[peaks], FFT_side[peaks],"x",color='tab:orange')
    ax3.set_xlabel('Frequency (Hz)')
    ax3.set_ylabel('FFT single-sided with peaks')
    ax3.set_xlim(0,55000)
    ax3.set_ylim(0,200000)
    

    当你把它们除以秒时,你实际上是在把它们转换回Hz:D

        2
  •  0
  •   Joe Mills    2 年前

    在这里回答我自己的问题:

    当应用于FFT数据时,find_peaks找到对应FFT仓处的峰值,而不是频率。将其归一化到频率范围是相关的。

    根据我指定的文件箱大小,按文件持续时间划分是正确的。

    我本可以用以下方法检查:

    print(fft_freqs_side[51511])
    

    其将返回5004Hz的值。