代码之家 › 专栏 › 技术社区 › Joe Mills

为什么WAV文件的持续时间会影响Scipy find_peaks返回的值?

peak-detection fft scipy python

Joe Mills · 技术社区 · 2 年前

我正在写一些非常简单的代码来对WAV文件进行频率分析(如麦克风记录到Audacity中的):对文件进行FFT,并找到最终的峰值频率值。据我所知,我已经正确地遵循了库文档。

WAV文件是从1到5000Hz,再回到1Hz的频率扫描。

将Scipy find_peaks应用于FFT数据时,它返回的值将乘以文件的长度(以秒为单位)。请参见以下图表:

Plots of: 1: Time-based data. 2. FFT single-sided. 3. FFT with peaks (note different frequency range). 4. FFT with corrected peaks (peaks/secs).

第4张图显示,如果我将峰值数据除以文件持续时间,则值是正确的。我不喜欢这个解决方案,因为我找不到任何文档来支持它。有人知道我在这里做错了什么吗?

我的代码:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import scipy.io.wavfile as wavfile
import scipy
import scipy.fftpack
import scipy.fft
import numpy as np
from scipy.signal import find_peaks as fp
from matplotlib import pyplot as plt

#read the wav file and print the sample rate
fs_rate, signal = wavfile.read("1Hz-5k_sweep_phone_single.wav")
print ("Sampling rate:", fs_rate)

#calculate number of channels of signal
l_audio = len(signal.shape)
print ("Channels:", l_audio)

#trim one channel, if channels = 2
if l_audio == 2:
    signal = signal.sum(axis=1) / 2
    
#print number of samples
N = signal.shape[0]
print ("Number of samples N:", N)

#calculate duration of file
secs = N / float(fs_rate)
print ("File duration (secs):", secs)

#calculate sample interval
Ts = 1.0/fs_rate # sampling interval in time
print ("Timestep between samples Ts:", Ts)


t = np.arange(0, secs, Ts) # time vector as scipy arange field / numpy.ndarray

#Do FFT
FFT = abs(scipy.fft.fft(signal))
#print("FFT size: ",FFT.shape[0])

#split the FFT to one-sided range
FFT_side = FFT[range(N//2)] # one side FFT range
print("FFT upper side: ", FFT_side)

#calculate the frequency bins.  Put into array
freqs = scipy.fft.fftfreq(signal.size, t[1]-t[0])
print("t:", t[1]-t[0])
fft_freqs = np.array(freqs)
#print("Frequency bins: ",fft_freqs)

#Split the frequencies to one-sided range
freqs_side = freqs[range(N//2)] # one side frequency range
fft_freqs_side = np.array(freqs_side)
print("size of upper-side fft: ",fft_freqs_side.shape[0])


#do peak detection on data.
h = (2.0e+04,2.0e+08)  #detected height range
p = None  #prominence
d = 100 #minimum distance between peaks (ignore closer peaks)
peaks, _ = fp(x=FFT_side, height=h,prominence=p,distance=d)
print("Peak detected frequencies (Hz):",peaks)
print("number of peaks: ", peaks.shape[0])



#data plotting
fig, axes = plt.subplots(nrows = 2, ncols = 1, figsize= (12,9))

ax1 = plt.subplot(411)
ax1.margins(0.05)           # Default margin is 0.05, value 0 means fit
ax1.plot(t, signal, "g",linewidth=0.1)
ax1.set_xlabel('Time')
ax1.set_ylabel('Amplitude')

ax2 = plt.subplot(412)
ax2.plot(freqs_side, abs(FFT_side), "b")
ax2.set_xlabel('Frequency (Hz)')
ax2.set_ylabel('FFT single-sided')
ax2.set_xlim(0,5500)
ax2.set_ylim(0,200000)

ax3 = plt.subplot(413)
ax3.plot(freqs_side, abs(FFT_side), "b")
ax3.plot(peaks, FFT_side[peaks],"x",color='tab:orange')
ax3.set_xlabel('Frequency (Hz)')
ax3.set_ylabel('FFT single-sided with peaks')
ax3.set_xlim(0,55000)
ax3.set_ylim(0,200000)

ax4 = plt.subplot(414)
ax4.plot(freqs_side, abs(FFT_side), "b")
ax4.plot(peaks/secs, FFT_side[peaks],"x",color='tab:orange')
ax4.set_xlabel('Frequency (Hz)')
ax4.set_ylabel('FFT single-sided: corrected peaks')
ax4.set_xlim(0,5500)
ax4.set_ylim(0,200000)

fig.tight_layout() # Or equivalently,  "plt.tight_layout()"

plt.show()

stackoverflow上的多个条目表明find_peaks是查找FFT峰值的一种优选方法。我已经查看了find_peaks的文档,并验证了我的代码遵循了那里的示例以及其他各种声称有效的示例。

我还搜索了stackoverflow和谷歌以查找相关问题,但没有找到任何内容。

如上所述,我已经通过将峰值除以WAV文件的持续时间来“解决”了这个问题,但我确信我在代码的其他地方做错了什么。

2 回复 | 直到 2 年前

Tino D 2 年前

那是因为 peaks 不是频率, 峰值 是指数。您必须使用:

ax3 = plt.subplot(413)
ax3.plot(freqs_side, abs(FFT_side), "b")
ax3.plot(freqs_side[peaks], FFT_side[peaks],"x",color='tab:orange')
ax3.set_xlabel('Frequency (Hz)')
ax3.set_ylabel('FFT single-sided with peaks')
ax3.set_xlim(0,55000)
ax3.set_ylim(0,200000)

当你把它们除以秒时,你实际上是在把它们转换回Hz:D

Joe Mills 2 年前

在这里回答我自己的问题:

当应用于FFT数据时,find_peaks找到对应FFT仓处的峰值,而不是频率。将其归一化到频率范围是相关的。

根据我指定的文件箱大小,按文件持续时间划分是正确的。

我本可以用以下方法检查:

print(fft_freqs_side[51511])

其将返回5004Hz的值。