我正在写一些非常简单的代码来对WAV文件进行频率分析(如麦克风记录到Audacity中的):对文件进行FFT,并找到最终的峰值频率值。据我所知,我已经正确地遵循了库文档。
WAV文件是从1到5000Hz,再回到1Hz的频率扫描。
将Scipy find_peaks应用于FFT数据时,它返回的值将乘以文件的长度(以秒为单位)。请参见以下图表:
Plots of: 1: Time-based data. 2. FFT single-sided. 3. FFT with peaks (note different frequency range). 4. FFT with corrected peaks (peaks/secs).
第4张图显示,如果我将峰值数据除以文件持续时间,则值是正确的。我不喜欢这个解决方案,因为我找不到任何文档来支持它。有人知道我在这里做错了什么吗?
我的代码:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import scipy.io.wavfile as wavfile
import scipy
import scipy.fftpack
import scipy.fft
import numpy as np
from scipy.signal import find_peaks as fp
from matplotlib import pyplot as plt
#read the wav file and print the sample rate
fs_rate, signal = wavfile.read("1Hz-5k_sweep_phone_single.wav")
print ("Sampling rate:", fs_rate)
#calculate number of channels of signal
l_audio = len(signal.shape)
print ("Channels:", l_audio)
#trim one channel, if channels = 2
if l_audio == 2:
signal = signal.sum(axis=1) / 2
#print number of samples
N = signal.shape[0]
print ("Number of samples N:", N)
#calculate duration of file
secs = N / float(fs_rate)
print ("File duration (secs):", secs)
#calculate sample interval
Ts = 1.0/fs_rate # sampling interval in time
print ("Timestep between samples Ts:", Ts)
t = np.arange(0, secs, Ts) # time vector as scipy arange field / numpy.ndarray
#Do FFT
FFT = abs(scipy.fft.fft(signal))
#print("FFT size: ",FFT.shape[0])
#split the FFT to one-sided range
FFT_side = FFT[range(N//2)] # one side FFT range
print("FFT upper side: ", FFT_side)
#calculate the frequency bins. Put into array
freqs = scipy.fft.fftfreq(signal.size, t[1]-t[0])
print("t:", t[1]-t[0])
fft_freqs = np.array(freqs)
#print("Frequency bins: ",fft_freqs)
#Split the frequencies to one-sided range
freqs_side = freqs[range(N//2)] # one side frequency range
fft_freqs_side = np.array(freqs_side)
print("size of upper-side fft: ",fft_freqs_side.shape[0])
#do peak detection on data.
h = (2.0e+04,2.0e+08) #detected height range
p = None #prominence
d = 100 #minimum distance between peaks (ignore closer peaks)
peaks, _ = fp(x=FFT_side, height=h,prominence=p,distance=d)
print("Peak detected frequencies (Hz):",peaks)
print("number of peaks: ", peaks.shape[0])
#data plotting
fig, axes = plt.subplots(nrows = 2, ncols = 1, figsize= (12,9))
ax1 = plt.subplot(411)
ax1.margins(0.05) # Default margin is 0.05, value 0 means fit
ax1.plot(t, signal, "g",linewidth=0.1)
ax1.set_xlabel('Time')
ax1.set_ylabel('Amplitude')
ax2 = plt.subplot(412)
ax2.plot(freqs_side, abs(FFT_side), "b")
ax2.set_xlabel('Frequency (Hz)')
ax2.set_ylabel('FFT single-sided')
ax2.set_xlim(0,5500)
ax2.set_ylim(0,200000)
ax3 = plt.subplot(413)
ax3.plot(freqs_side, abs(FFT_side), "b")
ax3.plot(peaks, FFT_side[peaks],"x",color='tab:orange')
ax3.set_xlabel('Frequency (Hz)')
ax3.set_ylabel('FFT single-sided with peaks')
ax3.set_xlim(0,55000)
ax3.set_ylim(0,200000)
ax4 = plt.subplot(414)
ax4.plot(freqs_side, abs(FFT_side), "b")
ax4.plot(peaks/secs, FFT_side[peaks],"x",color='tab:orange')
ax4.set_xlabel('Frequency (Hz)')
ax4.set_ylabel('FFT single-sided: corrected peaks')
ax4.set_xlim(0,5500)
ax4.set_ylim(0,200000)
fig.tight_layout() # Or equivalently, "plt.tight_layout()"
plt.show()
stackoverflow上的多个条目表明find_peaks是查找FFT峰值的一种优选方法。我已经查看了find_peaks的文档,并验证了我的代码遵循了那里的示例以及其他各种声称有效的示例。
我还搜索了stackoverflow和谷歌以查找相关问题,但没有找到任何内容。
如上所述,我已经通过将峰值除以WAV文件的持续时间来“解决”了这个问题,但我确信我在代码的其他地方做错了什么。