Specify minimum trigger frequency for recording audio in Python
I'm writing a script for sound-activated recording in Python using pyaudio. I want to trigger a 5s recording after a sound that is above a prespecified volume and frequency. I've managed to get the volume part working but don't know how to specify the minimum trigger frequency (I'd like it to trigger at frequencies above 10kHz, for example):
import pyaudio
import wave
from array import array
import time
FORMAT=pyaudio.paInt16
CHANNELS=1
RATE=44100
CHUNK=1024
RECORD_SECONDS=5
audio=pyaudio.PyAudio()
stream=audio.open(format=FORMAT,channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
nighttime=True
while nighttime:
data=stream.read(CHUNK)
data_chunk=array('h',data)
vol=max(data_chunk)
if(vol>=3000):
print("recording triggered")
frames=[]
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("recording saved")
# write to file
words = ["RECORDING-", time.strftime("%Y%m%d-%H%M%S"), ".wav"]
FILE_NAME= "".join(words)
wavfile=wave.open(FILE_NAME,'wb')
wavfile.setnchannels(CHANNELS)
wavfile.setsampwidth(audio.get_sample_size(FORMAT))
wavfile.setframerate(RATE)
wavfile.writeframes(b''.join(frames))
wavfile.close()
# check if still nighttime
nighttime=True
stream.stop_stream()
stream.close()
audio.terminate()
I'd like to add to the line if(vol>=3000):
something like if(vol>=3000 and frequency>10000):
but I don't know how to set up frequency
. How to do this?
To retrieve the frequency of a signal you can compute Fourier transform, thus switching to frequency domain (freq
in the code). Your next step is to compute relative amplitude of the signal (amp
) . The latter is proportional to the sound volume.
spec = np.abs(np.fft.rfft(audio_array))
freq = np.fft.rfftfreq(len(audio_array), d=1 / sampling_freq)
spec = np.abs(spec)
amp = spec / spec.sum()
Mind that 3000
isn't a sound volume either. The true sound volume information was lost when the signal was digitalised. Now you only work with relative numbers, so you can just check if e.g. 1/3 of energy in a frame is above 10 khz.
Here's some code to illustrate the concept:
idx_above_10khz = np.argmax(freq > 10000)
amp_below_10k = amp[:idx_above_10khz].sum()
amp_above_10k = amp[idx_above_10khz:].sum()
Now you could specify that from certain ratio of amp_below_10k / amp_above_10k
you should trigger your program.