Specify minimum trigger frequency for recording audio in Python

I'm writing a script for sound-activated recording in Python using pyaudio. I want to trigger a 5s recording after a sound that is above a prespecified volume and frequency. I've managed to get the volume part working but don't know how to specify the minimum trigger frequency (I'd like it to trigger at frequencies above 10kHz, for example):

import pyaudio
import wave
from array import array
import time
 
FORMAT=pyaudio.paInt16
CHANNELS=1
RATE=44100
CHUNK=1024
RECORD_SECONDS=5

audio=pyaudio.PyAudio() 

stream=audio.open(format=FORMAT,channels=CHANNELS, 
                  rate=RATE,
                  input=True,
                  frames_per_buffer=CHUNK)

nighttime=True

while nighttime:
     data=stream.read(CHUNK)
     data_chunk=array('h',data)
     vol=max(data_chunk)
     if(vol>=3000):
         print("recording triggered")
         frames=[]
         for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
             data = stream.read(CHUNK)
             frames.append(data)
         print("recording saved")
         # write to file
         words = ["RECORDING-", time.strftime("%Y%m%d-%H%M%S"), ".wav"]
         FILE_NAME= "".join(words) 
         wavfile=wave.open(FILE_NAME,'wb')
         wavfile.setnchannels(CHANNELS)
         wavfile.setsampwidth(audio.get_sample_size(FORMAT))
         wavfile.setframerate(RATE)
         wavfile.writeframes(b''.join(frames))
         wavfile.close()
     # check if still nighttime
     nighttime=True 
 
 stream.stop_stream()
 stream.close()
 audio.terminate()

I'd like to add to the line if(vol>=3000): something like if(vol>=3000 and frequency>10000): but I don't know how to set up frequency. How to do this?

To retrieve the frequency of a signal you can compute Fourier transform, thus switching to frequency domain (freq in the code). Your next step is to compute relative amplitude of the signal (amp) . The latter is proportional to the sound volume.

spec = np.abs(np.fft.rfft(audio_array))
freq = np.fft.rfftfreq(len(audio_array), d=1 / sampling_freq)
spec = np.abs(spec)
amp = spec / spec.sum()

Mind that 3000 isn't a sound volume either. The true sound volume information was lost when the signal was digitalised. Now you only work with relative numbers, so you can just check if e.g. 1/3 of energy in a frame is above 10 khz.

Here's some code to illustrate the concept:

idx_above_10khz = np.argmax(freq > 10000)
amp_below_10k = amp[:idx_above_10khz].sum()
amp_above_10k = amp[idx_above_10khz:].sum()

Now you could specify that from certain ratio of amp_below_10k / amp_above_10k you should trigger your program.

Specify minimum trigger frequency for recording audio in Python

Related

Recent Posts