pyaudio registrazione audio python

Aug 31 2020

Sto provando a registrare l'audio dal microfono con Python. E ho il seguente codice:

import pyaudio
import wave
import threading

FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
CHUNK = 1024
WAVE_OUTPUT_FILENAME = "file.wav"

stop_ = False
audio = pyaudio.PyAudio()

stream = audio.open(format=FORMAT, channels=CHANNELS,
                    rate=RATE, input=True,
                    frames_per_buffer=CHUNK)


def stop():
    global stop_
    while True:
        if not input('Press Enter >>>'):
            print('exit')
            stop_ = True


t = threading.Thread(target=stop, daemon=True).start()
frames = []

while True:
    data = stream.read(CHUNK)
    frames.append(data)
    if stop_:
        break

stream.stop_stream()
stream.close()
audio.terminate()
waveFile = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
waveFile.setnchannels(CHANNELS)
waveFile.setsampwidth(audio.get_sample_size(FORMAT))
waveFile.setframerate(RATE)
waveFile.writeframes(b''.join(frames))
waveFile.close()

Il mio codice funziona bene, ma quando riproduco la mia registrazione, non sento alcun suono nel file di output finale ( file.wav).

Perché qui si verificano problemi e come li risolvo?

Risposte

1 Azr Sep 08 2020 at 03:31

Il tuo codice funziona correttamente. Il problema che stai affrontando è dovuto ai diritti di amministratore. Il file audio ha dati costanti 0, quindi non è possibile ascoltare il suono nel file wav generato. Suppongo che il tuo microfono sia installato e funzioni correttamente. Se non sei sicuro dello stato dell'installazione audio, come da sistema operativo segui questi passaggi:

MAC OS: Preferenze di Sistema-> Suono-> Ingresso e lì puoi visualizzare le barre mentre emettono dei suoni. Assicurati che il tipo di dispositivo selezionato sia Built-in.

Sistema operativo Windos: Impostazioni audio e prova Microfono facendo clic su ascolta questo dispositivo, in seguito potresti deselezionarlo perché riprodurrà la tua voce in loop e creerà grandi rumori.

Molto probabilmente stai usando Mac OS. Ho avuto il problema simile, perché stavo usando l'editor Atom per eseguire il codice Python. Prova a eseguire il tuo codice dal terminale di Mac OS (o Power Shell se stai usando Windows), (nel caso in cui appaia un popup per l'accesso al microfono su Mac OS, premi Ok). Questo è tutto! il tuo codice registrerà bene. Come tester, esegui il codice seguente per verificare se riesci a visualizzare il suono e assicurati di eseguirlo tramite Terminale (nessun editor o IDE).

import queue
import sys
from matplotlib.animation import FuncAnimation
import matplotlib.pyplot as plt
import numpy as np
import sounddevice as sd

# Lets define audio variables
# We will use the default PC or Laptop mic to input the sound

device = 0 # id of the audio device by default
window = 1000 # window for the data
downsample = 1 # how much samples to drop
channels = [1] # a list of audio channels
interval = 30 # this is update interval in miliseconds for plot

# lets make a queue
q = queue.Queue()
# Please note that this sd.query_devices has an s in the end.
device_info =  sd.query_devices(device, 'input')
samplerate = device_info['default_samplerate']
length  = int(window*samplerate/(1000*downsample))

# lets print it 
print("Sample Rate: ", samplerate)

# Typical sample rate is 44100 so lets see.

# Ok so lets move forward

# Now we require a variable to hold the samples 

plotdata =  np.zeros((length,len(channels)))
# Lets look at the shape of this plotdata 
print("plotdata shape: ", plotdata.shape)
# So its vector of length 44100
# Or we can also say that its a matrix of rows 44100 and cols 1

# next is to make fig and axis of matplotlib plt
fig,ax = plt.subplots(figsize=(8,4))

# lets set the title
ax.set_title("PyShine")

# Make a matplotlib.lines.Line2D plot item of color green
# R,G,B = 0,1,0.29

lines = ax.plot(plotdata,color = (0,1,0.29))

# We will use an audio call back function to put the data in queue

def audio_callback(indata,frames,time,status):
    q.put(indata[::downsample,[0]])

# now we will use an another function 
# It will take frame of audio samples from the queue and update
# to the lines

def update_plot(frame):
    global plotdata
    while True:
        try: 
            data = q.get_nowait()
        except queue.Empty:
            break
        shift = len(data)
        plotdata = np.roll(plotdata, -shift,axis = 0)
        # Elements that roll beyond the last position are 
        # re-introduced 
        plotdata[-shift:,:] = data
    for column, line in enumerate(lines):
        line.set_ydata(plotdata[:,column])
    return lines
ax.set_facecolor((0,0,0))
# Lets add the grid
ax.set_yticks([0])
ax.yaxis.grid(True)

""" INPUT FROM MIC """

stream  = sd.InputStream( device = device, channels = max(channels), samplerate = samplerate, callback  = audio_callback)


""" OUTPUT """      

ani  = FuncAnimation(fig,update_plot, interval=interval,blit=True)
with stream:
    plt.show()

Salva questo file come voice.py in una cartella (diciamo AUDIO). Quindi cd nella cartella AUDIO dal comando del terminale e quindi eseguirlo utilizzando:

python3 voice.py

python voice.py

a seconda del tuo nome env python.

user0814 Sep 15 2020 at 09:00

Utilizzando print(sd.query_devices()), vedo un elenco di dispositivi come di seguito:

Microsoft Sound Mapper - Ingresso, MME (2 ingressi, 0 uscite)
Microfono (AudioHubNano2D_V1.5, MME (2 ingressi, 0 uscite)
Microfono interno (Conexant S, MME (2 ingressi, 0 uscite)
...

Tuttavia, se lo uso device = 0, posso ancora ricevere l'audio dal microfono USB, che è il numero di dispositivo 1. È di default, tutto il segnale audio va al Sound Mapper? Ciò significa che se lo uso device = 0, riceverò tutto il segnale audio da tutti gli ingressi audio; e se voglio solo l'ingresso audio da un particolare dispositivo, devo scegliere il suo numero x come device = x.