Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible or recommended to use a sounddevice stream via fastapi? #569

Open
asusdisciple opened this issue Nov 8, 2024 · 1 comment

Comments

@asusdisciple
Copy link

So i want to send some audio stream to my fastapi server which processes it and sends it back to the client. Its pretty straight forward to build a streaming endpoint in FastAPI with:

@app.websocket("/audio-stream/")
async def audio_stream_endpoint(websocket: WebSocket):
    await websocket.accept()
    try:
        while True:
            # Receive raw audio data from the client
            audio_data = await websocket.receive_bytes()

            # Process audio data
            pa = streaming_service.process_stream()
 
      

            # Send processed audio data back to the client
            await websocket.send_bytes(pa)
    except Exception as e:
        print("Connection closed:", e)
    finally:
        await websocket.close()

However I wondered if it makes sense to use sounddevice streams in this scenario to take advantage of the optimizations which are certainly there in comparison to my naive implementation? The question would be how to implement it. Use sounddevice on the client side and send the stream through fastapi as bytes and the take the audio_data from my endpoint and convert it again to an sounddevice stream to do processing and stuff.

@mgeier
Copy link
Member

mgeier commented Nov 8, 2024

If you want to send uncompressed PCM data, this should be fairly simple.

I would create an audio callback function that writes into a queue (see examples) and from a different thread (e.g. the main thread) repeatedly check for data in the queue and send it to the server via WebSocket (no example yet, see #415).

If you want to play back the manipulated signal that comes back from the server, I guess you'll need quite a long queue for buffering (depending on the network latency).

If you want to encode/decode the signal before/after sending it over the network, this gets a bit more complicated (and will need additional libraries), but it should be possible, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants