ConvoCloud is an app that provides a real-time visualisation of the semantic content of spoken language. It achieves this by integrating the technologies of Automatic Speech Recognition, Natural Language Processing and Word Cloud generation.
This first version currently runs in Google Chrome Desktop.
With fixes and additions it could possibly be used by those with hearing impairments as an assistive tool for capturing the topics of a spoken conversation. Other possible use cases include lectures, meetings and other scenarios where a semantic summary of what is being said would be useful.
The importance, i.e. size, of a word in the cloud is currently determined by a) frequency b) syntactic word class c) character length. However, this is still being adjusted.
This is a Flask web app that utlises Flask-Bootstrap. RequireJS is responsible for serving the JavaScript modules. To send speech input to the web server for processing the app uses SocketIO.
To capture speech input for the cloud the Web Speech API is used.
The python scripts that process the raw speech input and create semantically useful tokens for visualisation make use of the Natural Language Toolkit. This includes: tokenisation, removal of stop words and swear words and lemmatisation.
The bright and beautiful word clouds are rendered using the JQcloud library.