Reuters |
Machine Learning System – Listen Hours of Recording
Recently a jail in the Midwest had utilised a machine learning system created by a London firm, Intelligent Voice, to listen on the thousands of hours of recording that had been generated each month. Search engines have been moving beyond the web and in the disordered real world and are finding odd things going on. Every call in or out of US prison tends to get recorded since it is important to know what has been said because some of the inmates tend to use phones for the purpose of illegal business with the outside world.
However the recordings tend to generate large amount of audio which seems to be prohibitively expensive to monitor with the human ears.None at the prison were aware of the code word till the software began stirring through the calls. The software saw phrase three-way coming up again and again in the calls and was one of the most common non-trivial words or phrases which had been used. The prison officials, at first were astonished by the devastating popularity of what they presumed seemed to be a sexual reference.
None Aware of Code
They then worked out that it was a code. The prisoners are only permitted to call a few earlier agreed contacts. Hence if a convict needed to speak to a person on a number that was not on their list, they would call their contacts or close relative asking for a `three-way’ with the person they intend to talk to, code for dialling a third party into the call. No one running the surveillance of the phone at the prison had been aware of the code till the software gave indication through the recordings.
This explains the speed and scale of analysis which machine-learning processes tend to bring to the world. Initially, Intelligent Voice had developed the software to be used by UK banks which could record their call in complying with industry regulations. In the case of prisons, this seemed to generate a large amount of audio data which is hard to search through.
Nigel Cannings, the CEO of the company has stated that the breakthrough came when he decided to see what could have happened if he pointed a machine-learning system at the waveform of the voice data, its pattern of spikes as well as troughs instead of the direct audio recording.
Harness Powerful Prevailing Techniques
This system seemed to work amazingly. Preparing his system on this visual demonstration enabled him to harness powerful prevailing techniques created for image classification. He commented that he built this dialect classification system centred on pictures of the human voice. The trick enabled his system to develop its own models in recognizing speech patterns as well as accents which seemed as good as the best hand-coded one prevailing, models that were built by dialect and computer experts.
Neil Glackin, developer of Intelligent Voice had stated that `in their first run they were getting something like 88% accuracy. The software thereafter trains itself to transcribe speech utilising recordings of US congressional hearings, corresponding to the audio with the transcripts.
The power of machine which tends to listen and watch is not what they can perform better than human eyes or ears but tend to perform still worse when challenged with data from the real world. Similar to all applications of computation, their power tends to lie in scale, speed and relative cheapness of processing.