The neural network of people heard voices and painted their portraits

Recently, neural networks surprise their skills - Could you ten years ago to believe that the computer can "revive" the portraits of Marilyn Monroe and Dostoevsky? Prepare to be surprised on because researchers from the Massachusetts Institute of Technology have created a neural network Speech2Face, which is able to paint portraits of people just listen to their voices. The technology is still far from ideal, but its ability to determine the gender, nationality and age of the person is impressive.

To train the neural network used AVSpeech set with millions of short videos with thousands of people speaking. Tracks with video and sound are separated, so that the system was able to study each type of material as much detail. At the first stage, VGG-Face algorithm studied video fragments and creating portraits of people appearing to them from the front and a neutral facial expression. Another part of the algorithm studied voice spectrogram obtained and put on the portraits of additional changes - in the end get a rough picture of each person talking.

A neural network for creating portraits on the basis of voice - is already a reality

If you compare a person's face from the video and the proposed algorithm option, you can find a lot of differences. However, the researchers claim that they initially did not want to create the most similar portrait of a man - to the tone and intonation of the human voice influenced by many factors, so ideal result they would not have received. But the neural network copes with the fact that it is important to researchers, namely from the precise definition of gender, nationality and age.

The authors noted that at this point the algorithm is weak in determining age, but they are able to improve the accuracy. It was also found that the algorithm is better recreates face with European and Asian appearance, but this is due only to the fact that the training videos were not equal to the number of persons of different nationalities.

Why do we need a neural network?

The same can be useful this technology in the future? Alternatively, using it ever can be created by the service where the virtual user avatar is created automatically on the basis of his voice. The new research also carries great scientific benefit - to study data, scientists can find a relationship between a person's appearance and his voice. Listen to the voice and look at their portraits can be recreated based on the project's website.

