What is happening in this picture? Often the answer is obvious and just looks at the image, but make a machine that describes it is much more complicated than it seems.
Google has developed a system through Tensor Flow can generate these captions , and now that system has been released under open source license, which means that if you have the task of describing a good set of images, you can use this free form development.
More accurate, faster, and now also Open Source
Technology called Brain Team Google is really remarkable and according to its leaders is able to offer a 93.9% accuracy in the project called “Show and Tell” that makes the artificial intelligence engine in a small text indicating what It occurs in the photo.
To achieve this accuracy has had to train the algorithm with captions created by us, which among other things has allowed descriptions are well constructed sentences and not just combinations of object names.
The system is much faster: before training with each image required three seconds using a NVIDIA G20 GPU, but in this new edition released to the Open Source that time is reduced to 0.7 seconds. The practical applications are numerous, but one that is particularly striking: making the web more accessible for those who cannot see but “hear” that content. Now we also know what happens in the images included in such content.