A web-based application for exploring the atoms of object recognition.

Rapid advances in Artificial Intelligence (AI) over recent years have allowed computers to see the world much like humans. This has supported a variety of new technologies that are about to change how we live our lives, from Facebook’s automated face recognition and Google’s image search to Tesla’s self-driving cars. Despite great progress, there still exists a major gap between how humans and machines perceive the world. Our group has found that humans and modern AI algorithms rely on different object parts when recognizing objects.

Different parts of objects are important to humans than AI during object recognition.

The figure above shows what parts of an image are most informative for people vs. AI algorithms when trying to judge the category of an object in an image. This example illustrates the fact that humans and our chosen AI algorithm rely on different sources of information: the glasses’ frames seem to be most helpful for humans whereas the lenses seem most useful for the AI algorithm. This suggests that teaching the AI algorithm to focus on the same information as humans may offer a path towards the design of smarter seeing machines.

clickme is a game that provides us with data to bridge the gap between AI and human object recognition.

Every time you play clickme you provide researchers with information about what image locations are relevant to the human visual system when assessing category information. We use this data in the following ways:

  1. To train the AI algorithm to act more human-like when identifying important image parts.
  2. To measure the extent to which human-like identification improves the object recognition capability of modern AI algorithms.

Our findings, discussed in our [arXiv] suggest that this approach will significantly improve the performance of computers for recognizing objects in images. See below for a graph updated daily showing how AI improves at object recognition when it is forced to pay attention to the same information as humans.

Tracking the progress of AI recognition

Here we record the accuracy of an AI algorithm (VGG 16) in recognizing images from a large database of object images (ImageNet). The trace in red is accuracy of a typical VGG 16, and the trace in blue is what happens to its recognition accuracy after it is trained to pay attention to the same parts of objects as humans.