Human and primate perception occurs on multiple time scales, with some visual attributes identified in less than 200 ms, thanks to the ventral temporal cortex (VTC). However, more complex visual inferences, such as novel object recognition, require more time and multiple glances. The high-acuity fovea and frequent gaze shifts help compose object representations. While much is known about rapid visual processing, less is known about the integration of visual sequences. The medial temporal cortex (MTC), particularly the perirhinal cortex (PRC), may assist in this process, allowing visual inferences beyond the capabilities of the VTC by integrating sequential visual inputs.
Stanford researchers assessed the role of MTC in object perception by comparing human visual performance to VTC recordings from macaques. While humans and VTC perform similarly with short viewing times (<200 ms), human performance significantly outperforms VTC with prolonged viewing. MTC plays a key role in this improvement, as humans with MTC lesions behave like VTC models. Eye-tracking experiments revealed that humans use sequential gaze patterns for complex visual inferences. These findings suggest that MTC integrates visuospatial sequences into compositional representations, enhancing object perception beyond the capabilities of VTC.
The researchers used a dataset of multiple object images presented in different orientations and configurations to estimate performance based on VTC responses and compare it to human visual processing. They implemented a cross-validation strategy in which trials presented two typical objects and one atypical object in random configurations. Neural responses from high-level visual areas of the brain were then used to train a linear classifier to detect the oddball object. This process was repeated multiple times and the results were averaged to determine a performance score for distinguishing each object pair.
For comparison, a CNN model, pre-trained for object classification, was used to evaluate the performance of the VTC model. Images were pre-processed for the CNN and a similar experimental setup was followed, where a classifier was trained to detect foreign objects in several trials. The accuracy of the model was tested and compared to predictions based on neural response, providing insight into how well the model’s visual processing reflected human inference.
The study compares human performance in two visual regimes: time-restricted (less than 200 ms) and time-unrestricted (self-paced). In time-restricted tasks, participants rely on immediate visual processing, as there is no possibility of sequential sampling across eye movements. A 3-way visual discrimination task and a match-to-sample paradigm were used to assess this. The results showed a strong correlation between time-restricted human performance and performance predicted by the macaques’ high-level VTC. However, with unlimited viewing time, human participants significantly outperformed the performance supported by VTC and the VTC-based computational models. This demonstrates that humans outperform VTC capabilities when given long viewing times, suggesting reliance on different neural mechanisms.
The study reveals complementary neural systems in visual object perception, where VTC enables rapid visual inferences within 100 ms, whereas MTC supports more complex inferences across sequential saccades. Time-constrained tasks align with VTC performance, but with more time, humans outperform VTC capabilities, reflecting MTC’s integration of visuospatial sequences. The findings emphasize the role of MTC in compositional operations, which extend beyond memory to perception. Models of human vision, such as convolutional neural networks, approximate VTC but fail to capture MTC’s contributions, suggesting the need for biologically plausible models that integrate both systems.
Take a look at the PaperAll credit for this research goes to the researchers of this project. Also, don't forget to follow us on twitter.com/Marktechpost”>twitter and join our Telegram Channel and LinkedIn GrAbove!. If you like our work, you will love our fact sheet..
Don't forget to join our SubReddit of over 50,000 ml
FREE ai WEBINAR: 'SAM 2 for Video: How to Optimize Your Data' (Wednesday, September 25, 4:00 am – 4:45 am EST)
Sana Hassan, a Consulting Intern at Marktechpost and a dual degree student at IIT Madras, is passionate about applying technology and ai to address real-world challenges. With a keen interest in solving practical problems, she brings a fresh perspective to the intersection of ai and real-life solutions.
<script async src="//platform.twitter.com/widgets.js” charset=”utf-8″>