ASDNet: A robust involution‐based architecture for diagnosis of autism spectrum disorder utilising eye‐tracking technology

Table of Contents

Introduction

Autism Spectrum Disorder (ASD) is a condition that affects how people communicate and interact with others. It can be hard to diagnose ASD, especially in young children, because the symptoms vary widely and may not be obvious. However, early diagnosis and intervention can improve the outcomes and quality of life for people with ASD.

One of the methods that can help diagnose ASD is eye-tracking, which measures how people look at different stimuli, such as faces, objects, or scenes. Eye-tracking can reveal how people with ASD pay attention to social cues, process information, and express emotions. However, eye-tracking data can be complex and noisy, and it requires sophisticated analysis to extract meaningful features and patterns.

That’s why a team of researchers from Bangladesh, Canada, Saudi Arabia, UK, and Australia developed a new deep learning model called ASDNet, which can classify eye-tracking images and detect ASD with high accuracy and reliability. ASDNet is based on a novel neural network architecture called involution, which is simpler and more efficient than the conventional convolutional neural network (CNN).

What is involution and how does it work?

Involution is a new type of neural network layer that was proposed by Li et al. in 2021. Unlike convolution, which applies the same filter to different regions of the input, involution applies different filters to different locations, depending on the spatial information. This makes involution more flexible and adaptive, and it can capture the local variations and dependencies in the input.

Involution can be seen as a generalization of convolution and self-attention, two popular techniques in deep learning. Convolution is a special case of involution when the filters are fixed and independent of the location. Self-attention is another special case of involution when the filters are fully dependent on the location and the input. Involution can also combine the advantages of both convolution and self-attention, by using a mixture of fixed and dynamic filters.

How does ASDNet use involution to diagnose ASD?

ASDNet is a deep learning model that uses involution to classify eye-tracking images and diagnose ASD. The model takes three types of eye-tracking images as input: scanpath, heatmap, and fixation map. Scanpath shows the sequence of eye movements over time. Heatmap shows the distribution of gaze duration over space. Fixation map shows the locations of eye fixations over space.

ASDNet consists of four involutional blocks, each followed by a max-pooling layer. The first block uses a mixture of fixed and dynamic filters, while the other three blocks use only dynamic filters. The output of the last block is flattened and fed into a fully connected layer, which produces a binary classification of ASD or non-ASD.

ASDNet also uses a technique called Monte Carlo Dropout, which randomly drops out some of the neurons during inference, to estimate the uncertainty of the prediction. This can help assess the confidence and reliability of the model, and avoid false positives or negatives.

How well does ASDNet perform and why is it better than other models?

ASDNet was trained and tested on two publicly available datasets of eye-tracking images from children with and without ASD. The first dataset was collected by the University of Minnesota, and the second dataset was collected by the University of Cambridge. ASDNet achieved an average accuracy of 97.54% on the first dataset, and 97.62% on the second dataset, which outperformed the state-of-the-art image classification models, such as ResNet, DenseNet, and MobileNet, as well as other existing works on ASD diagnosis using eye-tracking.

ASDNet is better than other models for several reasons. First, it uses involution, which is more suitable for eye-tracking images, as it can capture the spatial variations and dependencies in the gaze patterns. Second, it has fewer parameters and requires less computational resources than other models, making it faster and more efficient. Third, it uses Monte Carlo Dropout, which can provide an uncertainty estimate for the prediction, making it more robust and reliable.

What are the implications and limitations of ASDNet?

ASDNet is a promising tool that can help diagnose ASD using eye-tracking images, which are easy to collect and non-invasive. ASDNet can provide a fast and accurate diagnosis, as well as an uncertainty estimate, which can complement the existing clinical methods and facilitate early intervention. ASDNet can also be applied to other domains that involve eye-tracking, such as education, marketing, or gaming.

However, ASDNet also has some limitations that need to be addressed. First, it is based on a single modality of eye-tracking, which may not capture the full spectrum of ASD symptoms and behaviors. Second, it is trained on limited and imbalanced datasets, which may not generalize to different populations and settings. Third, it is not validated by clinical experts, which may affect its validity and reliability.

Therefore, future work should aim to improve ASDNet by incorporating more modalities, such as facial expressions, speech, or gestures, to provide a more comprehensive diagnosis. Moreover, more data from diverse and representative samples should be collected and annotated, to enhance the robustness and generalizability of the model. Finally, more collaboration with clinical professionals should be established, to evaluate and refine the model and its applications.

Faq

How does the model compare with other methods that use eye-tracking for ASD diagnosis?

The model compares with other methods that use eye-tracking for ASD diagnosis by showing superior performance and efficiency. The model achieves higher accuracy than other methods that use eye-tracking features, such as gaze duration, fixation count, saccade length, and pupil size, to diagnose ASD. The model also achieves higher accuracy than other methods that use eye-tracking images, such as CNN, LSTM, and GAN, to diagnose ASD. The model also has fewer parameters and faster inference time than other methods that use eye-tracking images, making it more suitable for real-time and low-resource applications.

What are the advantages and disadvantages of using eye-tracking images for ASD diagnosis?

The advantages of using eye-tracking images for ASD diagnosis are that they can provide a non-invasive, objective, and quantitative measure of the social and cognitive impairments associated with ASD. Eye-tracking images can capture how people with ASD look at different stimuli, such as faces, objects, or scenes, and how they pay attention to social cues, process information, and express emotions. Eye-tracking images can also reveal the differences and similarities between people with and without ASD, and help identify the subtypes and severity of ASD.

The disadvantages of using eye-tracking images for ASD diagnosis are that they can be affected by various factors, such as the type of stimulus, the task, the individual, and the environment. Eye-tracking images can also be noisy and complex, and require sophisticated analysis to extract meaningful features and patterns. Eye-tracking images can also be insufficient to capture the full spectrum of ASD symptoms and behaviors, and may need to be combined with other modalities, such as facial expressions, speech, or gestures.

How can eye-tracking images be used for other purposes besides ASD diagnosis?

Eye-tracking images can be used for other purposes besides ASD diagnosis, such as education, marketing, or gaming. Eye-tracking images can help understand how people learn, remember, and solve problems, and provide feedback and guidance to improve their learning outcomes. Eye-tracking images can also help understand how people perceive, react, and respond to different products, advertisements, or brands, and provide insights and recommendations to optimize their marketing strategies. Eye-tracking images can also help understand how people enjoy, engage, and interact with different games, genres, or platforms, and provide suggestions and enhancements to improve their gaming experience.

What are the ethical and legal issues related to using eye-tracking images for ASD diagnosis?

The ethical and legal issues related to using eye-tracking images for ASD diagnosis are similar to those related to using any personal data for medical purposes. These issues include privacy, consent, security, ownership, access, and use of the data. For example, eye-tracking images may contain sensitive personal information, such as identity, preferences, emotions, or health conditions, which could be misused or leaked by unauthorized parties. Eye-tracking images may also require informed consent from the users or their guardians, to collect, store, and analyze the data. Eye-tracking images may also raise questions about who owns, controls, and benefits from the data, and how the data can be shared, transferred, or deleted.

Therefore, using eye-tracking images for ASD diagnosis should follow the ethical and legal principles and regulations for data protection, such as transparency, accountability, fairness, and respect for the rights and dignity of the users. Using eye-tracking images for ASD diagnosis should also involve the stakeholders, such as the users, the clinicians, the researchers, and the policymakers, in the design, development, and evaluation of the technology.

Source:

https://www.repository.cam.ac.uk/items/79ba09f1-c39e-48d2-bda9-fde79685c28e