Image segmentation, which includes the segmentation of organs, abnormalities, bones, and other objects, is a key problem in medical image analysis. Deep learning has made considerable progress in this area. The costly and time-consuming nature of collecting and curating medical images, particularly as trained radiologists often must provide meticulous mask annotations, makes it virtually difficult to develop and train segmentation models for new medical imaging data and tasks. These problems could be greatly reduced with the introduction of basic models and zero shot learning.
The field of natural language processing has benefited from the paradigm-shifting capabilities of basic models. To perform zero-shot learning on new data in various contexts, basic models are neural networks trained on a large amount of data with inventive insights and motivating goals that typically do not require traditional supervised training labels. The newly created Segment Anything model is a basic model that has demonstrated impressive zero-shot segmentation performance on various realistic image data sets. Researchers at Duke University put it to the test on a medical imaging dataset.
In response to instructions provided by the user, the Segment Anything Model (SAM) is designed to segment an object of interest in an image. A single point, a group of points (including a full mask), a bounding box, or text can be used as prompts. Even when the message is not clear, the model is requested to provide a suitable segmentation mask. The main notion behind this method is that the model can segment any object that is pointed to since it has learned the concept of an object. As a result, there’s a good chance that it will perform well under the zero-shot learning regime and be able to segment objects of types you’ve never seen before. The SAM authors used a particular model architecture and a particularly large data set in addition to request-based work formulation, as explained below.
SAM gradually trained while the image collection and accompanying object masks (SA-1B) were developed. Three processes went into creating the data set. First, human annotators clicked on elements of a series of manually refined photos and masks produced by SAM, which had been trained on open data sets at the time. Second, to expand the variety of objects, annotators were asked to segment skins that SAM had not yet created with confidence. The final set of masks was created automatically by choosing reliable and stable masks and providing the SAM model with a collection of points scattered in a grid over the image.
SAM is built to require one or more advertisements to generate a segmentation mask. Technically, the model can be run without requesting any visible elements, but they don’t anticipate this to be useful for medical imaging because there is often a lot more to the image than just the one of interest. SAM cannot be used in the same way as most segmentation models in medical imaging, where the input is just one image and the output is a segmentation mask or multiple segmentation masks for the required element(s). This is because SAM is request based. They suggest that there are three key applications for SAM in medical imaging segmentation.
The first two involve training new models, creating masks, or annotating data using the same Segment Anything model. These methods do not involve SAM adjustments. The final method is to develop and refine a SAM-like model specifically for medical imaging. Then, each strategy is explained. Since SAM is still in the proof-of-concept phase with text-based prompts, please note that they do not comment here. Semi-automatic annotation “Human in the loop”. One of the main obstacles to creating segmentation models in this discipline is human annotation of medical images, which often consumes valuable clinicians’ time.
SAM could be used as a tool for faster annotation in this situation. There are several methods to do this. In the most basic scenario, a human user requests SAM, which creates a mask that the user can accept or modify. This could be repeatedly improved. The “slice all” mode is another option, where SAM receives instructions evenly spaced over the image and creates masks for various things that the user can later name, choose and/or modify. There are many more options after this one; This is just the beginning.
review the Paper. Don’t forget to join our 21k+ ML SubReddit, discord channel, and electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at asif@marktechpost.com
Check out 100 AI tools at AI Tools Club
Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Information Science and Artificial Intelligence at the Indian Institute of Technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around her. She loves connecting with people and collaborating on interesting projects.