Ductal carcinoma in situ (DCIS) is a type of preinvasive tumor that sometimes progresses to become a highly deadly form of breast cancer. It accounts for about 25 percent of all breast cancer diagnoses.
Because doctors find it difficult to determine the type and stage of DCIS, patients with the disorder are often overtreated. To address this problem, an interdisciplinary team of researchers from MIT and eth Zurich developed an artificial intelligence model that can identify different stages of DCIS from an inexpensive and easy-to-obtain image of breast tissue. Their model shows that both the state and arrangement of cells in a tissue sample are important in determining the stage of DCIS.
Because tissue images are so easy to obtain, the researchers were able to create one of the largest data sets of its kind, which they used to train and test their model. When they compared their predictions to a pathologist's conclusions, they found clear agreement in many cases.
In the future, the model could be used as a tool to help doctors speed up diagnosis of simpler cases without the need for labor-intensive testing, giving them more time to evaluate cases where it's less clear whether DCIS will become invasive.
“We took the first step toward understanding that we should be looking at the spatial organization of cells when diagnosing DCIS, and now we’ve developed a technique that is scalable. From here, we really need a prospective study. Working with a hospital and taking this all the way to the clinic will be an important step forward,” says Caroline Uhler, a professor in the Department of Electrical Engineering and Computer Science (EECS) and the Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard and an investigator at MIT’s Laboratory for Information and Decision Systems (LIDS).
Uhler, co-author of a paper on this research, is joined by senior author Xinyi Zhang, a graduate student in EECS and the Eric and Wendy Schmidt Center; co-corresponding author G.V. Shivashankar, professor of mecogenomics at eth Zurich together with the Paul Scherrer Institute; and others at MIT, eth Zurich, and the University of Palermo in Italy. The open-access research was published on July 20 in Nature Communications.
Combining images with ai
Between 30 and 50 percent of patients with DCIS develop a highly invasive stage of the cancer, but researchers don't know the biomarkers that could tell doctors which tumors will progress.
Researchers can use techniques such as multiplex staining or single-cell RNA sequencing to determine the stage of DCIS in tissue samples. However, these tests are too expensive to be widely performed, Shivashankar explains.
In previous work, these researchers showed that an inexpensive imaging technique known as chromatin staining could be as informative as the much more expensive single-cell RNA sequencing.
For this research, they hypothesized that combining this single stain with a carefully designed machine learning model could provide the same information about cancer stage as more expensive techniques.
First, they created a dataset containing 560 images of tissue samples from 122 patients at three different stages of the disease. They used this dataset to train an ai model that learns a representation of the state of each cell in a tissue sample image, which it uses to infer the stage of a patient's cancer.
However, not all cells are indicative of cancer, so the researchers had to group them together in a meaningful way.
They designed the model to create groups of cells in similar states, identifying eight states that are important markers of DCIS. Some cell states are more indicative of invasive cancer than others. The model determines the proportion of cells in each state in a tissue sample.
Organization matters
“But in cancer, the organization of cells also changes. We found that it is not enough to know the proportions of cells in each state. You also need to understand how cells are organized,” Shivashankar says.
With this knowledge, they designed the model to consider the proportion and arrangement of cell states, which significantly increased its accuracy.
“What was interesting for us was to see how much spatial organization matters. Previous studies had shown that cells that are close to the mammary duct are important. But it is also important to consider which cells are close to which other cells,” says Zhang.
When they compared the results of their model to samples evaluated by a pathologist, there was clear agreement in many cases. In cases that were not as clear-cut, the model could provide information about features of a tissue sample, such as the organization of cells, that a pathologist could use to make decisions.
This versatile model could also be adapted for use in other types of cancer or even neurodegenerative diseases, an area that researchers are also currently exploring.
“We have shown that with the right ai techniques, this simple staining can be very effective. There is still a lot of research to be done, but we need to take the organization of cells into account in further studies,” says Uhler.
This research was supported, in part, by the Eric and Wendy Schmidt Center at the Broad Institute, eth Zurich, the Paul Scherrer Institute, the Swiss National Science Foundation, the U.S. National Institutes of Health, the U.S. Office of Naval Research, the MIT Jameel Clinic for Machine Learning and Health, the MIT-IBM Watson artificial intelligence Laboratory, and a Simons Investigator Award.