CHNSpec Technology (Zhejiang)Co.,Ltd chnspec@colorspec.cn 86--13732210605
In traditional pathological diagnosis, a breast cancer tissue sample needs to undergo more than ten processes such as fixation, embedding, sectioning, and staining. From sample delivery to report issuance, it often takes several hours or even longer. In the intraoperative frozen section stage, patients often need to be in a state of anesthesia waiting, and the shortening of this time is crucial for surgical safety.
A study recently published in "Scientific Reports" attempts to use a "label-free, stain-free" technical path combined with deep learning algorithms to provide a new solution to this clinical pain point.
![]()
When pathological images "lose" color
The pathological images we are familiar with are usually presented in blue-purple tones after H&E staining, with clear boundaries between the cell nucleus and cytoplasm. Microscopic Hyperspectral Imaging (MHSI) technology can obtain 128 bands of spectral information from visible light to near-infrared (397-1032 nm) by scanning tissue sections without any staining.
The direct challenge brought by this "stain-free" state is: the images lack morphological contrast, making it difficult for the human eye to interpret directly. However, the advantage of hyperspectral data lies in the fact that it records continuous spectral curves for every pixel point, and different biochemical components (such as proteins, lipids, nucleic acids) will present differentiated reflection characteristics at specific wavelengths. How to extract information with diagnostic value from such high-dimensional and weakly morphological data has become a new topic in computational pathology.
![]()
Transforming "section diagnosis" into "multi-instance learning"
The research team constructed a hyperspectral dataset containing 468 tissue sections from 60 breast cancer patients. Different from traditional methods that perform single-point prediction on local fields of view, the researchers modeled the pathological diagnosis as a Multi-Instance Learning (MIL) problem: treating an entire tissue section as a "bag," and the spectral cubes collected from 20 different regions on the section as "instances" within the bag. The model needs to synthesize the information of all instances to output the diagnosis result for the entire section.
This approach is closer to the actual image-reading logic of pathologists—first browsing globally under a low-power microscope, and then focusing on suspicious areas for comprehensive judgment.
![]()
Multi-level "attention" mechanism
Aiming at the characteristics of hyperspectral data, the team proposed a Multi-Scale Hierarchical Attention Network (MS-HAN), whose core design includes three key levels:
![]()
1.Multi-scale feature extraction draws lessons from the Inception structure, using different sizes of convolution kernels in parallel at the same spatial resolution to extract features, so as to capture multi-granularity information from subtle spectral differences to local texture patterns.
2.Dual attention mechanism first explicitly models the dependencies between bands through spectral channel attention, giving higher weights to bands with richer information; then generates a two-dimensional heat map through spatial attention to locate regions with diagnostic value in terms of cell morphology without relying on pixel-level labeling.
![]()
3.Hierarchical aggregation and prototype learning. To deal with the high intra-class variability in biological spectra, the model introduces a set of learnable "prototype vectors," soft-assigning instance features to these prototypes, and prevents mode collapse by constraining the entropy of the prototype usage distribution. Finally, a self-attention mechanism is utilized to model the dependencies between different regions within the section, obtaining the representation of the entire section through attention pooling.
Under weakly supervised training using only section-level labels, the model achieved an accuracy of 86.7% and an AUC of 0.92 on an independent test set (94 sections), showing statistically significant improvement compared to mainstream MIL baseline models such as TransMIL and CLAM.
![]()
Omission of the staining stage and compression of time cost
The foothold of this research is not to replace pathologists, but to explore a workflow of "optical sectioning" plus "AI primary screening." Omitting the staining step not only means a reduction in the cost of reagents and consumables, but more importantly, it significantly compresses the time window from sampling to digital diagnosis. For time-sensitive scenarios such as intraoperative freezing, this "cut-scan-analyze" mode is expected to shorten the waiting time for patients under anesthesia.
Of course, this research is still in the proof-of-concept stage. The scale of the 60-case single-center dataset is relatively limited, and the performance of the model in the face of preparation artifacts, low cell density, or rare molecular subtypes still needs external validation with multi-center and large-sample data. In addition, the hardware cost of hyperspectral imaging equipment is high, and moving from the laboratory to routine pathology departments still requires considerations at the engineering and health economics levels.