1,382
Views
0
CrossRef citations to date
0
Altmetric
Article Commentary

Artificial intelligence-assisted ultrasound-guided focused ultrasound therapy: a feasibility study

, , &
Article: 2260127 | Received 26 Jun 2023, Accepted 12 Sep 2023, Published online: 25 Sep 2023

Abstract

Objectives

Focused ultrasound (FUS) therapy has emerged as a promising noninvasive solution for tumor ablation. Accurate monitoring and guidance of ultrasound energy is crucial for effective FUS treatment. Although ultrasound (US) imaging is a well-suited modality for FUS monitoring, US-guided FUS (USgFUS) faces challenges in achieving precise monitoring, leading to unpredictable ablation shapes and a lack of quantitative monitoring. The demand for precise FUS monitoring heightens when complete tumor ablation involves controlling multiple sonication procedures.

Methods

To address these challenges, we propose an artificial intelligence (AI)-assisted USgFUS framework, incorporating an AI segmentation model with B-mode ultrasound imaging. This method labels the ablated regions distinguished by the hyperechogenicity effect, potentially bolstering FUS guidance. We evaluated our proposed method using the Swin-Unet AI architecture, conducting experiments with a USgFUS setup on chicken breast tissue.

Results

Our results showed a 93% accuracy in identifying ablated areas marked by the hyperechogenicity effect in B-mode imaging.

Conclusion

Our findings suggest that AI-assisted ultrasound monitoring can significantly improve the precision and control of FUS treatments, suggesting a crucial advancement toward the development of more effective FUS treatment strategies.

1. Introduction

Traditionally, surgery has been considered as the standard cure for various solid tumors. However, advancements in technology have led to a shift away from open surgery toward less invasive methods [Citation1]. In recent years, focused ultrasound (FUS) therapy, also known as high intensity focused ultrasound (HIFU), has emerged as a potential solution for noninvasive tumor ablation [Citation2]. FUS is an image-guided therapy that uses ultrasound beam to target and treat cancerous tumors. The principles of FUS are similar to conventional ultrasound imaging, where ultrasonic waves can pass harmlessly through living tissue. However, when the energy of the ultrasound beam is focused at high intensity, it can cause a rise in temperature, leading to tissue heating [Citation3]. During FUS treatment, the temperature at the focused region can quickly rise above 80 °C, causing effective cell death even with short exposure times [Citation4]. FUS is performed on an outpatient basis, requires no incisions, and can result in rapid recovery time, with the potential to improve the lives of millions of patients [Citation5]. Currently, FUS is offered at 799 treatment sites worldwide, to treat various solid tumors, including the prostate [Citation6–8], liver [Citation9–11], breast [Citation12–14], kidney [Citation15–17], and bone [Citation18–20].

A successful tumor treatment with FUS requires precise guidance and delivery of ultrasound energy to the target site [Citation21]. This can be accomplished through the integration of FUS with a medical imaging modality. Diagnostic ultrasound can provide an acceptable solution for FUS guidance, allowing for real-time monitoring of therapeutic results and the control of the FUS procedure for complete tumor ablation [Citation22]. Ultrasound-guided FUS (USgFUS) can be achieved through hyperechogenicity, where the ablated tissue and therapeutic effects can be evaluated through changes in the grayscale of conventional B-mode images [Citation23]. Ultrasound imaging offers several benefits for FUS guidance, including lower cost and accessibility, faster treatment times, and a strong correlation between the observed ultrasound changes and the region of necrosis in the tissue [Citation24]. USgFUS was first proposed in the early days of diagnostic ultrasound in the 1970s and remains one of the leading image-guided techniques for clinical FUS treatment [Citation24]. Over the past decade, several USgFUS devices have been received regulatory approval for solid tumor treatment, including Sonablate® 500 (by Focus Surgery Inc. based in Indianapolis, IN) and Ablatherm® HIFU (by EDAP TMS S.A. based in Vaulx-en-Velin, France) for the treatment of prostate cancer, and Chongqing HAIFU (by Chongqing Haifu Technology Co. Ltd. based in Chongqing, China) and FEP-BY HIFU System (by China Medical Technologies Inc. based in Beijing, China) for various extra-corporeal FUS treatments [Citation25].

Despite its clinical approval, USgFUS has not yet achieved widespread clinical acceptance and reached its full market potential. A significant challenge of USgFUS lies in the precision of monitoring with conventional B-mode imaging [Citation21]. Specifically, it is not possible to measure temperature rise during FUS treatment with diagnostic ultrasound, resulting in a lack of quantitative measurement of thermal dose and precise imaging of the ablated volume [Citation22]. Therefore, the success of monitoring during USgFUS treatment completely relies on the accuracy of detecting grayscale changes on B-mode images while ultrasound imaging does have limited spatial and contrast resolution [Citation22]. Besides, the appearance of hyperechoic spots during the USgFUS treatment necessitates overheating the focused area to generate boiling bubbles, resulting in the formation of ablated areas with unpredictable shapes. It has been also proven that the hyperechoic spots appearing on B-mode images can fade after FUS exposure ends [Citation26]. Due to the lack of precise monitoring in current USgFUS devices, the physicians performing the treatment require extensive training, and therapeutic results are highly dependent on their expertise and experience.

The need for an accurate method for monitoring FUS becomes even more critical when complete tumor ablation requires the control and monitoring of a number of ablation procedures to cover the entire tumor [Citation27]. The absence of precise monitoring mechanism increases the risk of cancerous cell survival in the spaces between small ablated regions [Citation1]. To mitigate this, physicians often target and ablate a volume larger than the targeted tumor and repeat the sonication process to ensure complete tumor ablation. This can result in long treatment time and major side effects, such as collateral damages to healthy tissue and rectal wall burns during the FUS prostate treatment [Citation28,Citation29]. Consequently, these limitations have prevented USgFUS from realizing its full market potential despite its overall benefits to patients. For instance, only 6% of 148,000 eligible patients with prostate cancer received FUS treatment in 2020, according to the Focused Ultrasound Foundation.

Recognizing the exceptional capabilities of ultrasound imaging, numerous studies have aimed to develop new methods and systems for USgFUS to enhance monitoring accuracy and improve therapeutic results. Various researchers have investigated alternative techniques for USgFUS, such as local harmonic motion (LHM) [Citation30], amplitude-modulated (AM) harmonic motion imaging [Citation31], ultrasound elastography [Citation32], contrast-enhanced ultrasonography [Citation33], and ultrasonic Nakagami imaging [Citation34]. Although these methods have shown promising potential for improving USgFUS procedures, it is crucial to acknowledge that they have not yet received clinical approval [Citation24]. At present, clinical USgFUS predominantly relies on conventional ultrasound B-mode imaging for providing feedback during the ablation procedures [25]. Consequently, there is an urgent need for the development of new monitoring methods that enable physicians to administer USgFUS more accurately and efficiently, utilizing ultrasound hyperechogenicity.

In this paper, we present the AI-assisted USgFUS concept. This innovative method harnesses a trained AI segmentation framework alongside diagnostic ultrasound, enabling precise identification and labeling of ablated regions within the ultrasound B-mode images captured during FUS monitoring (). To implement this approach, we developed an AI framework based on Swin-Unet architecture. The developed AI framework was employed in conjunction with ultrasound B-mode imaging to fulfill the demand for real-time and quantitative monitoring of ablated area during FUS ablation procedures. To assess the feasibility of our proposed AI-assisted USgFUS framework, we conducted an ex vivo experimental study using an USgFUS setup and chicken breast tissue. Initially, we trained a supervised AI framework on 90% of the experimental data, and then evaluated the real-time labeling performance of the trained AI network using the remaining 10% of the experimental data. The results presented in this paper demonstrate the accuracy and feasibility of using AI-assisted USgFUS for precise and quantitative monitoring of FUS treatment.

Figure 1. A comparative Illustration of conventional USgFUS technology and the proposed AI-assisted USgFUS for precise monitoring of FUS treatment. The traditional approach relies on grayscale variations in ultrasound B-mode images to visualize the ablated area, while our novel method employs AI-assisted labeling and real-time highlighting for quantitative and accurate assessment of FUS treatment progress.

Figure 1. A comparative Illustration of conventional USgFUS technology and the proposed AI-assisted USgFUS for precise monitoring of FUS treatment. The traditional approach relies on grayscale variations in ultrasound B-mode images to visualize the ablated area, while our novel method employs AI-assisted labeling and real-time highlighting for quantitative and accurate assessment of FUS treatment progress.

2. Materials and methods

2.1. FUS experimental setup

We designed and developed an ex vivo experimental setup using chicken breast tissue to evaluate the feasibility of AI-assisted USgFUS method, as depicted in . The experiments were specifically created to obtain ultrasound B-mode images before, during, and after each FUS procedure, for both training and testing the AI algorithms that are part of our method. The setup consists of two primary components, as illustrated in : a robotic-assisted FUS unit, which was responsible for accurate positioning of the focal region within the target tissue, and a robotic-assisted imaging unit that utilized a linear ultrasound transducer to closely monitor the FUS procedure with a high level of precision.

Figure 2. Ex vivo USgFUS experimental setup. The arrangement comprises two main components: a robotic-assisted FUS unit responsible for precise positioning of the focal region within the chicken tissue, and a robotic-assisted imaging unit employing a linear ultrasound transducer to meticulously monitor the FUS procedure with high precision.

Figure 2. Ex vivo USgFUS experimental setup. The arrangement comprises two main components: a robotic-assisted FUS unit responsible for precise positioning of the focal region within the chicken tissue, and a robotic-assisted imaging unit employing a linear ultrasound transducer to meticulously monitor the FUS procedure with high precision.

Figure 3. Schematic representation of the FUS system and its components. The system employs a single-element transducer (Model H-101, Sonic Concepts) for generating FUS, driven by a function generator (33210 A, Hewlett Packard), an RF amplifier (A-500, Electronic Navigation Industries), and an acoustic matching network (Sonic Concepts), all integrated into the FUS unit. A 3D motion system (5410 CNC Deluxe Mill, Sherline) is attached to the FUS transducer for accurate control of the focal region within the target tissue. B-mode images of the focal area are captured using a linear ultrasound probe (Ultrasonic Scanner Accuvix XQ) for FUS monitoring, with a robotic arm (Panda, Franka Emika GmbH) providing precise control of imaging.

Figure 3. Schematic representation of the FUS system and its components. The system employs a single-element transducer (Model H-101, Sonic Concepts) for generating FUS, driven by a function generator (33210 A, Hewlett Packard), an RF amplifier (A-500, Electronic Navigation Industries), and an acoustic matching network (Sonic Concepts), all integrated into the FUS unit. A 3D motion system (5410 CNC Deluxe Mill, Sherline) is attached to the FUS transducer for accurate control of the focal region within the target tissue. B-mode images of the focal area are captured using a linear ultrasound probe (Ultrasonic Scanner Accuvix XQ) for FUS monitoring, with a robotic arm (Panda, Franka Emika GmbH) providing precise control of imaging.

The FUS system used a single-element transducer (Model H-101, Sonic Concepts, Woodinville, US) with an active diameter of 64.0mm and a focal depth of 51.74mm for generating FUS. The transducer was driven at the desired frequency and power using a function generator (33210 A Waveform Generator, 10 MHz, Hewlett Packard, US), an RF amplifier (A-500, 60 dB fixed gain, Electronic Navigation Industries, Rochester, US), and an acoustic matching network (Sonic Concepts, Woodinville, US), all integrated into the FUS unit. A 3D motion system (5410 CNC Deluxe Mill, Sherline, Vista, US) was attached to the FUS transducer to enable precise control of the focal region inside the targeted tissue. A linear ultrasound probe (Ultrasonic Scanner Accuvix XQ, Korea) was used to capture B-mode images of the focal area for FUS monitoring. A robotic arm (Panda, Franka Emika GmbH, Germany) was also adopted with the ultrasound probe for precise control of imaging. The FUS transducer was driven at varying ranges of exposure time (520s), acoustic power (50150W), and frequencies (11.3MHz) for the USgFUS experiments. In this study, the chicken breast tissue samples were procured postmortem from a local provider. The experiments did not involve any live animals, and the handling of the tissue was performed in compliance with applicable guidelines and regulations.

During our experiments, the sonications were strategically positioned side by side with a separation of 0.5cm. This configuration was chosen based on the unique characteristics of the FUS transducer we employed. The transducer produces a focal region that spans approximately 1.2cm along the beam axis and 1.4mm in the transverse direction. To guarantee distinct, non-overlapping sonications, we placed the second sonication 0.5cm away from the first in the transverse direction. This deliberate spacing ensured that the sonications, influenced by our transducer’s inherent focal dimensions, would remain separate and avoid intersection.

2.2. AI analysis

This section provides a comprehensive overview of the supervised AI algorithm that we developed and trained to work in conjunction with ultrasound imaging for real-time monitoring of FUS procedures. The algorithm’s primary framework and workflow are founded on a transformers model [Citation35], and comprise two main components, as depicted in .

Figure 4. AI algorithm used in conjunction with US imaging. B-mode images captured before and after FUS treatment undergo a preprocessing stage, in which they are Clipped to a predefined surgical region based on prior knowledge. These images are then resized to 224 × 224 grayscale resolutions. Subsequently, the images and their pixel-wise difference are concatenated to generate a three-channel image. This image serves as the input for the Swin-Unet model, which produces the segmented region of the FUS ablation as its output.

Figure 4. AI algorithm used in conjunction with US imaging. B-mode images captured before and after FUS treatment undergo a preprocessing stage, in which they are Clipped to a predefined surgical region based on prior knowledge. These images are then resized to 224 × 224 grayscale resolutions. Subsequently, the images and their pixel-wise difference are concatenated to generate a three-channel image. This image serves as the input for the Swin-Unet model, which produces the segmented region of the FUS ablation as its output.

The first component involves a pre-processing stage that accepts two B-mode images captured before and after the FUS sonication. These images are subjected to clipping based on prior knowledge of the sonication’s coordinates, allowing the model to focus on the operational regions and avoid processing extraneous areas. After clipping, both images are resized to a standardized size of 224 by 224. Subsequently, the negative pixel values are set to zero, and the difference between the pre- and post-sonication images is computed. The resulting set of before-after differences is concatenated to form a three-channel image, which serves as input to the Swin-Unet [Citation35] model for segmenting the region of interest.

2.2.1. Swin-Unet

Swin-Unet combines the strengths of Swin Transformers and Unet [Citation35], featuring an encoder-decoder structure alongside skip connections [Citation36]. The core building block of this architecture is the SwinTransformer block, which is a hierarchical block that recursively applies a set of operations to a feature map in a multi-scale fashion. As explained in , the SwinTransformer block employed in this study consists of five main operations: a windowed multi-head self-attention module (W-MSA), a shifted window multi-head self-attention mechanism (SW-MSA), a set of two feedforward layers (MLP) with GELU activations [Citation37], and Layer Normalizations (LN) [Citation38]. These operations followed by skip-connections [Citation39], a down-sampling operation, and an up-sampling operation form the main structure of Swin-Unet.

The SW-MSA mechanism is a modified form of the standard self-attention mechanism used in transformers, where a sliding window is used to limit the attention to a local region around each position, and the windows are shifted to allow for global coverage. The feedforward layers consist of a series of linear transformations and activation functions, which are applied to the output of the attention mechanism. The process of down-sampling reduces the spatial resolution of the feature map by a factor of two, while conversely, the up-sampling operation increases it by the same factor of two. By applying a series of down-sampling and up-sampling operations to the feature map at multiple scales, Swin-Unet can extract features at different levels of granularity.

For an input image of size W×H×3 and a patch size of 4×4, the encoder receives non-overlapping image patches of size W4×H4×3. With an input image resolution of 224×224×3 and a patch size of 4×4, the encoder’s input size is 56×56×48. The first component of the encoder is a linear embedding module that maps the input patches into a pre-defined dimension referred to as C. The output of the embedding module is then passed to the Swin blocks. Using similar notations to [Citation40], if the input of a Swin block at lth block is represented by zll, the output of the Swin block is formulated as follows: (1) Ẑl=WMSA(LN(Zl1))+zl1(1) (2) zl=MLP(LN(ẑl))+ẑl(2) (3) ẑl+1=SWMSA(LN(zl))+zl(3) (4) zl+1=MLP(LN(ẑl+1))+ẑl+1(4)

Swin-Unet employs the same self-attention mechanism as that of Swin transformers [Citation40], which is calculated as follows: (5) Attention(Q,K,V)=SoftMax(QKTd+B)V(5)

In the encoder stage of Swin-Unet, the Swin blocks are followed by a patch merging block that concatenates the features of each set of 2×2 neighboring patches. This operation results in a reduction of the feature resolution by a factor of 2 after each merging block. To establish a connection between the encoder and decoder stages, Swin-Unet utilizes the same approach as vanilla Unet and creates a bottleneck by adding two consecutive Swin blocks while maintaining the feature dimensionality and resolution.

The decoder stage of the Swin-Unet architecture mirrors the encoder stage, utilizing Swin blocks to extract deep abstract features. However, instead of a patch merging layer, a patch expanding block is incorporated to increase feature resolution through an up-sampling operation. This block reconfigures adjacent feature maps into a higher resolution map, while halving the feature dimension. For instance, the initial patch expanding layer undergoes a linear transformation that doubles the feature dimension prior to up-sampling. To preserve spatial information and fuze shallow and deep features, skip connections link related Swin blocks in both stages. These connections enable effective information flow, preventing the loss of pertinent information resulting from merging blocks

2.2.2. Model’s training

During the training phase, the Swin-Unet neural architecture parameters are learned using a set of masks labeled by human experts. Prior to input to the model, the images undergo preprocessing steps such as cropping, resizing, normalization, and augmentation to ensure that the data is suitable for training. Several augmentation techniques are applied to the cropped and resized images, including horizontal and vertical flipping and affine transformation, to prevent significant perturbations in the data distribution.

To optimize the weight parameters of the model, a linear combination of two widely-used loss functions, Binary Cross Entropy and Soft Dice Loss, is employed [Citation41]. This combination yields optimal results in terms of segmentation, per-pixel accuracy, generalization, and robustness against adversarial attacks [Citation42]. The model is trained using a learning rate of 2e-4, a batch size of 8, a ReduceLROnPlateau scheduler, a maximum epoch of 200, and the AdamW optimizer. The PyTorch library is used to train the model on a single RTX 3080Ti GPU with 16GB memory. These crucial settings enable the model to effectively learn and generalize the segmentation task.

3. Results

In this section, we present the results of our study, which focused on assessing the effectiveness of our proposed AI segmentation framework for real-time labeling of ablated areas in ultrasound B-mode images. Our approach enables both qualitative and quantitative assessments of the proposed AI-assisted USgFUS method. The segmentation framework was developed using a deep learning algorithm and trained on a limited dataset of ultrasound B-mode images to accurately identify the precise volume of tissue ablated by FUS sonication.

present a series of ultrasound B-mode images captured at different levels of acoustic power (80, 90, and 100W). Each image set includes the B-mode images corresponding to pre- and post-FUS sonication, as well as AI-labeled B-mode images depicting the ablated area. For each power level, the images are organized into two rows, each signifying a separate FUS sonication event. The top row presents ultrasound images from the initial 15s FUS sonication, while the bottom row reveals images from a subsequent 15s sonication. Crucially, these sonications were applied sequentially, not simultaneously. After the first sonication was completed and its ablation detected by the AI segmentation model, a 30-s cooling period was observed. The transducer was then shifted 0.5cm, positioning the next sonication below the initial ablated region. This is to emphasize that our model focuses on the real-time detection of the most recent ablation, which is why the first lesion does not reappear in the AI segmentation post the second sonication. Our results demonstrate that the proposed AI segmentation framework accurately identifies ablated tissue with high precision. The segmentation maps highlight the framework’s ability to accurately determine the precise volume of tissue ablated by FUS sonication. Moreover, the average segmentation time during testing was found to be only 22±5 millisecond, indicating the segmentation framework’s suitability for real-time applications. Our segmentation framework offers high accuracy and speed, making it a highly promising tool for real-time monitoring of FUS treatments ().

Figure 5. Ultrasound B-mode images captured at various acoustic power levels (80, 90, and 100W) during FUS sonication. Each image set demonstrates pre- and post-FUS sonication B-mode images, as well as AI-labeled B-mode images indicating the ablated area. Two rows of images represent two consecutive FUS sonication events; the top row corresponds to the initial 15s FUS sonication, and the bottom row corresponds to the subsequent 15s sonication. The second sonication took place 0.5 cm away from the ablated region created by the first sonication, following a 30-s cooling period.

Figure 5. Ultrasound B-mode images captured at various acoustic power levels (80, 90, and 100W) during FUS sonication. Each image set demonstrates pre- and post-FUS sonication B-mode images, as well as AI-labeled B-mode images indicating the ablated area. Two rows of images represent two consecutive FUS sonication events; the top row corresponds to the initial 15−s FUS sonication, and the bottom row corresponds to the subsequent 15−s sonication. The second sonication took place 0.5 cm away from the ablated region created by the first sonication, following a 30-s cooling period.

In order to evaluate the performance of our AI segmentation framework in detecting the ablated area, presents a detailed analysis of the results obtained by comparing the AI-labeled ablated areas with the ground truth annotations. The figure includes two types of images to demonstrate the effectiveness of our framework. Firstly, a series of ultrasound B-mode images are presented where the AI-labeled ablated area is compared with the actual area, with the ground-truth boundary of the ablated region denoted by red contours. These images provide a clear visual representation of the level of accuracy achieved by our AI segmentation framework in identifying the ablated regions in real-time. Secondly, a set of color-coded images is presented, which highlights the accuracy of our framework by distinguishing under-segmented, accurately segmented, and over-segmented regions. The under-segmented regions, where the framework failed to accurately segment ablated regions, are indicated by orange pixels, while accurately segmented regions are represented by yellow pixels. Green pixels are used to indicate the over-segmented regions, where the framework falsely identified regions as ablated.

Figure 6. Detailed performance analysis of the proposed segmentation framework. Orange pixels represent under-segmented areas where the framework did not accurately delineate the regions. In contrast, yellow pixels correspond to accurately segmented regions, showcasing the framework’s effectiveness. Lastly, green pixels indicate over-segmented regions, where the framework has segmented beyond the actual boundaries.

Figure 6. Detailed performance analysis of the proposed segmentation framework. Orange pixels represent under-segmented areas where the framework did not accurately delineate the regions. In contrast, yellow pixels correspond to accurately segmented regions, showcasing the framework’s effectiveness. Lastly, green pixels indicate over-segmented regions, where the framework has segmented beyond the actual boundaries.

We further evaluated the performance of our proposed segmentation framework quantitatively by reporting the Mean Absolute Errors (MAE) and Dice Scores for six distinct sonication events, as shown in . Despite the limited training data, the proposed framework achieved noteworthy performance, with an average MAE of 0.04 and an average Dice Score of 93.47%. The results of this analysis demonstrate the high level of accuracy of our AI segmentation framework, which is crucial for real-time monitoring of FUS treatments.

Table 1. Quantitative evaluation of the proposed segmentation framework performance using Mean Absolute Errors (MAE) and Dice Scores for six distinct sonication events.

To further assess the AI segmentation model’s performance, provides a side-by-side comparison of the AI-labeled B-mode ultrasound image with post-ablation photographs of the chicken sample. This figure illustrates results from two sequential sonications at P1. Initially, sonication 1 was applied, followed by a cooling period, after which the transducer was shifted 0.5cm to administer sonication 2. In the post-ablation photographs, the ablated areas are discernible due to tissue color changes, which result from the cooking effect of the FUS sonication-induced heat. The comparative analysis between the AI-predicted ablated areas and those evident in the post-ablation photographs suggests a promising approximation by our AI model. However, it’s imperative to emphasize potential limitations in this validation. The ablated regions visible on B-mode ultrasound images might not be exact replicas of the actual ablated areas, primarily due to the boiling bubble phenomenon.

Figure 7. Comparison of AI-labeled B-mode ultrasound image with photographs of chicken sample post-ablation from sonications 1 and 2 at. P1

Figure 7. Comparison of AI-labeled B-mode ultrasound image with photographs of chicken sample post-ablation from sonications 1 and 2 at. P1

4. Discussion and conclusion

FUS is a noninvasive therapeutic method that can be used for various medical applications, including tumor ablation. However, monitoring the FUS process is crucial to ensure its effectiveness and safety during treatment. Ultrasound is an excellent imaging technique that can be used for FUS monitoring, but USgFUS faces challenges in precise monitoring, resulting in a lack of quantitative measurement and unpredictable ablation shapes. These limitations have hindered the widespread clinical acceptance and market potential of USgFUS, despite its overall benefits.

This study investigates the feasibility of integrating AI into ultrasound imaging to label the ablated area in real-time during FUS treatment monitored by B-mode imaging. The aim is to train an AI segmentation framework preoperatively that can be combined with USgFUS to empower surgeons with quantitative and intraoperative monitoring of FUS treatment. AI can provide fast computation, making it ideal for addressing the real-time requirements necessary for FUS monitoring. To validate the feasibility of the proposed AI-assisted USgFUS approach, we developed an AI framework using the Swin-Unet architecture and ex-vivo FUS experiments conducted on chicken breast tissue, and subsequently evaluated its performance through further experiments. Our AI segmentation framework demonstrated a 93% accuracy in labeling ablated tissue areas discerned by the hyperechogenicity effect during ultrasound B-mode imaging. The findings from our study present the AI-segmentation model as a promising tool to enhance B-mode ultrasound imaging during the monitoring of FUS sonication. However, several limitations should be noted. Firstly, our AI segmentation model was deployed post-sonication. Secondly, our experiments were conducted on ex vivo chicken samples rather than in a live environment. Lastly, our research was exclusively based on chicken samples.

In our study, we presented results post-sonication, but it’s vital to underscore the real-time potential of our AI segmentation model in the context of FUS treatments. The model’s efficiency, as evidenced in our experiments, allows it to identify and label the ablated region on ultrasound B-mode images within an impressive 22±5 milliseconds. This rapid processing ability translates to a capability to handle around 45  frames of ultrasound images every second. Given that prevalent ultrasound devices for FUS monitoring predominantly operate at a frame rate of 30, our model’s performance is not only compatible but also positions it at the forefront of real-time FUS monitoring technologies.

The present study primarily offers a feasibility analysis of an AI-assisted USgFUS methodology, leveraging ex vivo experiments with chicken breast tissue as the foundational substrate for training and evaluating our AI segmentation algorithm. Nonetheless, the flexibility of our approach makes it inherently adaptable. The method holds significant promise for deployment across various tissue types and even under in vivo and clinical circumstances. To transition this technique for use in other tissues or in vivo conditions, a pivotal step involves curating corresponding data for training the AI model under these specific scenarios. Given a robust training dataset, we anticipate the AI model, much like the results showcased in this study, can seamlessly integrate with ultrasound systems, thereby enabling real-time labeling of ablated areas.

For clinical settings, where USgFUS already enjoys FDA approval and practical application, our AI-assisted method is primed for potential integration. Transitioning our approach to a real-world clinical environment entails gathering ultrasound B-mode images from treatments administered on live or targeted tissues. Collating a comprehensive dataset, consisting of ultrasound B-mode imagery from actual clinical scenarios, supplemented with expert annotations, would pave the way for training a versatile AI segmentation model. Once trained, this model is envisioned to serve as an invaluable tool during clinical procedures, offering real-time insights and guidance. Drawing inspiration from the current study, such endeavors can shape the trajectory of future research in this domain, merging the strengths of AI and USgFUS for enhanced patient care.

In addition to its potential for assisting with FUS monitoring, our AI segmentation framework can also address additional challenges associated with FUS treatment, particularly those related to precise FUS control and treatment planning. A key challenge in FUS tumor treatment is the placement of multiple sonication areas to cover the entire tumor. Our AI-assisted FUS monitoring solution can mark and label the margin of the ablated area in real-time and track the growth of the ablated area. This can greatly aid in FUS control, thereby improving the ablation treatment with a reduced risk of tumor metastasis, ensuring accurate ablation of tumors close to sensitive organs, and minimizing the treatment duration [Citation1].

First, the placement of multiple ablation areas increases the risk of tumor seeds surviving between the spaces. With our AI monitoring solution, any unablated areas of the tumor can be identified and displayed on the screen for surgeons to take necessary action. Second, the FUS control process is often lengthy, with surgeons repeating sonication to ensure complete tumor ablation. However, our AI-assisted monitoring solution can calculate and define the ablated volume using our AI segmentation framework as an input for a supervisory controlled system. This can speed up the treatment process. Finally, when treating tumors located near sensitive areas, precise control and monitoring are critical. Our solution can be integrated with cutting-edge technologies such as MRI-ultrasound fusion [Citation43], which can delineate tumor boundaries on live ultrasound images. This allows for the accurate ablation of tumor areas without encroaching on the sensitive regions.

In conclusion, our study demonstrated the potential of AI-assisted FUS monitoring to significantly improve the precision and accuracy of FUS treatments, address the limitations of FUS control, and assist with FUS treatment planning. Further studies are needed to explore the effectiveness and safety of this approach in clinical settings.

Acknowledgements

The authors gratefully acknowledge the financial support provided in part by the Natural Sciences and Engineering Research Council of Canada (NSERC). Additionally, we extend our sincere appreciation to Josh Kazi for his valuable assistance in conducting experiments.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability

The code and data used in this publication are publicly available at https://github.com/hosseinbv/HIFU_Segmentation.git. The repository contains all the code used to generate the results reported in the paper, as well as the datasets and pre-trained models used in the experiments.

Additional information

Funding

The authors gratefully acknowledge the financial support provided in part by the Natural Sciences and Engineering Research Council of Canada (NSERC).

References

  • Kennedy JE. High-intensity focused ultrasound in the treatment of solid tumours. Nat Rev Cancer. 2005;5(4):321–327. doi: 10.1038/nrc1591.
  • Kennedy JE, Ter Haar GR, Cranston D. High intensity focused ultrasound: surgery of the future? Br J Radiol. 2003;76(909):590–599. doi: 10.1259/bjr/17150274.
  • > Gail ter Haar and C. Coussios, High intensity focused ultrasound: physical principles and devices. Int J Hyperthermia. 2007; 23:(2):89–104. doi: 10.1080/02656730601186138.
  • Dubinsky TJ, Cuevas C, Dighe MK, et al. High-intensity focused ultrasound: current potential and oncologic applications. AJR Am J Roentgenol. 2008;190(1):191–199. doi: 10.2214/AJR.07.2671.
  • Swamy KM. Ultrasound for BP measurement and treatment in subjects with resistant hypertension. In International Symposium on Ultrasonics. 2015;22(24):108–119.
  • Blana A, Walter B, Rogenhofer S, et al. High-intensity focused ultrasound for the treatment of localized prostate cancer: 5-year experience. Urology. 2004;63(2):297–300. doi: 10.1016/j.urology.2003.09.020.
  • Gelet A, Chapelon JY, Bouvier R, et al. Local control of prostate cancer by transrectal high intensity focused ultrasound therapy: preliminary results. J. Urol. 1999;161(1):156–162. doi: 10.1016/S0022-5347(01)62087-1.
  • Azzouz H, De la Rosette J. HIFU: local treatment of prostate cancer. Eau-Ebu Updat. Ser. 2006;4(2):62–70. doi: 10.1016/j.eeus.2006.01.002.
  • Li C-X, Xu G-L, Jiang Z-Y, et al. Analysis of clinical effect of high-intensity focused ultrasound on liver cancer. World J Gastroenterol. 2004;10(15):2201–2204. doi: 10.3748/wjg.v10.i15.2201.
  • Vaezy S, Martin R, Schmiedl U, et al. Liver hemostasis using high-intensity focused ultrasound. Ultrasound Med Biol. 1997;23(9):1413–1420. doi: 10.1016/s0301-5629(97)00143-9.
  • Kennedy JE, Wu F, ter Haar GR, et al. High-intensity focused ultrasound for the treatment of liver tumours. Ultrasonics. 2004;42(1-9):931–935. doi: 10.1016/j.ultras.2004.01.089.
  • Furusawa H. MRI-Guided focused ultrasound surgery of breast cancer. In: Non-surgical ablation therapy for early-stage breast cancer. Springer; 2016. p. 173–181.
  • Furusawa H, Namba K, Nakahara H, et al. The evolving non-surgical ablation of breast cancer: MR guided focused ultrasound (MRgFUS). Breast Cancer. 2007;14(1):55–58. doi: 10.2325/jbcs.14.55.
  • Wu F, Wang Z-B, Zhu H, et al. Extracorporeal high intensity focused ultrasound treatment for patients with breast cancer. Breast Cancer Res Treat. 2005;92(1):51–60. doi: 10.1007/s10549-004-5778-7.
  • Illing RO, Kennedy JE, Wu F, et al. The safety and feasibility of extracorporeal high-intensity focused ultrasound (HIFU) for the treatment of liver and kidney tumours in a Western population. Br J Cancer. 2005;93(8):890–895. doi: 10.1038/sj.bjc.6602803.
  • Adams JB, Moore RG, Anderson JH, et al. High-intensity focused ultrasound ablation of rabbit kidney tumors. J Endourol. 1996;10(1):71–75. doi: 10.1089/end.1996.10.71.
  • Marberger M, Schatzl G, Cranston D, et al. Extracorporeal ablation of renal tumours with high‐intensity focused ultrasound. BJU Int. 2005;95 Suppl 2(s2):52–55. doi: 10.1111/j.1464-410X.2005.05200.x.
  • Chen W, Zhu H, Zhang L, et al. Primary bone malignancy: effective treatment with high-intensity focused ultrasound ablation. Radiology. 2010;255(3):967–978. doi: 10.1148/radiol.10090374.
  • Li C, Zhang W, Fan W, et al. Noninvasive treatment of malignant bone tumors using high‐intensity focused ultrasound. Cancer. 2010;116(16):3934–3942. doi: 10.1002/cncr.25192.
  • Huisman M, Lam MK, Bartels LW, et al. Feasibility of volumetric MRI-guided high intensity focused ultrasound (MR-HIFU) for painful bone metastases. J Ther Ultrasound. 2014;2(1):16. doi: 10.1186/2050-5736-2-16.
  • Peek MCL, Wu F. High-intensity focused ultrasound in the treatment of breast tumours. Ecancermedicalscience. 2018;12:794. doi: 10.3332/ecancer.2018.794.
  • Rivens I, Shaw A, Civale J, et al. Treatment monitoring and thermometry for therapeutic focused ultrasound. Int J Hyperthermia. 2007;23(2):121–139. doi: 10.1080/02656730701207842.
  • Yu T, Xu C. Hyperecho as the indicator of tissue necrosis during microbubble-assisted high intensity focused ultrasound: sensitivity, specificity and predictive value. Ultrasound Med Biol. 2008;34(8):1343–1347. doi: 10.1016/j.ultrasmedbio.2008.01.012.
  • Ebbini ES, Ter Haar G. Ultrasound-guided therapeutic focused ultrasound: current status and future directions. Int J Hyperthermia. 2015;31(2):77–89. doi: 10.3109/02656736.2014.995238.
  • Escoffre J-M, Bouakaz A. Therapeutic ultrasound. vol. 880. Springer; 2015.
  • Vaezy S, Shi X, Martin RW, et al. Real-time visualization of high-intensity focused ultrasound treatment using ultrasound imaging. Ultrasound Med Biol. 2001;27(1):33–42. doi: 10.1016/s0301-5629(00)00279-9.
  • Seip R, et al. High-intensity focused ultrasound (HIFU) multiple lesion imaging: comparison of detection algorithms for real-time treatment control. In 2002 IEEE Ultrasonics Symposium, 2002. Proceedings., 2002, vol. 2, p. 1427–1430.
  • Jenne JW, Preusser T, Günther M. High-intensity focused ultrasound: principles, therapy guidance, simulations and applications. Z Med Phys. 2012;22(4):311–322. doi: 10.1016/j.zemedi.2012.07.001.
  • Izadifar Z, Izadifar Z, Chapman D, et al. An introduction to high intensity focused ultrasound: systematic review on principles, devices, and clinical applications. J Clin Med. 2020;9(2):460. doi: 10.3390/jcm9020460.
  • Curiel L, Chopra R, Hynynen K. In vivo monitoring of focused ultrasound surgery using local harmonic motion. Ultrasound Med Biol. 2009;35(1):65–78. doi: 10.1016/j.ultrasmedbio.2008.07.001.
  • Maleke C, Konofagou EE. Harmonic motion imaging for focused ultrasound (HMIFU): a fully integrated technique for sonication and monitoring of thermal ablation in tissues. Phys Med Biol. 2008;53(6):1773–1793. doi: 10.1088/0031-9155/53/6/018.
  • Righetti R, Kallel F, Stafford RJ, et al. Elastographic characterization of HIFU-induced lesions in canine livers. Ultrasound Med Biol. 1999;25(7):1099–1113. doi: 10.1016/s0301-5629(99)00044-7.
  • Kennedy JE, ter Haar GR, Wu F, et al. Contrast-enhanced ultrasound assessment of tissue response to high-intensity focused ultrasound. Ultrasound Med Biol. 2004;30(6):851–854. doi: 10.1016/j.ultrasmedbio.2004.03.011.
  • Zhang S, Shang S, Han Y, et al. Ex vivo and in vivo monitoring and characterization of thermal lesions by high-intensity focused ultrasound and microwave ablation using ultrasonic nakagami imaging. IEEE Trans Med Imaging. 2018;37(7):1701–1710. doi: 10.1109/TMI.2018.2829934.
  • Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, 2015, p. 234–241.
  • Orhan AE, Pitkow X. “Skip connections eliminate singularities,” arXiv Prepr. arXiv1701.09175, 2017.
  • Hendrycks D, Gimpel K. “Gaussian error linear units (gelus),” arXiv Prepr. arXiv1606.08415, 2016.
  • Ba JL, Kiros JR, Hinton GE. “Layer normalization,” arXiv Prepr. arXiv1607.06450, 2016.
  • Wu D, Wang Y, Xia S-T, et al. “Skip connections matter: on the transferability of adversarial examples generated with resnets,” arXiv Prepr. arXiv2002.05990, 2020.
  • Liu Z, et al. Swin transformer: hierarchical vision transformer using shifted windowsin Proceedings of the IEEE/CVF international conference on computer vision, 2021, p. 10012–10022.
  • Jadon S. A survey of loss functions for semantic segmentation in 2020 IEEE conference on computational intelligence in bioinformatics and computational biology (CIBCB), 2020, p. 1–7. doi: 10.1109/CIBCB48159.2020.9277638.
  • Rajput V. “Robustness of different loss functions and their impact on networks learning capability,” arXiv Prepr. arXiv2110.08322, 2021.
  • Marks L, Young S, Natarajan S. MRI–ultrasound fusion for guidance of targeted prostate biopsy. Curr Opin Urol. 2013;23(1):43–50. doi: 10.1097/MOU.0b013e32835ad3ee.