Tasks & Evaluation – Challenge on Ultrasound Beamforming with Deep Learning

Tasks

CUBDL is composed of three optional tasks. Therefore, participants have the option to provide their results for a minimum of one up to a maximum of four tasks or subtasks.

Task 1: Beamforming with deep learning after a single plane wave transmission

Task 1 has two optional subtasks:

Task 1a is explicitly focused on creating a high-quality image from a single plane wave to match a higher quality image created from multiple plane waves.
Task 1b gives more freedom to create an image that will be benchmarked against the highest SNR, CNR, gCNR, and contrast. These values can be better than those obtained with multiple plane wave transmissions.

Task 2: Beamforming with deep learning after a few plane wave transmissions

Task 2 imposes a maximum of 10 plane waves but lets participants choose from provided angles to create the best image quality possible.

Task 3: Beamforming with deep learning to achieve dynamic transmit focusing

Task 3 enables participants to compare the results of a deep learning dynamic transmit focusing implementation that will be useful with current transmit beamforming techniques implemented on most clinical systems today.

General Evaluation Metrics

The following metrics apply to Tasks 1-3:

Contrast
SNR
gCNR
Resolution (one of the following, depending on image):
- Single point FWHM (axial & lateral)
- Edge spread function (left half of phantom is speckle; the right half is anechoic)
Network complexity
- Total number of trainable parameters in the model
- Effective frame rate

Speckle Statistics

For Tasks 1a, 2, and 3, we are additionally concerned with preserving speckle statistics. Given the more futuristic outlook of Task 1b, we will allow participants to obtain the highest possible SNR, regardless of speckle preservation. Speckle preservation will be measured as:

SNR=1.91

Speckle-Based Resolution

Considering that speckle is intended to be preserved for Tasks 1a, 2, and 3, we will take advantage of the additional opportunity to measure resolution using the autocorrelation of speckle.

Image-to-Image Comparisons

With Task 1 a, we have an additional opportunity to assess performance by matching the images achieved with a high number of multiple plane wave transmissions. Therefore, we will additionally assess the following image-to-image correlation metrics:

L1 Loss
L2 Loss
PSNR
Cross-Correlation

Evaluation Location

Note that for Task 3, the general evaluation metrics, the preservation of speckle, and the speckle-based resolution will be measured both at and away from the transmit focus in order to assess the effectiveness of deep learning-based dynamic transmit focusing.

Summary

These tasks and metrics are summarized in the following table:

	Task	Objective	Metric
Task 1a	Beamforming with deep learning after a single plane wave transmission	Task 1a is explicitly focused on creating a high-quality image from a single plane wave to match a higher quality image created from multiple plane waves.	General Preserve Speckle Statistics Speckle-Based Resolution L1 Loss L2 Loss
Task 1b		Task 1b gives more freedom to create an image that will be benchmarked against the highest contrast, SNR, gCNR, etc. These values can be better than those obtained from an image formed by multiple plane waves.	General
Task 2	Beamforming with deep learning after a few plane wave transmissions	Task 2 imposes a maximum 10 plane waves but lets participants choose from provided angles to create the best image quality possible.	General Preserve Speckle Statistics Speckle-Based Resolution
Task 3	Beamforming with deep learning to achieve dynamic transmit focusing	Task 3 enables participants to compare the results of a deep learning dynamic transmit focusing implementation that will be useful with current transmit beamforming techniques implemented on most clinical systems today.	General* Preserve Speckle Statistics* Speckle-Based Resolution* *Measured both at and away from transmit focus

Scoring System

For each task or subtask, participants will be rank ordered using each metric above and receive a rank for each metric. These rankings will be grouped into two categories: (1) image quality and (2) network complexity (because we are interested in balancing image quality with display frame rates). We will average the ranks of the metrics obtained by each participant within these two groups. The average rank from each group will be summed. The participant with the lowest sum wins. This scoring system is represented mathematically as follows:

where T_I and T_N are the total numbers of image quality metric rankings and network complexity rankings, respectively.