The 2023/2024 DeepAnatomy Project quickly spread out into several subprojects, following the overarching topic of Sarcopenia. The subprojects, broadly speaking, try to enhance efficiency and accuracy in the development and application of deep learning models in the medical field. Specifically, there is a focus on improving muscle segmentation to aid in the diagnosis and treatment of sarcopenia. The projects address the migration and optimization of software tools and frameworks to meet the demands of modern AI-supported medical imaging.

Cooperation with UKE-Klinik

A collaborative project with the University Medical Center Hamburg-Eppendorf (UKE) has been initiated to quantify sarcopenia. The goal of this cooperation is to improve diagnostic accuracy and advance research in muscle segmentation through the use of neural networks.


In collaboration with the University Medical Center Hamburg-Eppendorf (UKE), we have developed a project to quantify sarcopenia. Our focus was on the use of neuronal networks to detect and analyze two relevant muscle structures, the so-called abdominal wall muscles and paravertebal muscles.

Sarcopenia is an age-related disease associated with a progressive loss of muscle mass, strength and function. It is a serious healthcare challenge, especially for older populations. Through our collaboration with the UKE, we aim to develop predictive models based on imaging data and clinical parameters. Our approach is to use neural network technologies to quantify muscle structures and identify and evaluate changes such as loss of muscle mass. This facilitates early detection and tracking of disease progression.

By developing precise quantitative measures for sarcopenia, we can not only improve diagnostic accuracy, but also optimize the effectiveness of prevention and treatment strategies. Our partnership with the UKE enables us to access high-quality clinical data and expertise to drive research and develop innovative solutions to the challenges associated with sarcopenia. The data we process are computed tomography (CT) scans. The images provide a detailed view of the body, showing organs, muscles, bones, fat and other components in different shades of gray. However, even if most of the components can be distinguished and named, an untrained eye (and a poorly trained algorithm) may find it difficult to clearly define the boundaries between, for example, muscles and organs.

For a reliable result, it must first be decided which type of deeplearning network architecture is to be used. Much progress has been made in this field over the last decade and new network architectures are still being presented in scientific publications. We have chosen a Convolutional Neural Network (CNN) because it has proven itself in recent years for the segmentation of medical image data.

The training of a CNN requires the annotation of several layers of the CT scans. This means that the various muscle groups and the background have to be drawn in manually. This enables the algorithm to independently recognize patterns, structures and grey value differences, for example, and assign them to the annotated labels. During training, the annotated data is used as a reference to evaluate the accuracy of the segmentation results. Thus, the millions of parameters within the network are adjusted in an iterative process. This is done until a certain number of iterations have been completed. Ultimately, annotations play a key role in the development of accurate and reliable models.


Due to the large group size of 17 participants a large variety of individual projects emerged, serving the completion and support of the overarching goal.

Participants could freely come up with subproject ideas, based on problems or deficiencies in the infrastructure that they were able to identify. This approach enabled every team member to get involved based on their individual experience and skills.

The subprojects were typically implemented in small groups of at least 2 people and had a focus on either infrastructural topics or deep learning methodology.

To realize the annotations, there are several possibilities on a technical level. Given our limited expertise in muscle anatomy and the challenge of accurately identifying muscular structures, we decided to switch from CSI annotations to SPLAT.

CSI (Contouring. Snapping. Interpolation) CSI is a macro module and a tool for interactive segmentation composed of sub-modules for contouring, snapping and interpolation. It implements a universal editor for creating and editing contours. The focus is on the use of free-form contours to enable flexible and precise contouring. Snapping aims to improve the accuracy of contours by automatically adjusting to object boundaries, resulting in more accurate segmentation. Finally, CSI includes interpolation functions to supplement irregularly sampled contour points. Interpolation thus contributes to the smoothing and continuity of contours, improving the overall performance of segmentation.

SPLAT (Sparse Label Annotation) Similar to CSI, SPLAT is a method for labeling image regions in medical image processing. In contrast to conventional annotations, which only provide binary information such as background and foreground, SPLAT enables a more differentiated labeling of image regions. With SPLAT, not only background and foreground can be labeled, but also additional categories such as undefined or uncertain areas can be marked. This enables a more precise characterization of image regions, especially in complex medical images where the assignment of pixels to specific structures or tissues can be difficult.

In computer science, neural networks are increasingly being used for image processing/analysis. In medical informatics in particular, they offer a great opportunity to analyze structures and patterns in imaging procedures such as MRI/CT and X-ray. Evaluationmetrics are a key feature for the development and improvement of those networks, thus we focused on research and implementation of the metrics in regards to our Sarcopenia project.

Various models are trained for tasks such as the registration and segmentation of organs or other tissue structures. The evaluation of a single model can be done with many known metrics - but these can have different meanings for different models that have been trained/optimized for different applications.

In the "Model Evaluation" subproject of DeepAnatomy, we took a closer look at different models - with a focus on muscle segmentation - and compared/evaluated them. As this year's focus was on pathological muscle atrophy (sarcopenia), we particularly examined the models' ability to segment muscles in the abdomen and thorax.

Graphical tools are of great importance for monitoring the performance of neural networks in a targeted manner. In the "Blossom" project, we set ourselves the goal of improving the efficiency and transparency of neural network training processes by repairing and subsequently revising a previously existing solution. The first part involved penetrating the existing cloud infrastructure in order to be able to reintegrate the repair of the original software.

The decision was then made to integrate new libraries in order to modernize and expand the visualization. A key approach is to provide real-time visualizations of the training lot and validation metrics throughout the training process. These visualizations make it possible to immediately identify patterns and trends in the training process and react accordingly. By monitoring training progress in real time, we can gain immediate insights into the performance of our model. This enables us to recognize problems such as overfitting, underfitting or gradient explosions at an early stage and address them in a targeted manner.

Other relevant aspects that were considered during the revision of Blossom were various features that promote interactivity with the data. For example, setting the granularity of data, but also delimiting different training sessions. Ultimately, the central core aspects were the smoothing of information, as sometimes huge amounts of data from many thousands of data points are generated in each training session and need to be visualized. In this context, performance issues also had to be carefully examined.

The aim of this contemplementary project was to write a module for the configuration and processing software used at Fraunhofer that implements the training configuration. The reason for this small project was to reduce the input in the terminal when you want to start a training or to move the input to the graphic.

As a result, the arguments in the terminal for the file paths to the architecture file and the configuration file can be saved, which keeps the command short and clear. Within the architecture file, the architecture, library etc. are specified in more detail, whereas the configuration file focuses more on hyperparameters such as learning rate and loss function. Now almost everything concerning the settings for the neural network can be realized graphically, namely in MeVisLab. MeVisLab is a graphical program where the training data for the neural network is usually preprocessed by editing and grouping the data using modules. These modules can be combined into a network to create an entire pipeline through which the training data automatically passes, and this is exactly where the module written in this small project can be found.

Although a machine learning model learns independently, there is a complex interaction between different software programs to create an environment that enables this training. These programs can output status messages as well as errors. Logs are therefore a valuable tool for the developer to monitor and improve the model.

DeepAnatomy uses a dashboard with which training can be started and monitored. In particular, this dashboard is intended to graphically display trainings (see "Blossom") and collect all relevant meta-information about the training. The aim of the Dashboard-Logger project was to extend the Dashboard web app independently in order to make logs from the training and MeVislab easily accessible and display them directly.

MeVislab is a program developed by Fraunhofer, which is used for data preparation and configuration of deep learning trainings. Both programs run in separate Docker containers in the container orchestration software "Nomad", so it is a distributed system. While the UI in Nomad itself only offers cumbersome ways to view all the logs associated with the training, we have simplified this in a central location and implemented useful additional functions. Due to the distributed system, the logs must first be collected in a central location for display: The basis for persisting the log data here is OpenSearch, into which the logs are pushed in real time using the log shipper pattern via fluentd. We can now request and display this log data via an Api interface.

Various considerations on topics such as UI design, UX and the general flow of information were incorporated and interesting mechanisms such as polling, infinite scroll, etc. were worked on.

The dashboard is an internal MEVIS project that can start trainings with given files. Access to GitLab repositories is to be implemented here so that the files can be taken directly from one or more repositories. To do this, repository access tokens must be set up and the possibility of integrating these via the front and back end must be set up using so-called vault keys.

Version 3.0 of Keras was released last year and offers new possibilities for writing deeplearning networks in Python. In particular, the framework now provides the flexibility to choose between tensorflow, Jax and Pytorch as a backend. Our task was to migrate the previously used Keras 2 to the new version. To do this, some of the existing Python modules had to be modified in accordance with the Keras 3 migration guide. Docker is used to ensure that all Fraunhofer employees work on the same versions of all frameworks. This allows a so-called image to be built, which authorized users can then download. This is built using a Dockerfile that specifies which versions are to be used. We built such an image and then carried out existing tests.

We then had to find solutions for the errors that occurred. However, some of the errors we encountered could not be fixed due to version incompatibilities between Keras3 and tensorflow_probabilities. Gradually, fixes were found for the remaining errors and even the version incompatibility was resolved by a new realease of Tensorflow. Further integration tests were then carried out, in which training sessions were started on the cluster via Nomad. As these did not run for various reasons, we also had to work on solutions to get the tests to run.


The outcomes of the projects include:

  • Development of precise predictive models for early detection and monitoring of sarcopenia progression.
  • The integration of SPLAT has increased the accuracy and fineness of our muscle segmentation. This method made it possible to not only distinguish between foreground and background, but also to identify areas that were challenging to classify. This led to an improved analysis of the muscle structure and contributed significantly to the quality of the results.
  • Simplified configuration of training processes via the MeVisLab module.
  • Advanced capabilities for monitoring training progress through the dashboard.
  • Improved workflow through the incorporation of GitLab URLs and the use of Vault keys.
  • Successful migration and integration of Keras 3, despite initial version incompatibilities.