Wlst-2016

Our presentation at Workshop on Large Scale Tomography titled Applying Big Data Solutions to Big Tomographic Problems

View the Project on GitHub 4Quant/WLST-2016

Applying Big Data Solutions to Big Tomographic Problems

The presentation for the Workshop on Large-Scale Tomography in Szeged, Hungary in January 2016.

Abstract

With every more efficient detectors, higher flux, and stable beamlines, comes the ability to probe

time and length-scales previously unachievable. Of particular interest are the massive scale

projects like the Human Brain Project, adult Zebra fish imaging, and dynamic imaging. All

involve thousands to millions of measurements at the highest possible resolutions to cover mm

to nanometer length scales. The task of processing and analyzing such large collections of

measurements is exceptionally difficult. We show how the commodity hardware-based ‘Big

Data’ solutions can be adapted to the scientific domain to address the processing terabytes

worth of measurements in a parallel, distributed manner. Building on the distributed frameworks

of Apache Spark and Spark Imaging Layer, we have extended the common tomographic and

image processing tools to work on these images enabling the use of many machines in parallel

and drastically accelerating the speed and ease with which these large datasets can be stitched

and analyzed. Our most recent developments enable the data to be analyzed and processed in

real-time using the latest streaming and micro-batch processing techniques. Unique such an

approach allows for fault-tolerant, distributed analytics to process complicated datasets and

eventually provide feedback to both experimentalists and their equipment to allow for adaptation

of measurements.

Relevant Links

Bio

Kevin Mader is the founder and CTO of 4Quant and a lecturer in Image Analysis at ETH Zurich. He focuses on turning big hairy 3D images into simple, robust, reproducible numbers without resorting to black boxes or magic. In particular, as part of several collaborations, he is currently working on automatically segmenting full animal zebrafish images, characterizing rheology in 3D flows, and measuring viral infection dynamics in cell lines.