We start off with over 1000 mouse samples measured at the TOMCAT beamline of the Swiss Light Source at the Paul Scherrer Institute in Villigen, Switzerland. Each sample consists of 14GB of image data as well as 98 genetic tags correlating each sample and phenotype to a specific pattern of inheritance corresponding to 10s of terabytes of data which analyzed normally would require a dozens of scripts, cluster management tools, and a lot of patience.
With IJSQL from 4Quant you can now do such analyses as easily as a SQL database query (even from Excel if you wish) and IJSQL handles loading the data, making sure it is evenly distributed, optimizing queries, and making a cluster or even entire cloud of computers work like one super-fast one using the latest generation Big Data technology.
The first step is creating the cluster, this can be done using public clouds like those available at Amazon AWS, Google Compute Engine, Databricks Cloud or even own your own cluster. Once the Spark Cluster has been created, you have the SparkContext called
sc the data can be loaded using the Spark Image Layer.
readImage[Float]loads in the data as decimal values since our images show the mineralization density at every pixel.
*indicates the all folders should be included which in this case means over 1000 samples or 14 TB of data!
cachesuffix keeps the files in memory so they can be read faster as many of our image processing tasks access the images a number of times.
val uctImages = sc.readImage[Float]("s3n://bone-images/f2-study/*/ufilt.tif").cache
We can then move the data into our IJSQL database so instead of writing Scala code we can utilize easy SQL commands for further analysis.
Although we execute the commands on only one machine, the data will be evenly loaded over all of the machines in the cluster (or cloud). We can show any of these images at any point by just typing
Once the table has been registered we can perform our analysis using the easy IJSQL interface (or use our Python and Java APIs to make your own analysis). The next steps for this bone analysis - extracting the porosity data from the images - analyzing the cells (small pores) inside
Since the measurements have some degree of noise from the detectors, we first clean up the images using a Gaussian Filter.
CREATE TABLE FilteredImages AS SELECT boneId,GaussianFilter(image) FROM ImageTable
Any ImageJ plugin can be easily used inside IJSQL and for a 3x3 Median filter
SELECT boneId,run(image,"Median...","radius=3") FROM ImageTable
We also offer a number of 3D rendering options for when a single slice does not give enough detail. This is rendered using the cluster and just the final image is sent to your machine so even huge images can be rendered quickly.
To segment the images into bone and air, we can either manually specify a cut-off or simply use an automated approach like Otsu, IsoData, or Intermodes.
CREATE TABLE ThresholdImages AS SELECT boneId,ApplyThreshold(image,OTSU) FROM FilteredImages
As with the last steps, a slice can be immediately inspected for one or more images with
sql("SELECT image FROM ThresholdImages").first().show(1)
From the segmented image, we can extract the cells by first creating a mask with all of the holes filled in.
CREATE TABLE MaskImages AS SELECT boneId, FillHoles(image) FROM ThresholdImages
sql("SELECT image FROM ThresholdImages").first().show(1) ``` ![Filled Holes](ext-figures/bone-filled.png)
CREATE TABLE CorticalImages AS SELECT boneId, PeelMask(thr.Image,mask.Image) FROM ThreshImage thr INNER JOIN MaskImages mask ON thr.boneId = mask.boneId
sql("SELECT image FROM CorticalImages").first().show(1)
CREATE TABLE PorosityImages AS SELECT boneId,PeelMask(run(thr.image,"Invert"),mask.image) FROM ThreshImage thr INNER JOIN MaskImages mask ON thr.boneId = mask.boneId
sql("SELECT image FROM PorosityImages").first().show(1)
We can then identify the individual cells using component labeling.
CREATE TABLE LabelImages AS SELECT boneId,ComponentLabel(image) FROM PorosityImages
We can also utilize component labeling to help us distinguish cells from vessels
CREATE TABLE VesselImages AS SELECT * FROM LabelImages WHERE obj.VOLUME>1000 CREATE TABLE CellImages AS SELECT * FROM LabelImages WHERE obj.VOLUME<1000