1 Problem: Pursuing Criminals

In a wide range of crimes from grand theft auto to child abduction, it is important to be able to find the exact location and reconstruct the past movements of a single vehicle. Some technologies from licence plate recognition at parking garages to on the ground officer deployment are used to alleviate this issue, but finding a single car still proves very difficult, time-consuming, and expensive.

2 Solution: Real-time Traffic Camera Analytics

Traffic cameras are in wide-spread use for monitoring the movement of cars and identifying problems. The information from these cameras is rich and the combination of an entire network of images can reconstruct the movements of a single vehicle.
Camera Grid

The image data on their own are however difficult to process and particularly examining an entire network has typically required hundreds of expensive employees to hand screen the images (approximately 2 per employee). Electronic solutions while existant are typically inflexible, and poorly scalable to the types of problems needed for real-time ad-hoc analysis.

2.1 Real-time image processing

Camera to 4QL

Using our 4Quant SQL, it is now possible to process these streams in a flexible scalable manner to query the live stream of images as if they were all in a database.

SELECT * FROM TrafficCameras WHERE DETECT_CAR(image)

\[ \Downarrow \]

Tracks \(\rightarrow\) Segment \[ \downdownarrows \] Final Forms \(\rightarrow\) Final Forms

More importantly many streams can be integrated together and queried as one coherent entity. For example if a white car has been reported missing, thousands the cameras can be screened for bars by color

SELECT position,color FROM TrafficCameras WHERE 
  DETECT_CAR(image) GROUP BY DETECT_CAR(image).COLOR

If we know the care was a white Subaru, we can further limit the results by taking only the white cars whose make is Subaru.

SELECT position FROM TrafficCameras WHERE 
  DETECT_CAR(image).MAKE="Subaru" AND 
  DETECT_CAR(image).COLOR="white"

Now we can start to integrate the other information about the time (the last 10 minutes) and an idea about where the perpetrator might be (100km of Zurich).

SELECT image,position FROM TrafficCameras WHERE 
  TIME BETWEEN 10:40 AND 10:50 AND
  DISTANCE(position,"Zurich")<100
  DETECT_CAR(image).MAKE="Subaru" AND 
  DETECT_CAR(image).COLOR="white"

The final step is to identify the license plate number and compare it to the ASTRA database to identify exactly who this scoundrel is and put them to justice.

SELECT Astra.Owner.Picture FROM TrafficCameras WHERE 
  TIME BETWEEN 10:40 AND 10:50 AND
  DISTANCE(position,"Zurich")<100
  DETECT_CAR(image).MAKE="Subaru" AND 
  DETECT_CAR(image).COLOR="white"
  INNER JOIN astraCarTable as Astra ON
  DETECT_CAR(image).LICENCE_PLATE = Astra.ID

Perp

2.2 How?

The first question is how the data can be processed. The basic work is done by a simple workflow on top of our Spark Image Layer. This abstracts away the complexities of cloud computing and distributed analysis. You focus only on the core task of image processing.

Beyond a single camera, our system scales linearly to multiple cameras and can distribute this computation across many computers to keep the computation real-time.

With cloud-integration and Big Data-based frameworks, even handling an entire city network with 100s of drones and cameras running continuously is an easy task without worrying about networks, topology, or fault-tolerance. Below is an example for 30 traffic cameras where the tasks are seamlessly, evenly divided among 50 different computers.

2.3 What?

The images which are collected by the traffic cameras at rate of 30 frames per second contain substantial dynamic information on not only cars, but buildings, landscapes, and people. The first basic task is the segmentation of the people which can provide information on their number, movement, and behavior

  • Labels

The segmented image above can be transformed into quantitative metrics at each time point. These metrics can then be processed to extract relevant quality assessment information for the tracks.

The data can then be broken down into small scenes where the number and flow of cars in each grouping can be evaluated.

The data can also be smoothed to show more clearly trends and car counts on a single image.

3 Technical Aspects

3.1 Streaming the Data

Once the cluster has been comissioned and you have the StreamingSparkContext called ssc (automatically provided in Databricks Cloud or Zeppelin), the data can be loaded using the Spark Image Layer. Since we are using real-time analysis, we acquire the images from a streaming source

val trafficCam1 = TrafficCameraReceiver("https://drone-8092")
val trafficCam2 = TrafficCameraReceiver("https://drone-8093")
val metaImageStream = ssc.receiverStream(trafficCam1 ++ trafficCam2)

Although we execute the command on one machine, the analysis will be distributed over the entire set of cluster resources available to ssc. To further process the images, we can take advantage of the rich set of functionality built into Spark Image Layer

def identifyCar(time: Double, pos: GeoPos, inImage: Img[Byte]) = {
  // Run the image processing steps on all images
  val carOutline = inImage.
    window(3s).
    run("Median...","radius=3"). // filter out the noise
    run("Rolling Background Subtraction..."). // remove static objects
    run("Threshold","OTSU") // threshold bright structures
  val carShape = carOutline.
    morphology(CLOSE,5) // connect nearby objects
    componentLabel(). // identify the components 
    filter(_.area>50). // take only the larger sized
    shapeAnalysis() // analyze the position and shape
  // return smoothness and separation based on the segmented image
  CarInformation(
                 size=carShape.area,
                 make=matchMakePattern(carShape),
                 color=identifyColor(inImage & carOutline)
                )
}
// apply the operation to all images as they come in
val carStream = metaImageStream.map(identifyCar)

The entire pipeline can then be started to run in real-time on all the new images as they stream in. If the tasks become more computationally intensive, then the computing power can be scaled up and down elastically.

4 Learn More

4Quant is active in a number of different areas from medicine to remote sensing. Our image processing framework (Spark Image Layer) and our query engine (Image Query and Analysis Engine) are widely adaptable to a number of different specific applications.

4.2 Technical Presentations

To find out more about the technical aspects of our solution, check out our presentation at:

5 Acknowledgements

Data was obtained from the Hazelwood traffic cameras Anal. The ais poweris ed by Spark Image Layer from 4Quant, Visualizations, Document Generation, and Maps provided by:

To cite ggplot2 in publications, please use:

H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009.

A BibTeX entry for LaTeX users is

@Book{, author = {Hadley Wickham}, title = {ggplot2: elegant graphics for data analysis}, publisher = {Springer New York}, year = {2009}, isbn = {978-0-387-98140-6}, url = {http://had.co.nz/ggplot2/book}, }

To cite package ‘leaflet’ in publications use:

Joe Cheng and Yihui Xie (2014). leaflet: Create Interactive Web Maps with the JavaScript LeafLet Library. R package version 0.0.11. https://github.com/rstudio/leaflet

A BibTeX entry for LaTeX users is

@Manual{, title = {leaflet: Create Interactive Web Maps with the JavaScript LeafLet Library}, author = {Joe Cheng and Yihui Xie}, year = {2014}, note = {R package version 0.0.11}, url = {https://github.com/rstudio/leaflet}, }

To cite plyr in publications use:

Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. URL http://www.jstatsoft.org/v40/i01/.

A BibTeX entry for LaTeX users is

@Article{, title = {The Split-Apply-Combine Strategy for Data Analysis}, author = {Hadley Wickham}, journal = {Journal of Statistical Software}, year = {2011}, volume = {40}, number = {1}, pages = {1–29}, url = {http://www.jstatsoft.org/v40/i01/}, }

To cite the ‘knitr’ package in publications use:

Yihui Xie (2015). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.10.

Yihui Xie (2013) Dynamic Documents with R and knitr. Chapman and Hall/CRC. ISBN 978-1482203530

Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595

To cite package ‘rmarkdown’ in publications use:

JJ Allaire, Joe Cheng, Yihui Xie, Jonathan McPherson, Winston Chang, Jeff Allen, Hadley Wickham and Rob Hyndman (2015). rmarkdown: Dynamic Documents for R. R package version 0.7. http://CRAN.R-project.org/package=rmarkdown

A BibTeX entry for LaTeX users is

@Manual{, title = {rmarkdown: Dynamic Documents for R}, author = {JJ Allaire and Joe Cheng and Yihui Xie and Jonathan McPherson and Winston Chang and Jeff Allen and Hadley Wickham and Rob Hyndman}, year = {2015}, note = {R package version 0.7}, url = {http://CRAN.R-project.org/package=rmarkdown}, }

To cite package ‘DiagrammeR’ in publications use:

Knut Sveidqvist, Mike Bostock, Chris Pettitt, Mike Daines, Andrei Kashcha and Richard Iannone (2015). DiagrammeR: Create Graph Diagrams and Flowcharts Using R. R package version 0.7.

A BibTeX entry for LaTeX users is

@Manual{, title = {DiagrammeR: Create Graph Diagrams and Flowcharts Using R}, author = {Knut Sveidqvist and Mike Bostock and Chris Pettitt and Mike Daines and Andrei Kashcha and Richard Iannone}, year = {2015}, note = {R package version 0.7}, }