san flashcisco 09.04.16

notes from my talk, “computer vision with flash”, at the San Flashcisco user group meeting.

no slides, as this presentation is mainly demo- and code- focused, but plenty of links and text.


motion tracking via subtractive analysis

this demo uses code developed by Justin Windle. find the source on his blog, here.

main loop
happens in MotionTracker.track():

subtractive analysis:
previous frame (BitmapData) is drawn into current frame (BitmapData) with a difference filter; this results in an image that shows only changed pixels.
(difference pixel = lighter pixel – darker pixel; if pixels are ==, difference pixel is black.)

apply contrast filter to push nearly black (little-to-no motion) pixels to black, and bring brighter pixels (areas of motion) closer to white.

apply blur filter to blur out noise and blob-ify areas of more motion.

apply threshold to isolate all pixels that are not near-black (in this case, all pixels above 0xFF333333), and map to a new color (in this case, 0xFFFFFF).

motion tracking:
BitmapData.getColorBoundsRect() returns rect that contains all pixels detected as moved, colored white in the previous step.

verify area of rect is above a constant value (in this case, 10% of source width/height), to ignore noise.


FLARToolkit

download source here:
http://www.libspark.org/wiki/saqoosha/FLARToolKit/en

FLARToolkit is an AS3 marker-tracking engine, derived from NYARToolkit (Java), derived from ARToolkit (C++).

ARToolkit detection algorithm explanation and diagram:
http://www.hitl.washington.edu/artoolkit/documentation/vision.htm

(FL)ARToolkit core algorithm (runs every frame)

create BitmapData snapshot of Video, and send to FLARMultiMarkerDetector.detectMarkerLite(), where the core tracking algorithm is implemented.

FLARRasterFilter_BitmapDataThreshold
snapshot is reduced to brightness, by applying a luminance filter.
(luminance relies more on green channel than red/blue; for more info see comments on this post.)
brightness greyscale image is then thresholded, resulting in a B/W image.

FLARSquareDetector
finds all possible outline areas (not clear on algorithm here).
video image distortion on each area is inverted according to information within a camera calibration file (camera_para.dat / FLARParams.dat).
returns a list of FLARSquare candidates.
each candidate is defined primarily by its four corners.

FLARColorPatt_O3
attempt to extract image, formatted as a pattern, from the area within each detected outline.

FLARMultiMarkerDetector (continued)
match the extracted image against each loaded pattern.
create a FLARMultiMarkerDetectorResult for the best match.
each FLARMultiMarkerDetectorResult contains information about the quality of the match (“confidence”), pattern id, a FLARSquare instance that describes the detected pattern’s
contour, and cardinal direction (U, R, D, L).

also calculate transformation matrix, based on FLARSquare for each detected marker.

camera_para.dat / FLARParams.dat
the camera parameters file is used to correct distortion caused by the camera lens that might adversely affect marker tracking (particularly with wide-angle lenses). info on how to create your own is here.

pattern generation
this marker generator, created by tarotaro, can generate pattern files from either a live camera feed or a loaded image. use the latter for a more reliable pattern file. tarotaro’s marker generator allows for patterns of varying resolution and % marker width (patternToBorderRatio in FLARManager > FLARPattern).

FLARToolkit forum
is here.

(FL)ARToolkit license
GNU GPL: all derivative works must adopt the GPL.
ARToolkit commercial licenses are available;
at the time of this writing, options include a single-use license and a multi-use, year-long license. i believe each has an up-front fee and a royalty.

FLARManager

FLARManager is a framework, written by Eric Socolofsky (that’s me!), that makes developing FLARToolkit applications simpler.

FLARManager overview
FLARManager constructor:
pass in cameraParamsPath, list of FLARPattern instances, IFLARSource.
cameraParamsPath: path to camera_para.dat / FLARParams.dat
FLARPatterns are containers for information about each pattern FLARToolkit should attempt to detect
IFLARSource will provide BitmapData object to FLARToolkit marker detector.

FLARManager initialization:

  • loads camera params file and all pattern files,
  • creates default source (FLARCameraSource) if none provided,
  • inits FLARRgbRaster_BitmapData (source for detection),
  • inits FLARMultiMarkerDetector (detector),
  • self-activates (begins detection on ENTER_FRAME).

main loop:

  • update source (redraw BitmapData)
  • get list of detected markers via FLARMultiMarkerDetector.detectMarkerLite()
  • manage detected markers (dispatch FLARMarkerEvents):
    • if no markers found, remove all active markers
    • check marker confidence against corresponding FLARPattern.minConfidence
    • compare detected markers against stored active markers; if distance is within MARKER_UPDATE_THRESHOLD, detected marker is just an updated marker; else, it’s a new marker
    • remove any stored (previously) active markers that didn’t get updated

a FLARManager application can use FLARCameraSource or FLARLoaderSource as a source. can also simulate input with a mouse+keyboard, using FLARProxy.

at the time of this writing, FLARManager comes with four example files:
FLARManagerExample_2D.as: simple 2D marker detection
FLARManagerExample_PV3D.as: 3D detection for “augmented reality”
onAdded:

  • store FLARMarker
  • hash DisplayObject3D containers by pattern id (pattern id determined by load order)

on ENTER_FRAME:

  • loop through stored FLARMarkers,
  • get each FLARMarker.transformMatrix,
  • convert from ARToolkit format to PV3D format,
  • apply to DisplayObject3D.

FLARManagerExample_Flash3D.as: similar, but different transformation matrix conversion; still a bit glitchy.
FLARManagerExample_2D_Loader.as: load source from a swf instead of camera. useful for debugging.


Marilena

Marilena is a partial port of OpenCV, a computer vision library written in C++.
Marilena uses “Haar Cascades” to detect objects in an image, in particular faces.
my understanding is that Haar Cascades are data structures that specify a branching series of descriptions of graphic regions, that can be traversed in order to match with a desired object. i think the face-tracking Haar Cascades start with eyes, then move out from there.

i could be totally wrong about all of this though. it’s complicated stuff.

the code i have here is modified from squidder‘s WebcamFaceDetection example, from here:
http://www.squidder.com/2009/02/26/realtime-face-detection-in-flash/

found that mario klingemann also did some optimization:
http://www.quasimondo.com/archives/000687.php

mario’s magic mirror:
http://incubator.quasimondo.com/flash/manic_mirror.php

mr. doob’s face-driven 3D:
http://mrdoob.com/lab/webcam/face_driven_3d/

One Comment

  1. cipriano says:

    Thanks for posting this. I couldn’t make the San Flashcisco meeting.

Leave a Reply

Additional comments powered by BackType