10.07.2015 Views

To the Graduate Council: I am submitting herewith a thesis written by ...

To the Graduate Council: I am submitting herewith a thesis written by ...

To the Graduate Council: I am submitting herewith a thesis written by ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>To</strong> <strong>the</strong> <strong>Graduate</strong> <strong>Council</strong>:I <strong>am</strong> <strong>submitting</strong> <strong>herewith</strong> a <strong>the</strong>sis <strong>written</strong> <strong>by</strong> Sreenivas Rangan Sukumar entitled“Curvature Variation as Measure of Shape Information.” I have ex<strong>am</strong>ined <strong>the</strong> finalelectronic copy of this <strong>the</strong>sis for form and content and recommend that it be acceptedin partial fulfillment of <strong>the</strong> requirements for <strong>the</strong> degree of Master of Science, with <strong>am</strong>ajor in Electrical Engineering.Mongi A. AbidiMajor ProfessorWe have read this <strong>the</strong>sis andrecommend its acceptance:Michael J. RobertsDavid L. PageAndrei V. GribokAccepted for <strong>the</strong> <strong>Council</strong>:Anne MayhewVice Chancellor andDean of <strong>Graduate</strong> Studies(Original signatures are on file with official student records.)


Curvature Variation as Measureof Shape InformationA ThesisPresented for <strong>the</strong>Master of Science DegreeThe University of Tennessee, KnoxvilleSreenivas Rangan SukumarDecember 2004


Acknowledgementsii“Experience is <strong>the</strong> toughest of teachers, she gives me tests first and lessons later.What I learn is simply information, Experience of information is knowledge.I've learnt that science is organized knowledge, but wisdom is organized life.And most importantly, I have learnt that I definitely have a lot more to learn…!”It was not long ago when I used to think that this section of <strong>the</strong> document was justano<strong>the</strong>r formality until I realized its significance as a medium to express my gratitude toall <strong>the</strong> people without whose contribution this work that I <strong>am</strong> presenting would haveremained a dre<strong>am</strong>.Words are not enough to express what <strong>the</strong>y have done to me. They have given me <strong>the</strong>life, <strong>the</strong> vision and <strong>the</strong>ir happiness for my well being and but for <strong>the</strong>ir monotonicallyincreasing affection I <strong>am</strong> sure I would not be what I <strong>am</strong> today. It is my pleasure todedicate this work to my parents Chellappa Sukumar and Malathi Sukumar.There is an adage that says “You can only take <strong>the</strong> horse to <strong>the</strong> pond but cannot make itdrink”. It has been an excellent learning experience under <strong>the</strong> academic guidance of Dr.Mongi Abidi. He showed me <strong>the</strong> pond in <strong>the</strong> Imaging, Robotics and Intelligent SystemsLab and provided me with <strong>the</strong> right kind of academic, financial and philosophicalsupport all through my pursuit.I shall never forget <strong>the</strong> quotation on his desk “It is not important what you learn but it isimportant how you teach it to o<strong>the</strong>rs”. It takes a lot to be an unselfish teacher that youhave been to me. “Thanks a million Dr. Page”. I should thank Dr. Andrei Gribok for <strong>the</strong>lively discussions of high technical impact on this work. I would like to take thisopportunity to appreciate <strong>the</strong> efforts of Dr. Andreas Koschan and Dr. Besma Abidiwhose rigorous review and feedback has added to my learning experience in <strong>the</strong> lab. Dr.Roberts has helped me with <strong>the</strong> documentation and review of my work. It is not fair if Ido not mention <strong>the</strong> efforts of Tak Motoy<strong>am</strong>a in <strong>the</strong> data acquisition process.A significant <strong>am</strong>ount of <strong>the</strong> learning process at <strong>the</strong> graduate level has to be attributed tomy peers at <strong>the</strong> lab. Faysal, Brad and Yohan have been my inspirations towards a PhDdegree and <strong>the</strong> weekly brainstorming sessions with <strong>the</strong>m have been a good platform tolaunch new ideas. I would like to extend a sincere thanks to Umayal for bequeathing herexperience with <strong>the</strong> IVP Ranger to me and I shall never forget our evenings at <strong>the</strong>“Motor pool” scanning under vehicle range data. It’s my pleasure to acknowledgeAshwin, Madhan, S<strong>am</strong>path, and Rishi with whom I share “<strong>the</strong>” cherishable moments atKnoxville.Sincere thanks to you all…


iiiAbstractIn this <strong>the</strong>sis, we present <strong>the</strong> Curvature Variation Measure (CVM) as ourinformational approach to shape description. We base our algorithm on shapecurvature, and extract shape information as <strong>the</strong> entropic measure of <strong>the</strong> curvature. Wepresent definitions to estimate curvature for both discrete 2D curves and 3D surfacesand <strong>the</strong>n formulate our <strong>the</strong>ory of shape information from <strong>the</strong>se definitions.With focus on reverse engineering and under vehicle inspection, we document ourresearch efforts in constructing a scanning mechanism to model real world objects. Weuse a laser-based range sensor for <strong>the</strong> data collection and discuss view-fusion andintegration to model real world objects as triangle meshes. With <strong>the</strong> triangle mesh as<strong>the</strong> digitized representation of <strong>the</strong> object, we segment <strong>the</strong> mesh into smooth surfacepatches based on <strong>the</strong> curvedness of <strong>the</strong> surface. We perform region-growing to obtain<strong>the</strong> patch adjacency and apply <strong>the</strong> definition of our CVM as a descriptor of surfacecomplexity on each of <strong>the</strong>se patches. We output <strong>the</strong> real world object as a graphnetwork of patches with our CVM at <strong>the</strong> nodes describing <strong>the</strong> patch complexity. Wedemonstrate this algorithm with results on automotive components.


Contents1 INTRODUCTION.......................................................................11.1 Motivation ............................................................................................................. 21.2 Proposed Approach ............................................................................................... 41.3 Document Organization......................................................................................... 62 LITERATURE REVIEW...........................................................72.1 Cognition and Computer Vision............................................................................ 72.2 Shape Analysis on 2D Images............................................................................... 82.2.1 Classification of Methods....................................................................................82.2.2 Contour-Based Description ...............................................................................102.2.3 Region-Based Description.................................................................................132.3 Shape Analysis on 3D Models ............................................................................ 162.3.1 Classification of Methods..................................................................................162.3.2 Feature Extraction .............................................................................................172.3.3 Descriptive Representation................................................................................192.3.4 Shape Histogr<strong>am</strong>s..............................................................................................202.3.5 <strong>To</strong>pology Description........................................................................................212.4 Summary ............................................................................................................. 233 DATA COLLECTION AND MODELING ............................263.1 Range Data Acquisition....................................................................................... 263.1.1 Range Acquisition Systems...............................................................................263.1.2 Range Sensing Using <strong>the</strong> IVP Range Scanner ..................................................273.2 Solid Modeling from Range Images.................................................................... 333.2.1 Modeling Automotive Components for Reverse Engineering...........................333.2.2 Modeling Automotive Scenes for Under Vehicle Inspection............................364 ALGORITHM OVERVIEW ...................................................394.1 Algorithm Description......................................................................................... 394.1.1 Informational Approach to Shape Description – Curvature Variation Measure404.1.2 Curvature-Based Automotive Component Description.....................................414.2 Building Blocks of <strong>the</strong> CVM algorithm .............................................................. 434.2.1 Differential Geometry of Curves and Surfaces .................................................434.2.2 Curvature Estimation.........................................................................................454.2.3 Density Estimation ............................................................................................484.2.4 Information Measure .........................................................................................565 ANALYSIS AND RESULTS....................................................595.1 Implementation Decisions on <strong>the</strong> Building Blocks ............................................. 595.1.1 Analysis of Curvature Estimation Methods.......................................................595.1.2 Density Estimation for Information Measure....................................................635.2 State-of-<strong>the</strong>-Art Shape Descriptors ..................................................................... 665.3 Results of our Informational Approach............................................................... 705.3.1 Intensity and Range Images...............................................................................705.3.2 Surface Ruggedness ..........................................................................................705.3.3 3D Mesh Models ...............................................................................................726 CONCLUSIONS........................................................................816.1 Contributions....................................................................................................... 81iv


v6.2 Directions for <strong>the</strong> Future ..................................................................................... 826.3 Closing Remarks ................................................................................................. 83BIBLIOGRAPHY.............................................................................84VITA ................................................................................................100


List of TablesviTable 2.1:Qualitative comparison of 3D shape analysis methods with focus onalgorithm efficiency...................................................................................................... 24Table 2.2: Qualitative comparison of 3D shape analysis methods with focus oneffective description. .................................................................................................... 25Table 4.1: Kernel functions. ................................................................................................ 52Table 4.2: List of entropy type measures of <strong>the</strong> form = == ⋅ϕ................ 57 ⋅ ϕ


viiList of FiguresFigure 1.1: Engineering and reverse engineering ..........................................................2Figure 1.2: Under vehicle inspection and surveillance ..................................................3Figure 1.3: Proposed approach ... ..................................................................................5Figure 2.1: Classification of shape description and representation adapted from[Zhang, 2004] ........................................................................................................9Figure 2.2: [Reproduced from Belongie, 2003] Shape Contexts ................................11Figure 2.3: Classification of methods on 3D data .......................................................17Figure 3.1: IVP Ranger SC-386 range acquisition system..........................................28Figure 3.2: Triangulation and range image acquisition ..............................................30Figure 3.3: The process of calibration ........................................................................32Figure 3.4: Graphical User Interface ..........................................................................33Figure 3.5: Block diagr<strong>am</strong> of a laser-based reverse engineering system ...................34Figure 3.6: Model creation .........................................................................................35Figure 3.7: Data acquisition for under vehicle inspection ..........................................38Figure 4.1: A circle and an arbitrary object .................................................................40Figure 4.2: Block diagr<strong>am</strong> of our CVM as <strong>the</strong> informational approach to shapedescription ...........................................................................................................41Figure 4.3: Block diagr<strong>am</strong> of curvature-based vehicle component descriptionalgorithm including path decomposition and CVM computation .......................42Figure 4.4: Illustration to understand curvature of a surface .......................................44Figure 4.5: Illustration that shows <strong>the</strong> effect of bin width on density estimation using ahistogr<strong>am</strong> .............................................................................................................49Figure 4.6: Different methods used to estimate <strong>the</strong> density of <strong>the</strong> s<strong>am</strong>e dataset.Reprinted from [Silverman, 1986] .......................................................................51Figure 4.7: Effect of bandwidth par<strong>am</strong>eter on kernel density .....................................53Figure 4.8: Resolution issue with Shannon type measures .........................................58Figure 5.1: Neighborhood of a vertex in a triangle mesh ............................................60


viiiFigure 5.2: Curvature analysis – Multi-resolution error analysis experiment withfour different approaches to curvature estimation on triangle meshes ................62Figure 5.3: Curvature analysis – Error in curvature of a sphere at multiple resolutions..............................................................................................................................64Figure 5.4: Curvature analysis – Variation in curvature for surface description ........65Figure 5.5: Curvature-based descriptors.......................................................................67Figure 5.6: Implementation of Shape Distributions ....................................................68Figure 5.7: Shape Distributions and its uniqueness in description ..............................69Figure 5.8: Shape complexity measure– using Shannon’s definition of information .71Figure 5.9: Shape information and surface ruggedness ...............................................72Figure 5.10: Shape information divergence from <strong>the</strong> sphere – Experimental results onsuper quadrics .......................................................................................................73Figure 5.11: Surface description results - surface, curvature and density of curvature of(a) Spherical cap (b) Saddle (c) Monkey saddle ..................................................75Figure 5.12: Multi resolution experiment on <strong>the</strong> monkey saddle – The surface, itscurvature density and <strong>the</strong> measure of shape information ....................................76Figure 5.13: CVM graph results on simple mesh models: curvedness-based edgedetection, smooth patch decomposition and graph representation.......................78Figure 5.14: CVM graph results on automotive parts: curvedness-based edgedetection, smooth patch decomposition and graph representation ......................79Figure 5.15: CVM graph results on an under vehicle scene ........................................80


Chapter 1: Introduction 11 INTRODUCTIONHave we ever realized how easy it has been for us to locate a friend at <strong>the</strong> shoppingcenter? How quickly we recollect something <strong>by</strong> looking at a photograph, and howaccurately we approximate distance? It is indeed <strong>am</strong>azing to realize <strong>the</strong> design of 126million receptors compactly packed into nerve endings and muscles that coordinate soimpeccably well to process visual information that would require a bandwidth of 600terahertz and processing capability of 2 tera<strong>by</strong>tes per second. We are just measuring <strong>the</strong>sensing capability of <strong>the</strong> eye; not to forget <strong>the</strong> extremely fast and meticulous brain thatdoes <strong>the</strong> processing at that bandwidth and with incredible accuracy and precision.As computer vision researchers, we acknowledge <strong>the</strong> uncanny ability of our humanvisual system in object detection and recognition, to address <strong>the</strong> complexities involvedin imparting this intelligence to a computer. The first and foremost computationalhurdle is that of variability. A vision system needs to generalize across huge variationsof an object to viewpoint, illumination, occlusions and many such factors and still bevery specific. For more than two decades researchers have been fighting such factorsand <strong>the</strong> lack of important depth information with intensity images. With increase incomputational speed and capabilities of <strong>the</strong> electronic world, we now deal with 3D data.The 3D sensors, in addition to having <strong>the</strong> capabilities of traditional c<strong>am</strong>eras, requireprocessing resources to extract depth information. By 3D data, we mean digitizedrepresentations of <strong>the</strong> real world objects that we can visualize and understand using acomputer. Computers can be progr<strong>am</strong>med to understand a specific domain of objects <strong>by</strong>extracting features from <strong>the</strong>ir digital representation. An important feature used for imageunderstanding is shape. Shape is interpreted as <strong>the</strong> geometric description of an object,and shape analysis refers to <strong>the</strong> process of feature extraction followed <strong>by</strong> featurematching. In this <strong>the</strong>sis we present <strong>the</strong> pipeline for 3D data collection and discuss a newshape analysis algorithm that we have developed. We base our algorithm on a featurethat we define as <strong>the</strong> Curvature Variation Measure (CVM). We have implemented <strong>the</strong>algorithm in an application to reverse engineering and vehicle inspection that weelaborate in Section 1.1.


Chapter 1: Introduction 21.1 MotivationComputer aided design (CAD) combined with computer aided manufacturing (CAM)has revolutionized many engineering disciplines since <strong>the</strong> 1980’s. In particular, CADand CAM technologies have catered to <strong>the</strong> needs of <strong>the</strong> automobile manufacturers. Adesigner can now rapidly fabricate a real-world tangible object from a conceptual CADdescription. The process of designing and manufacturing components using a computeris often referred to as computer aided engineering. In this context, we would like tointroduce <strong>the</strong> idea of reverse engineering that begins with <strong>the</strong> product and works through<strong>the</strong> design process in <strong>the</strong> opposite direction to arrive at a product definition statement. Indoing so, it uncovers as much information as possible about <strong>the</strong> design ideas that wereused to produce that particular product. By design ideas, we mean <strong>the</strong> shape andtopology of <strong>the</strong> surfaces used at <strong>the</strong> time of modeling. At this point, we would like toemphasize that our focus is only on <strong>the</strong> geometric aspect of reverse engineering and noton <strong>the</strong> functional aspect of <strong>the</strong>se mechanical components.Reverse engineering aids <strong>the</strong> electronic dissemination and archival of information inaddition to <strong>the</strong> prospect of re-creating an out-of-production component. More recently,reverse engineering techniques play a significant role in real-time rapid inspection andvalidation in <strong>the</strong> production line. The traditional approach to reverse engineering hasbeen <strong>the</strong> use of coordinate measuring machines (CMM) that require a probe in contactwith <strong>the</strong> object at <strong>the</strong> time of digitization. Though CMMs are accurate some applicationsdemand non-contact digitization.In Figure 1.1 we illustrate <strong>the</strong> process of reverse engineering as <strong>the</strong> reversal of CAM.We show that <strong>the</strong> reverse engineering of <strong>the</strong> disc brake involves acquiring 3D positiondata in <strong>the</strong> point cloud. We <strong>the</strong>n represent geometry of <strong>the</strong> object in terms of surfacepoints and tessellated piecewise smooth surfaces. We now need to represent <strong>the</strong> pointcloud in a form that <strong>the</strong> CAM system can interpret and manufacture.EngineeringReverse EngineeringFigure 1.1: Engineering and reverse engineering.


Chapter 1: Introduction 3Ano<strong>the</strong>r application that our research efforts target is that of under vehicle inspection.Vehicle inspection has been traditionally accomplished through security personnelwalking around a vehicle with a mirror at <strong>the</strong> end of a stick. The inspection personnelare able to view underneath a vehicle to identify weapons, bombs and o<strong>the</strong>r securitythreats. The mirror-on-<strong>the</strong>-stick system allows only partial coverage under a vehicle andis restricted <strong>by</strong> <strong>am</strong>bient lighting. The inspecting personnel are also at risk. As part of <strong>the</strong>Security Automation and Future Electromotive Robotics (SAFER) progr<strong>am</strong> we aim atdeveloping a robotic platform that deploys “sixth sense” sensors for threat assessment.We propose <strong>the</strong> idea of incorporating a 3D range sensor on <strong>the</strong> robotic platform. Theidea is to be able to extract <strong>the</strong> 3D geometry of <strong>the</strong> undercarriage of automobiles. Withprior manufacturer’s information on <strong>the</strong> components that make <strong>the</strong> undercarriage of <strong>the</strong>vehicle, we believe that it will be possible to identify foreign objects in <strong>the</strong> scene. Forex<strong>am</strong>ple in Figure 1.2 we show <strong>the</strong> robotic platform and <strong>the</strong> 3D geometry of <strong>the</strong> scenecontaining <strong>the</strong> muffler, shaft and <strong>the</strong> catalytic converter. It will not be possible to extractcomplete geometry of <strong>the</strong> undercarriage without dismantling <strong>the</strong> automobile. We henceneed a representation scheme that maps <strong>the</strong> shape sensed from <strong>the</strong> scene to <strong>the</strong> CADdescription and that is robust with occluded data.Though vehicle inspection and reverse engineering appear as different applications,<strong>the</strong>y share <strong>the</strong> s<strong>am</strong>e processing pipeline as a computer vision task of designing asystem that can capture <strong>the</strong> geometric structure of an object and store <strong>the</strong> subsequentshape and topology information. We discuss <strong>the</strong> use of laser-based range scanners for<strong>the</strong> extraction of 3D geometry and a curvature-based shape analysis algorithm basedon our CVM to interpret surface topology.Robotic Platform Under Vehicle Scene 3D GeometryFigure 1.2: Under vehicle inspection and surveillance.


Chapter 1: Introduction 41.2 Proposed ApproachShape analysis is an age-old research topic and has been pursued since <strong>the</strong> dawn ofimage processing and computer vision. Literature on shape extraction from intensityimages is vast and gives good insight into why vision research with intensity imageshas not been very successful. Most of <strong>the</strong> methods we have studied show promise withmore and almost complete information with 3D data. Though 3D data acquisition andprocessing is relatively new, <strong>the</strong>re are a few important contributions in our context ofshape similarity and shape description all motivated <strong>by</strong> <strong>the</strong> challenge of objectrecognition. We survey <strong>the</strong> literature on shape analysis applied to intensity images andalso summarize recent and ongoing work in 3D computer vision.Computer vision systems seek to develop computer models of <strong>the</strong> real world throughprocessing of image data from sensors. In Figure 1.3(a) we present <strong>the</strong> flow diagr<strong>am</strong>of our proposed approach. We begin with <strong>the</strong> data acquisition (Figure 1.3 (b)) usinglaser-based range scanners and <strong>the</strong> process of creating CAD models using <strong>the</strong>sescanners. We acquire range images using <strong>the</strong> laser-illuminated active range sensorfrom <strong>the</strong> Integrated Vision Products Inc. (IVP). A range image is a 2D matrix withvalues proportional to <strong>the</strong> distance between <strong>the</strong> sensor and <strong>the</strong> object. We acquirerange information from multiple views of <strong>the</strong> object to make sure that we havesufficient data to represent <strong>the</strong> object completely. We <strong>the</strong>n transform <strong>the</strong> range datafrom <strong>the</strong> c<strong>am</strong>era coordinate fr<strong>am</strong>e to <strong>the</strong> real world and integrate <strong>the</strong> multi-view pointclouds into a single global reference fr<strong>am</strong>e. We reconstruct triangle meshes from <strong>the</strong>point clouds and use it as our input for <strong>the</strong> shape analysis.We base our shape analysis algorithm on <strong>the</strong> part-based perception model[Stankiewicz, 2002]. With automotive components, our task is simplified because <strong>the</strong>components are man-made and manufacturing limitations restrict us to smooth (mostlyplanar and cylindrical) patches. We hence propose that surface shape description ofeach of <strong>the</strong> parts and <strong>the</strong> connectivity of parts can uniquely describe <strong>the</strong> object. Indescribing surfaces and surface complexity we chose curvature to extract “shapeinformation”. We chose curvature because it is an information preserving feature,invariant to rotation and possesses an intuitively pleasing correspondence to <strong>the</strong>perceptive property of “simplicity”. We decompose <strong>the</strong> object of interest into a set ofpatches and assign a Curvature Variation Measure (CVM) to each of <strong>the</strong>se patches andrepresent <strong>the</strong> object as a patch adjacency graph. Our graph representation whenextended to scenes with occlusions can still yield satisfactory results.Consider <strong>the</strong> ex<strong>am</strong>ple in Figure 1.3 again. We first decompose <strong>the</strong> triangle meshmodel into smooth patches. We show <strong>the</strong> disc brake model and decompose it into fourparts. We have shaded each of <strong>the</strong>se parts with a different color. We base our surfacepatch decomposition on <strong>the</strong> definition of curvedness in [Dorai, 1996]. Curvednessidentifies sharp edges and creases.


Chapter 1: Introduction 5Real WorldObjectData Acquisitionand ModelingSurface PatchDecompositionCurvatureVariation MeasureGraphRepresentation0.01 0.110.11 0.001(a)Multi-ViewRange ImagesMulti-ViewRegistrationViewIntegrationMeshModelingComputeCurvedness(b)Segment <strong>by</strong>Region GrowingConnectivity ofSurface Patches(c)CurvatureComputationDensityEstimationInformationMeasureCVM = 0.001(d)Figure 1.3: Proposed approach. (a) Shape analysis based on our curvature variationmeasure - flow diagr<strong>am</strong>. (b) Data acquisition and modeling. (c) Surface patchdecomposition. (d) Curvature variation measure.


Chapter 1: Introduction 6We <strong>the</strong>n perform region-growing segmentation and save <strong>the</strong> patch adjacencyinformation as illustrated in Figure 1.3(c).Now that we have segmented surface patches that make <strong>the</strong> object, we compute <strong>the</strong>curvature variation measure on each of <strong>the</strong>se patches (Figure 1.3(d)). We haveborrowed concepts from Shannon’s idea [Shannon, 1948] of measuring information ona probabilistic fr<strong>am</strong>ework. We hence define <strong>the</strong> curvature variation measure as <strong>the</strong>entropy of curvature along that surface. We present a brief analysis on variouscurvature estimation methods on triangle meshes and reiterate <strong>the</strong> importance ofbandwidth optimized density estimation to stabilize <strong>the</strong> information measure. Ourmodification of Shannon’s definition of entropy is normalized and invariant to scale.The normalized resolution invariant measure attempts to quantify <strong>the</strong> complexity of<strong>the</strong> surface <strong>by</strong> a single number. Similar shapes at different scales will have equalmeasures.1.3 Document OrganizationThe remainder of this <strong>the</strong>sis documents <strong>the</strong> <strong>the</strong>ory and results of our data collectionand CVM algorithm. Chapter 2 presents a survey of <strong>the</strong> literature on <strong>the</strong> shape analysisand description of 2D images and 3D models. Here we explain why methods in 2Dcannot be extended to 3D and discuss <strong>the</strong> scope for extending <strong>the</strong> state of <strong>the</strong> art.Then, we present our experience with <strong>the</strong> data acquisition using a laser-based scannerfor creating 3D models of automotive components and scenes under <strong>the</strong> vehicle inChapter 3. Chapter 4 documents <strong>the</strong> <strong>the</strong>ory that supports our shape analysis algorithm.We test our algorithm on <strong>the</strong> acquired data and present our results in Chapter 5. Theseexperimental results demonstrate capabilities of our algorithm and its scope as anobject recognition system. Finally, we conclude with possible extensions in Chapter 6.


Chapter 2: Literature Review 72 LITERATURE REVIEWIn this chapter we present a review of <strong>the</strong> research literature. In Section 2.1 weintroduce <strong>the</strong> reader to shape and its implication to computer vision and briefly reviewsome key methods on 2D images in Section 2.2. We discuss contemporary research in3D computer vision for shape analysis in Section 2.3 and summarize our survey inSection 2.4.2.1 Cognition and Computer VisionThe human cognitive system is designed to interpret sensory data with suchremarkable speed and accuracy that we fail to appreciate millions of computationsinvolved in a common event of identifying an object. An impressive component inhuman perception is our ability to recognize 3D objects from <strong>the</strong>ir 2D retinalprojections. Stankiewicz outlines human visual perception into three possiblehypo<strong>the</strong>ses, n<strong>am</strong>ely <strong>the</strong> feature model approach, alignment model approach and <strong>the</strong>part-based approach [Stankiewicz, 2002]. Feature models propose that <strong>the</strong> visualsystem does not match a precise numerical array of an object with ano<strong>the</strong>r butremembers a collection of features in memory. According to this approach <strong>the</strong> locationof <strong>the</strong> features in a particular image is less significant than its presence in <strong>the</strong> image.The feature model approach fails with increasing occlusions and is less reliable when<strong>the</strong> spatial relationship between <strong>the</strong> features and <strong>the</strong> image are vital in recognizing <strong>the</strong>object. Alignment models make use of <strong>the</strong> spatial information to compensate forviewpoint changes but do not consider occlusions. They can handle Euclideantransformations such as <strong>the</strong> rotation, translation and scaling and are accepted to berobust in comparison with <strong>the</strong> feature models. Part-based models operate <strong>by</strong>decomposing an object into its constituent parts. The approach uses image features todescribe <strong>the</strong> shape of parts in addition to documenting relationships between parts.The part-based model has not met with great success in computer vision because of<strong>the</strong> insufficiency in intensity images to segment objects as parts, but with <strong>the</strong>increasing computational capabilities and improvements in sensor technology towards3D imaging, part-based models are a good prospect.


Chapter 2: Literature Review 8Shape is <strong>the</strong> geometric information invariant to a particular class of transformationssuch as affine, translation, rotation and scaling and is considered to be <strong>the</strong> “words” of<strong>the</strong> visual language. Shape analysis is an important aspect in image understanding.Since so many objects in our world are strongly determined <strong>by</strong> geometric properties,<strong>the</strong> applications of shape analysis extend over a broad spectrum of science andtechnology. Indeed, when properly and carefully applied shape analysis provides richpotential for applications in diverse areas, spanning computer vision, graphics,material sciences, biology and even neuroscience.2.2 Shape Analysis on 2D ImagesShape description looks for effective and perceptually important shape features basedon ei<strong>the</strong>r shape boundary information or interior content. By perceptually similarshapes we are referring to shapes that are rotated, translated, scaled and are affinetransformed. Many shape representation techniques have been developed in <strong>the</strong> pastand shape analysis still remains as an interesting field of research. A few suchrepresentation techniques are <strong>the</strong> shape signatures, shape histogr<strong>am</strong>s, moments,curvature, shape context and shape matrix. We would like to direct <strong>the</strong> reader to[Zhang and Lu, 2004] for a recent and comprehensive survey on 2D shaperepresentation for various applications.2.2.1 Classification of MethodsShape representation techniques are generally classified into two classes based onwhe<strong>the</strong>r shape features are extracted from <strong>the</strong> contour only or from <strong>the</strong> whole region.Zhang and Lu [Zhang and Lu, 2004] subdivide each of <strong>the</strong>se classes fur<strong>the</strong>r intostructural and global approaches based on <strong>the</strong> primitives used to describe <strong>the</strong> shape.They discuss methods that operate on <strong>the</strong> space domain and transform domain toextract shape information and classify shape description methods as shown in Figure2.1.Contour-based approaches are more popular in computer vision literature. Suchmethods assume that human beings discriminate shapes mainly <strong>by</strong> <strong>the</strong>ir featurecontours. The contour-based approach is limited <strong>by</strong> noise and <strong>by</strong> data that do not havesufficient information (occlusions) in <strong>the</strong> boundary contour. Region-based methodsare considered to be more robust and are dependable for accurate retrieval as <strong>the</strong>yattempt to extract shape information from <strong>the</strong> entire region and not just its boundary.


Chapter 2: Literature Review 92D ShapeContour-basedRegion-basedStructural Global Global StructuralChain CodePolygonB-SplineInvariantsPerimeterCompactnessEccentricityShape SignatureHausdorff DistanceWaveletsScale SpaceAutoregressiveElastic MatchingAreaEuler NumberEccentricityGeometricMomentsZernike momentsFourierDescriptorGrid MethodShape MatrixConvex HullMedial AxisCoreFigure 2.1: Classification of shape description and representation adapted from[Zhang, 2004].


Chapter 2: Literature Review 102.2.2 Contour-Based DescriptionContour-based shape representation techniques extract shape information from <strong>the</strong>boundary. There are generally two approaches for contour shape modeling: <strong>the</strong>continuous global approach and discrete structural approach. The global approachmakes use of feature vectors derived from <strong>the</strong> boundary to describe shape. Themeasure of shape similarity is <strong>the</strong> metric distance between feature vectors. Thediscrete approach to shape analysis represents <strong>the</strong> shape into a graph or tree ofsegments (primitives). The shape similarity is deduced <strong>by</strong> string or graph matching.We begin our analysis with <strong>the</strong> contour-based global shape description methods. Themost commonly used global shape descriptors are surface area, circularity,eccentricity, convexity, bending energy, ratio of principle axis, circular variance andelliptic variance and orientation. These simple descriptors are not suitable standalonedescriptors but are usually used to discriminate shapes with large differences or tofilter false hits. Some of <strong>the</strong>se are also used in combination with <strong>the</strong> o<strong>the</strong>r descriptorsfor shape description. The efficiency of such descriptors is discussed in [Peura andIvarinen, 1997].A few space-domain techniques compute correspondence-based shape measures using<strong>the</strong> point-point match where each point on <strong>the</strong> boundary is considered to be acontributor to shape. Hausdorff distance is a classical correspondence-based shapematching method, often used to locate objects in an image and measure similaritybetween shapes as discussed in [Huttenlocher, 1992].Given two shapes S 1 = {a 1 , a 2 ,,……. ,a p } and S 2 = {b 1 , b 2 ,,……. ,b p } represented astwo sets of points, <strong>the</strong> Hausdorff distance is defined asHd(S1,S2) = max(h(S1,S2),h(S2,S1)}, h(S1,S2) = max min|| a − b ||a∈ S1 b∈S 2(2.1)where ||.|| refers to <strong>the</strong> Euclidean distance.The Hausdorff distance measure is too sensitive to noise and is useful for partialmatching invariant to rotation, scale and translation. Rucklidge improves it with a newmeasure between two datasets using a prohibitively expensive matching procedurethat tackles different orientations, positions and scales [Rucklidge, 1997]. A morerecent but similar kind of approach to shape matching was introduced <strong>by</strong> <strong>the</strong> n<strong>am</strong>e of“shape contexts” in [Belongie et al., 2002].Shape contexts claim to extract globalfeature at every point reducing <strong>the</strong> point-point matching into a matrix matching ofcontexts. <strong>To</strong> extract <strong>the</strong> shape context at a point p on <strong>the</strong> boundary, <strong>the</strong> vectors thatconnect p and each of <strong>the</strong> o<strong>the</strong>r points on <strong>the</strong> boundary are computed. The length andorientation of <strong>the</strong>se vectors are quantized into a log-space histogr<strong>am</strong> map for that pointp to account for additional sensitivity to neighboring points. These histogr<strong>am</strong>s areflattened and concatenated to form <strong>the</strong> context of <strong>the</strong> shape as shown in Figure 2.2.


Chapter 2: Literature Review 11(a) (b) (c) (d) (e)Figure 2.2: [Reproduced from Belongie, 2003] Shape Contexts. (a) A charactershape. (b) Edge image of (a). (c) The histogr<strong>am</strong> is <strong>the</strong> context of <strong>the</strong> point p. (d) Thelog-space histogr<strong>am</strong>. (e) Each row of <strong>the</strong> context map is <strong>the</strong> flattened histogr<strong>am</strong> ofeach point context, <strong>the</strong> number of rows is <strong>the</strong> number of s<strong>am</strong>pled points.Davies [Davies, 1997] describes shape signatures as a one-dimensional functionderived from <strong>the</strong> shape boundary points. Some shape signatures that can be found in<strong>the</strong> literature are <strong>the</strong> centroidal profile, complex coordinates, tangent angle, cumulativeangle, chord length and curvature. Shape signatures are usually normalized in scale.Translational and rotational invariance is achieved <strong>by</strong> a shift search procedure of <strong>the</strong>one dimensional function extracted from <strong>the</strong> shape boundary. Shape signatures requirefur<strong>the</strong>r processing in addition to <strong>the</strong> high matching cost to overcome <strong>the</strong>ir sensitivityand improve robustness. Autoregressive models [Chellappa and Bagdazian, 1984] arestochastically defined predictor-based methods dependent on modeling <strong>the</strong> shape intoa 1D function.Boundary moments are extensions of shape signatures to reduce <strong>the</strong> dimensionality of<strong>the</strong> boundary representation. If z(i) is an extracted shape signature of a boundary, <strong>the</strong>r th moment and <strong>the</strong> central moment µcan be estimated as shown in Equation 2.2 and2.3.rm1µ =1Nrr= [ z( i )]N i=1N[ z( ir) − m1]N i=1(2.2)(2.3)where N is <strong>the</strong> number of points representing <strong>the</strong> boundary. The normalized momentsare invariant to shape translation, rotation and scaling. As discussed in [Gonzalez,2002] <strong>the</strong> <strong>am</strong>plitude of <strong>the</strong> shape signature can be treated as a random variable and itsmoments computed using its histogr<strong>am</strong>. These moments are easily computable buthave no physical significance.


Chapter 2: Literature Review 12Bimbo [Bimbo, 1997] implements elastic matching for shape-based image retrieval. Adeformed template is generated as <strong>the</strong> sum of <strong>the</strong> original template and a warpingdeformation. The similarity between <strong>the</strong> original shape of <strong>the</strong> template and <strong>the</strong> objectis obtained <strong>by</strong> minimizing <strong>the</strong> compound function, which is <strong>the</strong> sum of <strong>the</strong> strainenergy, bending energy and <strong>the</strong> deviation measure of <strong>the</strong> deformed template with <strong>the</strong>object. He defines shape complexity as <strong>the</strong> number of curvature zero crossings and acorrelation between curvature functions of <strong>the</strong> template and <strong>the</strong> object. Theclassification is performed <strong>by</strong> a back propagation algorithm neural network.Most of <strong>the</strong> space-domain techniques discussed in literature are sensitive to noise andboundary deviations. Spectral-domain techniques resolve <strong>the</strong> noise issues. Thesimplest of <strong>the</strong> spectral domain descriptors are <strong>the</strong> Fourier descriptors [Zhang and Lu,2002] and <strong>the</strong> wavelet descriptors [Yang et al., 1998]. They are derived from <strong>the</strong> onedimensionalshape signatures of <strong>the</strong> shape function. They are easy to compute,normalize and <strong>by</strong>pass <strong>the</strong> complex matching stages of <strong>the</strong> shape signature basedmethods. Zhang and Lu [Zhang and Lu, 2002] argue that <strong>the</strong> centroidal profile is <strong>the</strong>most efficient shape descriptor to be used in combination with Fourier descriptors.Structural shape representation is yet ano<strong>the</strong>r approach to analysis of shape descriptionas shown in Figure 2.1. With <strong>the</strong> structural approach, shapes are broken down intosegments called shape primitives. Structural methods differ in <strong>the</strong> selection ofprimitives and organization of primitives for shape representation. Some of <strong>the</strong>common methods of boundary decomposition are based on polygonal approximation,curvature decomposition and curve fitting. The result of <strong>the</strong> decomposition is encodedin a general string form that can be used with a high-level syntactic analyzer for shapecomparison tasks.Chain code described <strong>by</strong> [Freeman and Saghri, 1978] is a sequence of unit-size linesegments with a given orientation. The unit vector method describes any arbitrarycurve as a sequence of small vectors of unit length in a set of directions. The chaincodes need to be independent of <strong>the</strong> starting boundary pixel. The independence isachieved <strong>by</strong> a good scheme that defines <strong>the</strong> characteristics of a starting pixel or <strong>by</strong>representing <strong>the</strong> chain code as differences in successive directions. Chain codes usedfor object recognition and matching are not scale invariant though <strong>the</strong>y are rotationinvariant. Polygonal decomposition methods discussed in [Groskey et al., 1992] breaka given boundary into line segments <strong>by</strong> using polygon vertices as primitives. Featurestrings are created with four elements such as <strong>the</strong> internal angle, distance from <strong>the</strong> nextvertex and <strong>the</strong> coordinates of <strong>the</strong> vertex. The similarity of shapes is <strong>the</strong> editingdistance between two feature strings representing <strong>the</strong> shape. Mehrotra and Gary in[Mehrotra and Gary, 1995] represent shape as a series of interest points from <strong>the</strong>polygonal boundary approximation. These points are mapped onto a new scale androtation invariant basis to represent shape in a new coordinate system. Berretti et al.[Berretti et al., 2000] extend [Groskey and Mehrotra, 1990] for shape retrieval <strong>by</strong>


Chapter 2: Literature Review 13defining tokens as <strong>the</strong> zero crossings of Gaussian curvature and shape similarity is <strong>the</strong>Euclidean distance between primitives. Dudek and Tsotsos [Dudek and Tsotsos, 1997]use curvature scale spaces for shape matching. In this approach, shape primitives arefirst obtained from a curvature-tuned smoothing technique. A segment descriptorconsists of <strong>the</strong> segment’s length, ordinal position, and curvature turning valueextracted from each of <strong>the</strong>se primitives. A string of segment descriptors is <strong>the</strong>n createdto describe <strong>the</strong> shape. For two shapes A and B represented <strong>by</strong> <strong>the</strong>ir string descriptors, <strong>am</strong>odel-<strong>by</strong>-model match using dyn<strong>am</strong>ic progr<strong>am</strong>ming is exploited to obtain <strong>the</strong>similarity score of <strong>the</strong> two shapes. <strong>To</strong> increase robustness and to save matchingcomputation, <strong>the</strong> shape features are put into a curvature scale space so that shapes canbe matched even in different scales. However, due to <strong>the</strong> inclusion of length in <strong>the</strong>segment descriptors, <strong>the</strong> descriptors are not scale invariant.Ano<strong>the</strong>r interesting approach to <strong>the</strong> analysis of shape is syntactic analysis in [Fu,1974] that attempts to simulate <strong>the</strong> structural and hierarchical nature of <strong>the</strong> humanvision system. In syntactic methods, shape is represented <strong>by</strong> a set of predefinedprimitives. The set of predefined primitives is called <strong>the</strong> codebook and <strong>the</strong> primitivesare called code words. The matching between shapes can use string matching <strong>by</strong>finding <strong>the</strong> minimal number of edit operations in trying to convert one string toano<strong>the</strong>r However, it is not practical in general applications due to <strong>the</strong> fact that it is notpossible to infer a pattern of gr<strong>am</strong>mar which can generate only <strong>the</strong> valid patterns. Inaddition, this method needs a priori knowledge for <strong>the</strong> database in order to definecodeword or alphabets. Shape invariants make use of simple shape descriptors such as<strong>the</strong> cross ratio, length and area to derive a multi-valued signature. Kliot and Rivlin[Kliot and Rivlin, 1998] propose a multi valued matrix that can be used for matchingtwo curves. This method can be improved with a histogr<strong>am</strong> matching stage before <strong>the</strong>matrix matching. Squire and Caelli in [Squire and Caelli, 2000] use <strong>the</strong> densityfunction of piecewise linear curves for <strong>the</strong>ir shape invariant. The histogr<strong>am</strong> of <strong>the</strong>shape invariant signature is fed into a neural network for classification.2.2.3 Region-Based DescriptionRegion-based techniques take into account all <strong>the</strong> pixels within a shape region toobtain <strong>the</strong> shape representation, ra<strong>the</strong>r than only use boundary information as incontour-based methods. Common region-based methods use moment descriptors todescribe shapes. Structural methods include grid method, shape matrix, convex hulland medial axis. Global methods treat shape as a whole; <strong>the</strong> resultant representation isa numeric feature vector which can be used for shape description while structuralmethods break down <strong>the</strong> shape into segments. Similarity between global shapedescriptors is simply <strong>the</strong> metric distance between <strong>the</strong>ir feature vectors. Some of <strong>the</strong>global descriptors are <strong>the</strong> geometric moment invariants and <strong>the</strong> algebraic momentinvariants. One of <strong>the</strong> oldest global methods implemented for region-based description


Chapter 2: Literature Review 14is from Hu [Hu, 1962]. He used <strong>the</strong> work of nineteenth century ma<strong>the</strong>maticians onimages for pattern recognition.p qmpq = x y f ( x, y ), p,q = 0 , 1,2....x y(2.4)Lower-order geometric moments from Equation 2.4 are easy to compute and aresufficient for representing simple shapes. Algebraic moments [Taubin and Cooper,1991] and [Taubin and Cooper, 1992] on <strong>the</strong> o<strong>the</strong>r hand are based on <strong>the</strong> centralmoments of predetermined matrices that can be constructed for any order and areinvariant to affine transformations. Teague [Teague, 1980] defines orthogonalmoments <strong>by</strong> replacing <strong>the</strong> x p y q term <strong>by</strong> <strong>the</strong> Zernike polynomials. Moment shapedescriptors are concise, robust, and easy to compute and match. The disadvantage ofmoment methods is that it is difficult to correlate higher-order moments with <strong>the</strong>shape’s physical features.Among <strong>the</strong> many moment shape descriptors, Zernike moments [Jeannin, 2000] are <strong>the</strong>most desirable for shape description. Due to <strong>the</strong> incorporation of a sinusoid functioninto <strong>the</strong> kernel, <strong>the</strong>y have similar properties of spectral features, which are wellunderstood. Although Zernike moment descriptors have a robust performance, <strong>the</strong>yhave several shortcomings. First, <strong>the</strong> kernel of Zernike moments is complex tocompute, and <strong>the</strong> shape has to be normalized into a unit disk before deriving <strong>the</strong>moment features. Second, <strong>the</strong> radial features and circular features captured <strong>by</strong> Zernikemoments are not consistent, one is in <strong>the</strong> spatial domain and <strong>the</strong> o<strong>the</strong>r is in spectraldomain. This approach does not allow multi-resolution analysis of a shape in <strong>the</strong> radialdirection. Third, <strong>the</strong> circular spectral features are not captured evenly at each o<strong>the</strong>r andcan result in loss of significant features which are useful for shape description.<strong>To</strong> overcome <strong>the</strong>se shortcomings, a generic Fourier descriptor (GFD) has beenproposed <strong>by</strong> Zhang and Lu [Zhang and Lu, 2002]. The GFD is acquired <strong>by</strong> applying a2D Fourier transform on a polar-raster s<strong>am</strong>pled image using <strong>the</strong> Equation 2.5.PF ( ρ,φ ) =2 r 2πif ( r, θi)expj2π( ρ + ϕ Rr i T(2.5)Zhang and Lu show that GFD outperforms contour shape descriptors such as curvaturescale spaces, Fourier descriptors and moment-based descriptors.The grid shape descriptor proposed <strong>by</strong> [Lu and Sajjanhar, 1999] has been used in[Chakrabarti et al., 2000] and [Safar et al., 2000]. Basically, a grid of cells is overlaidon a shape; <strong>the</strong> grid is <strong>the</strong>n scanned from left to right and top to bottom. The result issaved as a bitmap. The cells covered <strong>by</strong> <strong>the</strong> shape are assigned one and those notcovered <strong>by</strong> <strong>the</strong> shape are assigned zero. The shape can <strong>the</strong>n be represented as a binary


Chapter 2: Literature Review 15feature vector. The binary H<strong>am</strong>ming distance is used to measure <strong>the</strong> similaritybetween two shapes. <strong>To</strong> account for <strong>the</strong> invariance to Euclidean transformations <strong>the</strong>shape needs to be normalized. Chakrabarti et al. [Chakrabarti et al., 2000] improve <strong>the</strong>grid descriptor <strong>by</strong> using an adaptive resolution (AR) representation acquired <strong>by</strong>applying quad-tree decomposition on <strong>the</strong> bitmap representation of <strong>the</strong> shape.Typically, shape methods use rectangular-grid s<strong>am</strong>pling to acquire shape information.The shape representation so derived is usually not translation, rotation and scalinginvariant. Extra normalization is <strong>the</strong>refore required. Goshtas<strong>by</strong> [Goshtas<strong>by</strong>, 1985]proposes <strong>the</strong> use of a shape matrix which is derived from a circular raster s<strong>am</strong>plingtechnique. The idea is similar to normal raster s<strong>am</strong>pling. However, ra<strong>the</strong>r than overlay<strong>the</strong> normal square grid on a shape image, a polar raster of concentric circles and radiallines is overlaid in <strong>the</strong> center of <strong>the</strong> mass. The binary value of <strong>the</strong> shape is s<strong>am</strong>pled at<strong>the</strong> intersections of <strong>the</strong> circles and radial lines. The shape matrix is formed such that<strong>the</strong> circles correspond to <strong>the</strong> matrix columns and <strong>the</strong> radial lines correspond to <strong>the</strong>matrix rows. Prior to <strong>the</strong> s<strong>am</strong>pling, <strong>the</strong> shape is scale normalized using <strong>the</strong> maximumradius of <strong>the</strong> shape. The resultant matrix representation is invariant to translation,rotation, and scaling. Since <strong>the</strong> s<strong>am</strong>pling density is not constant with <strong>the</strong> polars<strong>am</strong>pling raster, Taza et al. represent shape using a weighed shape matrix, which givesmore weight to peripheral s<strong>am</strong>ples in [Taza et al., 1989]. However, since a shapematrix is a sparse s<strong>am</strong>pling of shape, it is easily affected <strong>by</strong> noise. Besides, shapematching using a shape matrix is expensive. Parui et al. propose a shape descriptionbased on <strong>the</strong> relative areas of <strong>the</strong> shape contained in concentric rings located in <strong>the</strong>shape center of <strong>the</strong> mass in [Parui et al., 1986].Structural methods for region-based shape description usually involve <strong>the</strong> convexhulls and medial axis described in [Davies,1997], [Blum,1967] and [Morse,1994]. Aregion R is convex if and only if for any two points x 1 ; x 2 R, <strong>the</strong> whole linesegment x 1 x 2 is inside <strong>the</strong> region. The convex hull of a region is <strong>the</strong> smallest convexregion H which satisfies <strong>the</strong> condition R H. The difference H − R is called <strong>the</strong>convex deficiency D of <strong>the</strong> region R. The extraction of <strong>the</strong> convex hull can beachieved ei<strong>the</strong>r using <strong>the</strong> boundary-tracing method from [Sonka et al., 1993] or <strong>by</strong>using morphological methods from [Gonzalez and Woods, 1992]. Since shapeboundaries tend to be irregular because of digitization noise and variations insegmentation result in a convex deficiency that has small, meaningless componentsscattered throughout <strong>the</strong> boundary. Common practice is to first smooth a boundaryprior to partitioning. The polygon approximation is particularly attractive because itcan reduce <strong>the</strong> computation time taken for extracting <strong>the</strong> convex hull from O (n 2 ) to O(n) (n being <strong>the</strong> number of points in <strong>the</strong> shape). The extraction of convex hull can be asingle process which finds significant convex deficiencies along <strong>the</strong> boundary. Afuller representation of <strong>the</strong> shape is obtained <strong>by</strong> a recursive process which results in aconcavity tree. Here <strong>the</strong> convex hull of an object is first obtained with its convexdeficiencies, <strong>the</strong>n <strong>the</strong> convex hulls and deficiencies of <strong>the</strong> convex deficiencies are


Chapter 2: Literature Review 16found, and <strong>the</strong> recursion follows until all <strong>the</strong> derived convex deficiencies are convex.The shape can <strong>the</strong>n be represented <strong>by</strong> a string of concavities (concavity tree). Eachconcavity can be described <strong>by</strong> its area, bridge (<strong>the</strong> line that connects <strong>the</strong> cut of <strong>the</strong>concavity) length, maximum curvature, distance from maximum curvature point to <strong>the</strong>bridge. The matching between shapes becomes a string or a graph matching.Like <strong>the</strong> convex hull, a region skeleton is also employed for shape representation. Askeleton may be defined as a connected set of medial lines along <strong>the</strong> limbs. The basicidea of <strong>the</strong> skeleton is to eliminate redundant information while retaining only <strong>the</strong>topological information concerning <strong>the</strong> structure of <strong>the</strong> object that can help withrecognition. The skeleton methods are represented <strong>by</strong> Blum’s medial axis transform(MAT) [Blum, 1967]. The medial axis is <strong>the</strong> locus of centers of maximal disks that fitwithin <strong>the</strong> shape. The bold line in <strong>the</strong> figure is <strong>the</strong> skeleton of <strong>the</strong> shaded rectangularshape. The skeleton can <strong>the</strong>n be decomposed into segments and represented as a graphaccording to a certain criteria. The matching between shapes becomes a graphmatching. The computation of <strong>the</strong> medial axis is a ra<strong>the</strong>r challenging problem. Inaddition, medial axis tends to be very sensitive to boundary noise and variations.Preprocessing <strong>the</strong> contour of <strong>the</strong> shape and finding its polygonal approximation hasbeen suggested as a way of overcoming <strong>the</strong>se problems. But, as has been pointed out<strong>by</strong> Pavlidis [Pavlidis, 1982] obtaining such polygonal approximations can be quitesufficient in itself for shape description. Morse [Morse, 1994] computes <strong>the</strong> core of ashape from medial axis in scale space.We conclude this section with a note that shape description from intensity images haveto deal with view occlusions and lack of sufficient information. We now study someimportant methods used for shape analysis on 3D mesh models in Section 2.3.2.3 Shape Analysis on 3D ModelsIn Section 2.2, we have reviewed techniques implemented for shape extraction in 2Dintensity images. In <strong>the</strong> following section we present a classification of methods in <strong>the</strong>literature on digitized 3D representations. We follow <strong>the</strong> classification with a briefdescription of some interesting methods.2.3.1 Classification of MethodsThere is a multitude of techniques to assess <strong>the</strong> similarity <strong>am</strong>ong 2D shapes as discussedin <strong>the</strong> Section 2.2. Most of <strong>the</strong> techniques do not extend to 3D models because of <strong>the</strong>difficulty of extending par<strong>am</strong>eterization of <strong>the</strong> boundary curve extracted from 2D to 3D.In simple words, given a 2D shape, its par<strong>am</strong>eterization is a straightforward 1D curve.With a 3D real world object it is difficult because when it is projected onto a 2D imageplane, one dimension of object information is lost. The 3D domain requires dealing with


Chapter 2: Literature Review 17objects of different genus which makes it impossible for most of <strong>the</strong> 2D similarityassessment methods extendable to 3D. The challenge in 3D computer vision is morethan just <strong>the</strong> lack of information as in <strong>the</strong> 2D case and needs to address <strong>the</strong>computational effort and descriptive representation. The 3D data usually are representedas meshes or assemblies of simple primitives. The representation scheme is suitable forvisualization but not for recognition and computer vision tasks. The process of shapeassessment hence becomes a two step process: (1) <strong>the</strong> shape signature extraction and (2)<strong>the</strong> comparison of shape signatures with distance functions. Based on how <strong>the</strong> shape isextracted from <strong>the</strong> 3D model representation techniques can be classified as shown inFigure 2.3.2.3.2 Feature ExtractionFeature extraction techniques usually attempt to represent <strong>the</strong> shape of <strong>the</strong> 3D object<strong>by</strong> a combination of one-dimensional feature vectors. A common approach forsimilarity models is based on <strong>the</strong> paradigm of feature vectors. A feature transformmaps a complex object onto a feature vector in a multidimensional space. Thesimilarity of two objects is <strong>the</strong>n defined as <strong>the</strong> vicinity of <strong>the</strong>ir feature vectors in <strong>the</strong>feature space. Geometric par<strong>am</strong>eters and ratios such as <strong>the</strong> surface area, volume ratio,compactness, Euler numbers and crinkliness have been used with limiteddiscrimination capabilities.3D ShapeFeature ExtractionDescriptiveRepresentationShape Histogr<strong>am</strong>s<strong>To</strong>pologyDescriptionShape ContextShock GraphsAlpha ShapesAspect GraphSpin ImagesHarmonic Shape ImagesCOSMOSGeometry ImagesShape DistributionsSpider modelLocal Feature Histogr<strong>am</strong>Reeb graphSkeletal graphsModel SignaturesFigure 2.3: Classification of methods on 3D data.


Chapter 2: Literature Review 18Kortgen et al. [Kortgen et al., 2003] achieves shape matching <strong>by</strong> extending <strong>the</strong> 2Dshape contexts [Belongie, 2003] to 3D. They use <strong>the</strong> shape context at a point on <strong>the</strong>surface as <strong>the</strong> summary of <strong>the</strong> global shape characteristics invariant to rotation,translation and scaling. Vranic and Saupe [Vranic and Saupe, 2001] propose a newmethod for shape similarity search on polygonal meshes. They characterize spatialproperties of 3D objects such that similar objects are mapped as close points in <strong>the</strong>feature space. They <strong>the</strong>n perform a coarse voxelization of <strong>the</strong> object in <strong>the</strong> canonicalcoordinate fr<strong>am</strong>e and compute <strong>the</strong> absolute value of <strong>the</strong> 3D Fourier coefficients as <strong>the</strong>feature vector. Vranic improves it fur<strong>the</strong>r in [Vranic, 2003]. Ohbuchi et al. [Obhuchi etal., 2003] describe a multi-resolution analysis technique for <strong>the</strong> task of shapesimilarity comparison. They use 3D alpha shapes to generate a multi-resolutionhierarchy of shapes of a given query object. They <strong>the</strong>n follow that <strong>by</strong> applying asimple shape descriptor such as <strong>the</strong> D 2 shape function introduced <strong>by</strong> [Osada et al.,2002] on each of <strong>the</strong> multi-resolution representations and call it <strong>the</strong> multi-resolutionshape descriptor.Automated feature recognition has also been attempted <strong>by</strong> extracting instances ofmanufactured features from engineering designs. Henderson [Henderson et al., 1993]is an extensive survey of such methods that make use of a library of machiningfeatures for description. With <strong>the</strong> assumption of primitives, procedural methodsproposed <strong>by</strong> Elinson et al. [Elinson et al., 1997] and Mukai et al. [Mukai et al., 2002]have applied constructive solid geometry (CSG’s) to classify CAD models ofmechanical parts. Their methods however cannot be extended to a more general classof shapes represented as point sets and meshes. Biermann et al. in [Biermann et al.,2001] propose Boolean operations of primitives for shape description. However, directassessment of similarity between 3D models using Boolean operations iscomputationally slow due to <strong>the</strong> difficulty in aligning <strong>the</strong> models before performing<strong>the</strong> operation. With a large database, it is not a pragmatic solution. Zhang and Chen[Zhang and Chen, 2001] discuss efficient global feature extraction methods from <strong>the</strong>mesh representation.Duda and Hart [Duda and Hart, 1973] have been extended <strong>by</strong> Khotanzad et al.[Khotanzad et al., 1980] to a subset of 3D moments that are invariant to rotation,translation and scaling that can be used as feature vectors for shapes as shown inEquation 2.6.∞∞∞ m = x y z ρ (x, y,z) dxdydzpqr−∞ −∞ -∞pqrwhere ρ (x, y, z) represents <strong>the</strong> point cloud of <strong>the</strong> model. (2.6)Cybenko et al. [Cybenko et al., 1997] use second-order moments, spherical kernelmoment invariants, bounding-box dimensions, object centroid and surface area alongwith a correlation metric for shape-similarity measurement. Elad et al. [Elad et al.,


Chapter 2: Literature Review 192001] implement support vector machines for adaptively selecting weights fordistance measurements between moments for shape similarity. Corney et al. [Corneyet al., 2002] compute <strong>the</strong> Euclidean distance between simple geometric ratios for ashape similarity measure. Cyr and Kimia [Cyr and Kimia, 2001] use a shock graphbasedshape similarity metric to assess <strong>the</strong> similarity between 3D models. Adjacentviews are clustered in, thus generating <strong>the</strong> aspect using a seeded-region growingtechnique that satisfies <strong>the</strong> local monotonicity and specific distinctiveness of <strong>the</strong>aspect view criteria. The comparison of two 3D models is achieved <strong>by</strong> matching <strong>the</strong>2D aspect views.2.3.3 Descriptive RepresentationIn this category of methods, shape matching is achieved through an intermediaterepresentation that aides a matching stage. These methods are usually robust but arecomputationally expensive. Usually <strong>the</strong> 3D information is broken down into a stack of2D descriptors on which robust 2D shape matching techniques can be applied.Dorai presents COSMOS (Curvedness-Orientation-Shape Map on Sphere) [Dorai,1996] as a representation scheme for 3D free form objects from range data withoutocclusions. According to this scheme, <strong>the</strong> object is represented concisely in terms ofmaximal surface patches of constant shape index. The shape index is a quantitativemeasure of shape complexity of <strong>the</strong> surface and is based on <strong>the</strong> principle curvatures ata point on <strong>the</strong> surface. The patches are mapped onto a sphere based on <strong>the</strong> orientationsand aggregated using shape spectral functions. Surface area, curvedness andconnectivity are utilized to capture global shape information. She derives a shapespectrum and experiments on its efficiency of recognition.Johnson and Hebert [Johnson and Hebert, 1999] introduce spin images for a 3D shapebasedobject recognition system towards simultaneous recognition of multiple objectsin scenes containing clutter and occlusion. The spin image is a data level descriptorthat is used to match surfaces represented as surface meshes. They describe surfaceshape as a dense collection of points and surface normals and associate a descriptiveimage with each surface point that encodes global properties of <strong>the</strong> surface using anobject centered coordinate system. The spin image is created <strong>by</strong> constructing a localbasis at an oriented point on <strong>the</strong> surface of <strong>the</strong> object and accumulates geometricpar<strong>am</strong>eters in a 2D histogr<strong>am</strong>. In simple words, <strong>the</strong> spin image can be visualized as asheet spinning about <strong>the</strong> normal at that point. The image is descriptive because itaccounts for all <strong>the</strong> points on <strong>the</strong> surface and is invariant to rigid transformations.Kazhdan et al. [Kazhdan et al., 2003] outline an algorithm for 3D shape matchingusing a harmonic representation of 3D polygonal meshes. They rasterize <strong>the</strong> 3D meshinto 64 x 64 x 64 voxel grids and center <strong>the</strong> object as <strong>the</strong> center of <strong>the</strong> grids so that <strong>the</strong>


Chapter 2: Literature Review 20bounding sphere is of radius 32 voxels. They <strong>the</strong>n treat <strong>the</strong> object as <strong>the</strong> function in 3Dspace and decompose it into 32 spherical functions <strong>by</strong> considering spheres of radii 1through 32. They fur<strong>the</strong>r decompose each of <strong>the</strong> functions into 16 harmoniccomponents and <strong>the</strong> 32 x 16 harmonics constitute <strong>the</strong> harmonic representation of <strong>the</strong>3D model. They compare two harmonic representations with <strong>the</strong> Euclidean distance.Zhang and Hebert propose <strong>the</strong> harmonic shape images as a 2D representation ofsurface patches [Zhang and Hebert, 1999]. The <strong>the</strong>ory of harmonic maps studies <strong>the</strong>mapping between different metric manifolds from <strong>the</strong> energy minimization point ofview. With <strong>the</strong> application of harmonic maps, a surface representation called harmonicshape images is generated to represent and match 3D freeform surfaces. The basic ideaof harmonic shape images is to map a 3D surface patch with disc topology to a 2Ddomain and encode <strong>the</strong> shape information of <strong>the</strong> surface patch into <strong>the</strong> 2D image. Thissimplifies <strong>the</strong> surface-matching problem to a 2D image-matching problem.Shum et al. address <strong>the</strong> problem of 3D shape similarity between closed surfaces in[Shum et al., 1996]. He defines a shape similarity metric as <strong>the</strong> L 2 distance between<strong>the</strong> local curvature distributions over <strong>the</strong> spherical mesh representations of <strong>the</strong> twoobjects. He achieves <strong>the</strong> similarity measure in O (n 2 ) complexity where n is <strong>the</strong>number of tessellations in <strong>the</strong> object mesh. Their experiments on simple shapes showgood shape similarity measurements.2.3.4 Shape Histogr<strong>am</strong>sThe histogr<strong>am</strong>-based methods reduce <strong>the</strong> cost of complex matching schemes butsacrifice efficiency and robustness to <strong>the</strong> methods discussed in Section 2.3.2. Thesemethods compare shapes on <strong>the</strong> basis of <strong>the</strong>ir statistical properties.Ankerst et al. [Ankerst et al., 1999] introduce 3D shape histogr<strong>am</strong>s as an intuitivepowerful approach to structural classification of proteins. They decompose a 3Dobject into three models (shell model, sector model and spider web) around an object’scentroid and process model similarity queries based on a filter refinement architecture.A similar search technique for mechanical parts using histogr<strong>am</strong>s was proposed in[Kriegel et al., 2003]. The models are normalized into a canonical form and voxelizedinto axis parallel equal partitions. Each of <strong>the</strong>se partitions is assigned to one or severalbins in a histogr<strong>am</strong> depending on <strong>the</strong> specific similarity model.Besl et al. [Besl et al., 1995] consider histogr<strong>am</strong>s of crease angles for all edges in atriangle mesh to describe shape. Their method does not match non-manifold surfacesand is not invariant to changes in mesh tessellation. Osada et al. in [Osada et al., 2002]presents Shape Distributions for a shape similarity search engine <strong>by</strong> extending Besl’sapproach. According to his technique, random points from <strong>the</strong> surface of a model are


Chapter 2: Literature Review 21extracted. Shape functions D 1 , D 2 , D 3 , D 4 , and A 3 are computed at each of <strong>the</strong>serandom points.•D 1 : Distance between a fixed point (centroid) and a random point.•D 2 : Distance between two random points.•D 3 : Square root of <strong>the</strong> area of triangle formed <strong>by</strong> three random points.•D 4 : Cube root volume of <strong>the</strong> tetrahedron of four random points.•A 3 : Angle between three random points.They suggest <strong>the</strong> use of D 2 shape function for computing Shape Distributions due toits robustness and efficiency along with invariance to rotation and translation. The D 2distances between random points are normalized using <strong>the</strong> mean distance. The shapedistribution is <strong>the</strong> histogr<strong>am</strong> that measures <strong>the</strong> frequency of occurrence of distanceswithin a specified range of distance values. Once <strong>the</strong> Shape Distributions aregenerated <strong>the</strong> distance between <strong>the</strong> two solid shapes is computed using L N norm.Usually L 2 norm is used for comparison, though o<strong>the</strong>r distances such as Earth Mover’sdistance or match distances can also be used.This technique is robust and efficient for simple objects and gross shape similarity. As<strong>the</strong> resolution of <strong>the</strong> 3D model increases <strong>the</strong> comparison becomes more robust, but <strong>the</strong>computational time increases. Fur<strong>the</strong>rmore as objects become more and morecomplex, <strong>the</strong> Shape Distributions tend to assume similar shape resulting in inaccuratecomparison of solid models. Shape Distributions have been experimented with limitedsuccess on mechanical parts and real laser scanned data. Ohbuchi et al. in [Ohbuchi etal., 2003] improve <strong>the</strong> performance of Shape Distributions with a 2D histogr<strong>am</strong> ofangle-distance and absolute angle distance that can be computed from <strong>the</strong> D 2 shapedistribution. Page et al. [Page et al., 2003b] define shape information as <strong>the</strong> entropy of<strong>the</strong> curvature density. They use it to describe <strong>the</strong> complexity of <strong>the</strong> 3D shape. Hetzel etal. [Hetzel et al., 2001] present an occlusion robust algorithm for 3D objectrecognition that makes use of local features such as <strong>the</strong> shape index, pixel depth and<strong>the</strong> surface normal characteristics in a multidimensional histogr<strong>am</strong>. Histogr<strong>am</strong>s of twoobjects are matched and verified using <strong>the</strong> chi–squared hypo<strong>the</strong>sis test to achieveshape recognition.2.3.5 <strong>To</strong>pology DescriptionThe topology of a 3D model is an important property for measuring similarity betweendifferent models. <strong>To</strong>pology of models is typically represented in <strong>the</strong> form of arelational data structure such as trees or directed acyclic graphs. Subsequently, <strong>the</strong>similarity estimation problem is reduced to a graph or tree comparison problem.


Chapter 2: Literature Review 22Gotsman et al. describe <strong>the</strong> fund<strong>am</strong>entals of spherical par<strong>am</strong>eterization for 3D meshes[Gotsman et al., 2003]. They argue that closed manifold genus-zero meshes aretopologically equivalent to a sphere and assign a 3D position on <strong>the</strong> unit sphere toeach of <strong>the</strong> mesh vertices. They use barycentric coordinates for <strong>the</strong> planarpar<strong>am</strong>eterization. Leibowitz et al. [Leibowitz et al., 1999] share <strong>the</strong>ir memoryintensive experience in implementing geometric hashing for <strong>the</strong> comparison of proteinmolecules represented as 3D atomic structures.In [McWherter et al., 2001] model signature graphs have been proposed fortopological comparison of solid models. They extend attribute adjacency graphs,mentioned in [Joshi and Chang, 1998], to consider curved surfaces. Model signaturegraphs are constructed from boundary representation of <strong>the</strong> solid. This graph forms <strong>the</strong>shape signature of <strong>the</strong> solid model. Once a model signature graph is constructed, <strong>the</strong>solid models are compared using spectral graph <strong>the</strong>ory [Chung, 1997]. The eigenvalues of <strong>the</strong> Laplacian matrix are used in <strong>the</strong> comparison. The eigen values of <strong>the</strong>Laplacian are strongly related to o<strong>the</strong>r graph properties such as <strong>the</strong> graph di<strong>am</strong>eter.The graph di<strong>am</strong>eter is <strong>the</strong> largest number of vertices, which must be traversed, totravel from one vertex to ano<strong>the</strong>r in <strong>the</strong> graph. Ano<strong>the</strong>r technique proposed forcomparing <strong>the</strong> graphs is <strong>the</strong> use of graph invariance vectors [McWherter et al., 2001].The vectors are <strong>the</strong>n compared using L 2 norm to determine similarity between <strong>the</strong>graphs and hence <strong>the</strong> solid models. The graph invariants that form <strong>the</strong> graphinvariance vectors include node and edge count, minimum and maximum degree of<strong>the</strong> nodes, median and mode degree of <strong>the</strong> nodes, and di<strong>am</strong>eter of <strong>the</strong> graph. The useof graph invariance vectors improves <strong>the</strong> efficiency of <strong>the</strong> method. However it resultsin decrease in <strong>the</strong> accuracy of comparison. This technique has been applied tomechanical parts and is applicable to product design and manufacturing domain. Thepaper [Cardone et al., 2003] is a comprehensive survey on shape-similarity basedassessment for product design applications.Multi-Resolution Reeb Graphs presented in [Hilaga et al., 2001] have been used formodeling 3D shapes. The Reeb graph is derived from <strong>the</strong> triangle mesh models <strong>by</strong>defining a suitable function such as <strong>the</strong> geodesic curvature. The choice of <strong>the</strong> functiondepends on <strong>the</strong> topological properties selected. The range of <strong>the</strong> function over <strong>the</strong>object is split into smaller bins. The number of bins is <strong>the</strong> resolution of <strong>the</strong> Reebgraph. Each connected region in <strong>the</strong> bin will map into a node of <strong>the</strong> Reeb graph, and<strong>the</strong> adjacent nodes will be connected <strong>by</strong> edges. The Reeb graph construction has atime complexity of O (N log N), N being <strong>the</strong> number of vertices in <strong>the</strong> mesh. The Reebgraphs of two objects can be used for maximizing a similarity function atcorresponding nodes. This technique is not invariant to Euclidean transformations.We have very briefly described some of <strong>the</strong> key methods for shape analysis on 2Dintensity images and 3D mesh models. In <strong>the</strong> next section we present two tables that


Chapter 2: Literature Review 23contain a qualitative comparison based on algorithm efficiency and descriptivecapability of <strong>the</strong> key methods presented in Section 2.3.2.4 SummaryWe would like to summarize <strong>the</strong> literature review in this section. We have presented3D shape searching as applied in diverse fields such as computer vision, mechanicalengineering, bio-informatics and bio-medical imaging. In Tables 2.1 and 2.2 wecompare <strong>the</strong> description and search efficiency of a few key methods before concludingour summary.Shape signatures are abstractions of 3D shapes and have limited discriminationcapabilities. They are application specific and hence <strong>the</strong> complexity involved inmatching and computation cannot be compared on <strong>the</strong> s<strong>am</strong>e domain for effectiveness.Therefore a good strategy to shape analysis would be <strong>the</strong> choice of a signature that iscomputationally efficient producing lesser false positives followed <strong>by</strong> ano<strong>the</strong>r one thatneeds computational effort to remove those false positives. With <strong>the</strong> popularity of 3Dscanning and CAD models we emphasize <strong>the</strong> necessity of a quick and informationpreserving shape representation than a time consuming exact isomorphicrepresentation.Shape analysis has been pursued <strong>by</strong> researchers for <strong>the</strong> task of multi-modal data fusion(registration), object recognition, object visualization and compression. Most of <strong>the</strong>methods developed are bounded <strong>by</strong> an application specific heuristic constraint thatbridges <strong>the</strong> user’s notion and <strong>the</strong> computer’s notion of shape similarity. We would liketo conclude <strong>the</strong> literature survey as our knowledge base for fur<strong>the</strong>r research anddevelopment.We discuss range data acquisition and solid modeling of mechanical parts andautomotive scenes in <strong>the</strong> next chapter with illustrative ex<strong>am</strong>ples. We emphasize that itis important to understand and interpret <strong>the</strong> data before analysis and so introduce rangeacquisition systems and <strong>the</strong> process of 3D model creation using a range sensor.


Chapter 2: Literature Review 24Table 2.1: Qualitative comparison of 3D shape search methods with focus onalgorithm efficiency.SearchingTechniqueComputationalCostComparisonCostTest DataKey MethodsFeature (Global)IntermediateDescriptionManufacturing andProduct basedDescriptionHistogr<strong>am</strong>-based<strong>To</strong>pological GraphMethodsO ( N ) whereN is <strong>the</strong> numberof voxels underconsideration.O (V logV)in <strong>the</strong> worst casewhere V is <strong>the</strong>number ofvertices.O ( P ) where Pis <strong>the</strong> number ofprimitives.O ( SB ) whereS is <strong>the</strong> numberof s<strong>am</strong>ple pointsand B is <strong>the</strong>number of bins.O ( N ) whereN is <strong>the</strong> numberof voxelsconsidered.O ( F ) where Fis <strong>the</strong> number offeaturesextracted.O(R2 ) whereR is <strong>the</strong>resolution of <strong>the</strong>intermediaterepresentation.O(F2 ) whereF is <strong>the</strong> numberof featuresextracted.O ( B ) where Bis <strong>the</strong> number ofbins.Worst caseO(N3 ) whereN is <strong>the</strong> numberof nodes in <strong>the</strong>graph.Methods spansyn<strong>the</strong>ticmesh datasetsto complexreal datasets.Range andTriangleMesh realworld datasets of scenesand objects.CAD modelsofmechanicalcomponents.Simple andlowresolutionsyn<strong>the</strong>ticmodels.Lowresolutionsyn<strong>the</strong>tic datasets.[Elad et al., 2001][Zhang, andHebert,1999][Johnson andHebert,1999][Dorai,1996][Mukhai et al.,2002][Ankerst et al., 1999][Osada et al.,2002][Hilaga et al.,2001][Leibowitz et al.,1999][McWherter et al.,2001]


Chapter 2: Literature Review 25Table 2.2: Qualitative comparison of 3D shape search methods with focus oneffective description.ShapeCategoryFeature(Global)IntermediateDescriptionManufacturingandProductbasedDescriptionHistogr<strong>am</strong>based<strong>To</strong>pologicalGraphMethodsMethodScaleInvarianceMoments No NoSphericalHarmonicsNoComparison CriterionLocalAdvantagesSaliencyNoComputationallyfastUsed in generalshapeclassification.COSMOS Yes Yes Curvature-basedSpinImagesGaussianImagesFeatureGraphsStringDescriptionShapehistogr<strong>am</strong>sShapeDistributionsSkeletalGraphReebGraphGeometricHashingNoNoNoNoYesYesYesYesNoYesNoNoNoNoNoYesNoNoRobust toocclusionsUseful forpruningUseful formechanical parts.Useful formechanical parts.Simple and easydescription.Good forclustering<strong>To</strong>pologicallycorrect with localsaliency support.Multi-resolutionAnalysisExact matchingDisadvantagesDifferent shapescan have s<strong>am</strong>emoments.Low stabilityAssumes idealdata.Storage of spinimages and 2Dimage matching.Lowcomputationalefficiency.Shape recovery isdifficult withmore primitives.Cannot beautomated.Not very robustUniqueness of <strong>the</strong>distribution is notjustifiedImportant localfeature extractionstageChoice of ReebfunctionHigh storagerequirements


Chapter 3: Data Collection and Modeling 263 DATA COLLECTION ANDMODELINGThe computer vision approach to reverse engineering and under vehicle inspectionrequires digitized data. We hence require a system that can automatically (or withminimal manual intervention) capture geometric structure of an object and store <strong>the</strong>subsequent shape and topology information as a digitized model. We make use of 3Drange scanners for this task. We introduce in this chapter <strong>the</strong> process of range dataacquisition and solid modeling geared towards generating mesh models using a sheetof-lightlaser scanning mechanism and share our experience with <strong>the</strong> IVP range sensorto create 3D models of automotive parts and automotive scenes.3.1 Range Data AcquisitionRange images are a special class of digital images. Each pixel of a range imageexpresses <strong>the</strong> distance between a known reference fr<strong>am</strong>e and a visible point in <strong>the</strong>scene. Therefore, a range image produces <strong>the</strong> 3D structure (though not completely) ofa scene and can be best understood as a s<strong>am</strong>pled surface in 3D. Range images (oftenreferred as depth maps, depth images, xyz maps, surface profiles and 2.5D images) areobtained using range sensors. Range sensors are devices that make use of opticalphenomena to measure range. In general range image acquisition systems areclassified into one of <strong>the</strong> following types based on <strong>the</strong>ir principle of operation:triangulation (passive or active), time of flight, focusing, holography and diffraction.We discuss each of <strong>the</strong>se methods very briefly in Section 3.1.1 and document <strong>the</strong>principle of operation and calibration details of our range sensor in Section 3.1.2.3.1.1 Range Acquisition SystemsWe begin our discussion with triangulation-based techniques. Passive triangulation(stereo) is <strong>the</strong> way humans perceive depth. It involves two c<strong>am</strong>eras taking a picture of<strong>the</strong> s<strong>am</strong>e scene from two different locations at <strong>the</strong> s<strong>am</strong>e instant of time. Depth cues areextracted <strong>by</strong> matching correspondences in <strong>the</strong> two images and using epipolargeometry. Passive triangulation is however challenged <strong>by</strong> <strong>the</strong> ill-posed problem ofcorrespondence in stereo matching. The correspondence problem is eliminated <strong>by</strong>


Chapter 3: Data Collection and Modeling 27replacing one of <strong>the</strong> c<strong>am</strong>eras <strong>by</strong> a moving light source (preferably a laser light source).This technique is called active triangulation where a pattern of light (energy) isprojected on <strong>the</strong> scene and is detected to obtain range measurements. Time of flightrange finders determine range <strong>by</strong> measuring <strong>the</strong> time required for a signal to travel,reflect and return. Holographic interferometry uses split be<strong>am</strong> interference to producean image which when processed fur<strong>the</strong>r, yields <strong>the</strong> range image. A moiré interferencepattern is created when two gratings with regularly spaced patterns are superimposedon each o<strong>the</strong>r. “Moiré” sensors project such gratings onto surfaces, and measure <strong>the</strong>phase differences of <strong>the</strong> observed interference pattern. Distance hence becomes afunction of such phase differences. Focusing and defocusing have also been used toderive range information. These methods infer range from two or more images of <strong>the</strong>s<strong>am</strong>e scene, acquired under varying focus settings. For ex<strong>am</strong>ple, shape from focussensors vary <strong>the</strong> focus of a motorized lens continuously, and measure <strong>the</strong> <strong>am</strong>ount ofblur for each focus value. Once <strong>the</strong> best focused image is determined, a model linkingfocus values and distance is used to approximate distance. The decision model makesuse of <strong>the</strong> law of thin lenses and computes range based on <strong>the</strong> focal length of <strong>the</strong>c<strong>am</strong>era and <strong>the</strong> image plane distance from <strong>the</strong> lens’ center. While triangulationmethods and time of flight methods have been extensively used for computer visiontasks, methods based on holography, focusing and diffraction are sidelined because of<strong>the</strong>ir fund<strong>am</strong>ental performance limitations and <strong>the</strong>ir inability to meet real-timeimaging requirements of speed and accuracy. We direct <strong>the</strong> reader to [Besl, 1988] and[Trucco and Verri, 1998] for fur<strong>the</strong>r reading on range image acquisition andprocessing.We concentrate on triangulation-based range sensors. The main reason behind thischoice is that such sensors are based on intensity c<strong>am</strong>eras, giving us a chance toexploit <strong>the</strong> concepts that we know on intensity imaging. They also give accurate 3Dcoordinate maps and are easy to understand and build for real-time imaging.3.1.2 Range Sensing Using <strong>the</strong> IVP Range ScannerThe IVP RANGER system as shown in Figure 3.1 consists of two differentsubsystems; <strong>the</strong> Smart C<strong>am</strong>era and <strong>the</strong> PC Interface. Each Smart C<strong>am</strong>era contains aSmart Vision sensor, a control processor (Intel 386) and an IVP HSSI (High speedserial interface). The Smart C<strong>am</strong>era is connected to <strong>the</strong> system PC via a COM port andan HSSI Interface on a PCI board called <strong>the</strong> SC adapter. The IVP Ranger isimplemented on <strong>the</strong> MAPP2200 (MAPP stands for Matrix Array Picture Processor),MAPP 2500 and LAPP1530 (Linear Array Picture Processor) Smart Vision Sensorsfrom <strong>the</strong> IVP. The total integration of <strong>the</strong> sensor, A/D converter and <strong>the</strong> processor on<strong>the</strong> s<strong>am</strong>e parallel architecture allows image processing at a very high speed. The SmartC<strong>am</strong>era acquires <strong>the</strong> range profiles autonomously and outputs <strong>the</strong> profiles to <strong>the</strong> hostvia <strong>the</strong> HSSI interface. The host PC can <strong>the</strong>n manipulate <strong>the</strong>se profiles.


Chapter 3: Data Collection and Modeling 28Figure 3.1: IVP Ranger SC-386 range acquisition system.The IVP Ranger uses an active triangulation scheme where <strong>the</strong> scene is illuminatedfrom one direction and viewed from ano<strong>the</strong>r. The illumination angle, <strong>the</strong> viewingangle, and <strong>the</strong> baseline between <strong>the</strong> illuminator and <strong>the</strong> viewer (sensor in this case) are<strong>the</strong> triangulation par<strong>am</strong>eters.The Ranger consists of a special 512 x 512 pixel c<strong>am</strong>era and a low-power stripe laser.The design of <strong>the</strong> Ranger is specifically tailored for <strong>the</strong> c<strong>am</strong>era and <strong>the</strong> supportingelectronics to integrate image processing functions onto a single parallel-architecturechip. This chip contained within <strong>the</strong> c<strong>am</strong>era housing has a dedicated range processingfunction that allows for high-speed acquisition of nearly one million points per second.The most common arrangement of <strong>the</strong> system is to mount <strong>the</strong> c<strong>am</strong>era and <strong>the</strong> lasersource relative to <strong>the</strong> proposed target area to form a triangle where <strong>the</strong> c<strong>am</strong>era, laser,and target are each corners of <strong>the</strong> triangle. The angle where <strong>the</strong> laser forms a corner istypically a right angle such that <strong>the</strong> laser stripe projects along one side of <strong>the</strong> triangle.The angle, α, at <strong>the</strong> c<strong>am</strong>era corner is typically 30-60 degrees. The baseline distancebetween <strong>the</strong> c<strong>am</strong>era and <strong>the</strong> laser, denoted <strong>by</strong> B, specifies <strong>the</strong> right triangle completely(see dotted line in Figure 3.1). We would like to summarize our experience of <strong>the</strong> IVPRanger as a sensor that outputs range values as a function of illumination, relativemotion, temperature and surface reflectance as shown in Equation 3.1.r = F( i, j, α , β ,B,T , η,χ , µ )(3.1)


Chapter 3: Data Collection and Modeling 29where i and j respectively are <strong>the</strong> horizontal and vertical pixel positions, β (= 90degrees) and α is <strong>the</strong> illumination angle and <strong>the</strong> c<strong>am</strong>era view angle respectively, andB is <strong>the</strong> base line distance between c<strong>am</strong>era and <strong>the</strong> laser source. These are <strong>the</strong>important design par<strong>am</strong>eters that decide <strong>the</strong> field of view of <strong>the</strong> scanning mechanism.External par<strong>am</strong>eters such as temperature (T), environmental light (η), surfacereflectance and color of <strong>the</strong> objects (χ) and <strong>the</strong> trajectory of <strong>the</strong> relative motion (µ)also influence <strong>the</strong> quality of range scans. We have characterized <strong>the</strong> scanner tominimize <strong>the</strong> effect of such external factors. We have deduced that <strong>the</strong> warm up timeof 40-50 minutes yielded stable and reliable data. We ignore effects of environmentaltemperature. We also realize that <strong>the</strong> Ranger is sensitive to light and tends to introducesignificant error when <strong>the</strong> <strong>am</strong>bient illumination is strong. Most of <strong>the</strong> scanning that wedo inside <strong>the</strong> lab is performed with minimal lighting. We have learned that <strong>the</strong> effectof illumination can be compensated <strong>by</strong> <strong>the</strong> use of a powerful laser (> 100mW andwavelength 685nm) that we propose to use for scanning under <strong>the</strong> vehicle. We haveperformed a simple experiment to characterize <strong>the</strong> sensor’s behavior to <strong>the</strong> color andreflectance of <strong>the</strong> objects. We have tried to image wooden and metal rectangularblocks of <strong>the</strong> s<strong>am</strong>e size and compared <strong>the</strong> range measurements. We have concludedthat <strong>the</strong> IVP range sensor is not influenced <strong>by</strong> surface reflectance but black objectsbecause of <strong>the</strong>ir laser be<strong>am</strong> reflectance characteristics need modification. We havesimply painted <strong>the</strong> object with a lighter color to work around this sensitivity. We havesimulated <strong>the</strong> triangulation geometry of <strong>the</strong> Ranger system in MATLAB to understand<strong>the</strong> effect of different sensor par<strong>am</strong>eters that influence <strong>the</strong> scanning mechanism and<strong>the</strong> process of calibration.In Figure 3.2 we demonstrate <strong>the</strong> principle behind range acquisition using <strong>the</strong> IVPRange scanner. We show <strong>the</strong> sheet-of-<strong>the</strong>-light laser falling on a target object. Thelaser line that provides cues about <strong>the</strong> surface shape of <strong>the</strong> object is called a surfaceprofile. By traversing <strong>the</strong> entire object ei<strong>the</strong>r <strong>by</strong> moving <strong>the</strong> sensor setup or <strong>the</strong> scene,a sequence of surface profiles is accumulated as a range image.Equation 3.2 is <strong>the</strong> reduced form of range r as a function of <strong>the</strong> geometry, focal length(b 0 is <strong>the</strong> distance between <strong>the</strong> lens and <strong>the</strong> sensor approximated as <strong>the</strong> focal length of<strong>the</strong> lens) and sensor offset position in <strong>the</strong> 512 x 512 CCD chip assuming that we havecompensated for <strong>the</strong> external sensor sensitivity par<strong>am</strong>eters.r( s )( b0tanα− s )cosα= B= Bb0−( b0tanα− s )sinαcosαf tan α - sf + s tanα(3.2)


Chapter 3: Data Collection and Modeling 30αC<strong>am</strong>eraRangeExtractionProfileAccumulationFigure 3.2: Triangulation and range image acquisition.The differential of <strong>the</strong> range equation in Equation 3.2 is <strong>the</strong> resolution of <strong>the</strong> sensor asshown in Equation 3.3− Bb0∆ r = ∆s2( b cosα + s sinα)0(3.3)The maximum range that a particular sensor arrangement with a baseline B and angleα (Equation 3.4) can measure is obtained <strong>by</strong> maximizing <strong>the</strong> function for range(Equation 3.2) in terms of <strong>the</strong> sensor size N and resolving capability ∆x.RT4BbN∆x(1+tanα)20=224bo−(N∆xtanα)24BfN∆x(1+tanα)=4 f −(N∆xtanα)(3.4)These equations are important when <strong>the</strong> decision between field of view and resolutionhas to be made. We make use of <strong>the</strong>se equations for <strong>the</strong> design of our scanningmechanism but not for range measurements. We follow a much more robustcalibration procedure that models <strong>the</strong> world to sensor coordinate transformation as acombination of translation (of <strong>the</strong> world coordinate system to <strong>the</strong> optical coordinatesystem), a rotation (to align optical axis with real world axis) and a projection fromworld to sensor coordinate system. Equation 3.5 is <strong>the</strong> transformation from <strong>the</strong> worldto <strong>the</strong> sensor coordinate system where w p is <strong>the</strong> sensor coordinate system scale factorproportional to <strong>the</strong> baseline distance B, (u, v) refers to <strong>the</strong> position in <strong>the</strong> sensorcoordinates; (X, Y, Z) are <strong>the</strong> real world position coordinates, f is <strong>the</strong> focal length of<strong>the</strong> optics. (u 0 , v o ) is <strong>the</strong> position where <strong>the</strong> optical axis meets <strong>the</strong> sensor, and k v , k u


Chapter 3: Data Collection and Modeling 31and θ are <strong>the</strong> skew and tilt compensation factors. The s ii matrix takes care of <strong>the</strong>rotation in <strong>the</strong> three rectangular axes while (x 0 , y 0 , z 0 ) compensates for <strong>the</strong> translation.wpu1 =wpv0 w p 0010 sinθu ku0 v 001 0cosθku1kv00 f0 0 0 0of00s0 s1s112131sss122232sss132333 0 0 0000000X− x0 Y− y 0 Z− z 0 1(3.5)Equation 3.5 can be simplified into Equation 3.6 with 12 unknown par<strong>am</strong>eters that canbe determined with at least 6 points positioned in <strong>the</strong> world coordinate systemprojected into <strong>the</strong> sensor coordinates.wpua =wpva w p a112131aaa122232aaa132333aaa142434X Y Z 1(3.6)After calibration we know <strong>the</strong> equation for all rays hitting <strong>the</strong> sensor plane. However,we still do not know from which point along <strong>the</strong> ray that it started. <strong>To</strong> find out whereour sheet-of-light rays start we introduce a simple calibration step (Figure 3.3 (b))using <strong>the</strong> sheet-of-light to calibrate a single profile. By finding <strong>the</strong> sensor positionswhere <strong>the</strong> light sheet hits <strong>the</strong> calibration target we can compute <strong>the</strong> world coordinatesfor <strong>the</strong> laser plane. Thus, calibration gives us <strong>the</strong> rays for each sensor coordinate, and<strong>the</strong> laser plane equation, using which we can find <strong>the</strong> world coordinates for each point.This process of calibration can be better understood with <strong>the</strong> help of Figure 3.3. Figure3.3(a) is <strong>the</strong> status of <strong>the</strong> CCD when it is viewing <strong>the</strong> laser line (white line on CCDshown in Figure 3.3(b)). The sensor position is detected with sub pixel accuracy(based on <strong>the</strong> intensity on <strong>the</strong> CCD because of <strong>the</strong> laser line) for <strong>the</strong> rangemeasurement. We solve for <strong>the</strong> 12 unknown par<strong>am</strong>eters as a system of linearequations. Theoretically, for <strong>the</strong> system described in Equation 3.6 we need sixequations to compute <strong>the</strong> par<strong>am</strong>eters. We increase <strong>the</strong> reliability and reduce possibleerror <strong>by</strong> using 40 points on <strong>the</strong> calibration target. With <strong>the</strong> 40 real-world coordinatesas in Figure 3.3(c) known we compute a transformation matrix that maps <strong>the</strong> sensorcoordinates to <strong>the</strong> real world in 3D. We use this transformation matrix for our futurescans without disturbing <strong>the</strong> geometry of <strong>the</strong> scanning mechanism.


Chapter 3: Data Collection and Modeling 32(a)(b)SensorProjectionReal World(c)Figure 3.3: The process of calibration. (a) Single profile calibration. (b) Thephysical calibration target designed to compute <strong>the</strong> 12 unknown par<strong>am</strong>eters using 40points. (c) The transformation from <strong>the</strong> sensor projection coordinates to <strong>the</strong> realworld.


Chapter 3: Data Collection and Modeling 33The IVP range scanner is capable of acquiring 2000 profiles in one second. We havecontrolled <strong>the</strong> relative motion of <strong>the</strong> sensor arrangement with a precise smart motor.The collection of profiles spanning that particular view of <strong>the</strong> object is represented asa range image. We have built a graphical user interface (GUI) for visualizing rangeimages and <strong>the</strong>ir corresponding 3D triangle meshes acquired using <strong>the</strong> scanner. Figure3.4(a) is <strong>the</strong> acquisition and control interface provided <strong>by</strong> <strong>the</strong> IVP and Figure 3.4(b) is<strong>the</strong> snapshot of our visualization GUI in action.3.2 Solid Modeling from Range ImagesRange data acquisition is a digitization process and is only <strong>the</strong> first step towards modelgeneration. We now need to process <strong>the</strong> range information for better visualization andrepresentation. In Section 3.2.1 we explain <strong>the</strong> processing pipeline for creating meshmodels of objects for <strong>the</strong> task of reverse engineering and extend our implementation toa more challenging task of modeling automotive scenes in Section 3.2.2.3.2.1 Modeling Automotive Components for Reverse EngineeringReverse engineering is <strong>the</strong> ability to create computer aided design models of existingobjects. It is often considered as a feedback path for inspection and validation in a rapidmanufacturing system. Bernardini et al. in [Bernardini et al., 1999] stress on <strong>the</strong> promiseand impact of computer aided reverse engineering in <strong>the</strong> process of system design while(a)Figure 3.4: Graphical User Interface. (a) Snapshot of <strong>the</strong> GUI for acquisition from<strong>the</strong> IVP. (b) Snapshot of <strong>the</strong> GUI for visualization.(b)


Chapter 3: Data Collection and Modeling 34Thompson et al. [Thompson et al., 1999] apply reverse engineering as a process that willenable <strong>the</strong> recreation of objects that are out of production.With <strong>the</strong> emergence of high speed accurate laser scanners reverse engineering is movingaway from <strong>the</strong> traditional tedious but accurate coordinate measuring machines (CMM).As discussed in <strong>the</strong> previous section, we use an active range sensor to acquire anensemble of range images to reconstruct a CAD-like model of <strong>the</strong> object. We beginreverse engineering with acquisition of range images using <strong>the</strong> IVP sensor, whichprovides speed and accuracy. We would like to summarize <strong>the</strong> process of modelcreation as a block diagr<strong>am</strong> in Figure 3.5.After data acquisition we have a set of range images representing multiple view pointsaround an object. The task is now to reconstruct <strong>the</strong> CAD model from <strong>the</strong>se rangeimages. The fund<strong>am</strong>ental challenge in modeling <strong>the</strong> range images is that ofreconstruction as discussed in [Hoppe et al., 1992]. The challenge lies in aligningmultiple views into a global coordinate fr<strong>am</strong>e (also called as <strong>the</strong> process of registration)and integrating and merging aligned views into a CAD representation. As discussedearlier, multiple views of an object are necessary to overcome occlusions. As <strong>the</strong> c<strong>am</strong>er<strong>am</strong>oves to <strong>the</strong> new view, <strong>the</strong> resulting data is relative to <strong>the</strong> new view position.Registration is <strong>the</strong> process where we align <strong>the</strong>se multiple views and <strong>the</strong>ir associatedcoordinated fr<strong>am</strong>es into a single global coordinate fr<strong>am</strong>e. The registration problem isessentially recovering <strong>the</strong> rigid transformation from <strong>the</strong> new range data. We define rigidtransformation asy = Rx + t(3.7)Multi-viewRange ImagesMulti-viewRegistrationViewIntegrationMeshModelingRange Sensing Model Reconstruction CAD RepresentationPipeline for Reverse EngineeringFigure 3.5: Block diagr<strong>am</strong> of a laser-based reverse engineering system.


Chapter 3: Data Collection and Modeling 35where R represents <strong>the</strong> rotation matrix and t is <strong>the</strong> translation vector. The point y is <strong>the</strong>s<strong>am</strong>e as x but in <strong>the</strong> global coordinate fr<strong>am</strong>e. Registration is <strong>the</strong> process of finding R andt. The registration process tries to interpret common geometric information from twocalibrated range images at two different poses (views).According to [Horn et al.,1988], given three or more pairs of non-coplanarcorresponding 3D points between views, <strong>the</strong> unknown rigid transformation of rotationand translation has a closed form solution. The registration problem can hence beapproached as a point matching problem. The most popular of registration algorithms is<strong>the</strong> Iterative Closest Point (ICP) algorithm [Besl and McKay, 1992]. We have used <strong>the</strong>implementation of ICP in Rapidform (a reverse modeler software package) for <strong>the</strong> taskof surface registration. It allows us to initialize <strong>the</strong> ICP algorithm <strong>by</strong> manual pointpicking. The three pairs of corresponding points so picked are iteratively refined up to aparticular threshold before merging <strong>the</strong> two point clouds.Having overcome <strong>the</strong> problem of occlusions <strong>by</strong> registering multiple views, we now needto integrate <strong>the</strong>se views into a single surface representation. We consider <strong>the</strong> registeredrange data as a cloud of points and reconstruct <strong>the</strong> topology of that object from its ranges<strong>am</strong>ples. A simple shape may require just a few views while a complicated object mayrequire significantly more. Page et al. [Page et al., 2003a] document this systematicprocedure in <strong>the</strong> literature as a method of reconstructing mechanical components.Figure 3.6(a) shows a part that we would like to reverse engineer. We present resultsof multiple view range image acquisition process in Figure 3.6(b). The point cloud inFigure 3.6(c) is <strong>the</strong> result of reconstruction that we triangulate to represent as a CADmodel in Figure 3.6(d). We use polygonal meshes to represent <strong>the</strong> CAD model.(a) (b) (c) (d)Figure 3.6: Model creation. (a) Photograph of <strong>the</strong> object. (b) Multiple-view rangemaps. (c) View integrated point cloud. (d) Rendered triangle mesh model.


Chapter 3: Data Collection and Modeling 36A polygonal mesh is a piece-wise linear surface that comprises vertices, edges andfaces. A vertex is a 3D point on <strong>the</strong> surface, edges are <strong>the</strong> connections between twovertices, and a face is a closed sequence of edges. In a polygonal surface mesh, anedge can be shared <strong>by</strong>, at most, two adjacent polygons and a vertex is shared <strong>by</strong> atleast two edges. We use triangle meshes to represent <strong>the</strong> discrete approximations to3D surfaces. A triangle mesh is a pair T= where ν={ν 1 ,ν 2 ,ν 3 ,….,ν n } is a set ofvertices, and τ ={τ 1 ,τ 2 ,…,τ m }is <strong>the</strong> set of triangles that approximate <strong>the</strong> surface.3.2.2 Modeling Automotive Scenes for Under Vehicle InspectionUnder ideal laboratory conditions, data collection with a range scanner isstraightforward. Underneath a vehicle however, we address several challenges. Themost significant of those is <strong>the</strong> design of <strong>the</strong> scanning mechanism. The field of view islimited <strong>by</strong> <strong>the</strong> ground clearance and <strong>the</strong> huge variation in size of <strong>the</strong> components thatmake up <strong>the</strong> scene under <strong>the</strong> vehicle. The distance (range) is too small for <strong>the</strong> use oftime-of-flight scanners and laser triangulation scanners but too large forphotogr<strong>am</strong>metric measurements.Real-time 3D data acquisition is a research challenge in computer vision. Before westart thinking of a robot mountable design for vehicle inspection, we would like tobriefly survey 3D data acquisition systems that optimize <strong>the</strong> process of digitization ofreal world scenes and objects for speed and accuracy. One such expensive effort is <strong>the</strong>digitization of statues in <strong>the</strong> “Digital Michelangelo” project that involves <strong>the</strong> closescanning of statues for cultural heritage recording. Levoy et al. in [Levoy et al., 2000]suggest a configuration for high speed laser triangulation that involves a lightprojector recording video, that is processed later to fill holes and registered using <strong>the</strong>ICP algorithm for <strong>the</strong> complete 3D model. The paper [Takatsuka et al., 1999] proposesa low cost interactive active monocular range finder. Davis and Chen [Davis andChen, 2001] present <strong>the</strong> design of a laser range scanner designed for minimumcalibration complexity. They specifically state that despite <strong>the</strong> simple geometry andcomponents, laser scanners must be engineered and calibrated with extremely highprecision. Ch<strong>am</strong>pleboux et al. [Ch<strong>am</strong>pleboux et al., 1992] ex<strong>am</strong>ine <strong>the</strong> process ofregistration of multiple 3D data sets obtained with a laser range finder. They propose anew sensor calibration technique based on <strong>the</strong> conjunction of a ma<strong>the</strong>matical c<strong>am</strong>er<strong>am</strong>odel and fur<strong>the</strong>r discuss an algorithm based on octree splines for recovering rigidtransformation, for rotational and translational rectification between two 3D data setsobtained from <strong>the</strong> range sensor. Having considered so many options and as a tradeoffbetween resolution and field of view we decided to jack <strong>the</strong> vehicle up <strong>by</strong> a meter anduse <strong>the</strong> inverted triangle mechanism for scanning. We calibrated <strong>the</strong> sensorarrangement as discussed in Section 3.1.2 and without disturbing it we inverted it andmoved <strong>the</strong> sensor arrangement on a conveyer belt to reconstruct <strong>the</strong> 3D scene.


Chapter 3: Data Collection and Modeling 37Although a powerful laser was used to counter <strong>am</strong>bient lighting, we could notcompensate for spectral reflections since <strong>the</strong> metallic surfaces under a vehicle exhibitstrong spectral reflection properties. A laser fur<strong>the</strong>r complicates this problem asinternal reflections lead to significant errors in range estimation. A promisingapproach for this problem involves <strong>the</strong> use of an optical filter tuned to <strong>the</strong> frequency of<strong>the</strong> powerful laser. The filter allows <strong>the</strong> data collection to isolate <strong>the</strong> initial reflectionof <strong>the</strong> laser and thus improve <strong>the</strong> range sensing capabilities. The o<strong>the</strong>r noise issue thatwe would like to discuss involves <strong>the</strong> jerks in trajectory of <strong>the</strong> scanning mechanism.We have assumed a linear and smooth trajectory under <strong>the</strong> vehicle in <strong>the</strong> data that wehave presented.Ano<strong>the</strong>r significant problem in range scanning underneath a vehicle is that of viewocclusions. The obvious occlusion is that <strong>the</strong> c<strong>am</strong>era can only view one side of acomponent (<strong>the</strong> bottom side facing straight down towards <strong>the</strong> ground). The muffler forex<strong>am</strong>ple in <strong>the</strong> Figure 3.7(d) is a one-sided view. Without dismantling <strong>the</strong> car, rangescanner cannot extract geometry of <strong>the</strong> o<strong>the</strong>r side of <strong>the</strong> muffler. Such an occlusionshould illustrate <strong>the</strong> potential of o<strong>the</strong>r occlusions such as one object partially coveringano<strong>the</strong>r object from <strong>the</strong> range sensor. The objects underneath a vehicle have variousshapes and scales located at different depths. For ex<strong>am</strong>ple in Figure 3.7(b), <strong>the</strong> bentpipe that connects <strong>the</strong> muffler and <strong>the</strong> catalytic converter is occluded <strong>by</strong> <strong>the</strong> muffler at<strong>the</strong> time of scanning. The solution to this problem is to use multiple scans to fill asmuch as possible <strong>the</strong> areas without information. This solution is a laborious onebecause multiple fields of view imply multiple calibration procedure iterations. Thedifferent views and scanning angles are extremely restricted <strong>by</strong> <strong>the</strong> low groundclearance under a vehicle. Thus an integration and fusion of multiple scans onlypartially fills <strong>the</strong> occlusion holes, but significantly enhances <strong>the</strong> data. As a result, wescan underneath a vehicle with multiple passes and at different angles. The finalchallenge that we consider with <strong>the</strong> data collection is <strong>the</strong> data redundancy inherent tolaser range scanning. A single range image with 512 x 512 pixels yields over 250,000data points. With additional scans to overcome occlusions and to achieve full coverageunder a vehicle, this number quickly grows to several million data points. This largedata set allows high fidelity geometry that o<strong>the</strong>r 3D sensors do not offer, but <strong>the</strong> priceis that of data redundancy and a potential data overload. The data that we present inFigure 3.7(d) is a 40 mega<strong>by</strong>te VRML model with 10 million vertices and 15 milliontriangles.We have presented <strong>the</strong> procedure and capability of data collection using a 3D rangescanner in this chapter. We now have real world objects in a format that computerscan attempt to understand. In Chapter 4, we would like to outline our approach toshape description and discuss <strong>the</strong> building blocks of our algorithm in detail.


Chapter 3: Data Collection and Modeling 38Mosaic of 11 range imagesColor Image(a)(b)(c)Figure 3.7: Data acquisition for under vehicle inspection. (a) A pre-calibratedscanning mechanism in action. (b) The mosaic of range images as <strong>the</strong> output from <strong>the</strong>scanner. (c) Close-up color image of <strong>the</strong> scene. (d) Snapshot of <strong>the</strong> registered 3Dmodel.(d)


Chapter 4: Algorithm Overview 394 ALGORITHM OVERVIEWIn this chapter, we describe our CVM algorithm as <strong>the</strong> informational approach to shapedescription. We first discuss CVM for 2D in <strong>the</strong> context of intensity and range imagesand extend it with modifications to 3D models. We also explain in detail each of <strong>the</strong>building blocks of <strong>the</strong> algorithm.4.1 Algorithm DescriptionBefore we discuss <strong>the</strong> details of <strong>the</strong> algorithm we would like to introduce some of <strong>the</strong>key papers that have influenced our work. Arman and Aggarwal present a survey onmodel based object recognition strategies on dense range images in [Arman andAggarwal, 1993]. More recently, C<strong>am</strong>pbell and Flynn survey free form objectrepresentation and recognition in [C<strong>am</strong>pbell and Flynn, 2001]. We focus our algorithmdevelopment with <strong>the</strong>se surveys as our knowledge base on object representation andrecognition.We are inspired <strong>by</strong> <strong>the</strong> COSMOS fr<strong>am</strong>ework for free form object representation[Dorai, 1996] for <strong>the</strong> development of our CVM algorithm. Dorai defines shape indexand curvedness as indicators of shape and constructs a shape spectrum for objectanalysis. She models range images as a combination of maximally sized surfacepatches of constant shape index to get around segmentation issues and uses a graphrepresentation on her range data. She assumes that <strong>the</strong>re are no occlusions in herimage. Her method of computing curvature that assumes <strong>the</strong> uniform grid structurehowever is not suited for mesh models. With CVM, we hence analyze variouscurvature estimation methods for triangle meshes and propose a graph representationbased on curvedness segmentation and a normalized surface variation measure basedon curvature. Our approach is analogous to <strong>the</strong> shape index that Dorai uses forsegmentation on <strong>the</strong> range image and <strong>the</strong> curvedness map on <strong>the</strong> sphere for shapeanalysis. We chose surface representation because it directly corresponds to <strong>the</strong>features that will aid recognition even with view occlusions in <strong>the</strong> sensed data. <strong>To</strong>illustrate <strong>the</strong> CVM better, we introduce in Section 4.1.1 <strong>the</strong> idea of using information<strong>the</strong>ory for shape complexity description on 2D contours. We discuss <strong>the</strong> algorithmwith a block diagr<strong>am</strong> and describe how we extend it to <strong>the</strong> description of 3D meshmodels in Section 4.1.2.


Chapter 4: Algorithm Overview 404.1.1 Informational Approach to Shape Description – Curvature VariationMeasureWe would like to formulate our algorithm on <strong>the</strong> basis that shape information isdirectly proportional to <strong>the</strong> variation in curvature (curvature of <strong>the</strong> boundary for 2Dcurves and curvature of surfaces on 3D surfaces) and inversely proportional tosymmetry. We propose to extract shape information from <strong>the</strong> images and analyze aprocedure to discriminate objects based on a single number that is a measure of itsvisual complexity.<strong>To</strong> understand <strong>the</strong> basis of our algorithm better, let us start with a small and simpleex<strong>am</strong>ple. In Figure 4.1, we show a circle and an arbitrary contour. Visually, <strong>the</strong> moreappealing of <strong>the</strong> two is <strong>the</strong> circle while <strong>the</strong> complex of <strong>the</strong> two contours is <strong>the</strong> arbitrarycontour. We propose that smoothly varying curvature conveys very little informationwhile sharp variation in curvature increases <strong>the</strong> complexity for shape description. Inthis context, we would like to refresh <strong>the</strong> fact that more likely an event is, lesser <strong>the</strong>information it conveys. The circle has uniformly varying curvature; that is <strong>the</strong>re is nouncertainty involved in <strong>the</strong> variation of its curvature, which means it has <strong>the</strong> leastshape information. Figure 4.2 is <strong>the</strong> block diagr<strong>am</strong> of our algorithm for 2D silhouettesand segmented boundary contours. The block diagr<strong>am</strong> represents <strong>the</strong> CVM algorithmfor 2D contours. We get back to <strong>the</strong> ex<strong>am</strong>ple of <strong>the</strong> circle again. The curvature of acircle is uniformly distributed. The density of curvature hence is a Kronecker deltafunction of strength one. The entropy of a Kronecker delta function is zero. This resultimplies that circular and linear contours convey no significant shape information. Wenote that circles of different radii will also have zero shape information. We argue thatchange in scale (radius) adds no extra shape information. On <strong>the</strong> o<strong>the</strong>r hand, <strong>the</strong> broader<strong>the</strong> curvature density, <strong>the</strong> higher <strong>the</strong> entropy and more complex <strong>the</strong> shape is. The mostcomplex shape hence would be <strong>the</strong> one that has randomly varying curvature at everypoint on <strong>the</strong> boundary.Figure 4.1: A circle and an arbitrary object.


Chapter 4: Algorithm Overview 41BoundaryContourCurvatureEstimationDensityEstimationEntropyComputationShapeInformationFigure 4.2: Block diagr<strong>am</strong> of our CVM as <strong>the</strong> informational approach to shapedescription.We use curvature because it is an information preserving feature, invariant to rotationand possesses an intuitively pleasing correspondence to <strong>the</strong> perceptive property ofsimplicity. Curvature completely par<strong>am</strong>eterizes <strong>the</strong> boundary contour for efficientshape description of 2D curves and boundary contours. We counter <strong>the</strong> inverserelation to symmetry <strong>by</strong> using information <strong>the</strong>ory. Symmetry does not contribute tomore shape information (entropy) but ra<strong>the</strong>r reduces it.4.1.2 Curvature-Based Automotive Component DescriptionOur CVM measure of shape on geometric curves is <strong>the</strong> entropy of curvature along <strong>the</strong>boundary contour. Curvature along <strong>the</strong> boundary provides us with sufficient detail in<strong>the</strong> 2D case but with 3D models and surfaces, it is not enough. We extend our idea ofshape information on 3D meshes to describe surface variation of <strong>the</strong> smooth surfacepatches that make up <strong>the</strong> object and store <strong>the</strong> list of connected patches. We assumethat we can reasonably describe most objects as a unique network of smooth patches.Then, <strong>the</strong> uniqueness of our description is to measure <strong>the</strong> variation in curvature acrosseach of <strong>the</strong>se patches.We describe <strong>the</strong> algorithm in a pictorial fashion with a block diagr<strong>am</strong> in Figure 4.3.Our description which could be used for purposes such as reverse engineering andinspection takes triangle meshes as <strong>the</strong> input. We take <strong>the</strong> ex<strong>am</strong>ple of <strong>the</strong> disc brakeagain. We break down <strong>the</strong> triangle mesh into surface patches based on <strong>the</strong> Dorai’s[Dorai, 1997] definition of curvedness. Curvedness identifies sharp edges and abruptsurface changes. We perform a simple region growing segmentation <strong>by</strong> identifying apoint and collecting <strong>the</strong> vertices whose face normal deviation is less than a particularangle. This angle is a free par<strong>am</strong>eter. We have used 85 degrees as <strong>the</strong> maximumthreshold angle before we meet an edge in <strong>the</strong> growing procedure. We save <strong>the</strong>connectivity information of each of <strong>the</strong>se surface patches. Our segmentation is a crude


Chapter 4: Algorithm Overview 42Triangle MeshSurface PatchDecompositionComputeCurvednessSegment <strong>by</strong>Region GrowingConnectivity ofSurface PatchesCurvature VariationMeasureCurvatureComputationDensityEstimationInformationMeasure0.01 0.11 0.0010.11CVM= 0.001Graph RepresentationFigure 4.3:Block diagr<strong>am</strong> of curvature-based vehicle component descriptionalgorithm including path decomposition and CVM computation.


Chapter 4: Algorithm Overview 43implementation of Guillaume’s algorithm [Guillaume, 2004]. He presents a moreefficient algorithm for <strong>the</strong> decomposition of 3D arbitrary triangle meshes into surfacepatches. The algorithm is based on <strong>the</strong> curvature tensor field analysis and presents twodistinct complementary steps: a region-based segmentation, which decomposes <strong>the</strong>object into known and near constant curvature patches and a boundary rectificationbased on curvature tensor directions, which correct boundaries <strong>by</strong> suppressing <strong>the</strong>irartifacts and discontinuities.We <strong>the</strong>n analyze each surface patch individually to compute <strong>the</strong> CVM, which is <strong>the</strong>entropy of curvature. We compute <strong>the</strong> Gaussian curvature on each of those surfacepatches. The kernel density of <strong>the</strong> Gaussian curvature is estimated. We optimize <strong>the</strong>bandwidth of <strong>the</strong> kernel density using <strong>the</strong> plug-in method to ensure stability in <strong>the</strong>resolution normalized entropy. This log scale measure from <strong>the</strong> curvature density is<strong>the</strong> curvature variation measure (CVM). We <strong>the</strong>n combine <strong>the</strong> surface connectivityinformation and <strong>the</strong> curvature variation measure into a single graph representation.We call our CVM algorithm a curvature-based approach because <strong>the</strong> segmentation and<strong>the</strong> description require computation of curvature. (Curvedness is a function ofprincipal curvatures.) However, <strong>the</strong> surface variation measure that we describe is notinvariant to scale. We would like to emphasize that our algorithm can be used fordescribing occluded scenes as well but at <strong>the</strong> cost of partial graph matching if we haveto attempt object recognition.4.2 Building Blocks of <strong>the</strong> CVM algorithmAs background, we first present a brief overview of surface curvature in <strong>the</strong> importantcontext of differential geometry. We in particular deal with curvature of a surface inSection 4.2.1 because we have assumed that curvature intrinsically describes <strong>the</strong> localshape of that surface. The differential geometry section helps us understand curvatureestimation on triangle meshes. We present a brief survey on curvature estimationtechniques in Section 4.2.2 and <strong>the</strong>n discuss <strong>the</strong> <strong>the</strong>ory behind <strong>the</strong> o<strong>the</strong>r buildingblocks of <strong>the</strong> algorithm in Section 4.2.3 and Section 4.2.4.4.2.1 Differential Geometry of Curves and SurfacesFirst, let us consider <strong>the</strong> continuous case for 2D curves. Using [Carmo, 1976], wearbitrarily define a planar curve α: I R 2 par<strong>am</strong>eterized <strong>by</strong> arc length s such that wehave α(s). We carefully choose, without loss of generality, this par<strong>am</strong>eterization suchthat <strong>the</strong> vector field T = α’ has unit length. With this construction, <strong>the</strong> derivative T’ =α" measures <strong>the</strong> way <strong>the</strong> curve is turning in R 2 and we term T’ <strong>the</strong> curvature vectorfield. Since T’ is always orthogonal to T, that is normal to, we can write that T’=κN


Chapter 4: Algorithm Overview 44where N is <strong>the</strong> normal vector field. The real valued function κ where κ(s) = || α '(s) ||,s <strong>the</strong> curvature function of α and completely describes <strong>the</strong> shape of α in R 2 , up to atranslation and rotation. This curvature function is what we would like to exploit todefine shape information of a curve. We would like to formulate <strong>the</strong> task of curvatureestimation on discrete s<strong>am</strong>ples of such curves. For a planar curve α, we have s<strong>am</strong>plesα j =α(s j ). We assume that uniform s<strong>am</strong>pling across <strong>the</strong> arc length of <strong>the</strong> curve suchthat ∆s = s j - s j-1 is a constant. This approach leads to N s<strong>am</strong>ples over <strong>the</strong> curve α.Since we have uniform s<strong>am</strong>pling along <strong>the</strong> curve κ j is directly proportional to <strong>the</strong>turning angle θ j formed <strong>by</strong> <strong>the</strong> line segments from end point α j-1 to α j and from α j toα j+1 .With 2D curves <strong>the</strong> definition and hence <strong>the</strong> computation of curvature isstraightforward while its extension to 3D surfaces require some concepts indifferential geometry.On a smooth surface S, we can define normal curvature as a starting point. ConsiderFigure 4.4, <strong>the</strong> point p lies on a smooth surface S, and we specify <strong>the</strong> orientation of Sat p with <strong>the</strong> unit-length normal N. We define S as a manifold embedded in R 3 . We canconstruct a plane Π p that contains p and N such that <strong>the</strong> intersection of Π p with Sforms a contour α. As before, we can arbitrarily par<strong>am</strong>eterize α(s) <strong>by</strong> arc length swhere α(0) = p and α’(0)= T. The normal curvature κ p (T) in <strong>the</strong> direction of T is thusα’’ (0) = κ p (T)N .This single κ p (T) does not specify <strong>the</strong> surface curvature of S at psince Π p is not a unique plane. If we rotate Π p around N, we form a new contour on Swith its own normal curvature. We can see that we actually have an infinite set of<strong>the</strong>se normal curvatures around p in every direction. Fortunately, herein enters <strong>the</strong>elegance of surface curvature. For this infinite set, we can construct an orthonormalbasis {T 1 , T 2 } that completely describes <strong>the</strong> set. The natural choice for this basis is <strong>the</strong>tangent vectors associated with <strong>the</strong> maximum and minimum normal curvatures at psince <strong>the</strong> directions of <strong>the</strong>se curvatures are always orthogonal.(a)(b)Figure 4.4: Illustration to understand curvature of a surface.


Chapter 4: Algorithm Overview 45These maximum and minimum directions {T 1 , T 2 } are <strong>the</strong> principal directions. Theadded benefit of choosing <strong>the</strong> principal directions as <strong>the</strong> basis set is that <strong>the</strong> curvaturesκ 1 = κ p (T 1 ) and κ 2 = κ p (T 2 ) associated with <strong>the</strong>se directions lead to <strong>the</strong> followingrelationship for any normal curvature at p:κ ( Tpθ22) = κ1cos ( θ ) + κ2sin ( θ ),(4.1)where T θ = cos(θ)T 1 + sin(θ)T 2 and -π ≤ θ ≤ π is <strong>the</strong> angle to vector T 1 in <strong>the</strong> tangentplane. The maximum and minimum curvatures are known as <strong>the</strong> principal curvatures.The principal directions along with <strong>the</strong> principal curvatures completely specify <strong>the</strong>surface curvature of S at p and thus describe <strong>the</strong> shape of S. Combinations of <strong>the</strong>principal curvatures lead to o<strong>the</strong>r common definitions of surface curvature. The mostcommonly used is <strong>the</strong> Gaussian curvature, and is <strong>the</strong> product of <strong>the</strong> principalcurvatures as shown in Equation 4.2.K p = κ 1 κ 2(4.2)This definition highlights that negative surface curvature that occurs at hyperbolicoccur where only one principal curvature is negative. The second definition ofcurvature is mean curvature. We specify mean curvature as <strong>the</strong> average of bothprincipal curvatures (Equation 4.3). Mean curvature gives insight to <strong>the</strong> degree offlatness of <strong>the</strong> surface.H p= ( κ 1 + κ 2 ) / 2(4.3)4.2.2 Curvature EstimationCurvature estimation is a challenging problem on digitized representations of curvesand surfaces. Consider a 2D function y=f(x).The curvature of <strong>the</strong> continuous functiony is ma<strong>the</strong>matically defined as shown in Equation 4.4.κ =1+ 2d y2dxdydx23 2(4.4)Equation 4.4 assumes <strong>the</strong> rectangular coordinate system. If we par<strong>am</strong>eterize y = f(x) in<strong>the</strong> polar coordinate system <strong>the</strong> curvature equation can be re<strong>written</strong> as in Equation 4.5.κ =r2+ 2 r2θ− rr2 2 3( r + r )θθθ2;rθ∂ r∂ θ .(4.5)


Chapter 4: Algorithm Overview 46These equations for continuous functions of x can be extended to contours of imageswithout much error <strong>by</strong> using <strong>the</strong> difference operator. We identified two key methodsof computing curvature for 2D contours from [Oddo, 1992] and [Abidi, 1995]. Oddofollows <strong>the</strong> strict definition of curvature in <strong>the</strong> continuous case and extends it to <strong>the</strong>digitized curves. He argues that <strong>the</strong> turning angle at every pixel on <strong>the</strong> boundary withtwo o<strong>the</strong>r points at a fixed pixel distance is proportional to <strong>the</strong> curvature at that pixel.Abidi [Abidi, 1995] uses a method based on polar coordinates to estimate curvature.We have used a second order differential operator on <strong>the</strong> boundary contour toapproximate curvature which is also proportional to <strong>the</strong> second derivative of afunction for our implementation.Curvature estimation on surfaces is a more challenging research area. After a detailedsurvey of <strong>the</strong> literature we would like to emphasize <strong>the</strong> fact that most of <strong>the</strong> researchon curvature estimation is in <strong>the</strong> context of range images with very little of it suited for<strong>the</strong> general problem on surface meshes. Flynn et al. [Flynn and Jain, 1989] and Suk etal. [Suk and Bhandarkar, 1992] offer us with surveys on curvature from rangemethods. These methods give an insight of <strong>the</strong> fund<strong>am</strong>ental problems that we mightencounter with surface meshes. One of <strong>the</strong> major assumptions that we make withrange images that prevents us from extending <strong>the</strong>m to triangle meshes is that of aregular grid structure and consistent topology that might not be always <strong>the</strong> case withpolygonal meshes. In <strong>the</strong> next few paragraphs, we present a brief survey on differentcurvature measures on triangle meshes. Triangle meshes are <strong>the</strong> most common outputof 3D scanners and is assumed as a piecewise approximation to a surface.Surface fitting methods try to apply concepts of differential geometry on surfaceapproximations. An analytic surface is fit to <strong>the</strong> region of interest and curvature iscomputed from that functional approximation. Surface fitting methods do not differmuch from <strong>the</strong> curvature-from-range methods because of <strong>the</strong> planar topology of fittedsurfaces and range images. Surface fitting aside, researchers have tried to estimatecurvature using curve fitting methods as well. A f<strong>am</strong>ily of curves is fit around a pointon <strong>the</strong> surface and <strong>the</strong> ensemble is used to compute principal curvatures. Besl and Jain[Besl and Jain, 1986] construct a local par<strong>am</strong>eterization of <strong>the</strong> surface and estimatecurvature <strong>by</strong> fitting orthogonal polynomials followed <strong>by</strong> a series of convolutionoperations. Stokely and Wu [Stokely and Wu, 1992] present five practical solutions,<strong>the</strong> characterized Sander-Zucker approach, two novel methods based on direct surfacemapping, a piecewise linear manifold technique, and a turtle geometry method. One of<strong>the</strong> new methods, called <strong>the</strong> cross patch (CP) method, is shown to be very fast, robustin <strong>the</strong> presence of noise, and is based on a proper surface par<strong>am</strong>eterization, provided<strong>the</strong> perturbations of <strong>the</strong> surface over <strong>the</strong> patch neighborhood are isotropicallydistributed. Kresk et al. [Kresk et al., 1998] summarize <strong>the</strong>ir experience with circlefitting, paraboloid fitting and <strong>the</strong> Dupin cyclide method. These three methods do notassume that <strong>the</strong> s<strong>am</strong>ple points be on a regular grid. They accept <strong>the</strong> speed andaccuracy of <strong>the</strong> circle fitting method but doubt <strong>the</strong> robustness on dense polygonal


Chapter 4: Algorithm Overview 47meshes. With paraboloid fitting which is slower than <strong>the</strong> circle fitting method, <strong>the</strong>ypoint out <strong>the</strong> systematic error that is introduced <strong>by</strong> <strong>the</strong> procedure in estimatingcurvature of smooth and uniformly varying surfaces such as spheres and cylinders.The Dupin Cyclide method turns out slower and inaccurate compared to <strong>the</strong>paraboloid fitting method.Ano<strong>the</strong>r approach to curvature estimation is to use <strong>the</strong> geometry and topology of <strong>the</strong>surface approximation to estimate curvature. These methods compute total curvatureas a global feature at each of <strong>the</strong> vertices of <strong>the</strong> triangle mesh though <strong>the</strong>oreticallyeach s<strong>am</strong>ple point on <strong>the</strong> mesh is a singularity. Lin and Perry [Lin and Perry, 1982]use <strong>the</strong> angle excess around a vertex and extend <strong>the</strong> Gauss-Bonnet Theorem indifferential geometry to define a total curvature measure. They relate it to <strong>the</strong>Gaussian curvature of <strong>the</strong> surface. Desbrun et al. in [Desbrun et al., 1999] derive anestimate of mean curvature on a triangle mesh based on <strong>the</strong> loss of angle approach.Delingette [Delingette, 1999] lays out a fr<strong>am</strong>ework called simplex meshes as a dual oftriangle meshes for surface representation and formulates curvature measures on <strong>the</strong>surface very similar to <strong>the</strong> angle excess method on <strong>the</strong> triangle mesh. Gourley in[Gourley,1998] attempts to approximate a curvature metric based on <strong>the</strong> dispersion offace normals around a vertex while Mangan and Whitaker [Mangan andWhitaker,1999] refine a curvature measure fur<strong>the</strong>r as <strong>the</strong> norm of a covariance matrixfor <strong>the</strong> face normals. Chen and Schmitt [Chen and Schmitt, 1992] formulate aquadratic representation of curvature at each vertex to derive principal curvatures <strong>by</strong>minimizing least squares. Taubin [Taubin, 1995] enhances Chen’s approach into anelegant algorithm that defines a symmetric matrix that has <strong>the</strong> s<strong>am</strong>e Eigen vectors as<strong>the</strong> principal directions and Eigen values that are related <strong>by</strong> a homogenous lineartransformation to principal curvatures. Talking of Eigen analysis [Page, 2001]proposes <strong>the</strong> idea of normal vector voting that selects a geodesic neighborhood aroundeach vertex. The triangles in this neighborhood vote to estimate <strong>the</strong> curvature at <strong>the</strong>specified vertex. He collects <strong>the</strong>se votes in a covariance matrix and uses Eigenanalysis of <strong>the</strong> matrix to estimate curvature. The relative size of <strong>the</strong> neighborhoodcontrols <strong>the</strong> trade-off between algorithm robustness and accuracy.We would like to summarize <strong>by</strong> saying that <strong>the</strong> surface fitting methods require <strong>the</strong>most computational effort since <strong>the</strong>y typically employ optimization in <strong>the</strong> fittingprocess. They are robust to noise but cannot deal with discontinuities. Curve fittingmethods on triangle meshes are extremely sensitive to noise yet very simple. Of <strong>the</strong>methods discussed in <strong>the</strong> previous paragraphs, we have decided to perform an analysisas to which of <strong>the</strong>se would help us in characterizing surfaces and <strong>the</strong>ir complexity. Wechose to compare Gaussian curvature estimates using <strong>the</strong> paraboloid fitting method,Taubin’s method, angle deficit as curvature and <strong>the</strong> Gauss-Bonnet’s extension tocurvature estimation. Our comparison differs from [Surazhsky,2003] and[Meyer,2000] because we not only focus on <strong>the</strong> absolute error in <strong>the</strong> estimation ofcurvature, but also <strong>the</strong> effect of resolution on <strong>the</strong> s<strong>am</strong>e surface and how well each one


Chapter 4: Algorithm Overview 48of <strong>the</strong>se methods can be exploited as surface shape complexity descriptors. We present<strong>the</strong> implementation issues and <strong>the</strong> results of analysis in <strong>the</strong> next chapter and justify <strong>the</strong>use of Gauss-Bonnets method of computing curvature.4.2.3 Density EstimationProbability density functions are ubiquitous when it comes to intelligent decisionmaking and modeling. In this section of <strong>the</strong> document, we survey some of <strong>the</strong> keydensity estimation techniques. Research in this field of density estimation dates back to<strong>the</strong> early 1950’s and was proposed <strong>by</strong> Fix and Hodges in 1951 [Fix and Hodges, 1951]as a breakthrough of freeing discriminant analysis from rigid distributional assumptions.Since <strong>the</strong>n it has undergone application oriented met<strong>am</strong>orphosis. Rosenblatt introduces<strong>the</strong> concept of non-par<strong>am</strong>etric density estimation as an advanced statistical method[Rosenblatt, 1956]. Parzen follows that up with remarks on a model that aims at nonpar<strong>am</strong>etricestimation of a density function [Parzen, 1962].Density estimation is generally approached in two different ways. One of <strong>the</strong>m is <strong>the</strong>par<strong>am</strong>etric approach that assumes that <strong>the</strong> data has been drawn from one of <strong>the</strong>established par<strong>am</strong>etric f<strong>am</strong>ily of distributions such as <strong>the</strong> Gaussian and Rayleigh witha particular mean, variance and o<strong>the</strong>r well defined statistical par<strong>am</strong>eters. The density funderlying <strong>the</strong> data could <strong>the</strong>n be estimated <strong>by</strong> finding <strong>the</strong> estimates of <strong>the</strong> mean and<strong>the</strong> variance from <strong>the</strong> data and <strong>the</strong>n substituting <strong>the</strong>se values into <strong>the</strong> formula of <strong>the</strong>assumed density. The par<strong>am</strong>etric approach to density estimation is bounded <strong>by</strong> <strong>the</strong>rigid assumption of <strong>the</strong> shape of <strong>the</strong> density function independent of <strong>the</strong> observed data.The non-par<strong>am</strong>etric approach however is less rigid in its assumptions. The data speakfor <strong>the</strong>mselves in determining <strong>the</strong> estimate of f. Silverman [Silverman, 1986] traces<strong>the</strong> evolution of density estimation techniques for a uni-variate dataset represented as as<strong>am</strong>ple of n observations of a data set X ={x 1 , x 2 , x 3 , x 4 ,…., x n }. We briefly survey suchtechniques in <strong>the</strong> next few paragraphs.The oldest and probably <strong>the</strong> most widely used non-par<strong>am</strong>etric density estimate is <strong>the</strong>histogr<strong>am</strong>. A histogr<strong>am</strong> is constructed <strong>by</strong> dividing <strong>the</strong> real line into equally sizedintervals, often called bins. The histogr<strong>am</strong> is <strong>the</strong>n a step function with heights being<strong>the</strong> proportion of <strong>the</strong> s<strong>am</strong>ple contained in that bin divided <strong>by</strong> <strong>the</strong> width of <strong>the</strong> bin. If hdenotes <strong>the</strong> width of <strong>the</strong> bins (bin width) and n represents <strong>the</strong> number of s<strong>am</strong>ples in<strong>the</strong> dataset <strong>the</strong>n <strong>the</strong> histogr<strong>am</strong> estimate at a point x is given <strong>by</strong>fˆ ( x ) =(numberofXiin <strong>the</strong>n hs<strong>am</strong>ebinasx)(4.6)The construction of <strong>the</strong> histogr<strong>am</strong> depends on <strong>the</strong> origin and bin width, <strong>the</strong> choice of <strong>the</strong>bin width primarily controlling <strong>the</strong> inherent smoothing of <strong>the</strong> density estimate.


Chapter 4: Algorithm Overview 49Histogr<strong>am</strong>s are good representation tools but not efficient density estimates. We discuss<strong>the</strong> effect of bin width on <strong>the</strong> histogr<strong>am</strong> with a simple ex<strong>am</strong>ple in Figure 4.5. It isimportant to note <strong>the</strong> significant change in shape and <strong>the</strong> density of <strong>the</strong> estimate.Ano<strong>the</strong>r method that is an improvement on <strong>the</strong> histogr<strong>am</strong> used to estimate <strong>the</strong> densityis <strong>the</strong> naïve estimator. It is based on <strong>the</strong> fact that if <strong>the</strong> random variable y has density f<strong>the</strong>n1f ( x ) = lim P( x − h < X < x + h ).h→0 2h(4.7)Thus a natural estimator f of <strong>the</strong> density can be obtained <strong>by</strong> choosing a small number has shown in Equation 4.6.fˆ ( x )=[number ofx1, x2….xnfalling in (x - h, x + h) ]2 × h × n(4.8)The naïve estimator can also be ma<strong>the</strong>matically expressed as follows1fˆ ( x ) =nni = 11 x − Xw (h hwhere w(x) represents a rectangle function of height 0.5 and width of 2.i)(4.9)It is easy to generalize <strong>the</strong> naive estimator to overcome its rugged nature of <strong>the</strong> density<strong>by</strong> replacing <strong>the</strong> weight function w <strong>by</strong> a kernel function K which satisfies <strong>the</strong>condition described in Equation 4.10.Number of bins = 4Number of bins = 7yDensityDensityxyyFigure 4.5: Illustration that shows <strong>the</strong> effect of bin width on density estimationusing a histogr<strong>am</strong>.


Chapter 4: Algorithm Overview 50∞− ∞K(x) dx= 1 .(4.10)Analogous to <strong>the</strong> definition of <strong>the</strong> naive estimator, <strong>the</strong> kernel estimator with kernel Kis defined <strong>by</strong>fˆ ( x )1=nhni=1x − XK(hi)(4.11)While <strong>the</strong> naive estimator can be considered as a sum of boxes centered at <strong>the</strong>observations, <strong>the</strong> kernel estimator is a sum of bumps placed at <strong>the</strong> observations. Thekernel function K determines <strong>the</strong> shape of <strong>the</strong> bumps while <strong>the</strong> window width hdetermines <strong>the</strong>ir width. It suffers inaccuracy with long tailed distributions because of<strong>the</strong> fixed bandwidth throughout <strong>the</strong> process of density estimation.The nearest neighbor class of estimators represents an attempt to adapt <strong>the</strong> <strong>am</strong>ount ofsmoothing to <strong>the</strong> `local' density of data. The degree of smoothing is controlled <strong>by</strong> aninteger k, chosen to be considerably smaller than <strong>the</strong> s<strong>am</strong>ple size; typically k n 1/2 .Define <strong>the</strong> distance d(x, y) between two points on <strong>the</strong> line to be |x - y| in <strong>the</strong> usual way,and for each t define d 1 ( t ) ≤ d 2 ( t ) ≤ ... ≤ dn( t ) to be <strong>the</strong> distances, arranged in ascendingorder, from t to <strong>the</strong> points of <strong>the</strong> s<strong>am</strong>ple.The k th nearest neighbor density estimate is <strong>the</strong>n defined <strong>by</strong>fˆ ( t )=2kndk( t.)(4.12)While <strong>the</strong> naive estimator is based on <strong>the</strong> number of observations falling in a box offixed width centered at <strong>the</strong> point of interest, <strong>the</strong> nearest neighbor estimate is inverselyproportional to <strong>the</strong> size of <strong>the</strong> box needed to contain a given number of observations.In <strong>the</strong> tails of <strong>the</strong> distribution, <strong>the</strong> distance d k (t) will be larger than in <strong>the</strong> main part of<strong>the</strong> distribution, and so <strong>the</strong> problem of under smoothing in <strong>the</strong> tails is reduced. Like<strong>the</strong> naive estimator, to which it is related, <strong>the</strong> nearest choice neighbor estimate asdefined is not a smooth curve. The function d k (t) can easily be seen to be continuous,but its derivative will have a discontinuity. We would like to achieve stability with <strong>the</strong>information measure and since most of <strong>the</strong> surfaces that we are interested in havesmooth analytical par<strong>am</strong>eterization, we are inclined to chose <strong>the</strong> continuous andsmooth looking kernel density estimate. We show how each of <strong>the</strong>se methods estimate<strong>the</strong> density of <strong>the</strong> s<strong>am</strong>e dataset in Figure 4.6. We have reproduced Figure 4.6 from[Silverman, 1986].


Chapter 4: Algorithm Overview 51Method used to compute densityPlot of <strong>the</strong> density functionHistogr<strong>am</strong>sThe Naïve EstimatorThe Kernel Density EstimatorThe Nearest NeighborhoodMethodFigure 4.6: Different methods used to estimate <strong>the</strong> density of <strong>the</strong> s<strong>am</strong>e dataset.Adapted from [Silverman, 1986].


Chapter 4: Algorithm Overview 52There is a plethora of research in automating <strong>the</strong> process of bandwidth estimation thatwill give us <strong>the</strong> best estimate of <strong>the</strong> density as possible. Recollecting Equation 4.9, <strong>the</strong>par<strong>am</strong>eters that influence <strong>the</strong> density estimate are <strong>the</strong> kernel function, density span and<strong>the</strong> kernel bandwidth. We ignore <strong>the</strong> effect of kernel function and density spanbecause of <strong>the</strong> assumption that we have represented <strong>the</strong> digitized surface with enoughpoints to represent a continuous surface. We hence assume that our dataset is too largeto react to <strong>the</strong> effect of <strong>the</strong> different kernel functions listed in Table 4.1. We havedecided to use <strong>the</strong> Gaussian kernel for our implementation for its continuity though <strong>the</strong>Epanechnikov kernel is considered to be <strong>the</strong> most efficient of kernel functions.We have performed a simple experiment on a normally distributed pseudo randomdata set with zero mean and unit variance at 512 points to study <strong>the</strong> effect of <strong>the</strong>bandwidth par<strong>am</strong>eter on density estimation. In Figure 4.7, <strong>the</strong> red curves represent <strong>the</strong>ground truth Gaussian density function and <strong>the</strong> blue ones represent <strong>the</strong> estimateddensity. Figures 4.7(a-f) portray <strong>the</strong> <strong>am</strong>ount of smoothing that <strong>the</strong> bandwidthpar<strong>am</strong>eter imposes on <strong>the</strong> estimated density. Figure 4.7 illustrates <strong>the</strong> importance of <strong>the</strong>bin width par<strong>am</strong>eter in density estimation with a simple ex<strong>am</strong>ple. Though <strong>the</strong> figuresrepresent <strong>the</strong> density of <strong>the</strong> s<strong>am</strong>e dataset, we are able to anticipate <strong>the</strong> instability of itsinformation measure.The paper [Turlach, 1996] is an excellent survey on bandwidth selection in kerneldensity estimation. The books [Silverman, 1986] and [Wand, 1995] are classics in <strong>the</strong>field of kernel estimation and kernel smoothing and have detailed descriptions of kernelTable 4.1: Kernel functions.KernelK(u)Uniform I (| u | ≤ 1 )Triangle ( 1−| u |) I (| u | ≤ 1)Epanechnikov ( 1 − u )I(| u | ≤ 1)4123 215 21635 2( 1−u )3 I(| u | ≤ 1322Triweight ( 1 − u ) I(| u | ≤ 1 )Quartic )− u2Gaussian e ( )12 ππ π uCosinus cos( )I (| u | ≤ 1 )422


Chapter 4: Algorithm Overview 53(a)(b)(c)(d)(e)(f)Figure 4.7: Effect of bandwidth par<strong>am</strong>eter on kernel density. (a) KDE for h =0.01. (b) KDE for h = 0.1. (c) KDE for h = 0.3. (d) KDE for h = 0.5. (e) KDE for h =0.328(optimal). (f) KDE for h = 1.


Chapter 4: Algorithm Overview 54smoothing as applied to uni-variate and multi-variate datasets. Papers that discuss <strong>the</strong>information bound bandwidth selection methods are [Wu and Lin, 1996] and [Jones etal., 1996].The bandwidth selection for <strong>the</strong> process of density estimation is important to assert <strong>the</strong>accuracy of <strong>the</strong> density estimate. The choice of <strong>the</strong> bandwidth at least <strong>the</strong>oretically canbe derived to minimize <strong>the</strong> mean integrated square error between <strong>the</strong> actual densityand <strong>the</strong> computed density. Some methods that are used for this purpose are Distribution Scale Methods, Cross Validation Methods, Plug-In Methods, and Bootstrap methods.In <strong>the</strong> next few paragraphs, we will very briefly discuss <strong>the</strong> rationale behind <strong>the</strong>seobjective methods for bandwidth selection. Assume that f is <strong>the</strong> actual density of <strong>the</strong>data and fˆ is <strong>the</strong> estimated density. The process of bandwidth selection is aimed atminimizing <strong>the</strong> integrated mean square error between <strong>the</strong> actual and <strong>the</strong> estimateddensity. The integrated mean square error is defined as <strong>the</strong> expected value of <strong>the</strong>integrated square error and is given <strong>by</strong> Equation 4.13.MSE{f ( x;h )}122= n− {( K h * f )( x ) − ( K h * f ) ( x )} + {( K h * f )( x ) − f ( x )} . (4.13)The Mean Integrated Square Error ( MISE ) is <strong>the</strong> integral of <strong>the</strong> mean squared errorthat can be simplified as shown below.2MISE{MISE{ fˆ ( x,h )}12=MSE{ fˆ ( x;h )}dx = E{fˆ ( x;h ) − f ( x )} dx22{(Kh* f )( x ) −(Kh* f ) ( x )}dx +{(Kh* f )( x ) −fˆ (.;h )} = n− f ( x )} dx = −− + − − + 2 (4.14)MISE is <strong>the</strong> sum of <strong>the</strong> integrated square bias and <strong>the</strong> integrated variance and henceminimization of that error is effectively <strong>the</strong> tradeoff between <strong>the</strong> bias and variance.The closed form solution that is derived for <strong>the</strong> optimal bandwidth <strong>by</strong> minimizing <strong>the</strong>MISE is <strong>the</strong> h opt in Equation 4.15.h opt= n(2K(z)dz 222z K(z)dz)( f "( x)dx)1/ 5(4.15)


Chapter 4: Algorithm Overview 55The problem with using this closed form solution is <strong>the</strong> dependence of <strong>the</strong> optimalbandwidth on <strong>the</strong> second derivative of <strong>the</strong> density function f that we are trying tocompute. By using <strong>the</strong> Gaussian kernel for our implementation we have ensured <strong>the</strong>differentiability of <strong>the</strong> estimated density and also justified <strong>the</strong> reason for not choosing<strong>the</strong> naïve estimator or its rugged counterparts.Two popular but quick and simple bandwidth selectors are based on <strong>the</strong> normal scalerule and maximum smoothing principle. For ex<strong>am</strong>ple, an easy approach would be tomake use of a standard f<strong>am</strong>ily of distributions to assign a value to <strong>the</strong> doublederivative term. In Equation 4.12 we assume normal density and compute <strong>the</strong> secondderivative. This method can lead to gross errors in cases when <strong>the</strong> data is notdistributed <strong>the</strong> way it was assumed.f2−52−5"(x)dx = σ φ"(x)dx ≈ 0.212σ(4.16)The rationale behind <strong>the</strong> principle of cross validation is to use <strong>the</strong> s<strong>am</strong>e dataset toextract data points partially as a construction set and a training set. A model is fitassuming <strong>the</strong> correctness of <strong>the</strong> training dataset and is tested for accuracy with <strong>the</strong>construction dataset. The error in <strong>the</strong> estimate is minimized <strong>by</strong> defining a cost functionof <strong>the</strong> error. Based on <strong>the</strong> construction of <strong>the</strong> cost function, methods are n<strong>am</strong>ed as leastsquares cross validation, biased cross validation and likelihood cross validation. Moreadvanced bandwidth selectors are <strong>the</strong> plug-in and <strong>the</strong> bootstrap methods that “plug-in”estimates of <strong>the</strong> unknown quantities that appear in <strong>the</strong> formulae for asymptoticallyoptimal bandwidth. Bootstrap methods make use of a pilot bandwidth to initialize <strong>the</strong>density estimation process and improve <strong>the</strong> pilot bandwidth based on <strong>the</strong> data. InEquation 4.15, we show <strong>the</strong> plug-in method of bandwidth selection. Plug-in methodsinvolve <strong>the</strong> estimation of <strong>the</strong> integrated squared density derivatives called functionals.15 243R(K ) = =2 =2ĥσ where R( K ) K( t ) dt, µ 2(K )t K( t ) dt235µ2(K ) n[ ](4.17)and σ = med j | X j − medi( X i ) | is <strong>the</strong> absolute deviation. (4.18)We discuss implementation issues in <strong>the</strong> next chapter. Our next building block is <strong>the</strong>information measure on <strong>the</strong> accurate density of curvature estimated using <strong>the</strong>bandwidth optimized kernel density estimators.


Chapter 4: Algorithm Overview 564.2.4 Information MeasureInformation <strong>the</strong>ory is a relatively new branch of ma<strong>the</strong>matics that began only in <strong>the</strong>1940’s. The term “information <strong>the</strong>ory” still does not possess a unique definition butbroadly deals with <strong>the</strong> study of problems concerning systems that involve informationprocessing, information storage, information retrieval and decision making.The first studies in this direction were undertaken <strong>by</strong> Nyquist in 1924 and <strong>by</strong> Hartleyin 1928 (Equation 4.17) that recognized <strong>the</strong> logarithmic nature of <strong>the</strong> measure ofinformation. In 1948, Shannon published his seminal paper on <strong>the</strong> properties ofinformation sources and of <strong>the</strong> communication channels used to transmit <strong>the</strong> outputsof <strong>the</strong>se sources and <strong>the</strong> important definition of entropy as <strong>the</strong> measure of information(Equation 4.18).HHartley( p1 , p2,...pn) = log |{i; pi> 0,1≤i ≤ n }|(4.19)HShannon= −ni=pilog pi1 (4.20)In <strong>the</strong> past fifty years, literature on information <strong>the</strong>ory has grown quite voluminousand apart from communication <strong>the</strong>ory it has found deep applications in many social,physical and biological sciences, economics, statistics, accounting, language,psychology, ecology, pattern recognition, computer sciences and fuzzy sets.A key feature of Shannon’s information <strong>the</strong>ory is <strong>the</strong> term “information” that can oftenbe given a ma<strong>the</strong>matical meaning as a numerically measurable quantity, on <strong>the</strong> basisof a probabilistic model. This important measure has a very concrete operationalinterpretation for <strong>the</strong> communication engineers. We would like to summarize <strong>the</strong>various definitions of entropy in <strong>the</strong> literature as Table 4.2.The list that we have presented in Table 4.2 is not an exhaustive one though we havespanned a few important definitions involving par<strong>am</strong>eters and weights. Pap in [Pap,2002] briefs <strong>the</strong> history of information <strong>the</strong>ory and discusses various measures ofinformation while Reza [Reza, 1994] approaches information <strong>the</strong>ory from <strong>the</strong> codingaspect of communication <strong>the</strong>ory. We would like to emphasize that <strong>the</strong> difference inusing a discrete random variable and a continuous random variable. The analogousdefinition of Shannon’s entropy in <strong>the</strong> continuous case is called <strong>the</strong> differentialentropy (Equation 4.21).Hdifferenti al∞−∞= p( x )log( p( x )) dx(4.21)


Chapter 4: Algorithm Overview 57Table 4.2: List of entropy type measures of <strong>the</strong> form= == ⋅ϕ ⋅ ϕ Measure h ( x )φ 1 ( x )φ 2 ( x ) v i1 x − x log( x )x v2 ( 1 r ) log x−1r−3 x log( x )−1r−4 ( s r ) log( x )x x v− x rr5 ( 1 )arctan( x )sx r sin( s log x ) x r cos( s log x ) v6−1( m − r ) log( x )r−m+ 1x x v−1− r m7 ( m( m r )) log( x )8 ( 1 t ) log x9 ( 1−s ) ( x −1)−1t+s−1−−1t− − t10 ( t 1)1 ( x −1)11 ( 1 − s ) ( e −1)12( − s ) ( xr− 1 x)s−11−1−1−1)xxxsxx x vsxsx x vx 1 x v( s − 1)x log( xx vrx x v13 x − x r log( x )x v14 ( s r ) x−1r s−x − xx v15 (sin s)-1 x− x r sin( slog x )x v161x( 1 + )log( 1 + λ ) −λλ( 1 + λ x )log( 1 + λx)x v17 x − sin( sx )xlog( )2sin(s / 2 )x vsin( xs ) sin( sx )18 x log( )2 sin( s / 2 ) 2sin(s / 2 )19 x − x log( x )x w i20 x − log( x )1 v i21−1r−1( 1 − r ) log xx 1 v i22 ( 1 − s ) ( e −1)23( − s ) ( xr− 1 x( s 1)log(x )s−11−1−1−1)− 1 v ir−1x 1 v ixvvvv


Chapter 4: Algorithm Overview 58With <strong>the</strong> help of Figure 4.8 we would like to explain an issue with <strong>the</strong> Shannon typeentropy measures. As <strong>the</strong> resolution of <strong>the</strong> data increases, <strong>the</strong> number of points in <strong>the</strong>density is also going to increase and ∆ tends towards zero. Using Reiman’s definitionof integrals we can rewrite Equation 4.18 as−∆f( x )log( ∆f( x ) ) = −i− ∞ −∞i∆f( xi)log( f (x ) ) −i∆f( x )log( ∆ ) )i(4.22)f ( x )log f ( x )dx = lim( H Shannon + log( ∆ ))∆→0 (4.23)We see that as <strong>the</strong> number of points approaches <strong>the</strong> continuous random variable, <strong>the</strong>reis a quantum jump in <strong>the</strong> <strong>am</strong>ount of information measured. We needed a measure thatis normalized and improves with resolution. The measures that we have presented inTable 4.2 have an upper limit that is directly proportional to <strong>the</strong> number of charactersin a symbol. Since we need to have <strong>the</strong> shape information quantized and independentof resolution, we have studied different divergence measures such as KL divergence (Equation 4.24), Jenson-Shannon divergence (Equation 4.25) and Chi-Squareddivergence measures before extending Shannon’s definition for our CVM.p( x )H KL = − p( x )logq( x )(4.24)p + q H Shannon(p ) + H Shannon( q )H JS = H Shannon() −22where p is <strong>the</strong> density of <strong>the</strong> object of interest and q is <strong>the</strong> density of <strong>the</strong>reference. (4.25)We have discussed <strong>the</strong> supporting <strong>the</strong>ory for <strong>the</strong> proposed CVM algorithm. In <strong>the</strong> nextchapter we discuss implementation decisions for <strong>the</strong> algorithm and present <strong>the</strong>experimental results of our algorithm on different datasets.Figure 4.8: Resolution issue with Shannon type measures.


Chapter 5: Analysis and Results 595 ANALYSIS AND RESULTSWe begin this section with important implementation decisions on each of <strong>the</strong> buildingblocks for <strong>the</strong> proposed CVM algorithm. We discuss our algorithm and justify ourchoice of methods before we present analysis results on intensity images, range imagesand 3D mesh models.5.1 Implementation Decisions on <strong>the</strong> Building BlocksWe have acquired <strong>the</strong> data and are ready for shape analysis with <strong>the</strong> CVM. We usetriangle mesh datasets as our input. Since our algorithm is a curvature-based algorithm,our first task is to compute curvature at <strong>the</strong> vertices on <strong>the</strong> mesh. In Section 5.1.1 wediscuss various curvature estimation methods with analysis results on <strong>the</strong> effectivenessof <strong>the</strong>se measures for surface description. We use curvedness, which is a function ofprincipal curvatures, to perform segmentation. We <strong>the</strong>n perform “region growing” toidentify <strong>the</strong> regions and create a mapping of <strong>the</strong> vertex and <strong>the</strong> region to which itbelongs. We use curvature at each of <strong>the</strong>se vertices in a particular region to compute <strong>the</strong>CVM. In short, our CVM algorithm is a three pass algorithm; <strong>the</strong> first pass is for <strong>the</strong>estimation of curvature and curvedness, <strong>the</strong> second one to map vertices to smoothpatches (segmentation) and <strong>the</strong> third one to compute <strong>the</strong> surface variation measure thatwe represent in a region adjacency graph.5.1.1 Analysis of Curvature Estimation MethodsWe recall <strong>the</strong> ma<strong>the</strong>matical definition of triangle meshes as a set of vertices and a listof triangles connecting <strong>the</strong>se vertices. We would like to define more specific termsbefore we discuss <strong>the</strong> implementation of curvature estimation methods. A vertex v i isconsidered as an immediate neighbor of vertex v if edge vv i belongs to <strong>the</strong> mesh. Wen−v and <strong>the</strong> set of <strong>the</strong> triangles containingdenote <strong>the</strong> set of neighboring vertices <strong>by</strong> [ ]1n−1<strong>the</strong> vertex v <strong>by</strong> [ ]i i=0T i i=0 where 1vTi = Triangle( viv v( i+ 1)modn ), 0 ≤ i ≤ n −(5.1)


Chapter 5: Analysis and Results 60We define N v as <strong>the</strong> normal of surface S at a vertex v. We compute <strong>the</strong> normal at avertex using <strong>the</strong> normals of <strong>the</strong> triangles that contain <strong>the</strong> vertex. The normal of atriangle is <strong>the</strong> normal of <strong>the</strong> plane that fits <strong>the</strong> three points and is given <strong>by</strong> Equation5.2. We compute <strong>the</strong> vertex normal as <strong>the</strong> average of <strong>the</strong>se normals weighted <strong>by</strong> areaof <strong>the</strong> triangles involved.Nvi( vi− v ) × ( v=||( v − v ) × ( v − 11 n vv= N in i=0N ;i( i+1)mod n( i+1)mod n− v )− v )||(5.2)NN = vv|| Nv||(5.3)We show a small section of a triangle mesh in Figure 5.1 to understand <strong>the</strong> definitionsbetter. The blue colored point in <strong>the</strong> middle is <strong>the</strong> vertex at which we would like tocompute <strong>the</strong> curvature. Points in red are <strong>the</strong> neighbor points and <strong>the</strong> lines connecting<strong>the</strong> vertex v and its neighbors are <strong>the</strong> triangles that determine <strong>the</strong> surface. N v is <strong>the</strong>normal at <strong>the</strong> vertex that we have defined in Equation 5.3.The paraboloid fitting method [Kresk, 1998] at each vertex is computed <strong>by</strong> translating<strong>the</strong> vertex under consideration to <strong>the</strong> origin and its neighbors are rotated so that <strong>the</strong>vertex normal coalesces with <strong>the</strong> z axis. The osculating paraboloid of <strong>the</strong> form z= ax 2 +bxy + cy 2 is assumed to contain <strong>the</strong>se transformed points. The coefficients a, b, c arefound <strong>by</strong> solving a least square fit to v and <strong>the</strong> neighboring vertices [ v n−1i ] i=0 . The total andmean curvatures are computed using <strong>the</strong> formula in Equation 5.4.2κ = 4ac − b ; H = a + c(5.4)N vv 1v 2v 3v 4αvv 6v 5Figure 5.1: Neighborhood of a vertex in a triangle mesh.


Chapter 5: Analysis and Results 61Gauss-Bonnet approach [Lin, 1982] makes use of <strong>the</strong> angle α i at v and two successiveedges. The reduced form of Gauss-Bonnet <strong>the</strong>orem for polygonal meshes is given interms of <strong>the</strong> loss of angle as Equation 5.5. − KdA = π −i=An 12 α0i(5.5)Assuming that K is a constant in that neighborhood, Gaussian and mean curvature iscomputed asn−1n−112π−αi||ei|| βii= 0 4 i=0κ = ;H =AA33(5.6)where A is <strong>the</strong> accumulated area of all <strong>the</strong> triangles that contain vertex v and || e || iβiis<strong>the</strong> measure of angle deviation between <strong>the</strong> normal at vertex v and its neighbor.Desbrun et al. [Desbrun et al., 1999] reduces <strong>the</strong> normal deviation as <strong>the</strong> sum of <strong>the</strong>cotangents of <strong>the</strong> angle formed at <strong>the</strong> neighbor vertex. Taubin [Taubin, 1995] defines asymmetric matrix using <strong>the</strong> integral formula involving <strong>the</strong> normal curvature.Assuming that vertex normals at each vertex have been computed, a matrix M v isapproximated with a weighted sum over <strong>the</strong> neighbor vertices v where T i is <strong>the</strong> unitlength normalized projection of vector (v i –v) onto <strong>the</strong> tangent plane at v.n 1= − M v wiκn(Ti)TiTii=0Ti=||t[ I − NvNv] [ v − vi]t[ I − NvN ] [ v − v ]||vit(5.7)(5.8)The weights w i in Equation 5.7 are selected proportional to <strong>the</strong> sum of surface areas of<strong>the</strong> triangles incident to both vertices v and v i . The matrix M v is restricted to <strong>the</strong> tangentplane and its Eigen values correspond to <strong>the</strong> principal values of curvature.We compare different approaches to choose one of <strong>the</strong>se for our CVM algorithm. Wemake our decision based on a few experiments. We have chosen a saddle surface forwhich we can compute <strong>the</strong> analytical curvature. We show <strong>the</strong> surface Gaussiancurvature also as a 3D mesh in Figure 5.2 because <strong>the</strong> variation of curvature along <strong>the</strong>surface can be visualized better. We have also presented a simple multi-resolutionexperiment in Figure 5.2. We have s<strong>am</strong>pled <strong>the</strong> saddle surface so that each surface meshmodel is made up of 161 vertices, 961 vertices and 10000 vertices respectively. Wehave computed curvature based on each of <strong>the</strong> methods discussed in <strong>the</strong> previousparagraphs. We observe that <strong>the</strong> curvature estimate of <strong>the</strong> Gauss-Bonnets approach and


Chapter 5: Analysis and Results 62Saddle SurfaceAnalyticSurfaceCurvatureGauss- BonnetApproachParaboloidFittingLoss of angle ascurvatureTaubin’sMethodN =121N =961N = 10000Figure 5.2: Curvature analysis – Multi-resolution error analysis experiment withfour different approaches to curvature estimation on triangle meshes.


Chapter 5: Analysis and Results 63loss of angle approach are two methods that give us good estimates of Gaussiancurvature in comparison with <strong>the</strong> paraboloid and Taubin methods. We also observe thatas <strong>the</strong> resolution of <strong>the</strong> data increases Taubin’s method also improves drastically. Theparaboloid fitting method appears to have <strong>the</strong> s<strong>am</strong>pled version of <strong>the</strong> analyticalcurvature. Since we have fit an analytical surface at each vertex to compute curvaturearound it, <strong>the</strong> error in this method seems to have accumulated throughout <strong>the</strong> mesh.We next performed <strong>the</strong> curvature analysis on <strong>the</strong> unit sphere whose Gaussian curvatureestimate should be equal to <strong>the</strong> reciprocal of <strong>the</strong> radius. We show how each one of <strong>the</strong>semethods has behaved with <strong>the</strong> sphere at different resolutions in Figure 5.3. We wouldlike to reiterate <strong>the</strong> large error in <strong>the</strong> paraboloid fitting methods at low resolutions.Since we are interested in a scheme that is consistent at all resolutions we need tomake a choice between Gauss-Bonnet, loss of angle and Taubin’s methods.We have created syn<strong>the</strong>tic surfaces such as <strong>the</strong> spherical cup, saddle and a monkeysaddle. Visually and analytically <strong>the</strong> monkey saddle surface has <strong>the</strong> maximum variationin curvature. We decided to choose <strong>the</strong> method that categorically shows <strong>the</strong> variation.We call <strong>the</strong> variation as <strong>the</strong> span for curvature and plot it against each surface for <strong>the</strong>four methods in Figure 5.4. We conclude that Taubin and Gauss-Bonnet’s approach forcurvature estimation yields accurate results. We have used Taubin’s method to computeprincipal curvatures and <strong>the</strong> Gauss-Bonnet approach for <strong>the</strong> Gaussian curvature for ourimplementation.We have combined <strong>the</strong> simplicity of <strong>the</strong> Harvard mesh library (<strong>written</strong> <strong>by</strong> X. Gu) andspeed of Triangle mesh library (<strong>written</strong> <strong>by</strong> Michael Roy) for our triangle meshprocessing. Both <strong>the</strong> libraries are open source implementations of <strong>the</strong> half-edge datastructure in C++. We have used <strong>the</strong> Microsoft Developer Environment (MicrosoftVisual C++7.0) as our progr<strong>am</strong>ming platform. For graphs and plots however we haveused MATLAB.5.1.2 Density Estimation for Information MeasureWe would like to document our experience with <strong>the</strong> bandwidth optimization methods.Before incorporating it into our algorithm we have used <strong>the</strong> MATLAB(implementation of Christian Beardah’s) toolbox on kernel density estimation. Withground truth normal density, we have concluded that cross validation methods give usaccurate results. We have compared least squares cross validation, smoo<strong>the</strong>d crossvalidation, likelihood cross validation, biased cross validation, distribution scalemethods and <strong>the</strong> plug-in method. With large data cross validation though accurate was<strong>the</strong> most time consuming. Cross validation is a O (N 2 ) complex algorithm in <strong>the</strong> worstcase and had convergence problems with our real data. Sometimes cross validationmethods result in monotonic cost functions that output <strong>the</strong> lower limit as <strong>the</strong> optimalbandwidth. We use <strong>the</strong> plug-in method. The plug-in method is a multi pass paradigm


Chapter 5: Analysis and Results 64Figure 5.3: Curvature analysis – Error in curvature of a sphere at multipleresolutions.


Chapter 5: Analysis and Results 65Figure 5.4: Curvature analysis – Variation in curvature for surface description.


Chapter 5: Analysis and Results 66that makes use of an equation involving quartiles to output a single number as <strong>the</strong>optimal bandwidth. We have observed that it sometimes gives us under-smoo<strong>the</strong>d datacompared to <strong>the</strong> cross validation methods. We have decided to use <strong>the</strong> plug-in methodfor bandwidth optimization because we want our algorithm to be fully automaticwithout us having to interfere. Ano<strong>the</strong>r important par<strong>am</strong>eter with <strong>the</strong> densitydistribution that decides accuracy of <strong>the</strong> estimate is <strong>the</strong> number of points at which wecalculate <strong>the</strong> density.Ano<strong>the</strong>r small but significant implementation issue that we would like to throw lightupon is <strong>the</strong> difference between continuous random variables and discrete randomvariables. The discrete density function is not a s<strong>am</strong>pled form of <strong>the</strong> continuous densityfunction. We note that <strong>the</strong> density at each point of a discrete random variable is less thanor equal to one and <strong>the</strong> sum of <strong>the</strong> densities is unity.Since some values of <strong>the</strong> density function estimated are possibly zero and since we areusing a logarithmic information measure, we should get around <strong>the</strong> zero points of <strong>the</strong>density function. We do not compute entropy at <strong>the</strong> zero points.5.2 State-of-<strong>the</strong>-Art Shape DescriptorsThe analysis in this section is <strong>the</strong> backbone of our CVM algorithm. We haveimplemented a few state-of-<strong>the</strong>-art algorithms to better understand <strong>the</strong> process of shapeextraction from triangle meshes and also to know about <strong>the</strong> existing curvature-basedmetrics. Now that we have accurate measures of principle curvature and Gaussiancurvature, we are able to identify curvedness and shape index used <strong>by</strong> Dorai in her“COSMOS” fr<strong>am</strong>ework for shape recognition on range images.In Figure 5.5 we show curvedness, shape index and <strong>the</strong> Gaussian curvature color codedmodels of <strong>the</strong> fan disk. (Model source of <strong>the</strong> fan disk: Hughes Hoppe, MicrosoftResearch.) By color coding we mean we have attributed color in <strong>the</strong> RGB spectrum toeach vertex of that model. The cosine color coding is proportional to <strong>the</strong> value of <strong>the</strong>par<strong>am</strong>eter that we have computed at that vertex. For ex<strong>am</strong>ple, in Figure 5.5(d) eachvertex is color coded to Gaussian curvature. We have chosen <strong>the</strong> fan disk model becauseit is <strong>the</strong> one that has a combination of flat and curved surfaces and is not too simple ortoo complex.


Chapter 5: Analysis and Results 67Fan Disk Model CurvednessShape Index Gaussian Curvature2 2κ1+ κ1 12−1κ1+ κ2 κ = κΚ =η = − tan1κ 222 π κ1− κ2(a) (b) (c) (d)Figure 5.5: Curvature-based descriptors.We would like to make <strong>the</strong> following conclusions from Figure 5.5. Curvedness provesto be a good descriptor that detects abrupt change in curvature. Curvedness is consistentat low resolutions but with bad triangulation it produces erroneous results. We attributethis result however to <strong>the</strong> curvature estimation method that assumes good and uniformtriangulation. While on <strong>the</strong> o<strong>the</strong>r hand, we see how shape index is colorful indicatingsurface variation along <strong>the</strong> flat surface facing us in <strong>the</strong> diagr<strong>am</strong>. The definition of shapeindex assumes uniform topology of meshes as in range images. That is why <strong>the</strong> shapeindex of a spherical cap and a spherical cup which look <strong>the</strong> s<strong>am</strong>e visually possessdifferent shape indices. We see in Figure 5.5(d) that <strong>the</strong> Gaussian curvature clearlyshows variation in curvature in each of <strong>the</strong> surface patches and no or very little variationin <strong>the</strong> flat surfaces.We have also implemented a recent method for shape classification and descriptioncalled Shape Distributions [Osada, 2002]. The approach represents shapes ashistogr<strong>am</strong>s. Randomly s<strong>am</strong>pled points on <strong>the</strong> surface of a triangle mesh is used toextract several features such as <strong>the</strong> centroidal profile, distance between two points, anglebetween three random points. These features are binned into a histogr<strong>am</strong>. Thishistogr<strong>am</strong> is used for object detection and classification. Results show similar shapeshaving similar feature histogr<strong>am</strong>s. This algorithm was implemented for shape searchingand retrieval on <strong>the</strong> web. We have tested this algorithm with our automotive parts andhave come to realize that several 1D features cannot represent completely <strong>the</strong> 3Dinformation in an object. We show our implementation of Shape Distributions in Figure5.6 and our experience in representing automotive components in Figure 5.7. Wedemonstrate <strong>the</strong> lack of uniqueness in description with <strong>the</strong> fan disk, disc brake andmuffler models. These models have <strong>the</strong> s<strong>am</strong>e bounding box but we see that <strong>the</strong> discbrake and <strong>the</strong> fan disk though extremely different in shape have a similar histogr<strong>am</strong>,while mufflers though similar in shape have noticeable <strong>am</strong>ount of variation.


Chapter 5: Analysis and Results 68Adapted from [Osada, 2002](a) (b) (c)Adapted from [Osada, 2002](d) (e) (f)Figure 5.6: Implementation of Shape Distributions. (a) Wire fr<strong>am</strong>e model of acube. (b) Shape Distribution result from <strong>the</strong> paper [Osada,2002] for a cube. (c) Resultof our implementation on <strong>the</strong> cube. (d) Wire fr<strong>am</strong>e model of a sphere. (e) ShapeDistribution result from <strong>the</strong> paper [Osada, 2002] for a sphere. (f) Result of ourimplementation on <strong>the</strong> sphere.


Chapter 5: Analysis and Results 69(a) (b) (c) (d)(e) (f) (g) (h)Figure 5.7: Shape Distributions and its uniqueness in description. (a) Model of afandisk. (b) Model of a disc brake. (c) Model of a <strong>To</strong>yota muffler. (d) Model of aVolvo muffler. (e) Shape Distribution of model in (a). (f) Shape Distribution ofmodel in (b). (g) Shape Distribution of model in (c). (h) Shape Distribution of modelin (d).


Chapter 5: Analysis and Results 705.3 Results of our Informational ApproachIn Chapter 4 we discussed our CVM algorithm that quantifies surface shapecomplexity. We compute curvature based on <strong>the</strong> method suggested <strong>by</strong> [Abidi, 1995]and measure boundary complexity as <strong>the</strong> Shannon’s entropy of curvature on 2Dcontours. We have presented <strong>the</strong>se results in [Page et al., 2003b]. We discuss someimportant results on X-ray and range images. We also analyze some limitations ofusing Shannon entropy and <strong>the</strong> need for a normalized information measure beforediscussing <strong>the</strong> results of our graph representation on automotive components.5.3.1 Intensity and Range ImagesIn Figure 5.8(a) we show results on simple curves. We have made a few importantassumptions with <strong>the</strong>se curves. These curves are of <strong>the</strong> s<strong>am</strong>e resolution and areuniformly s<strong>am</strong>pled. We have computed <strong>the</strong> Shannon entropy of <strong>the</strong> turning angle ateach point on <strong>the</strong> boundary as <strong>the</strong> shape complexity measure (SCM). We note thatSCM and CVM are similar measures but are not equivalent.SCM inspired <strong>the</strong>development of CVM, and CVM represents <strong>the</strong> evolution of SCM from lessonslearned on scaling and resolution We would like to emphasize in <strong>the</strong>se results on howshape information behaves with symmetry and how important <strong>the</strong> assumption on sizeand resolution turns out to be. We would also like to note that <strong>the</strong> shape informationfrom <strong>the</strong> Shannon’s measure cannot be compared if <strong>the</strong> two images are not at <strong>the</strong> s<strong>am</strong>eresolution and comparable size. Hence for <strong>the</strong> real data we have normalized <strong>the</strong>segmented region of interest for size and resolution and <strong>the</strong>n computed <strong>the</strong> curvaturebasedmeasure on <strong>the</strong> normalized boundary contour. We would like to recall fromChapter 2 and note that our method falls under <strong>the</strong> boundary-based descriptionmethods. In Figure 5.8(b) we show an ex<strong>am</strong>ple with an X-ray image of a baggage. Thebag contains a few objects that we have segmented manually. We take each of <strong>the</strong>segmented objects and <strong>the</strong>n compute <strong>the</strong> shape information on each of <strong>the</strong>se contours.Our measure categorizes complex objects and simple ones with satisfactory ease.Next, we show some results on range images in Figure 5.8(c).We believe that we willbe able to distinguish between <strong>the</strong> man-made structures that have flat and nice edgeslike <strong>the</strong> building in Figure 5.8 (c) and natural vegetation that has rugged boundaries.5.3.2 Surface RuggednessIn terms of resolution we would like to present some results on syn<strong>the</strong>tic DEMs(Digital Elevation Maps) of <strong>the</strong> s<strong>am</strong>e resolution. The Shannon’s entropy of curvaturegives a consistent ruggedness measure of <strong>the</strong> surface. But we still face inconsistencywith resolution. We formulate our algorithm on <strong>the</strong> heuristic that <strong>the</strong> variation in <strong>the</strong>shape characteristics of surfaces is ma<strong>the</strong>matically <strong>the</strong> variation of curvature. Wedefine shape information as <strong>the</strong> entropy of <strong>the</strong> curvature density of <strong>the</strong> surface underconsideration.


Chapter 5: Analysis and Results 71(a)X-ray image SCM = 1.227 SCM = 2.3458 SCM=3.4050 SCM = 0.891(b)Range ImageSCM = 0.5(c)SCM = 1.7Figure 5.8: Shape complexity measure– using Shannon’s definition of information.(a) Results on simple curves. (b) Results on segmented objects from <strong>the</strong> X-ray imageof a baggage. (c) Results on segmented contours from a range image.


Chapter 5: Analysis and Results 72SCM =0.6 SCM = 1.276 SCM =2.2(a) (b) (c)Figure 5.9: Shape information and surface ruggedness. (a) Shape informationmeasured on a DEM of a plain terrain. (c) Shape information on a plateau terrain. (d)Shape information on a mountainous terrain.In Figure 5.9 we show three surfaces. Figure 5.9(a) can be considered to represent aplain while Figure 5.9(b) and 5.9(c) represent a plateau and a mountainous region,respectively. We have color coded each of <strong>the</strong>se surfaces <strong>by</strong> <strong>the</strong> scale that we show in<strong>the</strong> picture. In agreement to our perceptual thinking we observe that <strong>the</strong> informationthat <strong>the</strong> total curvature conveys about each of <strong>the</strong>se surfaces is well quantified <strong>by</strong>SCM.5.3.3 3D Mesh ModelsWe see that <strong>the</strong> CVM algorithm behaves as expected in Figure 5.9 (a) – (c) but still notrobust because of <strong>the</strong> assumption on resolution and s<strong>am</strong>pling. We counter <strong>the</strong> problemof resolution as lack of information. We compensate with <strong>the</strong> information that for acontour its circumscribing circle of <strong>the</strong> s<strong>am</strong>e resolution, a plane of <strong>the</strong> s<strong>am</strong>e resolutionfor a surface and a circumscribing sphere for 3D models would have <strong>the</strong> least shapeinformation. We use <strong>the</strong> circumscribing reference because it is easy to determine from<strong>the</strong> characteristics of <strong>the</strong> model and we might lose an important length dimension if wemake use of an inscribed reference. The inscribed sphere, for instance, on a cylindercould turn out to be too small compared to <strong>the</strong> size of <strong>the</strong> cylinder and might not be agood reference. We measured shape complexity as <strong>the</strong> shape information distancebetween <strong>the</strong> two datasets and used <strong>the</strong> KL divergence measure (Equation 4.24) on superquadrics of varying shape factors. We present those results in Figure 5.10.


Chapter 5: Analysis and Results 73xa12e1+ya22e2ee21+za32e1= 1Figure 5.10: Shape information divergence from <strong>the</strong> sphere – Experimental resultson super quadrics.


Chapter 5: Analysis and Results 74We observe that <strong>the</strong> results with <strong>the</strong> super quadrics are interesting. We chose superquadrics to perform our experiments because <strong>the</strong>y provide us a scheme of slowlyvarying shapes (that can be controlled <strong>by</strong> a par<strong>am</strong>eter) and smoothly varyingcurvature. Though we are unable to cluster or classify shapes based on a singlenumber, we would like to point out <strong>the</strong> success with super quadrics. We however hadto deal with ano<strong>the</strong>r major problem. The asymptotic behavior of <strong>the</strong> divergencemeasure as <strong>the</strong> resolution tended to infinity. We end up with an impulse function on<strong>the</strong> sphere in <strong>the</strong> continuous case as a reference to a curvature density of ano<strong>the</strong>r 3Dmodel. Though <strong>the</strong> divergence measures are defined for <strong>the</strong> continuous randomvariables, our shape complexity measure becomes unstable with resolutionapproaching <strong>the</strong> continuous case. We also are not able to justify what it means for twocompletely different shapes having <strong>the</strong> s<strong>am</strong>e measure. The magnitude of <strong>the</strong> measurethough can be understood as <strong>the</strong> number of bits that are required to describe <strong>the</strong> shapecomplexity of <strong>the</strong> object; it is not very convincing for <strong>the</strong> application of shapeclassification or clustering.Our focus is to make <strong>the</strong> CVM independent of resolution and support it with<strong>the</strong>oretical consistency. We hence decided to change <strong>the</strong> reference from <strong>the</strong> sphere thatrepresented <strong>the</strong> object with least information to <strong>the</strong> abstract, most complex object atthat resolution. We have extended <strong>the</strong> Shannon’s definition to a resolution normalizedentropy form as shown in Equation 5.9.H∆( X ) p( x )logR p( x )(5.9)= −where R is <strong>the</strong> resolution of <strong>the</strong> datasets under consideration and p(x) is <strong>the</strong> probabilitydensity of <strong>the</strong> curvature. We achieve two things with this measure of information. Themeasure is normalized. It has a minimum value of zero and a maximum value of one.The measure is in a logarithmic scale and is resolution independent. In Figure 5.11 weshow how curvature on surfaces acts a descriptor with <strong>the</strong> spherical cup, saddle and<strong>the</strong> monkey saddle surfaces. We would like to point out that <strong>the</strong> broader <strong>the</strong>probability density function <strong>the</strong> higher <strong>the</strong> complexity. In Figure 5.12 we perform <strong>am</strong>ulti-resolution experiment with our CVM shape signature. We res<strong>am</strong>ple <strong>the</strong> monkeysaddle without obvious change in shape to show that our measure is now independentof resolution. N refers to <strong>the</strong> number of vertices in that surface and F is <strong>the</strong> number offaces.We recollect <strong>the</strong> experience with <strong>the</strong> curvature-based descriptors. Curvature alone is nota sufficient feature for shape description because we have lost more than twodimensions of description in trying to represent 3D into a 1D function of curvature.However, now that we have verified <strong>the</strong> surface description capabilities of our measurewe propose to describe objects that can be broken down into surface patches. We makeuse of curvedness for this task. We identify <strong>the</strong> sharp edges and creases and use it forsegmentation of <strong>the</strong> triangle meshes.


Chapter 5: Analysis and Results 75Surface Curvature at each vertex Density of curvature(a)(b)Figure 5.11: Surface description results - surface, curvature and density ofcurvature of (a) Spherical cap (b) Saddle (c) Monkey saddle(c)


Chapter 5: Analysis and Results 76CVM = 0.548CVM = 0.634N = 50 F = 84N = 386 F = 653N = 84 F = 115CVM = 0.62N = 580 F = 1037CVM = 0.644N = 165 F = 252CVM = 0.637CVM = 0.6346N = 700 F = 1302N = 262 F = 409CVM = 0.62N = 882 F = 1640CVM = 0.6311Figure 5.12: Multi resolution experiment on <strong>the</strong> monkey saddle – The surface, itscurvature density and <strong>the</strong> measure of shape information.


Chapter 5: Analysis and Results 77We present <strong>the</strong> results of <strong>the</strong> shape description proposed in this <strong>the</strong>sis in Figures 5.13and 5.14. We start with <strong>the</strong> description of <strong>the</strong> simple cube in Figure 5.13 (a). We showhow <strong>the</strong> six faces of a cube are interconnected in <strong>the</strong> graph and since each of <strong>the</strong>se facesis planar <strong>the</strong>y convey no shape information. We would like to emphasize that all cuboidswill also have <strong>the</strong> s<strong>am</strong>e description. This can be distinguished only with scaleinformation along with <strong>the</strong> graph. With <strong>the</strong> fan disk ex<strong>am</strong>ple, we show <strong>the</strong> graphcomplexity that we will face with more and more complex parts. Figure 5.13(c),5.14(a)–(c) are our experimental results on automotive components. Since ourassumptions about man-made components go well with <strong>the</strong> informational signature thatwe have proposed our results are good. We show <strong>the</strong> result of applying our measure on<strong>the</strong> real scene that we acquired before we conclude this section in Figure 5.15. We show<strong>the</strong> scene, <strong>the</strong> segmented muffler from <strong>the</strong> scene and its description that looks verysimilar to <strong>the</strong> muffler results in Figure 5.13(c). We consider this as our first step towardsobject detection. However for <strong>the</strong> algorithm to be fully automated for object detectionwe need an implementation of partial graph matching. We also have to addressocclusion problems and representation issues with more and more complex components.This section concludes our experimental results for <strong>the</strong> CVM algorithm. We havepresented <strong>the</strong> evolution of <strong>the</strong> algorithm in this chapter with results and analysis at eachstage of <strong>the</strong> development. We now move to <strong>the</strong> final section of this <strong>the</strong>sis where wedraw conclusions from <strong>the</strong>se results and <strong>the</strong>n discuss future directions for our research.


Chapter 5: Analysis and Results 78Triangle MeshModelCurvedness – SharpEdges DetectionSmooth PatchDecompositionGraphRepresentation(a)(b)(c)Figure 5.13: CVM graph results on simple mesh models: curvedness-based edgedetection, smooth patch decomposition and graph representation. (a) Cube (b) Fandisk. (c) Disc Brake.


Chapter 5: Analysis and Results 79Triangle Mesh modelCurvedness – SharpEdges DetectionSmooth PatchDecompositionGraphRepresentation(a)(b)(c)Figure 5.14: CVM graph results on automotive parts: curvedness-based edgedetection, smooth patch decomposition and graph representation of (a) catalyticconverter, (b) Volvo muffler and (c) <strong>To</strong>yota muffler.


Chapter 5: Analysis and Results 80(a) (b) (c)Figure 5.15: CVM graph results on an under vehicle scene. (a) Under vehiclescene. (b) Segmented muffler model. (c) Shape description of <strong>the</strong> muffler.


Chapter 6: Conclusions 816 CONCLUSIONSIn this <strong>the</strong>sis, we have described a pipeline for real-time imaging and 3D modeling ofautomotive parts and a representation scheme that would simplify <strong>the</strong> task of threatdetection for vehicle inspection. This research relies heavily on a heuristic, which wecall CVM, which is based on curvature of surfaces and <strong>the</strong>ir contribution to describingsurface complexity. In <strong>the</strong> previous chapters, we have reviewed research in <strong>the</strong>computer vision literature similar to our algorithm as a context for our contributions andpresented <strong>the</strong> supporting <strong>the</strong>ory along with experimental results. We have also discussedcertain implementation issues of <strong>the</strong> algorithm. We now conclude with a brief summaryof <strong>the</strong> contributions and a short discussion on future directions.6.1 ContributionsOur research efforts were focused on <strong>the</strong> construction of a scanning mechanism thatwould be able to create 3D models of automotive components. We have used <strong>the</strong>sheet-of-light active range imaging technique for <strong>the</strong> data acquisition task andextended its capability to extract geometry of an automotive scene. We have outlinedour design efforts towards data collection and followed it up with results on 3D modelcreation and analysis of objects. We have also presented experimental results of aninformation <strong>the</strong>ory-based surface shape description algorithm on <strong>the</strong> laser scanned 3Dmodels. The 3D data acquisition process to generate a dense point cloud of a particularview of an object, multiple view fusion and surface graph representation (comparableto CAD) of <strong>the</strong> models is our implementation of a pipeline that aids reverseengineering and inspection.Based on our survey and implementation of <strong>the</strong> state-of-<strong>the</strong> art algorithms for curvatureestimation on triangle meshes, we have presented a rigorous analysis on key methods.Our comparison sheds light on <strong>the</strong> errors in magnitude of curvature and also <strong>the</strong> effectof factors such as resolution and effectiveness in describing visual complexity.We hence would like to summarize <strong>the</strong> quintessence of <strong>the</strong> <strong>the</strong>sis as <strong>the</strong> definition ofCVM as <strong>the</strong> informational approach to shape description. We have used curvatureestimates at each vertex to generate probability distribution curves. With <strong>the</strong>se curves,we have formulated an information <strong>the</strong>oretic based on entropy to define surface shape


Chapter 6: Conclusions 82complexity. In <strong>the</strong> spirit of Claude Shannon's definition of information, this measurereflects <strong>the</strong> <strong>am</strong>ount of shape information that an object surface possesses. Objects andscenes with nearly constant curvature contain relatively low values of shapeinformation, while o<strong>the</strong>r objects and scenes with significant variation in curvatureexhibit fairly large values. Though our idea of using curvature as a feature fordescription is not new, our attempt to quantify <strong>the</strong> perceptual complexity of a surfaceusing information <strong>the</strong>ory is. Since we are describing surfaces using our approach,occluded scenes such as <strong>the</strong> one that was obtained real-time can also be represented witha good degree of confidence towards object detection.6.2 Directions for <strong>the</strong> FutureWe feel that <strong>the</strong> process of creating 3D models has scope for improvement. Using oursystem design, it takes nearly six hours to scan, fuse and integrate multiple views intoits complete 3D triangle mesh model. We did not consider optimizing views forminimizing scan time. We can approach view planning as a sensor placement problemfor better efficiency. The solution to <strong>the</strong> problem will also enhance <strong>the</strong> under vehiclescene modeling. Our system design also has a lot of scope for improvement towardsvehicle inspection. Instead of using a conveyer belt that houses <strong>the</strong> range sensor, itwould be better to have a calibrated setup to control <strong>the</strong> relative motion. We wouldalso like to have <strong>the</strong> scanning mechanism redesigned to be robot mountable toautomate <strong>the</strong> scanning process.In <strong>the</strong> typical context of aligning range scans of an object in order to create a completemodel of that object we would like to point out <strong>the</strong> possibility of application of ouralgorithm to surface registration. Surface registration is a feature dependent process.More features improve registration. By features, here we mean unique geometricinformation. We believe that representing a range scan as a cluster of shape measuresaround a neighborhood would help us recover <strong>the</strong> rigid transformation from ano<strong>the</strong>rview of <strong>the</strong> s<strong>am</strong>e object that has some common information (overlap). A multi-scalehierarchical informational approach should be a good start for this process.Object recognition is an extremely difficult task with most current solutions limited to avery constrained and restricted problem domain. Although we do not claimcontributions in terms of recognition as yet, we are encouraged <strong>by</strong> <strong>the</strong> results in this<strong>the</strong>sis that it might serve as a first step in <strong>the</strong> recognition pipeline. Shapiro andStockman [Shapiro, 2001] suggest commonly used paradigms for object recognitionwhere <strong>the</strong> method chosen depends heavily on <strong>the</strong> application. They discuss twoparadigms that use part (region) relationships to move away from a geometric definitionof an object to a more symbolic one. Our algorithm benefits <strong>the</strong> creation of such asymbolic graph representation from a mesh representation. We would like to performrigorous experiments on partial graph matching for threat detection and model-based


Chapter 6: Conclusions 83object matching before we claim confidence and robustness. We also would like toexperiment <strong>the</strong> effect of segmentation on our algorithm. More robust segmentationmethods based on <strong>the</strong> minima rule and boundary refinement can substantially enhance<strong>the</strong> performance when our algorithm is used for <strong>the</strong> object detection and recognition.6.3 Closing RemarksIn <strong>the</strong> first chapter of this document, we proposed to use <strong>the</strong> part-based humanperception model for shape analysis. Though our implementation does not completelycapture <strong>the</strong> perceptual power of <strong>the</strong> human mind or its coordination with <strong>the</strong> eye, <strong>the</strong>concepts presented in this <strong>the</strong>sis are a first step, though a very small one towardsextending <strong>the</strong> state of <strong>the</strong> art in 3D computer vision.


Bibliography 84BIBLIOGRAPHY


Bibliography 85[Abidi, 1995]B. R. Abidi, “Automatic Sensor placement forvolumetric object characterization,” PhD Thesis,University of Tennessee, Knoxville,1995.[Ankerst et al., 1999] M. Ankerst, G. Kastenmüller, H. P. Kriegel, T.Seidl, “3D Shape Histogr<strong>am</strong>s for Similarity Searchand Classification in Spatial Databases,” LectureNotes in Computer Science, 1999, Volume 1651,Springer, pp. 207-226.[Arman and Aggarwal, 1993]F. Arman and J. Aggarwal, “Model-based objectrecognition in dense-range images—A review,”ACM Computing Surveys, 1993, Volume 25,Issue1,pp. 5-43.[Bernardini et al., 1999] F. Bernardini, C. L. Bajaj, J. Chen and D. R.Schikore, “Automatic reconstruction of 3D CADmodels from digital scans,” International Journalon Computational Geometry and Applications,1999, Volume 9, Issue 4/5, pp. 327-369.[Besl and Jain, 1986]P. J. Besl, and R. C. Jain, “Invariant surfacecharacteristics for 3D object recognition in rangeimages,” Journal of Computer Vision, Graphicsand Image Processing, 1986, Volume 33, pp. 33-80.[Besl, 1988] P. J. Besl, “Surfaces in range imageunderstanding,” Springer-Verlag New York, Inc.,New York, NY, 1988.[Besl and McKay, 1992][Besl ,1995]P. J. Besl and N. D. McKay, “A method forregistration of 3D shapes,” IEEE Transaction onPattern Analysis and Machine Intelligence,1992,Volume 14, Issue 2, pp. 239-256.P. Besl, “Triangles as a primary representation:Object Recognition in Computer Vision,” Lecturenotes in Computer Science, 1995, pp. 191-206.


Bibliography 86[Biermann, 2001][Belongie et al., 2002][Beretti et al., 2000][Bimbo and Pala, 1997][Blum, 1967][C<strong>am</strong>pbell and Flynn, 2001]H. Biermann, D. Kristjansson and D. Zorin,“Approximate Boolean Operations on Free-formSolids,” In <strong>the</strong> Proceedings of SIGGRAPH2001,Los Angeles, California, August 2001, pp.185-194.S. Belongie, J. Malik and J. Puzicha, ”ShapeMatching and Object Recognition Using ShapeContexts,” IEEE Transactions on Pattern Analysisand Machine Intelligence, April 2002, Volume 24,Issue 4, pp. 509-522.S. Berretti, A. D. Bimbo and P. Pala, “Retrieval <strong>by</strong>shape similarity with perceptual distance andeffective indexing,” IEEE Transactions onMultimedia, 2000, Volume 2, Issue 4, pp. 225-239.A. D. Bimbo and P. Pala, “Visual image retrieval<strong>by</strong> elastic matching of user sketches,” IEEETransactions on Pattern Analysis and MachineIntelligence, 1997, Volume 19, Issue 2, pp. 121-132.H. Blum, “A transformation for extracting newdescriptors of shape,” Models for <strong>the</strong> Perceptionof Speech and Visual Forms, 1967, MIT Press,C<strong>am</strong>bridge, MA, pp. 362–380.R. J. C<strong>am</strong>pbell and P. J. Flynn, “A Survey of Free-Form Object Representation and RecognitionTechniques,” Journal on Computer Vision andImage Understanding, February 2001, Volume 81,Issue 2, pp. 166-210.[Cardone et al., 2003] A. Cardone, S. K. Gupta, and M. Karnik, “ASurvey of Shape Similarity AssessmentAlgorithms for Product Design and ManufacturingApplications,” ASME Journal of Computing andInformation Science in Engineering, 2003,Volume 3, Issue 2, pp. 109-118.


Bibliography 87[Carmo, 1976]M. P. do Carmo, “Differential Geometry ofCurves and Surfaces,” Prentice Hall Inc.,Englewood Cliffs, NJ, 1976.[Ch<strong>am</strong>pleboux et al., 1992] G. Ch<strong>am</strong>pleboux, S. Lavallee, P. Sautot and P.Cinquin, “Accurate calibration of c<strong>am</strong>eras andrange imaging sensor: <strong>the</strong> NPBS method,” In <strong>the</strong>Proceedings of <strong>the</strong> IEEE International Conferenceon Robotics and Automation, California 1992,Volume 2, pp. 1552-1557.[Chen and Schmitt, 1992][Chakrabarti et al, 2000]X. Chen and F. Schmitt, “Intrinsic surfaceproperties from surface triangulation,” In <strong>the</strong>Proceedings of <strong>the</strong> European Conference onComputer Vision, Italy, 1992, pp. 739-743.K. Chakrabarti, M. O. Binderberger, K. Porkaewand S. Mehrotra, “Similar shape retrieval inMARS,” In <strong>the</strong> Proceedings of IEEE InternationalConference on Multimedia and Expo, New York,USA, 2000, , Volume 2, pp. 709-712.[Chellappa and Bagdazian, 1984] R. Chellappa and R. Bagdazian, “Fourier codingof image boundaries, IEEE Transactions onPattern Analysis and Machine Intelligence,1984Volume 6,Issue 1,pp. 102-105.[Chung, 1997]F. R. Chung, “Spectral Graph Theory,” AmericanMa<strong>the</strong>matical Society, 1997.[Corney et al., 2002] J. Corney, H. Rea, J. Clark, J. Pritchard, M.Breaks and R. MacLeod, “Coarse Filters for ShapeMatching,” IEEE Transactions on ComputerGraphics and Applications, 2002, Volume 22,Issue 3, pp. 65-74.[Cybenko et al., 1997]G. Cybenko, A. Bhasin and K. Cohen, “PatternRecognition of 3D CAD Objects,” SmartEngineering Systems Design, 1997, Volume 1,pp.1-13.


Bibliography 88[Cyr and Kimia, 2001]C. M. Cyr and B. B. Kimia, “3D objectrecognition using shape similarity-based aspectgraph,” In <strong>the</strong> Proceedings of <strong>the</strong> InternationalConference on Computer Vision, 2001, pp. 254-261.[Davies, 1997] E. R. Davies, “Machine Vision: Theory,Algorithms, Practicalities,” Academic Press, NewYork, 1997, pp. 171-191.[Davis and Chen, 2001]J. Davis and X. Chen, “A Laser Range ScannerDesigned for Minimum Calibration Complexity,”In <strong>the</strong> Proceedings of <strong>the</strong> Third InternationalConference on 3D Digital Imaging and modeling ,2001.[Desbrun et al., 1999] M. Desbrun, M.Meyer, P. SchrÄoder and A. H.Barr, “Implicit fairing of irregular meshes usingdiffusion and curvature flow,” In ComputerGraphics Proceedings (SIGGRAPH '99), 1999,pp. 317-324.[Delingette, 1999][Dorai, 1996][Duda and Hart, 1973][Dudek and Tsotsos, 1997]H. Delingette, “General object reconstructionbased on simplex meshes,” International Journalof Computer Vision, 1999, Volume 32, Issue 2,pp.111-146.C. Dorai, “COSMOS: A fr<strong>am</strong>ework for <strong>the</strong>representation and recognition of free formobjects,” PhD Thesis, Michigan State University,1996.R. M Duda. and P. E Hart, “Pattern Classificationand Scene Analysis,” John Wiley and Sons, NewYork, 1973.G. Dudek and J. K Tsotsos, “Shape representationand recognition from multi-scale curvature,”Journal of Computer Vision and ImageUnderstanding, 1997, Volume 68, Issue 2, pp.170-189.


Bibliography 89[Elad et al., 2001] M. Elad, A. Tal and S. Ar, “Content-BasedRetrieval of VRML Objects – an iterative andinteractive approach,” Eurographics MultimediaWorkshop, 2001, pp.97-108.[Elinson et al., 1997][Fix and Hodges, 1951][Flynn and Jain, 1989][Freeman and Saghri, 1978]A. Elinson., D. Nau, and W. C. Regli, “FeaturebasedSimilarity Assessment of Solid Models,” In<strong>the</strong> Proceedings of 4th ACM/SIGGRAPHSymposium on Solid Modeling andApplications,1997, Atlanta, pp. 297-310.E. Fix, J. L. Hodges, “Discriminatory analysis,nonpar<strong>am</strong>etric discrimination consistencyproperties,” Technical Report 4, Randolph Filed,Texas, US Air Force, 1951.P. J. Flynn and A. K. Jain, “On reliable curvatureestimation,” In Proceedings of <strong>the</strong> InternationalConference on Computer Vision and PatternRecognition, 1989, pp. 110-116.H. Freeman and A. Saghri, “Generalized chaincodes for planar curves,” In <strong>the</strong> Proceedings of <strong>the</strong>Fourth International Joint Conference on PatternRecognition, Kyoto, Japan, November 1978, pp.701–703.[Fu, 1974] K. S. Fu, “Syntactic Methods in PatternRecognition,” Academic Press, New York, 1974.[Gonzalez and Woods, 1992][Gotsman et al., 2003]R. C. Gonzalez and R. E. Woods, “Digital ImageProcessing,” 1992, Addison-Wesley, Reading,MA, pp. 502-503.C.Gotsman, X. Gu, A.Sheffer, “Fund<strong>am</strong>entals ofspherical par<strong>am</strong>eterization for 3D meshes,” In <strong>the</strong>Proceedings of ACM SIGGRAPH, 2003, pp.358-363.


Bibliography 90[Gourley, 1998][Goshtas<strong>by</strong>, 1985][Groskey and Mehrotra, 1990]C. S. Gourley, “Pattern vector based reduction oflarge multimodal data sets for fixed rateinteractivity during visualization of multiresolutionmodels,” PhD <strong>the</strong>sis, University ofTennessee, Knoxville, TN,1998.A. Goshtas<strong>by</strong>, “Description and discrimination ofplanar shapes using shape matrices,” IEEETransactions on Pattern Analysis and MachineIntelligence, 1985, Volume 7, pp. 738-743.W. I. Groskey, R. Mehrotra, “Index-based objectrecognition in pictorial data management,”Journal of Computer Vision, Graphics and ImageProcessing, 1990, Volume 52, pp. 416-436.[Groskey et al., 1992] W. I. Groskey, P. Neo and R. Mehrotra, “Apictorial index mechanism for model-basedmatching,” Data Knowledge Engineering, 1992,Volume 8, pp. 309-327.[Guillaume et al., 2004]L. Guillaume, D. Florent and B. Atilla, “CurvatureTensor Based Triangle Mesh Segmentation withBoundary Rectification,” In <strong>the</strong> Proceedings ofComputer Graphics International, June 2004,Crete, Greece, pp. 10-17.[Hetzel et al., 2001] G. Hetzel, B. Leibe , P. Levi , B. Schiele, “3DObject Recognition from Range Images usingLocal Feature Histogr<strong>am</strong>s,” In <strong>the</strong> Proceedings ofIEEE International Conference on ComputerVision and Pattern Recognition, 2001,pp. 394-399.[Henderson et al., 1993] M. R. Henderson., G. Srinath, R. Stage, K.Walker, and W. Regli, “Boundary RepresentationbasedFeature Identification,” In Advances inFeature Based Manufacturing, Elsevier-NorthHolland Publishers, Amsterd<strong>am</strong>,1993.


Bibliography 91[Hilaga et al., 2001] M. Hilaga, Y. Shinagawa , T. Kohmura, and T. L.Kunii, “<strong>To</strong>pology Matching for Fully AutomaticSimilarity Estimation of 3d Shapes,” In <strong>the</strong>Proceedings of SIGGRAPH, ACM Press, 2001,pp. 203-212.[Hoppe et al., 1992][Horn et al., 1998][Hu, 1962][Huttenlocher, 1992][Jeannin, 2000][Johnson and Hebert, 1999]H. Hoppe, T. DeRose, T. Duch<strong>am</strong>p, J. McDonaldand W. Suetzle, “Surface reconstruction fromunorganized points,” In <strong>the</strong> Proceedings of ACMSIGGRAPH, 1992, pp. 71-78.B. Horn, H. Hilden and S. Negahdaripour,“Closed-form solution of absolute orientationusing orthonormal matrices,” Journal of OpticalSociety of America (Optics and Image Science),1998, Volume 5, Issue 7, pp. 1127-1135.M. K. Hu, “Visual pattern recognition <strong>by</strong> momentinvariants,” IRE Transactions on InformationTheory, 1962, Volume 8, pp.179-187.D. P. Huttenlocher, W. J. Rucklidge, “A multiresolutiontechnique for comparing images using<strong>the</strong> Hausdorff distance,” Technical Report, TR-92-1321, Department of Computer Science, CornellUniversity, 1992.S. Jeannin (Editor), “MPEG-7 Visual part ofexperimentation model version 5.0,” ISO/IECJTC1/SC29/WG11/N3321, Nordwijkerhout,March, 2000.A. Johnson and M. Hebert, “Using spin images forefficient object recognition in cluttered 3Dscenes,” IEEE Transactions on Pattern Analysisand Machine Intelligence, 1999, Volume 21, Issue5, pp. 433-449.[Jones et al., 1996] M. C. Jones, J. S. Marron, and S. J. Shea<strong>the</strong>r, “Abrief survey of bandwidth selection for densityestimation,” Journal of <strong>the</strong> American StatisticalAssociation, 1996, Volume 91,Issue 433, pp. 401-407.


Bibliography 92[Joshi and Chang, 1988][Kazhdan et al., 2003][Khotanzad and Hong, 1990][Kliot and Rivlin, 1998][Kortgen et al., 2003][Kresk et al., 1998]S. Joshi and T. C Chang, “Graph-based Heuristicsfor Recognition of Machined Features from a 3DSolid Model,” Computer-Aided Design Journal,1988, Volume 20, Issue 2, pp. 58-66.M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz,“Rotation Invariant Spherical HarmonicRepresentation of 3D Shape Descriptors,” In <strong>the</strong>Proceedings of ACM/Euro graphics Symposiumon Geometry Processing, 2003, pp. 167-175.A. Khotanzad and Y. H. Hong, “Invariant ImageRecognition <strong>by</strong> Zernike Moments,” IEEETransactions on Pattern Analysis and MachineIntelligence, 1990, Volume 12, Issue 5, pp. 489-497.M. Kliot and E. Rivlin, “Invariant-based shaperetrieval in pictorial databases,” Journal ofComputer Vision and Image Understanding, 1998,Volume 71, Issue 2, pp. 182-197.M. Kortgen, G. J. Park, M. Novotni and R. Klein,“3D Shape Matching with 3D Shape Contexts,” In<strong>the</strong> proceedings of 7th Central European Seminaron Computer Graphics, April 2003.P. Krsek, C. Lukacs, and R. R. Martin,“Algorithms for computing curvatures from rangedata.,” In The Ma<strong>the</strong>matics of Surfaces VIII,Information Geometers, 1998, pp. 1-16.[Kriegel et al., 2003] H. P. Kriegel, P. Kröger, Z. Mashael, M.Pfeifle, M. Pötke and S. Seidl, “EffectiveSimilarity Search on Voxelized CAD Objects,” In<strong>the</strong> Proceedings of 8 th International Conference onDatabase Systems for AdvancedApplications,2003, Kyoto, Japan, pp. 27-36.


Bibliography 93[Leibowitz et al., 1999][Levoy et al., 2000][Lin and Perry, 1982][Lu and Sajjanhar, 1999][Mangan and Whitaker, 1999]N. Leibowitz, Z. Y. Fligelman , R. Nussinov , andH. J. Wolfson, “Multiple Structural Alignment andCore Detection <strong>by</strong> Geometric Hashing,” inProceedings of <strong>the</strong> 7th International Conference onIntelligent Systems in Molecular Biology, 1999,Heidelberg, Germany, pp. 169-177.M. Levoy, K. Pulli, B. Curless, S. Rusinkiewicz,D. Koller, L. Pereira, M. Ginzton, S. Anderson, J.Davis, J. Ginsberg, J. Shade and D. Fulk, “TheDigital Michelangelo Project: 3D Scanning ofLarge Statues,” In <strong>the</strong> Proceedings of <strong>the</strong> ACMSIGGRAPH, 2000, pp. 131-144.C. Lin and M. J. Perry, “Shape description usingsurface triangulation,” In Proceedings of <strong>the</strong> IEEEWorkshop on Computer Vision: Representationand Control, 1982, pp. 38-43.G. J. Lu and A. Sajjanhar, “Region-based shaperepresentation and similarity measure suitable forcontent-based image retrieval,” Journal ofMultimedia Systems, 1999, Volume 7, Issue 2, pp.165-174.A. P. Mangan and R. T. Whitaker, “Partitioning3D surface meshes using watershedsegmentation,” IEEE Transactions onVisualization and Computer Graphics, 1999,Volume 5, Issue 4, pp. 308-321.[McWherter et al., 2001] D. McWherter, M. Peabody, W. C Regli and A.Shoukofandeh, “Solid Model Databases:Techniques and Empirical Results,” ASMEJournal of Computing and Information Science inEngineering, 2001, Volume 1, Issue 4, pp. 300-310.[Mehrotra and Gary, 1995]R. Mehrotra, J. E. Gary, ”Similar-shape retrievalin shape data management,” IEEE Transactions onComputing, 1995, Volume 28, Issue 9, pp. 57-62.


Bibliography 94[Meyer, 2002][Morse, 1994][Mukai et al., 2002]M.Meyer, M. Desbrun and P. Alliez, “IntrinsicPar<strong>am</strong>eterizations of Surface Meshes,”Eurographics 2002, Volume 21, Issue 2, 2002.B. S. Morse, “Computation of object cores fromgrey-level images,” Ph.D. Thesis, University ofNorth Carolina at Chapel Hill, 1994.S. Mukai, S. Furukawa, M. Kuroda, “AnAlgorithm for Deciding Similarities of 3DObjects,” In <strong>the</strong> Proceedings of <strong>the</strong> ACMSymposium on Solid Modelling and Applications2002, Saarbrücken, Germany, June 2002.[Oddo, 1992] L. A. Oddo, “Global shape entropy: Ama<strong>the</strong>matically tractable approach to buildingextraction in aerial imagery,” In <strong>the</strong> Proceedingsof <strong>the</strong> 20th SPIE AIPR Workshop, 1992, Volume1623, pp. 91-101.[Ohbuchi et al., 2003]R. Ohbuchi, T. Min<strong>am</strong>itani and T. Takei, “ShapeSimilarity Search of 3D Models <strong>by</strong> usingEnhanced Shape Functions,” In <strong>the</strong> Proceedings ofTheory and Practice in Computer Graphics, June2003, Birmingh<strong>am</strong>, U.K[Osada et al., 2002] R. Osada, T. Funkhouser, B. Chazelle, and D.Dobkin, “Shape Distributions,” ACM Transactionson Graphics, October 2002,Volume 21, Issue 4,pp. 807-832.[Pap, 2002]E. Pap, “A handbook on measure <strong>the</strong>ory,” ElsevierNorth Holland Press, 2002.[Page et al., 2001] D. L. Page, Y. Sun, A. F. Koschan, J. Paik and M.A. Abidi, “Robust crease detection and curvatureestimation of piecewise smooth surfaces fromtriangle mesh approximations using normalvoting,” In Proceedings of <strong>the</strong> InternationalConference on Computer Vision and PatternRecognition, 2001, Volume 1, pp. 162-167.


Bibliography 95[Page et al., 2003a] D. L. Page, A. F. Koschan, Y. Sun, and M. A.Abidi, “Laser-based Imaging for ReverseEngineering,” Sensor Review, Special issue onMachine Vision and Laser Scanners, July 2003,Volume 23, Issue 3, pp. 223-229.[Page et al., 2003b] D. L. Page, A. F. Koschan, S. R. Sukumar, B.Abidi, and M. A. Abidi, “Shape analysis algorithmbased on information <strong>the</strong>ory,” In <strong>the</strong> Proceedingsof <strong>the</strong> International Conference on ImageProcessing, Barcelona, Spain, September2003,Volume 1, pp. 229-232.[Parui et al, 1986][Parzen, 1962][Pavlidis, 1982][Peura and Ivarinen, 1997][Reza,1961][Rosenblatt, 1956][Rucklidge, 1997]S. Parui, E. Sarma and D. Majumder, “How todiscriminate shapes using <strong>the</strong> shape vector,”Pattern Recognition Letters, 1986, Volume 4, pp.201-204.E. Parzen , “On estimation of a probability densityand model,” Annals of Ma<strong>the</strong>matical Statistics,1962, pp. 1065-1076.T. Pavlidis, “Algorithms for Graphics and ImageProcessing,” Computer Science Press, Rockville,MD, 1982.M. Peura and J. Ivarinen, “Efficiency of simpleshape descriptors,” In <strong>the</strong> Proceedings of <strong>the</strong> ThirdInternational Workshop on Visual Form, Capri,Italy, May, 1997, pp. 443-451.F.M. Reza, “An Introduction to InformationTheory,” McGraw-Hill, 1961.M. Rosenblatt, “Remarks on some non-par<strong>am</strong>etricestimates of a density function,” Annals ofMa<strong>the</strong>matical Statistics, 1956, pp. 642-669.W. J. Rucklidge, “Efficient locating objects usingHausdorff distance,” International Journal ofComputer Vision, 1997, Volume 24, Issue 3, pp.251-270.


Bibliography 96[Safar et al., 2000][Shannon, 1948][Shum et al., 1996][Silverman, 1986][Sonka et al., 1993][Squire and Caelli, 2000][Surazhsky et al., 2003][Stankiewicz, 2002]M. Safar, C. Shahabi and X. Sun, “Image retrieval<strong>by</strong> shape: a comparative study,” In <strong>the</strong>Proceedings of IEEE International Conference onMultimedia and Expo, New York, USA, 2000,Volume 1, pp. 141-144.C. E. Shannon, “A ma<strong>the</strong>matical <strong>the</strong>ory ofcommunication,” The Bell System TechnicalJournal, 1948, Volume 27, pp. 379-423.H. Shum, M. Hebert, and K. Ikeuchi, “On 3Dshape similarity,” In <strong>the</strong> Proceedings of <strong>the</strong> IEEEConference on Computer Vision and PatternRecognition, 1996, pp. 526-531.B. W. Silverman, “Density Estimation forStatistics and Data Analysis,” Chapman and Hall,London, 1986.M. Sonka, V. Hlavac and R. Boyle, “ImageProcessing, Analysis and Machine Vision,”Chapman & Hall, London, UK, NJ, 1993, pp. 193-242.D. M. Squire, T. M. Caelli, “Invariance signature:characterizing contours <strong>by</strong> <strong>the</strong>ir departures frominvariance,” Journal of Computer Vision andImage Understanding, 2000, Volume 77, pp. 284-316.T. Surazhsky, E. Magid, O. Soldea, G. Elber, andE. Rivlin , “A Comparison of Gaussian and MeanCurvatures Estimation Methods on TriangularMeshes,” In <strong>the</strong> Proceedings of InternationalConference on Robotics and Automation, Taiwan,September 2003, pp. 1021-1026.B. J. Stankiewicz, “Models of <strong>the</strong> PerceptualSystem,” In <strong>the</strong> Encyclopedia of CognitiveScience, Macmillan Press, 2002.[Stokely and Wu, 1992] E. M. Stokely, S. Y. Wu, “SurfacePar<strong>am</strong>eterization and Curvature Measurement of


Bibliography 97Arbitrary 3D Objects: Five Practical Methods,”IEEE Transactions on Pattern Analysis andMachine Intelligence,1992, Volume 14,Issue 8,pp. 833-840.[Suk and Bhandarkar, 1992] M. Suk and S. M. Bhandarkar, “Three-Dimensional Object Recognition from RangeImages,” Springer-Verlag, <strong>To</strong>kyo,1992.[Takatsuka et al., 1999] M. Takatsuka, A. W. Geoff, S. Venkatesh and T.M. Caelli, “Low cost interactive monocular rangefinder,” In <strong>the</strong> Proceedings of Computer Visionand Pattern Recognition, Colorado, June1999,Volume 1, pp. 1444-1451.[Taza and Suen, 1989][Taubin and Cooper, 1991][Taubin and Cooper, 1992][Taubin, 1995][Teague, 1980][Thompson et al., 1999]A. Taza and C. Suen, “Discrimination of planarshapes using shape matrices,” IEEE Transactionson Systems, Man and Cybernetics, 1989, Volume19, pp. 1281-1289.G. Taubin and D. B. Cooper, “Recognition andpositioning of rigid objects using algebraicmoment invariants,” SPIE Conference onGeometric Methods in Computer Vision, Volume1570, University of Florida, Florida, USA 1991,pp. 175-186.G. Taubin, D. B. Cooper, “Object recognitionbased on moment,” Geometric Invariance inComputer Vision, MIT Press, C<strong>am</strong>bridge, MA,1992, pp. 375-397.G. Taubin, “Estimating <strong>the</strong> tensor of curvature of asurface from a polyhedral approximation,” In <strong>the</strong>Proceedings of <strong>the</strong> Fifth International Conferenceon Computer Vision, 1995, pp. 902-907.M. R. Teague, “Image analysis via <strong>the</strong> general<strong>the</strong>ory of moments,” Journal of Optical Society ofAmerica, 1980, Volume 70, Issue 8, pp. 920-930.W. B. Thompson, J. C. Owen and H. J. Germain,“Feature-base reverse engineering of mechanical


Bibliography 98parts,” IEEE Transactions on Robotics andAutomation, 1999, Volume 15, pp. 57-66.[Turlach, 1996][Trucco and Verri, 1998][Vranic and Saupe, 2001][Vranic, 2003][Wu and Lin, 1996][Wand, 1995][Yang et al., 1998][Zhang and Hebert, 1999]B. A. Turlach, “Bandwidth Selection in KernelDensity Estimation: A Review,” C.O.R.E andInstitute de Statistique, University catholique deLouvain,Belgium,1996.E. Trucco and A. Verri, “Introductory Techniquesfor 3D Computer Vision,” Prentice Hall, 1998.D. V. Vranic and D. Saupe, “3D Shape DescriptorBased on 3D Fourier Transform,” In <strong>the</strong>Proceedings of <strong>the</strong> EURASIP Conference onDigital Signal Processing for MultimediaCommunications and Services, Hungary,September 2001, pp. 271-274.D. V. Vranic, “An Improvement of RotationInvariant 3D Shape Descriptor Based on Functionson Concentric Spheres,” In <strong>the</strong> Proceedings of <strong>the</strong>IEEE International Conference on ImageProcessing, September 2003, Volume3,Barcelona, Spain, pp. 757-760.Wu and Lin, “Information bound for bandwidthselection in Kernel Density Estimators,” StatisticaSinica ,1996, Volume 6, pp. 129-145.M. P. Wand and M. C Jones, “Kernel Smoothing,”Chapman and Hall, London, 1995.H. S. Yang, S. U. Lee and K. M. Lee,“Recognition of 2D object contours using startingpoint-independentwavelet coefficient matching,”Journal of Visual Communication and ImageRepresentation,1998,Volume 9, Issue 2, pp. 171-181.D. Zhang and M. Hebert, “Harmonic Maps andTheir Applications in Surface Matching,” In <strong>the</strong>Proceedings of IEEE Conference on ComputerVision and Pattern Recognition, 1999.


Bibliography 99[Zhang and Chen, 2001][Zhang, 2002][Zhang and Lu, 2002][Zhang and Lu, 2004]C. Zhang, and T. Chen, “Efficient FeatureExtraction for 2D/3D Objects in MeshRepresentation,” In <strong>the</strong> Proceedings of <strong>the</strong> IEEEInternational Conference on Image Processing,Thessaloniki, Greece, 2001.D. S. Zhang, “Image retrieval based on shape,”Ph.D. Thesis, Monash University, Australia,March, 2002.D. S. Zhang and G. Lu, “Generic Fourierdescriptor for shape-based image retrieval,” In <strong>the</strong>Proceedings of IEEE International Conference onMultimedia and Expo, Volume 1, Lausanne,Switzerland, August 2002, pp. 425-428.D. Zhang and G. Lu, “Review of shaperepresentation and description techniques,”Pattern Recognition, January 2004, Volume 37,Issue 1, pp. 1-19.


Vita 100VITASreenivas Rangan Sukumar was born in Chennai, India on <strong>the</strong> 16 th of May, 1981. Hegraduated as <strong>the</strong> department topper of his college with a Bachelors Degree inElectronics and Communication Engineering from <strong>the</strong> University of Madras, India in2002. Progr<strong>am</strong>ming experience at <strong>the</strong> National Institute of Information Technology,India paved his way into Pent<strong>am</strong>edia Graphics Limited, India, where he was part of<strong>the</strong> research te<strong>am</strong> that implemented a data compression fr<strong>am</strong>ework for archivingmultimedia on <strong>the</strong> web. It infused his interest in image processing and information<strong>the</strong>ory to pursue his Masters Degree at <strong>the</strong> Imaging, Robotics and Intelligent SystemsLab at <strong>the</strong> University of Tennessee, Knoxville, U.S.A. He wishes to pursue hisacademic career with a PhD degree before he can be ready to contribute to <strong>the</strong> society.He spends his leisure time listening to Carnatic music and wishes for more time with<strong>the</strong> “veena” (a south Indian musical instrument).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!