Session Chair: Yannis Pavlidis (University of Houston)
Visual Analytics for Shrinking Checkout Shrink
Speaker: Sharathchandra Pankanti
Affiliation: Manager of Exploratory Computer Vision, IBM T J, Watson Center
Abstract: The enormous magnitude of the retail shrink and relentless pursuit for competitive performance are the two driving factors for the retailers to increasingly focus on reducing the losses related to retail fraud and operational errors. A large proportion of the retail shrink has been found to occur in and around the retail checkout operations. We will describe our visual compliance based approach to address four categories of problems associated with checkout shrink: empty cart, non-scan, ticket switch, and cashier fraud. The checkout shrink may occur because the shopper may intentionally or unintentionally not empty the shopping cart or the basket at the checkout. The point of sale (POS) device may not have registered all items apparently “presented” for scanning. The resulting non-scan is another source of the shrink. The barcode label on the shopping item may have been tampered (typically, by the shopper) so that the associated price registered at the POS may be lower than the correct item price. The resulting shrink is referred to as ticket (barcode) switching. Finally, the cashier may intentionally manipulate the transactions at the cash register to under-represent the total revenue from the sale so that (s)he can steal the difference. We will present the experimental results and observations based on real data from multiple stores over the years.
Brief Biography: Sharath Pankanti is manager of Exploratory Computer Vision Group at T J Watson Research Center, IBM. He leads a number of safety, productivity, and security focused projects involving biometrics-, multi-sensor surveillance, driver assistance technologies that entail object/event modeling, detection and recognition from information provided by static and moving sensors and cameras. Many of these works are integrated into systems that have been rigorously evaluated in real world applications. His research interests include performance metrics metrics/evaluation, computer vision system designs for effective privacy, safety, security, productivity, and convenience. He has published about 70 publications in peer-reviewed conference/workshop proceedings and journals and has contributed to 20 inventions spanning biometrics, object detection, and recognition. He has co-edited the first comprehensive book on biometrics, “Biometrics: Personal Identification” Kluwer, 1999 and co-authored, “A Guide to Biometrics”, Springer 2004 which is being used in many undergraduate and graduate biometrics curricula.
GE's Computer Vision: from Prisons to Healthcare
Speaker: Peter Tu
Affiliation: Manager, GE Global Research Center
Abstract: This talk will cover GE's ongoing intelligent video research agenda. Various aspects of our site wide tracking capabilities, as showcased at the DHS STIDP trials, will be discussed. This will include detection of people from both static and moving platforms. We then consider the blurring of the lines between computer vision and artificial intelligence as we present methods by which track based behaviors that are of interest to the user community can be automatically detected. Going from tracking to articulated action, we show how advances in action analysis can now be deployed in a variety of real-world contexts. We next present various methods associated with biometrics-at-a-distance including face recognition and reacquisition based on appearance. Our work on facial analysis will then be presented, which includes: facial alignment, facial recognition, facial expression analysis and face reconstruction of skeletal remains. A brief review of our aerial analysis methods and medical imaging work will also be given. We will conclude the talk with thoughts on steps that need to be taken in order to achieve the critical mass necessary to deploy robust systems that fail gracefully when confronted by unforeseen circumstances.
Brief Biography: In 1990 Dr. Tu joined Sony Research in Tokyo Japan, where he developed a number of computer vision algorithms for man-machine interfaces. In 1995 Dr. Tu received a Ph.D. from the Engineering Science Department of Oxford University. While at Oxford, his research was devoted to the development of computer vision methods for the automatic analysis of seismic imagery. In 1997 Dr. Tu became a senior research scientist working at General Electric’s Global Research Center. In partnership with Lockheed Martin, he has developed a set of latent fingerprint matching algorithms for the FBI Automatic Fingerprint Identification System (AFIS). Dr. Tu has also developed optical methods for the precise measurement of 3D parts in a manufacturing setting. Dr. Tu is the principal investigator for the FBI ReFace project, which is tasked with developing an automatic system for face reconstruction from skeletal remains. In 2006, he was the principal investigator for the National Institute of Justice’s 3D Face Enhancer Program. This work was focused on improving face recognition from poor quality surveillance video. In 2008, Dr. Tu led the GE video analytics team that participated in the DHS STIDP demonstration program - the goal of STIDP is to establish an effective defense against suicide bomber attack. Currently Dr. Tu is leading a group of 15 researchers in the field of multi-view video analysis with the aim of achieving reliable behavior recognition in complex environments. He has helped to develop a large number of analytic capabilities including: person detection from fixed and moving platforms, crowd segmentation, multi-view tracking, person reacquisition, face modeling, face expression analysis, face recognition at a distance, face verification from photo IDs, and articulated motion analysis. Dr. Tu has over 25 peer-reviewed publications and has filed more than 20 U.S. patents.
Visual Cognition, Semantic & Quantitative Imaging
Speaker: Ramesh Visvanathan
Affiliation: Global Technology Field Leader, Siemens Corporation, Princeton, NJ
Abstract: Affordable sensing, increased resolution and diversity in sensing modalities is causing an “Information Overload”. Managing this information overload requires human-machine systems that can automatically summarize the information content for real-time decision making as well as offline search for advanced forensics. This trend impacts all Siemens Industry, Energy and Healthcare sectors. In Healthcare such systems will help improve diagnostics and quality of life. In Industry such systems will improve the safety and security of public through advanced video surveillance systems. In Energy sector such systems will assist in improved turbine efficiency and reduced maintenance costs. This presentation will highlight how we leverage latest advances in computer vision, machine learning and artificial intelligence to deliver systems that perform visual cognition, semantic understanding and quantitative imaging. We will conclude our presentation with our view of the open research challenges.
Brief Biography: Dr. Visvanathan Ramesh currently heads the Real-Time Vision and Industrial Imaging global technology field at SCR. His research has developed real-time vision systems for application in video surveillance and monitoring, augmented reality, computer vision software systems, statistical modeling techniques for computer vision algorithm design, analysis, and performance characterization. He has extensive experience in the design, analysis, implementation, and performance evaluation of real-time vision systems. Dr. Ramesh was the co-recipient of the best paper award for his work on real-time tracking in the IEEE Computer Vision and Pattern Recognition Conference (2000) and most recently he co-received the IEEE Longuet Higgins Award for the same paper Fundamental Contributions to computer vision (2010). He was also awarded the Siemens Inventor of the Year award in 2008 for his and his teams outstanding contributions to Siemens in the area of real-time-vision. He has served as a member of the DARPA Image Understanding environment committee (1991-95) and various IEEE conference committees for computer vision and video surveillance.
15:30-16:00 Coffee Break
Session Chair: Nikos Paragios (Ecole Centrale de Paris)
Applications of Computer Vision in Airborne Surveillance
Speaker: Michael E. Bazakos
Affiliation: Lockheed Martin Maritime Systems and Sensors (MS2), Aviation Systems, Eagan MN
Abstract: The current state-of-the-art functionality of deployed large airborne surveillance platforms is heavily dependent on man power both on board the aircraft and on the ground. Future systems will need to have increasingly automated functionality in their mission execution and sensor exploitation systems. This talk will address the related needs of the industrial applications and some of the research and advancements towards these goals, which are being pursued jointly by the industry in collaboration with academia under the National Science Foundation (NSF) Industry University Cooperative Research Centers (IUCRC) initiative.
Brief Biography: Michael E. Bazakos received his B.S. in Mathematics from the National University of Athens Greece, and a M.S. in Applied Mathematics from the University of Minnesota. He is currently the Technology Growth Lead of the Manned Surveillance (fixed-wing) Platforms for ISR at Lockheed Martin (LM) Aviation Systems. His responsibilities include: IRAD definition and management for airborne surveillance, collaboration with other LM research centers and universities, and technology transition. Prior to this Mike worked at Honeywell’s Advanced Technology Laboratories, where he climbed the engineering ranks to Section Chief of Technology and finally Senior Research Fellow. His main work and interests are in the area of computer vision for real world applications including: Automated Surveillance, High-End Security Systems, and Automatic Target Recognition (ATR). In 2003 he was awarded the highest technical achievement award within Honeywell for his innovative work under the Secure Gate Access Technology Evaluation (S-GATE) program funded by NRL. S-GATE was a system which detected and matched against a database the driver of a car while the car is in motion. He has published over 50 papers and scientific reports, has co-chaired six national/international/government/industry conferences. He has 10 patents awarded and an additional 17 applied for.
From PostScript to face detectors: How computer vision is transforming Adobe
Speaker: Lubomir Bourdev
Affiliation: Adobe Systems, Inc. and U.C. Berkeley
Abstract: Ten years ago it was hard to imagine that computer vision will have much in common with the makers of Acrobat, Photoshop, and Flash. The set of practical applications was limited, computers were slow and there was not much need nor understanding of what vision could offer. Today computer vision forms the basis of numerous features in Photoshop, Premiere, AfterEffects, Acrobat, Photoshop Elements, Lightroom and a constantly growing list of Adobe products. I will describe how the advancements of the field have impacted our products, as well as the expectations of our customers, designers and engineers.
Brief Biography: Lubomir Bourdev is a Senior Research Scientist at the Creative Technologies Lab at Adobe. Over the past twelve years in Adobe's R&D organization, his research has resulted in seven major features in Illustrator, Photoshop Elements, Acrobat and InDesign. Mr. Bourdev was the lead architect of the Face Tagging feature in Photoshop Elements 4, the People Recognition module in Elements 8 and several non-vision related projects. His research interests include computer vision, computer graphics and machine learning. He has 27 issued and 15 pending patents.
Mr. Bourdev has received his Bachelor's and Master's degrees in Computer Graphics in 1998 from Brown University and is currently part-time pursuing his Ph.D. in Computer Vision at U.C. Berkeley.
Information capacity: a measure of potential image quality of a digital camera
Speaker: Frederic Guichard
Abstract: Shannon’s notion of information has revolutionized the electrical telecommunication. He defined the entropy of a random signal, and proved that the amount of information carried through an informative channel can be precisely quantified. Considering that a camera (composed of a lens and a sensor array) can be understood as an informative canal, we seek a similar characterization for an optical imaging system describing the potential of the camera to produce good images. This means comparing defects of different nature such as pixel and optical flaws and taking into account that digital processing can corrects for some of the defaults (but never retrieve lost information). We show that our definition of information capacity of a camera naturally accounts for all the main defects than can be observed in digital images and boils down to a relation between the pixel characteristics and the lens performance, which allows a new comprehension of distribution of information through a camera. We show that it can be practically used for camera ranking, for camera system (pixel, optics, digital processing) design and optimization, and for understanding future upcoming imaging challenges.
Brief Biography: Dr. Frederic Guichard graduated in mathematics from Ecole Normale Supérieure of Paris and is also member of the "Corps des Ponts & Chaussées". He defended his PhD, in applied math. and image processing, at university Paris IX. He is currently Chief Scientist & CTO at DxO Labs, that he cofounded in 2003. Prior to that, he has occupied several academic or industrial research positions. His research interests include digital photography, image processing, computer vision and applied math. He published more than hundreds papers in international journals, conferences or patents. During his career, he has received several awards, such as, in 1996 : "Science and Defence" and in 2006, a "grand prize" of the French Academy of Sciences.
17:30-18:00 Panel Discussion