R&D Activities

  • High-fidelity MOtion CAPture (MOCAP) of humans from commodity cameras.
    We envision and develop methods and technical systems that permit low-cost, unobtrusive and compact quantification of human motion in 3D, as well as semantic interpretation of human activities based on markerless visual input. We are working on high-fidelity MOCAP from commodity cameras, tasks that require the estimation of the 3D state of one or more humans. 
  • 3D modelling of human bodies, hands, faces and skulls.
    The bodies, hands, faces, and skulls of humans have a great variability in size, shape and appearance, with significant differences across different individuals and demographic groups. We are developing novel methodologies for constructing next-generation, large-scale datasets and models regarding the shape and appearance of human hands, faces, and skulls, aiming at achieving unprecedented realism and expressing the full variability that is encountered in the human population. 
  • Human activity recognition and forecasting.
    Capitalizing on accurate and robust information of the human geometry and motion as well as rich contextual information, we aim at developing methods on multiple aspects of the analysis and interpretation of human-centric activities. Our research efforts pertain to the tasks of human activity recognition, interpretation of human object interaction, identification of repetitive actions in videos, action cosegmentation in videos, action quality assessment, prediction of human intentions and forecasting of future actions.
  • Gesture and facial expression recognition, sign language translation. 
    Building upon our expertise in vision-based perception of humans at the geometry, motion and semantic level, we are developing novel systems for gesture recognition. We are also applying our state-of-the-art 3D face modelling and reconstruction methods on the analysis of facial expressions from videos. This is one of the most important non-intrusive ways to infer human emotions and plays a central role in the field of Affective Computing. All the above are essential components in systems that aim to interpret sign language.
  • Vision for autonomous systems that interact and collaborate with humans.
    We develop algorithms that enable autonomous systems such as robots to operate in any environment and to carry out tasks in collaboration with humans. We are tackling scenarios where robots and humans interact with each other or with one or more objects that vary in complexity (rigid, articulated or deformable objects, objects of various sizes, previously unseen objects).
  • Vision-based coastal seafloor mapping.
    We are developing systems for automated shallow bathymetry retrieval and detailed mapping of coastal benthic cover based on optical imagery from drones (Unmanned Aerial Vehicles). We are applying state-of-the-art image analysis, neural networks and machine learning techniques for training systems for automated, high-resolution 3D reconstruction of coastal seafloor.