Invited Speakers

Prof. Alberto del Bimbo,University of Florence, Italy

About faces and 3D data

Abstract

Human target recognition has been an active area of research in the last years with major emphasis on automatic detection and matching of faces in still images and videos for the purposes of verification and identification. Performance of 2D face matching systems depends on their capability of being insensitive to critical factors such as facial expressions, makeup and aging but mainly hinges upon extrinsic factors such as illumination differences, camera viewpoint and scene geometry. However the inherent limitations of 2D face matching have supported the belief that effective recognition of identity should be obtained through multi-biometric technologies. In particular the exploitation of the geometry of the anatomical structure of the face rather than appearance , and therefore of solutions for 3D facial information modeling and matching have become a growing field of research. We will discuss solutions to model 3D facial information for the purpose of retrieval, critical issues and perspectives.

 

About the speaker

Alberto Del Bimbo, is Full Professor of Computer Engineering at the University of Florence, Italy. He was the Director of the Department of Sistemi e Informatica, from 1997 to 2000 and the Deputy Rector for Research and Innovation Transfer of the University of Florence, from 2000 to 2006. Presently he is the President of the Foundation for Research and Innovation that has been established by the University of Florence together with the local institutions of the Metropolitan Area. He is also the Director of the Media Integration and Communication Center of Excellence and the Director of the Master in Multimedia of the University of Florence.

His scientific interests are Pattern Recognition, Image and Video Analysis, Multimedia Information Retrieval and Natural Human Computer Interaction. He has published over 250 publications in some of the most distinguished scientific journals and international conferences. From 1996 to 2000, he was the President of the IAPR Italian Chapter, and, from 1998 to 2000, Member at Large of the IEEE Publication Board. He was the general Chair of IAPR ICIAP'97, the International Conference on Image Analysis and Processing, IEEE ICMCS'99, the International Conference on Multimedia Computing and Systems, AVIVDiLib'05 the International Workshop on Audio-Visual Content and Information Visualization, VMDL07 the International Workshop on Visual and Multimedia Digital Libraries, IEEE ISM2008, the International Symposium on Multimedia and Program Co-Chair of ACM Multimedia 2008. He is the General Co-Chair of ACM Multimedia 2010 and of ECCV 2012, the European Conference on Computer Vision.

He is IAPR Fellow and Associate Editor of Multimedia Tools and Applications, Pattern Analysis and Applications, Journal of Visual Languages and Computing and International Journal of Image and Video Processing, and was Associate Editor of Pattern Recognition, IEEE Transactions on Multimedia and IEEE Transactions on Pattern Analysis and Machine Intelligence.

 

Prof. David Taubman, The University of New South Wales, Sydney, Australia

Efficient Interactive Access to Large Volume Media, Keynote presentation

Abstract

As image and video frame sizes expand, the need for efficient interactive access to multimedia content is becoming increasingly pronounced. Video sensors with Giga-pixel spatial resolution at multiple frames per second are already under development for aerial surveillance applications; consumer devices already provide content with much more resolution than most users can conveniently email or push to a photo server; moreover, we can expect higher dimensional media to proliferate in the future, e.g., 3D scenes and free view-point video. In this talk, the speaker will present some recent and emerging technologies for compression and communication of multimedia content, which emphasise interactive accessibility.

The speaker will begin by providing an overview and demonstration of the features found in the JPEG2000 and JPIP standards which facilitate highly efficient interactive browsing of large images and intra-frame coded video. The talk will then consider the benefits of intelligent client/server interaction in interactive applications, exploring opportunities which present themselves for a media server to perform on-line rate-distortion optimisation in an interactive context. The speaker will introduce some recent approaches to the interactive communication of video and 3D scenes, which rely upon such intelligent client/server interaction. Finally, the talk will focus on the interactive communication of metadata, including textual and region of interest metadata. In this last part of the talk, the speaker will demonstrate existing tools which can be used to interact with remotely hosted media content via metadata and imagery cues in a tightly coupled manner. He will also show how metadata can be used to improve the efficiency with which compressed media content is transferred by an intelligent server.

 

About the speaker

Professor Taubman is with the School of Electrical Engineering and Telecommunications, at the University of New South Wales, where he heads the Telecommunications Research Group. Before joining UNSW at the end of 1998, he spent 4 years at Hewlett-Packard's research laboratories in Palo Alto, California. He received the B.S. and B.E. (Electrical) degrees in 1986 and 1988 from the University of Sydney, Australia, and the M.S. and Ph.D. degrees in 1992 and 1994 from the University of California at Berkeley. He contributed extensively to the JPEG2000 standard for image compression and the JPIP standard for interactive image communication. He is author, with Michael Marcellin, of the book "JPEG2000: Image compression fundamentals, standards and practice" and author of the popular Kakadu software for JPEG2000 developers. He is recipient of two IEEE Best Paper awards: for the 1996 paper, "A Common Framework for Rate and Distortion Based Scaling of Highly Scalable Compressed Video;" and for the 2000 paper, "High Performance Scalable Image Compression with EBCOT". He is also co-author with J. Thie of a 2004 ICIP best student paper award, for work on hybrid ARQ with LR-PET, and was a featured Plenary speaker at ICIP 2006. Professor Taubman's research interests include scalable image and video compression, robust and efficient communication of multimedia content, perceptual modelling and statistical inverse problems.

Tutorial presentations

Prof. Marco Tagliasacchi, Politecnico di Milano, Italy

Compressive Sensing: basic principles and applications in image and video processing, Tutorial presentation

Abstract

Compressive sensing is a recent paradigm that enables the reconstruction of a discrete signal from a limited number of non-adaptive measurements, provided that the underlying signal has a sparse representation in some basis or dictionary expansion. Compressive sensing enables sub-Nyquist sampling, e.g. the number of measurements can be significantly smaller than the original number of samples.

This tutorial is intended to be an introduction to the basic principles of compressive sensing, providing the theoretical foundations and the geometrical interpretation of the reconstruction problem, and explaining the requirements that need to be fulfilled in order to guarantee signal recovery. The most relevant reconstruction algorithms will be reviewed, including the case of noisy measurements.

Compressive Sensing has been applied in many fields, ranging from medical imaging, communications and data compression, just to name a few. This tutorial will focus on recent applications of compressive sensing in image and video analysis and processing.

 

About the speaker

Marco Tagliasacchi, born in 1978, received the "Laurea" degree (2002, cum Laude) in Computer Engineering and the Ph.D. in Electrical Engineering and Computer Science (2006), from the Politecnico di Milano, Italy. He is currently Assistant Professor at the "Dipartimento di Elettronica e Informazione - Politecnico di Milano". In 2004 he worked at the University of California - Berkeley as visiting scholar. Marco Tagliasacchi authored more than 50 scientific papers on international journals and conferences and actively participated to several EU funded research projects on audiovisual media technologies. His research interests span the range of multimedia signal processing, including image and video processing (video quality assessment, tampering identification, distributed video coding, scalable video coding) and audio processing (source localization and tracking, audio classification, environment aware audio processing)

 

Prof. Nicu Sebe, University of Trento, Italy

Human-centered computing: Challenges and Perspectives

Abstract

This tutorial will address the problem of sensing and understanding users' interactive actions and intentions for achieving multimodal human-computer interaction in natural settings. A critical issue here is that human face and body exhibit complex and rich dynamic behavior that is all non-linear, time varying, and context dependent (person, task, mood/affect dependent). Thus, the main focus will be on multimodal human-computer interaction models from multi-sensory observations.

The presentation will focus on the analysis of the user's behavior (e.g., facial expressions, body and head pose, eye tracking, etc.) in his personal environment (e.g., home or office) as well as bimodal emotion recognition from facial expressions and audio information. Another important aspect is the analysis of multimedia information retrieval techniques toward extracting affective information from the multimedia data (e.g., movies).

 

About the speaker

Nicu Sebe is an Associate professor in the Department of Information Engineering and Computer Science, University of Trento where he is leading the research in the areas of multimedia information retrieval and human-computer interaction in computer vision applications.

Until spring 2009, he was with the University of Amsterdam, The Netherlands. He is the author of two monographie and was involved in the organization of the major conferences and workshops addressing the computer vision and human-centered aspects of multimedia information retrieval, among which as a General Co-Chair of the IEEE Automatic Face and Gesture Recognition Conference, FG 2008 and ACM International Conference on Image and Video Retrieval (CIVR) 2007, and as one of the initiators and a Program Co-Chair of the Human-Centered Multimedia track of the ACM Multimedia 2007 conference. He is the general chair of WIAMIS 2009, ACM CIVR 2010, a program coordinator of ICMR 2011, and a track chair of WWW 2009 and ICPR 2010. He has served as the guest editor for several special issues in IEEE Computer, Computer Vision and Image Understanding, Image and Vision Computing, Multimedia Systems, ACM TOMCCAP, and IEEE Transactions on Multimedia. He has been a visiting professor in Beckman Institute, University of Illinois at Urbana-Champaign and in the Electrical Engineering Department, Darmstadt University of Technology, Germany. He was the recipient of a British Telecomm Felowship. He is the co-chair of the IEEE Computer Society Task Force on Human-centered Computing and is an associate editor of IEEE Transactions on Multimedia, Machine Vision and Applications, Image and Vision Computing, Electronic Imaging and of Journal of Multimedia.

Overview Talk - Telecom Italia

Giovanni Cordara, Telecom Italia, Italy

Visual Search: A Telecom Operator Perspective

Abstract

In the last years, digital imagery has expanded its horizon in many directions, with a resulting explosion in the volume of data, acquired by different devices and made available through heterogeneous networks. Such visual information needs to be efficiently searched, accessed and distributed. As a consequence, much research is being done on content-based image/video retrieval and object recognition. A telecom operator, providing different multimedia services over heterogeneous distribution channels, can benefit form the adoption of innovative visual search solutions.

This talk will present an overview of the techniques that are deemed most promising for their usage in real services. The adoption of visual search technology in such services deals with huge amount of data, real-time requirements and efficiency constraints: some of the existing techniques provide robust solutions to these problems, whereas other topics are still open to research. The presentation will focus on the description of available architectural and algorithmic solutions and the identification of challenges still to be addressed to pave the way for next generation services.

 

About the speaker

Giovanni Cordara is a senior researcher at Telecom Italia Lab, the corporate research centre of Telecom Italia.

In recent years, his research activity has concentrated on image/video processing field, with special emphasis on video coding and content analysis. His current research interests relate to the theoretical and application oriented aspects of content indexing and retrieval: he is involved in the design of efficient and scalable solutions for visual search and in the study of innovative techniques for object recognition/classification.

He has been involved in several projects supported by the Italian government and the European Commission, and he is coordinating technical collaborations with universities for carrying out joint research activities. He's active in the MPEG specification, acting as the Head of Delegation of the Italian National Body.