We present the first generative adversarial network (GAN) for natural image mat- ting. Our novel generator network is trained to predict visually appealing alphas with the addition of the adversarial loss from the discriminator that is trained to classify well- composited images. Further, we improve existing encoder-decoder architectures to better deal with the spatial localization issues inherited in convolutional neural networks (CNN) by using dilated convolutions to capture global context information without downscaling feature maps and losing spatial information. We present state-of-the-art results on the alphamatting online benchmark for the gradient error and give comparable results in oth- ers. Our method is particularly well suited for fine structures like hair, which is of great importance in practical matting applications, e.g. in film/TV production
We present a scalable pipeline for Free-Viewpoint Video (FVV) content creation, considering also visualisation in Augmented Reality (AR) and Virtual Reality (VR). We support a range of scenarios where there may be a limited number of handheld consumer cameras, but also demonstrate how our method can be applied in professional multi-camera setups. Our novel pipeline extends many state-of-the-art techniques (such as structure-from-motion, shape-from-silhouette and multi-view stereo) and incorporates bio-mechanical constraints through 3D skeletal information as well as efficient camera pose estimation algorithms. We introduce multi-source shape-from-silhouette (MS-SfS) combined with fusion of different geometry data as crucial components for accurate reconstruction in sparse camera settings. Our approach is highly flexible and our results indicate suitability either for affordable content creation for VR/AR or for interactive FVV visualisation where a user can choose an arbitrary viewpoint or sweep between known views using view synthesis.
Project Reference Link: https://v-sense.scss.tcd.ie/?p=2024https://v-sense.scss.tcd.ie/?p=2024
The Long Room holds a very special place in the hearts of students, staff, alumni of Trinity College Dublin, and the wider Dublin community. There is no substitute for the genuine experience of being immersed in the great space within the Long Room. However, we were inspired to create an anecdotal visitor narrative and concept that would augment the visitor experience. The user is guided through the Long Room by a friendly vologram, a volumetric 3D video representation of the great Jonathan Swift. The Virtual Reality (VR) and Augmented Reality (AR) innovative prototypes we created enabled an interactive narrative whereby a visitor can engage in an augmented tour of the Long Room with the use of this interactive digital media and which highlighted the following distinctive features:
The interactive Virtual Reality (VR) prototype visualized The Long Room content and history where a user would be immersed in a world through VR simulation. The user could be located anywhere in the world, and by putting on the Head Mounted Display (HMD), they can enter the magical virtual simulation of the Long Room. The user can explore this on PC (browser), iPad, iPhone or Android Phone/Tablet. The Augmented Reality (AR) version takes place within the actual Library environment via the use of HoloLens or handheld was developed in the second phase and only uses Dynamic content and did not ‘virtually’ reconstruct the internal spaces of the Long Room. This project is an example of our creative experiments, which showcase our original technologies in real creative productions. In this case we demonstrate our novel tools and pipeline for AR/VR content creation.
Since the early years of the twenty-first century, the performing arts have been party to an increasing number of digital media projects that bring renewed attention to questions about, on one hand, new working processes involving capture and distribution techniques, and on the other hand, how particular works—with bespoke hard and software—can exert an efficacy over how work is created by the artist/producer or received by the audience. The evolution of author/audience criteria demand that digital arts practice modify aes- thetic and storytelling strategies, to types that are more appropriate to communicating ideas over interactive digital networks, wherein AR/VR technologies are rapidly becoming the dominant interface. This project explores these redefined criteria through a reimagining of Samuel Becketts Play (1963) for digital culture. This paper offers an account of the working processes, the aesthetic and technical considerations that guide artistic decisions and how we attempt to place the overall work in the state of the art.
Together with Prof. Tyyne Claudia Pollmann, of the “Kunsthochschule Berlin Weißensee”, we implemented a person tracking project. She had won the call for an exhibit within the new Charité building “Charité Cross Over”. The Team around Ralf Reulke was charged with the development of the system. A 2 Megapixel high quality camera was used with an 4 core computing unit and a very fast graphics card. The tracking software from a former project was adapted, such that heavy image computations were implemented in the graphical processing unit. The results showed a high frame rate processing of 12 to 13 frames per second.
The aim of the project was to develop an optical system for the realization of monitoring functions based on the analysis of three-dimensional point clouds and derived quantities. The application is the rail-based public transport, the entire passenger compartment is to be recorded and analyzed. There are methods for fusing data to develop for noise removal and pattern recognition. To create it data basis two different camera systems (stereo and RGB-D) are used, stereo camera Hella People Counter and RGB-D Microsoft Kinect. People are considered 3D ellipsoids and their movement patterns analyzed. The multi-camera system using this poses particular challenges of synchronization and stability.
|Michele Adduci, Konstantinos Amplianitis, Martin Misgaiski-Hass, Christian Kaptur,Sourabh Bodas, Silvio Tristram|
The company Hella Aglaia Mobile Vision GmbH develops and distributes intelligent visual sensor systems. This also includes the software Cassandra, a tool for fast development of (prototypic) image processing algorithms. Cassandra is implemented in C++ and specializes on the automotive area. Cassandra has changed significantly for version 11: improved multi core support, better synchronization of data sources and sinks, realtime requirements can now be applied, since the system won’t be cluttered by too much data and many more features have been implemented. Cassandra will also be released in a community edition. This edition allows academic associates and students to rapidly develop and test their own image processing algorithms. The outstanding feature of Cassandra is the lack of need for programming skills. But still, if you know C++, you can extend Cassandra with your own stations.
With the freely available third party library OpenCV users have a very powerful image processing tool. This library will be implemented in Cassandra 11. The task of the CV group is to reasonably implement OpenCV functions and classes to Cassandra stations. The problem is the different concept of OpenCV and Cassandra. The latter one makes use of data flow graphs. In contrast, OpenCV requires much more user input and control about parameters and the use of the classes.
The OpenCV modul core has already been ported to Cassandra by Hella Aglaia. The CV group has implemented the modules calib3d, features2d, video analysis and object detection. With the experience and knowledge of our group, we could add meaningful application scenarios and end user tutorials.