Click on any session in the calendar to get more information about the speakers and the description of the talks.
Invited speakers will give technical talks about their research in computer vision. The invited talks will each have a duration of 25 minutes (20 minute talk and 5 minute Q/A).
Title: A future with affordable self-driving vehicles
Panelists will answer questions and discuss about increasing diversity in computer vision.
Feel free to ask your anonymous questions here.
A few accepted papers are invited to give oral presentations. The presentations will each have a duration of 7.5 minutes (5 minute spotlight and 2.5 minute Q/A).
Abstract: Partial person re-identification involves matching pedestrian views where only a part of a body is visible in corresponding images. This reflects practical CCTV surveillance scenario, where full person views are often unavailable. Missing body parts make the comparison very challenging due to significant misalignment and varying scale of the views. We propose Partial Matching Net (PMN) that detects body joints, aligns partial views and hallucinates the missing parts based on the information present in the frame and a learned model of a person. The aligned and reconstructed views are then combined into a joint representation and used for matching images. We evaluate our approach and compare to other methods on three different datasets, demonstrating significant improvements.
Abstract: We present the minimalist camera (mincam), a design framework to capture the scene information with minimal resources and without constructing an image. The basic sensing unit of a mincam is a ‘mixel’ — an optical photo-detector that aggregates light from the entire scene linearly modulated by a static mask. We precompute a set of masks for a configuration of few mixels, such that they retain minimal information relevant to a task. We show how tasks such as tracking moving objects or determining a vehicle’s speed can be accomplished with a handful of mixels as opposed to the more than a million pixels used in traditional photography. Since mincams are passive, compact, low powered and inexpensive, they can potentially find applications in a broad range of scenarios.
Abstract: Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention. Generally, the annotation burden is mitigated by labeling datasets with weaker forms of supervision, e.g. image-level labels or bounding boxes. Another option are semi-supervised settings, that commonly leverage a few strong annotations and a huge number of unlabeled/weakly-labeled data. In this paper, we revisit semi-supervised segmentation schemes and narrow down significantly the annotation budget (in terms of total labeling time of the training set) compared to previous approaches. With a very simple pipeline, we demonstrate that at low annotation budgets, semi-supervised methods outperform by a wide margin weakly-supervised ones for both semantic and instance segmentation. Our approach also outperforms previous semi-supervised works at a much reduced labeling cost. We present results for the Pascal VOC benchmark and unify weakly and semi-supervised approaches by considering the total annotation budget, thus allowing a fairer comparison between methods.
Abstract: Image classification serves as a fundamental part of many computer vision tasks such as detection, segmentation, captioning, in which deep neural networks excel. But the vulnerability of these models to imperceptible carefully crafted noise have raised questions regarding their robustness. In this work, we open these neural networks based black boxes for adversarial examples by leveraging side information (attributes). We predict attributes for clean as well as adversarial images and analyze how the attributes change when the input is slightly perturbed. We present comprehensive experiments for attribute prediction, adversarial example generation, adversarially robust learning, their qualitative and quantitative analysis using predicted attributes on Caltech-UCSD Birds datasets.
Abstract: We propose a novel procedure which adds "content-addressability" to any given unconditional implicit model e.g., a generative adversarial network (GAN). The procedure allows users to control the generative process by specifying a set (arbitrary size) of desired examples based on which similar samples are generated from the model. The proposed approach, based on kernel mean matching, is applicable to any generative models which transform latent vectors to samples, and does not require retraining of the model. Experiments on various high-dimensional image generation problems (CelebA-HQ, LSUN bedroom, bridge, tower) show that our approach is able to generate images which are consistent with the input set, while retaining the image quality of the original model. To our knowledge, this is the first work that attempts to construct, at test time, a content-addressable generative model from a trained marginal model.
Abstract: Single image super-resolution (SISR) is a challenging ill-posed problem which aims to restore or infer a high-resolution image from a low-resolution one. Powerful deep learning-based techniques have achieved state-of-the-art performance in SISR; however, they can underperform when handling images with non-stationary degradations, such as for the application of projector resolution enhancement. In this paper, a new UNet architecture that is able to learn the relationship between a set of degraded low-resolution images and their corresponding original high-resolution images is proposed. We propose employing a degradation model on training images in a non-stationary way, allowing the construction of a robust UNet (RUNet) for image super-resolution (SR). Experimental results show that the proposed RUNet improves the visual quality of the obtained super-resolution images while maintaining a low reconstruction error.
Authors of all accepted papers will present their work in a poster session. All posters should be installed in at most 10 minutes at the start of the poster session.
Use your Poster ID to find your poster board at the conference venue.
The physical dimensions of the poster stands are 8 feet wide by 4 feet high.
Poster presenters can optionally use the CVPR19 poster template for more details on how to prepare their posters.
The list of accepted papers along with their Poster IDs can be found here.
7:00 - 10:00 pm Dinner sponsored by Google
The dinner event is an opportunity to meet other female computer vision researchers. Authors will be matched with senior computer vision researchers to share experience and career advice. Invitees will receive an e-mail and be asked to confirm attendance.