Paper name aimed two point: 1. Synthetic Depth-of-Field
Shallow depth-of-field: When the depth of field is small, or shallow, the image background and foreground are blurred, with only a small area in focus
Dual pixel technology effectively divides every single pixel into two separate photo sites. Each pixel consists of two photodiodes that sit side by side next to each other under a micro lens.
A matte is a layer (or any of its channels) that defines the transparent areas of that layer or another layer.
cell phone's carema is all-in-focus images.
We present a system to computationally synthesize shallow depth-of-field images with a single mobile camera and a single button press.
Our system can process a 5.4 megapixel image in 4 seconds on a mobile phone, is fully automatic, and is robust enough to be used by non-experts.
b) time-of-flight or structured-light direct depth sensor
Our system combines two different technologies and is able to function with only one of them. The first is a neural network trained to segment out people and their accessories.Second, if available, we use a sensor with dual-pixel (DP) auto-focus hardware, which effectively gives us a 2-sample light field with a narrow ∼1 millimeter baseline.
---<<Burst photography for high dynamic range and low-light imaging on mobile cameras.>>
---Robert Anderson, David Gallup, Jonathan T Barron, Janne Kontkanen, Noah Snavely, Carlos Hernández, Sameer Agarwal, and Steven M Seitz. 2016. Jump: Virtual Reality Video. SIGGRAPH Asia (2016).
--Jonathan T Barron, Andrew Adams, YiChang Shih, and Carlos Hernández. 2015. Fast bilateral-space stereo for synthetic defocus.
--Johannes Kopf, Michael F Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint bilateral upsampling. ACM TOG (2007).
We present a calibration procedure.
Our rendering technique divides the scene into several layers at different disparities, splats pixels to translucent disks according to disparity and then composites the different layers weighted by the actual disparity.
The wide field-of-view of a typical mobile camera is ill-suited for portraiture. It causes a photographer to stand near subjects leading to unflattering perspective distortion of their faces.
--Carlos Hernández. 2014. Lens Blur in the new Google Camera app. http://research. googleblog.com/2014/04/lens-blur-in-new-google-camera-app.html.
Our contributions include: (a) training and data collection methodologies to train a fast and accurate segmentation model capable of running on a mobile device, and (b) edge-aware filtering to upsample the mask predicted by the neural network.
choosing a wide enough variety of poses, discarding poor training images, cleaning up inaccurate polygon masks, etc.
With each improvement we made over a 9- month period in our training data, we observed the quality of our defocused portraits to improve commensurately.
The network takes as input a 4 channel 256 × 256 image, where 3 of the channels correspond to the RGB image resized and padded to 256 × 256 resolution preserving the aspect ratio. The fourth channel encodes the location of the face as a posterior distribution of an isotropic Gaussian centered on the face detection box with a standard deviation of 21 pixels and scaled to be 1 at the mean location.
At inference time, we are provided with an RGB image and face rectangles output by a face detector.Our model is trained to predict the segmentation mask corresponding to the face location in the input.
3.4 Edge-Aware Filtering of a Segmentation Mask
Using the prior that mask boundaries are often aligned with image edges, we use
an edge-aware filtering approach to upsample the low resolution mask M(x) predicted by the network.
3.07 Giga-flops compared to 607 for PortraitFCN+ and 3160 for Mask-RCNN as measured using the Tensorflow Model Benchmark Tool.
4. Depth from dual-pixel camera
Dual-pixel (DP) auto-focus systems work by splitting pixels in half, such that the left half integrates light over the right half of the aperture and vice versa。
This system is normally used for autofocus, where it is sometimes called phase-detection auto-focus.
Some techniques can compute detph but need two more than two views.
---Edward H Adelson and John YA Wang. 1992. Single lens stereo with a plenoptic camera.
We build upon the stereo work of Barron et al.
---JonathanTBarron,AndrewAdams,YiChangShih,andCarlosHernández.2015. Fast bilateral-spacestereoforsyntheticdefocus. CVPR (2015).
We therefore build upon the stereowork of Barron et al. [2015] and the edge-aware flow work of Anderson et al. [2016] to construct a stereo algorithm that is both tractable at high resolution and wellsuited to the defocus task by virtue of following the edges in the input image.
To get multiple frames for denoising, we keep a circular buffer of the last nine raw and DP frames captured by the camera.
To compute disparity, we take each non-overlapping 8 × 8 tile in the first view and search a range of −3 pixels to 3 pixels in the second view at DP resolution.
Several heuristics: the value of the SSD loss, the magnitude of the horizontal gradients in the tile, the presence of a close second minimum, and the agreement of disparities in neighboring tiles.
4.2 Imaging Model and Calibration
This equation has two notable consequences. First, disparity depends on focus distance (z) and is zero when depth is equal to focus distance (D = z). Second, there is a linear relationship between inverse depth and disparity that does not vary spatially.
4.3 Combining Disparity and Segmentation
4.4 Edge-Aware Filtering of Disparity
We use the bilateral solver [Barron and Poole 2016] to turn the noisy disparities into a smooth edge-aware disparity map suitable for shallow depth-of-field rendering.
5.1 Precomputing the blur parameters
One obvious solution is to simply reexpress the scatter as a gather.
5.3 Production the final image
Final image with synthetic noise