[ieee 2011 annual ieee india conference (indicon) - hyderabad, india (2011.12.16-2011.12.18)] 2011...

4
HDR Scene Detection and Capturing Strategy V. ValliKumari Department of Computer Science Andhra University Visakhapatnam, India B. RaviKiran and K.V.S.V.N. Raju Department of Computer Science Anil Neerukonda Institute of Technology Visakhapatnam, India AbstractExisting high dynamic range(HDR) imaging acquisition techniques are limited to the skill of the end user and offline. We propose an algorithm to detect HDR scene and suggest optimal exposure sets for a given scene. Keywords- HDR, Expoure. I. INTRODUCTION If the scene is normal-illuminated, then it is not difficult to capture the scene with the available digital cameras. However, if the scene contains excessive back light and high contrast lighting conditions, output images may lose significant amounts of detail. If we adjust the exposure values for the dark objects, then the bright objects may appear too bright. Similarly, if we adjust the exposure values for the bright objects, then the dark objects may appear too dark. Our eye is able to see both full daylight details and indoor lighting details at the same time, without having any loss in highlight details and as well as in shadow details [1]. The human visual system is capable of adapting to lighting conditions that vary by nearly ten orders of magnitude [2]. Within a scene, the human visual system functions over a range of about five orders of magnitude simultaneously [2]. The dynamic range capability of human visual system is much more than any imaging device, which is available now. As a result, the human eye can see much more details. However, if you capture the same scene with the digital camera then you can end up with saturated highlights and losing information in shadows. Due to the size, cost limitations, and limited hardware resources, it is difficult to capture the high dynamic range scenes even with the cell phone cameras [3]. Many of the everyday scene's dynamic range is far greater than the dynamic range of the imaging devices. The dream of electronic vision and image processing is to mimic the capabilities of the human eye and possibly to go beyond in certain aspects [2]. HDR images contain all the light present in a scene ranging from dark shadow details to saturated highlights. Due to the limited dynamic range of imaging devices, we cannot capture the full dynamic range of the scene with a single shot. However, by capturing multiple exposures of the same scene we can preserve all the details of the scene in a single image. These images are called high dynamic range images. Images that store a depiction of the scene in a range of intensities commensurate with the scene are what we call HDR, or radiance maps [2]. High dynamic range CMOS sensors are developed for use in high-speed machine vision, for driver assistance and in endoscopy to capture the overall dynamic range of the scene [2], but such cameras are not widely available and also more expensive than the consumer digital cameras. Pentax K7 offers in camera high dynamic range imaging. In this mode, the camera captures three different images, combines them, and gives the final image to the user. The main problem with this approach is that there are situations where only three images are not enough to capture the entire scene’s dynamic range; in that case, we are missing those extra exposures. There are also situations where the three captured exposures might not be the correct exposures. In some cases, only two images or even a single image is enough to capture the full dynamic range of a scene. In such cases by enabling the high dynamic range mode in Pentax K7 camera, we are wasting extra exposures. The alternative approach to capture the entire scene dynamic range with the available camera, which is widely adapted now, is that the end users need to take multiple images of the same scene and combine them by using special software. Currently some software tools like Photomatix, Picturenaut and HDR darkroom are available to produce the high dynamic range images from a series of differently exposed images. We need to give these differently exposed images of the same scene as input to the software and the software will process the images and produce the final HDR image to the user. The main problem with these tools is that the end user should know whether the scene is HDR scene or not. If the scene is not a HDR scene, then taking multiple exposures of the same scene adds to the computation cost and time. If the user is an experienced photographer then he/she can decide about the scene characteristics and can judge whether the scene is HDR or not. In addition, a normal user may not even be able to tell whether the scene is HDR or not. If he or she does not capture multiple pictures, then the details will be lost. Once the details are lost, we cannot recover them by any way. Another problem is that even if the user decides that the scene is HDR, he/she may not know the minimal number of exposures that are required and at what exposures to capture the scene. Total dynamic range of some real world scenes may be captured with just two different exposures, while some might require more than three.

Upload: kvsvn

Post on 10-Mar-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - HDR scene detection and capturing strategy

HDR Scene Detection and Capturing Strategy

V. ValliKumari Department of Computer Science

Andhra University Visakhapatnam, India

B. RaviKiran and K.V.S.V.N. Raju

Department of Computer Science Anil Neerukonda Institute of Technology

Visakhapatnam, India

Abstract— Existing high dynamic range(HDR) imaging acquisition techniques are limited to the skill of the end user and offline. We propose an algorithm to detect HDR scene and suggest optimal exposure sets for a given scene.

Keywords- HDR, Expoure.

I. INTRODUCTION If the scene is normal-illuminated, then it is not difficult to

capture the scene with the available digital cameras. However, if the scene contains excessive back light and high contrast lighting conditions, output images may lose significant amounts of detail. If we adjust the exposure values for the dark objects, then the bright objects may appear too bright. Similarly, if we adjust the exposure values for the bright objects, then the dark objects may appear too dark. Our eye is able to see both full daylight details and indoor lighting details at the same time, without having any loss in highlight details and as well as in shadow details [1]. The human visual system is capable of adapting to lighting conditions that vary by nearly ten orders of magnitude [2]. Within a scene, the human visual system functions over a range of about five orders of magnitude simultaneously [2]. The dynamic range capability of human visual system is much more than any imaging device, which is available now. As a result, the human eye can see much more details. However, if you capture the same scene with the digital camera then you can end up with saturated highlights and losing information in shadows. Due to the size, cost limitations, and limited hardware resources, it is difficult to capture the high dynamic range scenes even with the cell phone cameras [3]. Many of the everyday scene's dynamic range is far greater than the dynamic range of the imaging devices. The dream of electronic vision and image processing is to mimic the capabilities of the human eye and possibly to go beyond in certain aspects [2].

HDR images contain all the light present in a scene ranging from dark shadow details to saturated highlights. Due to the limited dynamic range of imaging devices, we cannot capture the full dynamic range of the scene with a single shot. However, by capturing multiple exposures of the same scene we can preserve all the details of the scene in a single image. These images are called high dynamic range images. Images that store a depiction of the scene in a range of intensities commensurate with the scene are what we call HDR, or radiance maps [2].

High dynamic range CMOS sensors are developed for use in high-speed machine vision, for driver assistance and in endoscopy to capture the overall dynamic range of the scene [2], but such cameras are not widely available and also more expensive than the consumer digital cameras. Pentax K7 offers in camera high dynamic range imaging. In this mode, the camera captures three different images, combines them, and gives the final image to the user. The main problem with this approach is that there are situations where only three images are not enough to capture the entire scene’s dynamic range; in that case, we are missing those extra exposures. There are also situations where the three captured exposures might not be the correct exposures. In some cases, only two images or even a single image is enough to capture the full dynamic range of a scene. In such cases by enabling the high dynamic range mode in Pentax K7 camera, we are wasting extra exposures.

The alternative approach to capture the entire scene dynamic range with the available camera, which is widely adapted now, is that the end users need to take multiple images of the same scene and combine them by using special software. Currently some software tools like Photomatix, Picturenaut and HDR darkroom are available to produce the high dynamic range images from a series of differently exposed images. We need to give these differently exposed images of the same scene as input to the software and the software will process the images and produce the final HDR image to the user. The main problem with these tools is that the end user should know whether the scene is HDR scene or not. If the scene is not a HDR scene, then taking multiple exposures of the same scene adds to the computation cost and time. If the user is an experienced photographer then he/she can decide about the scene characteristics and can judge whether the scene is HDR or not. In addition, a normal user may not even be able to tell whether the scene is HDR or not. If he or she does not capture multiple pictures, then the details will be lost. Once the details are lost, we cannot recover them by any way. Another problem is that even if the user decides that the scene is HDR, he/she may not know the minimal number of exposures that are required and at what exposures to capture the scene. Total dynamic range of some real world scenes may be captured with just two different exposures, while some might require more than three.

Page 2: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - HDR scene detection and capturing strategy

II. RELATED WORK

To determine the degree of lighting conditions, the authors in [4-6] proposed a method, which uses the relation between the mean value and median value of an image.

By calculating the difference of mean and median values, we can have some knowledge on scene lighting conditions. In the case of normal lighting conditions, the brightness level of all pixels follows a steady distribution throughout the color and brightness ranges of each image. Therefore, the difference between a mean and median value is very low [4-6]. However, in the case of high contrast lighting and back lighting, for under or appropriate exposure value, the median value of the brightness levels tends to reside in the small value section and hence, it differs much from the average value of the whole array of all pixels. Based on this knowledge, the authors [4-6] concluded that, for normal lighting conditions the difference between the median is below the threshold (authors used the threshold value as 20). However, for under or over exposed scenes the difference between the mean and median is greater than the threshold value.

Limitation: We verified the algorithm with the Fig 1, the difference between mean and median value is 6. However, the scene is not a normal lighting condition scene. The difference of the mean and median value is 6. According to the author’s algorithm the scene is normal lighting. However, this is not true. Due to the large portion of under exposed area, the mean and median values are small. Because of the small portion of bright area, the mean value is slightly more the median value. Another drawback of this method is that, always a constant amount of exposure values are increased in the multiple exposure method. This ratio should depend on the scene.

To find the different exposure values, Nayar et all [7] fixed the number of exposure values and minimized the objective function. To find the number of exposures, Nayar et all started with two exposures and continued until the minimum of the objective function fell below a given tolerance. Since the weighting function is discontinuous, it is difficult to minimize the objective function. The authors are yet to find the best way to minimize the objective function. Currently, they achieved the minimization by using the exhaustive search. This search is performed off-line to produce a list of exposures, which need to be combined. The main limitation of this method is that the targeted end users of this method are scientists and engineers but not for the common users.

Barakath [8] introduced three different algorithms to identify the different bracketing systems to capture the HDR scenes. The first one is blind HDR system. If there is no prior knowledge of the target scene’s range of irradiance values, then the simple technique to capture the high dynamic range image for that scene is to capture it with all the available exposures. In order to solve this problem authors implemented a simple algorithm. The algorithm starts at the lowest exposure setting, and on each iteration, it adds to the solution set an exposure whose detectable range overlaps that of the previous added exposure and that which covers as much of the remaining system detectable range as possible. The algorithm terminates

after adding the highest available exposure setting. This final bracketing set is called minimum system bracketing set.

Figure 1. The difference of mean and median value of the image is 6. According to the authors [4-6], it is a normal light condition. However, the scene is not a normal light scene.

The drawback of this approach is that, in these exposures, some may contain little or no useful information depending on the scene. Some devices may need to work where the target scene dynamic range is already available. We can refer to these types of systems as ‘clairvoyant’ HDR-imaging systems. For example, in a HDR video surveillance system, the target scene is largely static (e.g. an outdoor building entrance). If the scene's minimum and maximum irradiance values are available, then we can convert the blind system into ‘clairvoyant’HDR-imaging system with the initialization and termination conditions modified to reflect the scene irradiance range. The procedure then traverses the set of available exposures in the upwards direction. For some situations where the scene dynamic range is not known in advance, a feedback process can help to gain the knowledge of the scene dynamic range. The procedure starts by acquiring an image at a central exposure setting . A lower and upper threshold is then applied to the pixel values to determine if the resulting LDR image has at least one pixel in darkness or in saturation. The procedure then traverses the set of available exposures in the upwards (and then downwards) direction, capturing LDR images and applying the dual-threshold to each image until an LDR image with no dark (respectively saturated) pixels is obtained (i.e., until the entire scene is captured) [7].

III. HDR SCENE DETECTION & CAPTURING STRATEGY

A. HDR Detection According to the basic definition we can say that if

there is an area of an image which is overexposed i.e., having pixel value almost equal to 255 or is underexposed i.e., having pixel value almost nearing 0, then it can be called as an HDR image. However this broad definition does not apply to all images because of the occurrence of reflective objects with uniform reflectance areas in an image. These areas are the naturally black or white patches in an image. Hence it is important to differentiate between these areas in an image and the HDR scenes. The reflective

Page 3: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - HDR scene detection and capturing strategy

objects with uniform reflectance can be verified by changing the exposure and intensity values of multiple images of the same patch. When the exposure time of an image is doubled the corresponding intensity of the image is also observed to be almost double. Therefore, these patches form a linear graph. When the same procedure is applied for an HDR scene, it results in a non linear graph.From Figure 2 to Figure 3 the exposure is doubled and the pixel average value is also getting doubled. Pixel average value for Fig 2 is 187 where as pixel average value for Fig 3 is 374.

Figure 2. Mean of the black color patch is 187.

Figure 3. Mean of the black color patch is 374.

Figure 4. Mean value of the small rectangular in the sky is 4090

B. Proposed algorithm for HDR capturing strategy An outline of a proposed HDR detection algorithm is

introduced in this section, which will detect high dynamic

range scene and will find the set of desired exposure values to capture the full dynamic range of the scene.

Step 1: Obtain the exposure values by using the auto exposure algorithm suggested by the camera.

Step 2: Identify the regions where the pixel values are either close to 0 or close to 255.

Step 3: The pixel values can be zero either because of the reflective objects with uniform reflectance (a black color object) or because of the under exposed regions (shadow details) in the image. Similarly, the pixel values can be 255 either because of the reflective objects with uniform reflectance (a white color object) or because of the saturated high light details. Our intention here is to identify the shadow details, and saturated high light details and not the uniform reflectance reflective objects.

Step 4: Identify under exposed regions in the obtained exposure value (step 1) and check for exposure values in the shadow regions where the contrast is high. If the scene contains only one shadow region, then only one exposure value is enough to maximize the contrast in the shadow details. However, if the scene contains more than one shadow region, then we need to check whether, the maximum exposure value is having the maximum contrast

Step 5: Similarly, we now need to find the exposure value(s) for the saturated details.

Step6: The exposure value, suggested by the auto exposure algorithm of the camera is the base exposure to our model. To check whether we need to include this exposure or not in the exposure sequence, we need to follow the simple procedure given below.

Step 6.1: First, we need to identify the high contrast and correctly exposed regions in the base exposure (step 1) value.

Step 6.2: Second, we need to check whether these regions (step 6a) are properly exposed and the contrast is high for the exposure values obtained in the step 4 and step 5. Based on the scene characteristics, for some scenes these regions (step 6a) will have the same contrast and properly exposed in all the obtained exposure values (step 1, step 4, and step 5). In these cases, we are not required to include the base exposure (step 1) in the exposure sequence to construct HDRI. In some cases, if the regions (step 6a) are not properly exposed and the contrast is not equal for the obtained exposure values (step 4 and step 5), then, we need to include the base exposure also in the sequence to construct HDRI. If we do not include the base exposure, then the output image will have some colour inconsistencies.

Adding an extra exposure for the HDR sequence requires more processing power, and creates ghosting artifacts. If adding the base exposure does not improve any details, it is not worth to add the base exposure to the sequence.

Step 7: The next step is to fuse the multiple images together to get a HDRI. Once we find the correct exposure values for the shadow details and high light details, we need

Page 4: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - HDR scene detection and capturing strategy

to do the weighted average of all the obtained exposures to get the HDRI.

CONCLUSION & FUTURE WORK We have presented a method for detecting and capturing high dynamic range scene detection and capturing strategy and tested on around 250 different images (50 HDR sets). One set of HDR images can be found in the below link http://picasaweb.google.com/ravi0024.ntu/Scene1# . Future work may include enhancement to the output results.

ACKNOWLEDGMENT We are very much thankful to Prof. Ramakrishna Kakarala, NTU, Singapore for his suggestions and inputs.

REFERENCES [1] A. Rizzi, C. Gatto, B. Piacentini, M. Fierro, and D. Marini.

“Human visual system inspired tone mapping algorithms for HDR images.” Proc. SPIE Electronic Imaging Conf, 5292 57-68, 2004.

[2] E. Reinhard, G. Ward, S. Pattanaik, and P. Debevec. High Dynamic Range Imaging. Elsevier, 2005.

[3] R.C. Bilcu,A. Burian, A. Knuutila, M. Vehvilainen. "High dynamic range imaging on mobile devices." 15th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2008, pp.1312-1315, Aug. 31 2008-Sept. 3 2008.

[4] Quoc Kien Vuong, Se-Hwan Yun,and Suki Kim. "A New Auto Exposure System to Detect High Dynamic Range Conditions Using CMOS Technology." Third International Conference on Convergence and Hybrid Information Technology, ICCIT '08, vol.1, pp.577-580, 11-13 Nov. 2008.

[5] JiaYi Liang, YaJie Qin, and ZhiLiang Hong. "An Auto-exposure algorithm for detecting high contrast lighting conditions." 7th International Conference on ASIC, ASICON '07, pp.725-728, 22-25 Oct. 2007.

[6] Quoc Kien Vuong, Se-Hwan Yun, Suki Kim. “A New Auto Exposure and Auto White-Balance Algorithm to Detect High Dynamic Range Conditions Using CMOS Technology.” Proceedings of the world congess and engineering and computer science, October 22-24, 2008.

[7] M. Grossberg, and S. Nayar. "High Dynamic Range from Multiple Images: Which Exposures to Combine?" In ICCV Workshop on Colour and Photometric Methods in Computer Vision (CPMCV) 2003.

[8] N. Barakat, A.N. Hone, and T.E. Darcie. "Minimal-Bracketing Sets for High-Dynamic-Range Image Capture." IEEE Transactions on Image Processing, vol.17, no.10, pp.1864-1875, Oct 2008.