digital watermarking - information systems and internet security

158 DIGITAL WATERMARKING

DIGITAL WATERMARKING

R. CHANDRAMOULI

Stevens Institute of TechnologyHoboken, NJ

NASIR MEMON

Polytechnic UniversityBrooklyn, NY

MAJID RABBANI

Eastman Kodak CompanyRochester, NY

INTRODUCTION

The advent of the Internet has resulted in many newopportunities for creating and delivering content in digitalform. Applications include electronic advertising, real-time video and audio delivery, digital repositories andlibraries, and Web publishing. An important issue thatarises in these applications is protection of the rights ofcontent owners. It has been recognized for quite some timethat current copyright laws are inadequate for dealingwith digital data. This has led to an interest in developingnew copy deterrence and protective mechanisms. Oneapproach that has been attracting increasing interestis based on digital watermarking techniques. Digitalwatermarking is the process of embedding informationinto digital multimedia content such that the information(which we call the watermark) can later be extractedor detected for a variety of purposes, including copyprevention and control. Digital watermarking has becomean active and important area of research, and developmentand commercialization of watermarking techniques isdeemed essential to help address some of the challengesfaced by the rapid proliferation of digital content.

In the rest of this article, we assume that thecontent being watermarked is a still image, though mostdigital watermarking techniques are, in principle, equallyapplicable to audio and video data. A digital watermarkcan be visible or invisible. A visible watermark typicallyconsists of a conspicuously visible message or a companylogo indicating the ownership of the image, as shownin Fig. 1. On the other hand, an invisibly watermarkedimage appears very similar to the original. The existenceof an invisible watermark can be determined only byusing an appropriate watermark extraction or detectionalgorithm. In this article, we restrict our attention toinvisible watermarks.

An invisible watermarking technique generally consistsof an encoding process and a decoding process. A genericwatermark encoding process is shown in Fig. 2. Thewatermark insertion step is represented as

X ′ = EK(X, W), (1)

where X is the original image, X ′ is the watermarkedimage, W is the watermark information being embedded,K is the user’s insertion key, and E represents thewatermark insertion function. Depending on the way thewatermark is inserted and depending on the nature of

Figure 1. An image that has visible watermark.

User key K

Watermarked image X ′

EK (X,W ) = X ′Source image X

Encoder

EWatermark W

Figure 2. Watermark encoding process.

the watermarking algorithm, the detection or extractionmethod can take on very distinct approaches. Onemajor difference between watermarking techniques iswhether the watermark detection or extraction steprequires the original image. Watermarking techniquesthat do not require the original image during theextraction process are called oblivious (or public or blind)watermarking techniques. For oblivious watermarkingtechniques, watermark extraction is represented as

W = DK ′ (X ′), (2)

where X ′ is a possibly corrupted watermarked image,K ′ is the extraction key, D represents the watermarkextraction/detection function, and W′ is the extractedwatermark information (see Fig. 3). Oblivious schemes areattractive for many applications where it is not feasible torequire the original image to decode a watermark.

Invisible watermarking schemes can also be classifiedas either robust or fragile. Robust watermarks are oftenused to prove ownership claims and so are generallydesigned to withstand common image processing tasks

DIGITAL WATERMARKING 159

Source image X

Watermarked image Y ′

Yes

no

deci

sion

d

Decoder

User key K

DK(X,Y ′,T ) = {0,1}DK (Y ′ ) = T

Watermark T

Watermark W

D

Figure 3. Watermark decoding process.

such as compression, cropping, scaling, filtering, contrastenhancement, and printing/scanning, in addition tomalicious attacks aimed at removing or forging thewatermark. In contrast, fragile watermarks are designedto detect and localize small changes in the image data.

Applications

Digital watermarks are potentially useful in manyapplications, including the following.

Ownership Assertion. To assert ownership of an image,Alice can generate a watermarking signal using a secretprivate key and embed it in the original image. She canthen make the watermarked image publicly available.Later, when Bob contests the ownership of an imagederived from this public image, Alice can produce theunmarked original image and also demonstrate thepresence of her watermark in Bob’s image. Because Alice’soriginal image is unavailable to Bob, he cannot do thesame. For such a scheme to work, the watermark has tosurvive image processing operations aimed at maliciousremoval. In addition, the watermark should be inserted sothat it cannot be forged because Alice would not want tobe held accountable for an image that she does not own.

Fingerprinting. In applications where multimedia con-tent is electronically distributed across a network, thecontent owner would like to discourage unauthorizedduplication and distribution by embedding a distinctwatermark (or a fingerprint) in each copy of the data.If, unauthorized copies of the data are found, at a latertime, then the origin of the copy can be determined byretrieving the fingerprint. In this application, the water-mark needs to be invisible and must also be invulnerableto deliberate attempts to forge, remove or invalidate it.The watermark, should also be resistant to collusion, thatis, a group of users that have the same image but containsdifferent fingerprints should not be able to collude andinvalidate any fingerprint or create a copy without anyfingerprint.

Another example is in digital cinema, where informa-tion can be embedded as a watermark in every frame orsequence of frames to help investigators locate the scene ofthe piracy more quickly and point out security weaknessesin the movie’s distribution. The information could includedata such as the name of the theater and the date and timeof the screening. The technology would be most useful in

fighting a form of piracy that is surprisingly common, forexample, when someone uses a camcorder to record themovie as it is shown in a theater and then duplicates itonto optical disks or VHS tapes for distribution.

Copy Prevention or Control. Watermarks can also beused for copy prevention and control. For example, in aclosed system where the multimedia content needs specialhardware for copying and/or viewing, a digital watermarkcan be inserted indicating the number of copies that arepermitted. Every time a copy is made the watermarkcan be modified by the hardware and after a certainnumber of copies, the hardware would not create furthercopies of the data. An example of such a system is thedigital versatile disk (DVD). In fact, a copy protectionmechanism that includes digital watermarking at itscore is currently being considered for standardization,and second-generation DVD players may well include theability to read watermarks and act based on their presenceor absence (1).

Fraud and Tampering Detection. When multimediacontent is used for legal purposes, medical applications,news reporting, and commercial transactions, it isimportant to ensure that the content originated from aspecific source and that it was not changed, manipulated,or falsified. This can be achieved by embedding awatermark in the data. Subsequently, when the photois checked, the watermark is extracted using a unique keyassociated with the source, and the integrity of the data isverified through the integrity of the extracted watermark.The watermark can also include information from theoriginal image that can aid in undoing any modificationand recovering the original. Clearly, a watermark usedfor authentication should not affect the quality of animage and should be resistant to forgeries. Robustnessis not critical because removing the watermark rendersthe content inauthentic and hence valueless.

ID Card Security. Information in a passport or ID (e.g.,passport number or person’s name) can also be includedin the person’s photo that appears on the ID. The ID cardcan be verified by extracting the embedded informationand comparing it to the written text. The inclusion of thewatermark provides an additional level of security in thisapplication. For example, if the ID card is stolen and thepicture is replaced by a forged copy, failure in extractingthe watermark will invalidate the ID card.

These are a few examples of applications where digitalwatermarks could be of use. In addition, there aremany other applications in digital rights management(DRM) and protection that can benefit from watermarkingtechnology. Examples include tracking use of content,binding content to specific players, automatic billing forviewing content, and broadcast monitoring. From thevariety of potential applications exemplified, it is clearthat a digital watermarking technique needs to satisfy anumber of requirements. The specific requirements varywith the application, so watermarking techniques needto be designed within the context of the entire systemin which they are to be employed. Each application


imposes different requirements and requires differenttypes of invisible or visible watermarking schemes ora combination thereof. In the remaining sections ofthis article, we describe some general principles andtechniques for invisible watermarking. Our aim is to givethe reader a better understanding of the basic principles,inherent trade-offs, strengths, and weaknesses, of digitalwatermarking. We will focus on image watermarking inour discussions and examples. However as we mentionedearlier, the concepts involved are general and can beapplied to other forms of content such as video and audio.

Relationship to Information Hiding and Steganography

In addition to digital watermarking, the general idea ofhiding some information in digital content has a widerclass of applications that go beyond copyright protectionand authentication. The techniques involved in suchapplications are collectively referred to as informationhiding. For example, an image printed on a documentcould be annotated by information that could lead a userto its high resolution version as shown in Fig. 4. Metadataprovide additional information about an image. Althoughmetadata can also be stored in the file header of a digitalimage, this approach has many limitations. Usually, whena file is transformed to another format (e.g., from TIFFto JPEG or to bmp), the metadata are lost. Similarly,cropping or any other form of image manipulation typicallydestroys the metadata. Finally, the metadata can beattached only to an image as long as the image existsin digital form and is lost once the image is printed.Information hiding allows the metadata to travel with theimage regardless of the file format and image state (digitalor analog). Metadata information embedded in an imagecan serve many purposes. For example, a business canembed the website URL for a specific product in a picturethat shows an advertisement of that product. The userholds the magazine photo in front of a low-cost CMOScamera that is integrated into a personal computer, cellphone, or Palm Pilot. The data are extracted from thelow-quality picture and are used to take the browser tothe designated website. Another example is embeddingGPS data (about 56 bits) about the capture location ofa picture. The key difference between this applicationand many other watermarking applications is the absenceof an active adversary. In watermarking applications,such as copyright protection and authentication, thereis an active adversary that would attempt to remove,

invalidate, or forge watermarks. In information hiding,there is no such active adversary because there is novalue in removing the information hidden in the content.Nevertheless, information hiding techniques need to berobust to accidental distortions. For example, in theapplication shown in Fig. 4, the information embeddedin the document image needs to be extracted despitedistortions from the print and scan process. However,these distortions are just a part of a process and are notcaused by an active adversary.

Another topic that is related to watermarking issteganography (meaning covered writing in Greek), whichis the science and art of secret communication. Althoughsteganography has been studied as part of cryptographyfor many decades, the focus of steganography is secretcommunication. In fact, the modern formulation of theproblem goes by the name of the prisoner’s problem.Here, Alice and Bob are trying to hatch an escape planwhile in prison. The problem is that all communicationbetween them is examined by a warden, Wendy, who willplace both of them in solitary confinement at the firsthint of any suspicious communication. Hence, Alice andBob must trade seemingly inconspicuous messages thatactually contain hidden messages involving the escapeplan. There are two versions of the problem that areusually discussed — one where the warden is passive, andonly observes messages and the other where the warden isactive and modifies messages in a limited manner to guardagainst hidden messages. Clearly, the most importantissue here is that the very presence of a hidden messagemust be concealed, whereas in digital watermarking, it isnot always necessary that a good watermarking techniquealso be steganographic.

Watermarking Issues

The following important issues arise in studying digitalwatermarking techniques:

• Capacity: What is the optimum amount of data thatcan be embedded in a given signal? What is theoptimum way to embed and then later extract thisinformation?

• Robustness: How do we embed and retrieve data sothat it survives malicious or accidental attempts atremoval?

• Transparency: How do we embed data so that it doesnot perceptually degrade the underlying content?

Figure 4. Metadata tagging using information hiding.

Scanner Extract20 byte ID Database

Hi-resolution originalEach sticker contains 20

bytes of hidden data


• Security: How do we determine that the informationembedded has not been tampered with forged, or evenremoved?

These questions have been the focus of intense studyin the past few years, and some remarkable progresshas already been made. However, there are still morequestions than answers in this rapidly evolving researcharea. Perhaps a key reason for it is that digitalwatermarking is inherently a multidisciplinary topic thatbuilds on developments in diverse subjects. The areas thatcontribute to the development of digital watermarkinginclude at least the following:

• Information and communication theory• Decision and detection theory• Signal processing• Cryptography and cryptographic protocols

Each of these areas deals with a particular aspect ofthe digital watermarking problem. Generally speaking,information and communication theoretic methods dealwith the data embedding (encoder) side of the problem.For example, information theoretic methods are useful incomputing the amount of data that can be embedded ina given signal subject to various constraints such as thepeak power (square of the amplitude) of the embeddeddata or the embedding-induced distortion. The hostsignal can be treated as a communication channel, andvarious operations such as compression/decompression,and filtering can be treated as noise. Using this framework,many results from classical information theory canbe successfully applied to compute the data-embeddingcapacity of a signal.

Decision theory is used to analyze data-embeddingprocedures from the receiver (decoder) side. Given a data-embedding procedure, how do we extract the hidden datafrom the host signal which may have been subjected tointentional or unintentional attacks? The data extractionprocedure must guarantee a certain amount of reliability.What are the chances that the extracted data areindeed the original embedded data? Even if the data-embedding algorithm is not intelligent or sophisticated,a good data extraction algorithm can offset this effect.In watermarking applications where the embedded datais used for copyright protection, decision theory is usedto detect the presence of embedded data. In applicationssuch as media bridging, detection theoretic methods areneeded to extract the embedded information. Therefore,decision theory plays a very important role in digitalwatermarking for data extraction and detection. In fact,it is shown that when using invisible watermarks forresolving rightful ownership, uniqueness problems arisedue to the data detection process, irrespective of thedata-embedding process. Therefore, there is a real andimmediate need to develop reliable, efficient, and robustdetectors for digital watermarking applications.

A variety of signal processing tools and algorithms canbe applied to digital watermarking. Such algorithms arebased on aspects of the human visual system, propertiesof signal transforms [e.g., Fourier and discrete cosine

transform (DCT)], noise characteristics, properties ofvarious signal processing attacks, etc. Depending on thenature of the application and the context, these methodscan be implemented at the encoder, at the decoder, orboth. The user has the flexibility to mix and matchdifferent techniques, depending on the algorithmic andcomputational constraints. Although issues such as visualquality, robustness, and real-time constraints can beaccommodated, it is still not clear if all of the propertiesdesirable for digital watermarking discussed earlier canbe achieved by any single algorithm. In most cases,these properties have an inherent trade-off. Therefore,developing signal processing methods to strike an optimalbalance between the competing properties of a digitalwatermarking algorithm is necessary.

Cryptographic issues lie at the core of many appli-cations of information hiding but have unfortunatelyreceived little attention. Perhaps this is due to thefact that most work in digital watermarking has beendone in the signal processing and communications com-munity, whereas cryptographers have focused more onissues like secret communication (covert channels, sub-liminal channels) and collusion-resistant fingerprinting.It is often assumed that simply using appropriate cryp-tographic primitives like encryption, time-stamps, digitalsignatures, and hash functions would result in secureinformation hiding applications. We believe that thisis far from the truth. In fact, designing secure digitalwatermarking techniques requires an intricate blend ofcryptography along with information theory and signalprocessing.

The rest of this article is organized as follows. Inthe next section, we describe fragile and semifragilewatermarking; the following section deals with robustwatermarks. Communication and information theoreticapproaches to watermarking are discussed in the subse-quent section, and concluding remarks are provided in thelast section.

FRAGILE AND SEMIFRAGILE WATERMARKS

In the analog world, an image (a photograph) has generallybeen accepted as a proof of occurrence ‘‘of the eventdepicted. The advent of digital images and the relativeease with which they can be manipulated has changed thissituation dramatically. Given an image in digital or analogform, one can no longer be assured of its authenticity. Thishas led to the need for image authentication techniques.

Data authentication techniques have been studied incryptography for the past few decades. They provide ameans of ensuring the integrity of a message. At first,the need for image authentication techniques may notseem to pose a problem because efficient and effectiveauthentication techniques are found in the field ofcryptography. However, authentication applications forimages present some unique problems that are notaddressed by conventional cryptographic authenticationtechniques. Some of these issues are listed here:

• It is desirable in many applications to authenticatethe image content, rather then the representation of


the content. For example, converting an image fromJPEG to GIF is a change in representation. One wouldlike the authenticator to remain valid across differentrepresentations, as long as the perceptual contenthas not been changed. Conventional authenticationtechniques based on cryptographic hash functions,message digests, and digital signatures authenticateonly the representation.

• When authenticating image content, it is oftendesirable to embed the authenticator in the imageitself. This has the advantage that authentication willnot require any modifications to the large number ofexisting representational formats for image contentthat do not provide any explicit mechanism forincluding an authentication tag (like the GIF format).More importantly, the authentication tag embeddedin the image would survive transcoding of the dataacross different formats, including analog-to-digitaland digital-to-analog conversions, in a completelytransparent manner.

• In addition to detecting any tampering with theoriginal content, it is also desirable to detect theexact location of the tampering.

• Given the highly data-intensive nature of imagecontent, any authentication technique has to becomputationally efficient to the extent that a simplereal-time implementation should be possible in bothhardware and software.

These issues can be addressed by designing imageauthentication techniques based on digital watermarks.There are two kinds of watermarking techniques that havebeen developed for authentication applications — fragilewatermarking techniques and semifragile watermarkingtechniques. In the rest of this section, we describe thegeneral approach taken by each and give some illustrativeexamples.

Fragile Watermarks

A fragile watermark is designed to indicate and evenpinpoint any modification made to an image. To illustratethe basic workings of fragile watermarking, we describea technique recently proposed by Wong and Memon (2).This technique inserts an invisible watermark W into an

m × n image, X. The original image X and the binarywatermark W are partitioned into k × l blocks, where therth image block and the watermark block are denotedby Xr and Wr, respectively. For each image block Xr, acorresponding block Xr is formed, identical to Xr, exceptthat the least significant bit of every element in Xr is setto zero.

For each block Xr, a cryptographic hash H(K, m, n, Xr)(such as MD5) is computed, where K is the user’s key. Thefirst kl bits of the hash output, treated as k × l rectangulararray, are XOR’ed with the current watermark block Wr toform a new binary block Cr. Each element of Cr is insertedinto the least significant bit of the corresponding elementin Xr, generating the output X ′

r.Image authentication is performed by extracting Cr

from each block X ′r of the watermarked image and

by XOR’ing that array with the cryptographic hashH(K, m, n, Xr), as before, to produce the extracted water-mark block. Changes in the watermarked image result inchanges in the corresponding binary watermark region,enabling using the technique to localize unauthorizedalterations of an image.

The watermarking algorithm can also be extended toa public key version, where the private key of a publickey algorithm K ′

A is required to insert the watermark.However, the extraction requires only the public keyof user A. More specifically, in the public key versionof the algorithm, the MSBs of an image data block Xr

and the image size parameters are hashed, and then theresult is encrypted using the private key of a public keyalgorithm. The resulting encrypted block is then XOR’edwith the corresponding binary watermark block Wr beforethe combined results are embedded in the LSB of theblock. In the extraction step, the same MSB data andthe image size parameters are hashed. The LSB of thedata block (cipher text) is decrypted using the public keyand then XOR’ed with the hash output to produce thewatermark block. Refer to Figs. 5 and 6 for public keyverification watermark insertion and extraction processes,respectively.

This technique is one example of a fragile watermarkingtechnique. Many other techniques are proposed in theliterature. Following are the main issues that need to beaddressed in designing fragile watermarking techniques:

Figure 5. Public key verification water-mark insertion procedure.

Puplic keyencryption

Ek ′(.)H(Ix, Mx, Nx, r, Xr)

Pr Wr SrXOR

Insert Srinto LSB

of Xr∼

∼

∼XrXr

Inputimageblock

Block index r

Image height Nx

Image Width Mx

Image ID Ix

Block of watermark bitmap Br

Private (encryption) key K ′

Set LSB’sto zero


H(IY,MY,NY,r,Yr)

Public keydecryption

Dk(.)

Gr

Qr

Ur

YrYr

∼

∼XOR

Yor

Block ofextractedwatermark

Public (decryption) key K

ExtractLSB’s

Set LSB’s to zero

Imageblock

Block index r

Image height Ny

Image width My

Image ID Iy

Figure 6. Public key verification watermark extractionprocedure.

Entropydecoder

DCT

Inputimage

JPEGbitstream

Signature

Comparator

Dequantizer

Quantizationtable

Huffman table

Decryption

Figure 7. SARI image authentication system — verification pro-cedure.

• Locality: How well does the technique identify theexact pixels that have been modified. The Wongand Memon technique described before, for example,can localize only changes in image blocks (at least12 × 12), even if only one pixel has been changed inthe block. Any region smaller than this cannot bepinpointed as modified.

• Transparency: How much degradation in imagequality is suffered by inserting of a watermark?

• Security: How difficult is it for someone without theknowledge of the secret key (the user key K in the firstscenario or the private key K ′

A in the second scenario)used in the watermarking process to modify an imagewithout modifying the watermark or to insert a newbut valid watermark.

Semifragile Watermarks

The methods described in the previous subsectionauthenticate the data that form the multimedia content;the authentication process does not treat the data asdistinct from any other data stream. Only the process ofinserting the signature into the multimedia content treatsthe data stream as an object that is to be viewed by ahuman observer. For example, a watermarking scheme

may maintain the overall average image color, or it mayinsert the watermark in the least significant bit, thusdiscarding the least significant bits of the original datastream and treating them as perceptually irrelevant.

All multimedia content in current representations havea fair amount of built-in redundancy, that is to say thatthe data representing the content can be changed withouteffecting a perceptual change. Further, even perceptualchanges in the data may not affect the content. Forexample, when dealing with images, one can brightenan image, compress it in a lossy fashion, or changecontrast settings. The changes caused by these operationscould well be perceptible, even desirable, but the imagecontent is not considered changed. Objects in the imageare in the same positions and are still recognizable. Itis highly desirable that authentication of multimediadocuments takes this into account, that is, there is aset of allowed operations that can be applied to theimage content without affecting the authenticity of theimage.

There have been a number of recent attempts attechniques that address authentication of ‘‘image content’’,not just the image data representation. One approach isto use feature points in defining image content that arerobust to image compression. Cryptographic schemes suchas digital signatures can then be used to authenticatethese feature points. Typical feature points include, forexample, edge maps (3), local maxima and minima, andlow-pass wavelet coefficients (4). The problem with thesemethods is that it is hard to define image content in termsof a few features; for example, edge maps do not sufficientlydefine image content because it may be possible for twoimages, to have fairly different content (the face of oneperson replaced by that of another) but identical edgemaps. Image content remains an ill-defined attribute thatdefines quantification despite the many attempts by theimage processing and vision communities.

Another interesting approach to authenticating imagecontent is to compute an image digest (or hash orfingerprint) of the image and encrypt the digest usinga secret key. For public key verification of the image,the secret key is the user’s private key and hence theverification can be done by anyone who has the user’spublic key, much like digital signatures. Note that theimage digest that is computed is much smaller than theimage itself and can be embedded in the image by using arobust watermarking technique. Furthermore, the imagedigest has the property that, as long as the image contenthas not changed, the digest that is computed from theimage remains the same. Clearly, constructing such animage digest function is a difficult problem. Nevertheless,there have been a few such functions proposed in theliterature, and image authentication schemes based onthem have been devised. Perhaps the most widely citedimage digest function/authentication scheme is SARI,proposed by Lin and Chang (5). The SARI authenticationscheme contains an image digest function that generateshash bits that are invariant to JPEG compression, thatis, the hash bits do not change if the image is JPEGcompressed but do change for any other significant ormalicious operation.


The image digest component of SARI is based onthe invariance of the relationship between selected DCTcoefficients in two given image blocks. It can be proventhat this relationship is maintained even after JPEGcompression by using the same quantization matrix forthe whole image. Because the image digest is based on thisfeature, SARI can distinguish between JPEG compressionand other malicious operations that modify image content.More specifically, in SARI, the image to be authenticatedis first transformed to the DCT domain. The DCT blocksare grouped into nonoverlapping sets Pp and Pq as definedhere:

Pp = {P1, P2, P3, . . . , P(N/2)},Pq = {Q1, Q2, Q3, . . . , Q(N/2)},

where N is the total number of DCT blocks in the inputimage. An arbitrary mapping function Z is defined betweenthese two sets that satisfies the criteria Pp = Z(K, Pq),Pp ∩ Pq = φ and Pp ∪ Pq = P, where P is the set of all DCTblocks of the input image. The mapping function Z iscentral to the security of SARI and is based on a secretkey K. The mapping effectively partitions image blocksinto pairs. Then for each block pair, a number of DCTcoefficients is selected. Feature code or hash bits are thengenerated by comparing the corresponding coefficients inthe paired block. For example, if the DCT coefficient inblock Pm is greater than the DCT coefficient in block Pn inthe block pair (Pm, Pn), then the hash bit generated is ‘‘1’’.Otherwise, a ‘‘0’’ is generated.

It is clear that a hash bit preserves the relationshipbetween the selected DCT coefficients in a given block pair.The hash bits generated for each block are concatenatedto form the digest of the input image. This digest canthen either be embedded in the image itself or appendedas a tag. The authentication procedure at the receivingend involves extracting the embedded digest. The digestfor the received image is generated in the same manneras at the encoder and is compared with the extractedand decrypted digest. Because relationships betweenselected DCT coefficients is maintained even after JPEGcompression, this authentication system can distinguishJPEG compression from other malicious manipulations ofthe authenticated image. However, it was recently shownthat if a system uses the same secret key K and hencethe same mapping function Z to form block pairs for allof the images authenticated by it, an attacker who hasaccess to a sufficient number of images authenticated bythis system can produce arbitrary fake images (6).

SARI is limited to authentication that is invariant onlyto JPEG compression. Although JPEG compression is oneof the most common operations performed on an image,certain applications may require authentication that isinvariant to other simple image processing operationssuch as contrast enhancement or sharpening. As arepresentative of the published literature to achieve thispurpose, we review a promising technique proposed byFridrich (7). In this technique, N random matrices aregenerated whose entries are uniformly distributed in [0,1],using a secret key. Then, a low-pass filter is applied toeach of these random matrices to obtain N random smooth

Figure 8. Random patterns and their smoothed versions used inFridrich semifragile watermarking technique.

patterns, as shown in Fig. 8. These are then made DC freeby subtracting their respective means to obtain Pi wherei = 1, . . . , N. Then image block B is projected onto each ofthese random smooth patterns. If a projection is greaterthan zero, then the hash bit generated is a ‘‘1’’ otherwisea ‘‘0’’ is generated. In this way, N hash bits are generatedfor image authentication.

Because the patterns Pi have zero mean, the projectionsdo not depend on the mean gray value of the blockbut depend only on the variations within the blockitself. The robustness of this bit extraction techniquewas tested on real imagery, and it was shown that itcan reliably extract more than 48 correct bits (out of50 bits) from a small 64 × 64 image for the following imageprocessing operations: 15% quality JPEG compression (asin PaintShop Pro); additive uniform noise that has anamplitude of 30 gray levels; 50% contrast adjustment;25% brightness adjustment, dithering to 8 colors; multipleapplications of sharpening, blurring, median, and mosaicfiltering; histogram equalization and stretching; edgeenhancement; and gamma correction in the range 0.7–1.5.However, operations, such as embossing, and geometricmodifications, such as rotation, shift, and change of scale,lead to a failure to extract the correct bits.

In summary, image content authentication using avisual hash function and then embedding this hash byusing a robust watermark is a promising area and willsee many developments in the coming years. This is adifficult problem, and there may never be a completelysatisfactory solution because there is no clear definitionof image content and relatively small changes in imagerepresentation could lead to large variations in imagecontent.

ROBUST WATERMARKS

Unlike fragile watermarks, robust watermarks areresilient to intentional or unintentional attacks or signalprocessing operations. Ideally, a robust watermark mustwithstand attempts to destroy or remove it. Some of thedesirable properties of a good, robust watermark includethe following:

• Perceptual transparency: Robustness must not beachieved at the expense of perceptible degradation ofthe watermarked data. For example, a high-energywatermark can withstand many signal processing


attacks; however, even in the absence of any attacksthis can cause significant loss in the visual quality ofthe watermarked image.

• Higher payload: A robust watermark must be able tocarry a higher number of information bits reliably,even in the presence of attacks.

• Resilience to common signal processing operationssuch as compression, linear and nonlinear filtering,additive random noise, and digital-to-analog conver-sion.

• Resilience to geometric attacks such as translation,rotation, cropping, and scaling.

• Robustness to collusion attacks where multiple copiesof the watermarked data can be used to create orremove a valid watermark.

• Computational simplicity: Consideration for compu-tational complexity is important when designingrobust watermarks. If a watermarking algorithmis robust but computationally very intensive dur-ing encoding or decoding, then its usefulness in reallife may be limited.

In general, most of these above properties conflictwith one another, so a number of trade-offs is needed.Three major trade-offs in robust watermarking and theapplications that are impacted by each of these trade-offfactors are shown in Fig. 9.

It is easily understood that placing a watermarkin perceptually insignificant components of an imageimperceptibly distorts the watermarked image. However,such watermarking techniques are generally not robustto intentional or unintentional attacks. For example, ifthe watermarked image is lossy compressed, then theperceptually insignificant components are discarded bythe compression algorithm. Therefore, for a watermarkto be robust, it must be placed in the perceptuallysignificant components of an image, even though we runa risk of causing perceptible distortions. This gives riseto two important questions: (a) what are the perceptuallysignificant components of a signal, and (b) how can theperceptual degradation due to robust water-marking beminimized? The answer to the first question dependson the type of medium — audio, image, or video. Forexample, certain spatial frequencies and some spatialcharacteristics such as edges in an image are perceptually

Capac

ity Robustness

Imag

equ

ality

Web applicationsDigital cinemaProfessional photography

ScalingRotationPerspectiveNoiseDitheringEnhancementCompressionPrint/ScanIntentionalattacks

100’s of bits(Metadata storage)28 bits (Uniqueidentifier, UUID,GUID)bit (Detectwatermarkresence)

Figure 9. Trade-offs in robust watermarking.

significant. Therefore, choosing these components ascarriers of a watermark will add robustness to operationssuch as lossy compression.

There are many ways in which a watermark can beinserted into perceptually significant components. Butcare must be taken to shape the watermark to matchthe characteristics of the carrier components. A commontechnique that is used in most robust watermarkingalgorithms is adaptation of the watermark energy to suitthe characteristics of the carrier. This is usually based oncertain local statistics of the original image so that thewatermark is not visually perceptible.

A number of robust watermarking techniques havebeen developed during the past few years. Some ofthem apply the wattermark in the spatial domain andsome in the frequency domain. Some are additive water-marks, and some use a quantize and replace strategy.Some are linear and some are nonlinear. The earliestrobust spatial domain techniques were the MIT patch-work algorithm (8) and another one by Digimarc (9).One of the first and still the most cited frequency-domain techniques was proposed by Cox et al. (10). Someearly perceptual watermarking techniques using lin-ear transforms in the transform domain were proposedin (11). Finally, a recent spatial-domain algorithm thatis remarkably robust was proposed by Kodak (12–16).Instead of describing these different algorithms inde-pendently, we chose to describe Kodak’s technique indetail because it clearly identifies the different ele-ments that are needed in a robust watermarking tech-nique.

Kodak’s Watermarking Technique

A spatial watermarking technique based on phasedispersion was developed by Kodak (12–16). The Kodakmethod is noteworthy for several reasons. First, it canbe used to embed either a gray-scale iconic image orbinary data. Iconic images include trademarks, corporatelogos, or other arbitrary small images; an example isshown in Fig. 10a. Second, the technique can determinecropping coordinates without the need for a separatecalibration signal. Furthermore, the strategy that isused to detect rotation and scale can be applied toother watermarking methods in which the watermarkis inserted as a periodic pattern in the image domain.Finally, the Kodak algorithm reportedly scored 0.98 usingStirMark 3.0 (15). The following is a brief description ofthe technique. For brevity, only the embedding of binarydata is considered.

The binary digits of the message are representedby positive and negative delta functions (correspondingto ones and zeros) that are placed in unique locationswithin a message image M. These locations are specifiedby a predefined message template T, an example ofwhich is shown in Fig. 10b. The size of the messagetemplate is typically only a portion of the original imagesize (e.g., 64 × 64, or 128 × 128). Next, a carrier imageCK , which is the same size as the message image, isgenerated by using a secret key. The carrier image isusually constructed in the Fourier domain by assigninga uniform amplitude and a random phase (produced by


(a) (b)

Figure 10. Example of a (a) binary iconic message and (b) message template.

or

Icon

Bin

ary100101

Convolve

Message

Secure key

Randomphase carrier

Scale to smallamplitude

Data embedded image

Original image

+

+

Figure 11. Schematic of the watermark insertion process.

a random number generator initialized by the secretkey) to each spatial-frequency location. The carrier imageis convolved with the message image to produce adispersed message image, which is then added to theoriginal image. Because the message image is typicallysmaller than the original image, the original image ispartitioned into contiguous nonoverlapping rectangularblocks Xr, which are the same size as the message image.The message embedding process creates a block of thewatermarked image, X ′

r(x, y), according to the followingrelationship:

X ′r(x, y) = α[M(x, y) ∗ CK(x, y)] + Xr(x, y), (3)

where the symbol ∗ represents cyclic convolution and α

is an arbitrary constant chosen to make the embeddedmessage simultaneously invisible and robust to commonprocessing. This process is repeated for every block in theoriginal image, as depicted in Fig. 11. It is clear fromEq. (3) that there are no restrictions on the message

image and its pixel values can either be binary ormultilevel.

The basic extraction process is straightforward andconsists of correlating a watermarked image block withthe same carrier image used to embed the message. Theextracted message image M(x, y) is given by

M(x, y) = X ′r(x, y) ⊗ CK(x, y) = α[M(x, y) ∗ CK(x, y)]

⊗ CK(x, y) + Xr(x, y) ⊗ CK(x, y), (4)

where the symbol ⊗ represents cyclic correlation. Thecorrelation of the carrier with itself can be represented bya point-spread function p(x, y) = CK(x, y) ⊗ CK(x, y), andbecause the operations of convolution and correlationcommute, Eq. (4) reduces to

M(x, y) = αM(x, y) ∗ p(x, y) + Xr(x, y) ⊗ CK(x, y). (5)


The extracted message is a linearly degraded version ofthe original message plus a low-amplitude noise termresulting from the cross correlation of the original imagewith the carrier. The original message can be recovered byusing any conventional restoration (deblurring) techniquesuch as Wiener filtering. However, for an ideal carrier,p(x, y) is a delta function, and the watermark extractionprocess results in a scaled version of the message imageplus low-amplitude noise. To improve the signal-to-noiseratio of the extraction process, the watermarked imageblocks are aligned and summed before the extractionprocess, as shown in Fig. 12. The summation of the blocksreinforces the watermark component (because it is thesame in each block), and the noise component is reducedbecause the image content typically varies from blockto block. To create a system that is robust to cropping,rotation, scaling, and other common image processingtasks such as sharpening, blurring, and compression,many factors need to be considered in designing of thecarrier and the message template.

In general, designing the carrier requires consideringthe visual transparency of the embedded message, theextracted signal quality, and the robustness to imageprocessing operations. For visual transparency, most of thecarrier energy should be concentrated in the higher spatialfrequencies because the contrast sensitivity function (CSF)of the human visual system falls off rapidly at higherfrequencies. However, to improve the extracted signalquality, the autocorrelation function of the carrier, p(x, y),should be as close as possible to a delta function, whichimplies a flat spectrum. In addition, it is desirable tospread out the carrier energy across all frequenciesto improve robustness to both friendly and maliciousattacks because the power spectrum of typical imageryfalls off with spatial frequency and concentration of thecarrier energy in high frequencies would create littlefrequency overlap between the image and the embeddedwatermark. This would render the watermark vulnerableto removal by simple low-pass filtering. The actualdesign of the carrier is a balancing act between theseconcerns.

The design of an optimal message template is guidedby two requirements. The first is to maximize the qualityof the extracted signal, which is achieved by placing the

message locations maximally apart. The second is that theembedded message must be recoverable from a croppedversion of the watermarked image. Consider a case wherethe watermarked image has been cropped so that thewatermark tiles in the cropped image are displaced withrespect to the tiles in the original image. It can be shownthat the message extracted from the cropped image is acyclically shifted version of the message extracted from theuncropped image. Because the message template is known,the amount of the shift can be unambiguously determinedby ensuring that all of the cyclic shifts of the messagetemplate are unique. This can be accomplished by creatinga message template that has an autocorrelation equal toa delta function. Although in practice it is impossiblefor the autocorrelation of the message template to bean ideal delta function, optimization techniques such assimulated annealing can be used to design a messagetemplate that has maximum separation and minimumsidelobes.

The ability to handle rotation and scaling is afundamental requirement of robust data embeddingtechniques. Almost all applications that involve printingand scanning result in some degree of scaling and rotation.Many algorithms rely on an additional calibration signalto correct for rotation and scaling, which taxes theinformation capacity of the embedding system. Instead,the Kodak approach uses the autocorrelation of thewatermarked image to determine the rotation andscale parameters, which does not require a separatecalibration signal. This method can also be applied toany embedding technique where the embedded image isperiodically repeated in tiles. It can also be implementedacross local regions to correct for low-order geometricwarps.

To see how this method is applied, consider theautocorrelation function of a watermarked image thathas not been rotated or scaled. At zero displacement,there is a large peak due to the image correlation withitself. However, because the embedded message pattern isrepeated at each tile, lower magnitude correlation peaksare also expected at regularly spaced horizontal andvertical intervals equal to the tile dimension. Rotationand scaling affect the relative position of these secondarypeaks in exactly the same way that they affect the image.

Sum ofsections Correlate

Icon

Bin

ary

Display

100101

or

Data embedded imageExtracted message

Randomphase carrier

Secure key

+ + +• • •

Figure 12. Schematic of the watermark extrac-tion process.


(a)

(b)

Figure 13. (a) Example of a watermarked image without rota-tion and scale transformation and its corresponding autocorrela-tion. (b) Image in top row after scale and rotational transforma-tion and its corresponding autocorrelation.

By properly detecting these peaks, the exact amount ofthe rotation and scale can be determined. An example isshown in Figure 13. Not surprisingly, the energy of theoriginal image is much larger than that of the embeddedmessage, and the autocorrelation of the original image canmask the detection of the periodic peaks. To minimize thisproblem, the watermarked image needs to be processed,before computing the autocorrelation function. Examplesof such preprocessing include removing the local mean by aspatially adaptive technique or simple high-pass filtering.In addition, the resulting autocorrelation function is high-pass filtered to amplify the peak values.

COMMUNICATION AND INFORMATION THEORETICASPECTS

Communication and information theoretic approachesfocus mainly on the theoretical analysis of watermarkingsystems. They deal with abstract mathematical modelsfor watermark encoding, attacks, and decoding. Thesemodels enable studying watermarks at a high levelwithout resorting to any specific application (such as imageauthentication, etc.). Therefore, the results obtained byusing these techniques are potentially useful in a widevariety of applications by suitably mapping the applicationto a communication or information theoretic model. Therich set of mathematical models based primarily on thetheory of probability and stochastic processes allowsrigorous study of watermarking techniques; however, acommon complaint from practitioners suggests that someof these popular mathematical theories are not completelyvalid in practice. Therefore, studying watermarks based

on communication and information theory is an ongoingprocess where theories are proposed and refined based onfeedback from engineering applications of watermarks.

In this section, we describe some communication andinformation theoretic aspects of digital watermarking.First, we describe the similarities and differences betweenclassical communication and current watermarking sys-tems. Once this is established, it becomes easier toadapt the theory of communications to watermarking andmake theoretical predictions about the performance ofa watermarking system. Following this discussion, wedescribe some information theoretic models applied towatermarking.

Watermarking as Communication

Standard techniques from communication theory canbe adapted to study and improve the performanceof watermarking algorithms (17). Figure 14 shows anexample of a communication system where the informationbits are first encoded to suit the modulation type, errorcontrol, etc., followed by the modulation of a carriersignal to transmit this information across a noisy channel.At the decoder, the carrier is demodulated, and theinformation bits (possibly corrupted due to channel noise)are decoded. Figure 15 shows the counterpart systemfor digital watermarking. The modulator in Fig. 14 hasbeen replaced by the watermark embedder that placesthe watermark in the media content. The channel noisehas been replaced by the distortions of the watermarkmedia induced by either malicious attacks or by signalprocessing operations such as compression/decompression,cropping, filtering, and scaling. The embedded watermarkis extracted by the watermark decoder or detector.However, note that a major difference between the twomodels exists on the encoder side. In communicationsystems, encoding is done to protect the information bitsfrom channel distortion, but in watermarking, emphasisis usually placed on techniques that minimize perceptualdistortions of the watermarked content.

Some analogies between the traditional communicationsystem and the watermarking system are summarized inTable 1. We note from this table that the theory andalgorithms developed for studying digital communicationsystems may be directly applicable to studying someaspects of watermarking. Note that though these twosystems have common requirements, such as powerand reliability constraints, these requirements maybe motivated by different factors, for example, powerconstraint in a communication channel is imposed from acost perspective, whereas in watermarking, it is motivatedby perceptual issues.

Information Theoretic Analysis

Information theoretic methods have been successfullyapplied to information storage and transmission (18).Here, messages and channels are modelled probabilis-tically, and their properties are studied analytically. Agreat amount of effort during the past five decades hasproduced many interesting results regarding the capac-ity of various channels, that is, the maximum amount


Modulator Demodulator DecoderEncoder +Information

Transmission noise

Output information

Figure 14. Communication system model.

Watermarkembedder

Watermarkextractor Decoder

Output watermark

Induced distortionMedia to be

watermarked

+EncoderWatermark

Figure 15. Watermarking as a communication system.

Table 1. Analogies Between Communication and Watermarking System

Communication System Watermarking System

Information WatermarkCommunication channel Host signal (such as image, video)Power constraint on transmitted signal due to physical

limitationsPower constraint on watermark due to audio/visual quality limitations

Interference Host signal and watermark attacksSide information at transmitter and/or receiver Knowledge of host signal, watermarking parameters such as key at the

encoder and/or decoderChannel capacity Watermarking capacity

of information that can be transmitted through a chan-nel so that decoding this information with an arbitrarilysmall probability of error is possible. Using the analogybetween communication and watermarking channels, itis possible to compute fundamental information-carryingcapacity limits of watermarking channels using informa-tion theoretic analysis. In this context, the following twoimportant questions arise:

• What is the maximum length (in bits) of a watermarkmessage that can be embedded and distinguishedreliably in a host signal?

• How do we design watermarking algorithms that caneffectively achieve this maximum?

Answers to these questions can be found based on certainassumptions (19–28). We usually begin by assumingprobability models for the watermark signal, host signal,and the random watermark key. A distortion constraintis then placed on the watermark encoder. This constraintis used to model and control the perceptual distortioninduced due to watermark insertion. For example, inimage or video watermarking, the distortion metric couldbe based on human visual perceptual criteria. Basedon the application, the watermark encoder can use asuitable distortion metric and a value for this metricthat must be met during encoding. A watermark attackerhas a similar distortion constraint, so that the attackdoes not result in a completely corrupted watermarkedsignal that makes it useless to all parties concerned.The information that is known to the encoder, attacker,

and the decoder is incorporated into the mathematicalmodel through joint probability distributions. Then, thewatermarking capacity is given by the maximum rateof reliably embedding the watermark in any possiblewatermarking strategy and any attack that satisfies thespecified constraints. This problem can also be formulatedas a stochastic game where the players are the watermarkencoder and the attacker (29). The common payoff functionof this game is the mutual information between therandom variables representing the input and the receivedwatermark.

Now, we discuss the details of the mathematicalformulation described before. Let a watermark (ormessage) W ∈ W be communicated to the decoder. Thiswatermark is embedded in a length-N sequence XN =(X1, X2, . . . , XN) representing the host signal. Let thewatermark key known both to the encoder and thedecoder be KN = (K1, K2, . . . , KN). Then, using W, XN ,and KN , a watermarked signal X ′N = (X ′

1, X ′2, . . . , X ′

N) isobtained by the encoder. For instance, in transform-based image watermarking, each Xi could represent ablock of 8 × 8 discrete cosine transform coefficients, WN

could be the spread spectrum watermark (10), and KN

could be locations of the transform coefficients wherethe watermark is embedded. Therefore, N = 4096 for a512 × 512 image. Usually, it is assumed that the elementsof XN are independent and identically distributed (i.i.d.)random variables whose probability mass function is p(x),x ∈ X. Similarly, the elements of KN are i.i.d., and theirprobability mass function is p(k), k ∈ K. If X and Kdenote generic random variables in the random vectors


XN and KN , respectively, then any dependence between Xand K is modeled by the joint probability mass functionp(x, k). Usually, it is assumed that W is independent of(X, K). Then, a length-N watermarking code that hasdistortion D1 is a triple (W, fN, φN), where, W is a set ofmessages whose elements are uniformly distributed, fN isthe encoder mapping, and φN is the decoder mapping thatsatisfy the following (25):

• The encoder mapping x′N = fN(xN, w, kN) ∈ XN issuch that the expected value of the distortion,E[dN(XN, X ′N)] ≤ D1.

• The decoder mapping is given by w = φN(yN, kN) ∈ Wwhere yn is the received watermarked signal.

The attack channel is modeled as a sequence of condi-tional probability mass functions, AN(yN|xN) such thatE[dN(XN , YN)] ≤ D2. Throughout, it is assumed thatdN(xN , yN) = 1/N

∑Nj=1 d(xj, yj) where d is a bounded,

nonnegative, real-valued distortion function. A water-marking rate R = 1/N log |W| is said to be achievablefor (D1, D2) if there exists a sequence of watermarkingcodes (W, fN , φN) subject to distortion D1 that haverespective rates RN > R such that the probability oferror Pe = 1/|W| ∑w∈W Pr(w �= w|W = w) → 0, as N → ∞for any attack subject to D2. Then, the watermarkingcapacity C(D1, D2) is defined as the maximum (or supre-mum, in general) of all achievable rates for given D1 andD2. This information theoretic framework has been suc-cessfully used to compute the watermarking capacity of awide variety of channels. We discuss a few of them next.

When N = 1 in the information theoretic model, weobtain a single letter channel. Consider the single letter,discrete-time, additive channel model shown in Fig. 16. Inthis model, the message W is corrupted by additive noiseJ. Suppose that E(W) = E(J) = 0; then, the watermarkpower is given by E(W2) = σ 2

W , and the channel noisepower is E(J2) = σ 2

J . If W and J are Gaussian distributed,then, it can be shown that the watermarking capacity isgiven by 1/2 ln(1 + σ 2

W/σ 2J ) (28). For the Gaussian channel,

a surprising result has also been found recently (25). LetW = R be the space of the watermark signal and d(w, y) =(w − y)2 be the squared-error distortion measure. IfX ∼ Gaussian (0, σ 2

x ), then, the capacities of the blind andnonblind watermarking systems are equal! This meansthat, irrespective of whether or not the original signal isavailable at the decoder, the watermarking rate remainsthe same.

Watermark capacity has received considerable atten-tion for the case where the host signal undergoes specificprocessing/attacks that can be modeled using well-known

+

J

YW

Figure 16. Discrete-time additive channel noise model.

probability distributions. It is also a popular assumptionthat the type of attack which the watermark signal under-goes is completely known at the receiver and is usuallymodeled as additive noise. But, in reality, it is not guaran-teed that an attack is known at the receiver, and, it neednot be only additive; for example, scaling and rotationalattacks are not additive. Therefore, a more general mathe-matical model, as shown in Fig. 17, is required to improvethe capacity estimates for many nonadditive attack sce-narios (20). In Fig. 17, we see that a random multiplicativecomponent is also introduced to model an attack.

Using the model in Fig. 17, where Gd and Gr, respec-tively, denote the deterministic and random components ofthe multiplicative channel noise attack, it has been shownthat (20) a traditional additive channel model such as thatshown in Fig. 16 tends either to over- or underestimate thewatermarking capacity, depending on the type of attack. Aprecise estimate for the loss in capacity due to the uncer-tainty about the channel attack at the decoder can becomputed by using this model. Extensions of this result tomultiple watermarks in a host signal show that, to improvecapacity, a specific watermark decoder has to cancel theeffect of the interfering watermarks rather than treatingthem as known or unknown interference. It has also beenobserved that (20) an unbounded increase in watermarkenergy does not necessarily produce unbounded capacity.These results give us intuitive ideas for optimizing thecapacity of watermarking systems.

Computations of information theoretic watermarkingcapacity do not tell us how to approach this capacityeffectively. To address this important problem a new set oftechniques is required. Approaches such as quantizationindex modulation (QIM) (23) address some of theseissues. QIM deals with characterizing of the inherenttrade-offs among embedding rate, embedding-induceddegradation, and robustness of embedding methods. Here,the watermark embedding function is viewed as anensemble of functions indexed by w that satisfies thefollowing property:

x ≈ x′∀w. (6)

It is clear that robustness can be achieved if theranges of these functions are sufficiently separated fromeach other. If not, identifying the embedded messageuniquely even in the absence of any attackes will notbe possible. Equation (6) and the nonoverlapping rangesof the embedding functions suggest that the range of theembedding functions must cover the range space of x′

and the functions must be discontinuous. QIM embedsinformation by first modulating an index or a sequenceof indexes by using the embedding information and

Gd

Gr

+ +YW

J

Figure 17. Multiplicative and additive watermarking channelnoise model.


then quantizing the host signal by using an associatedquantizer or a sequence of quantizers. We explain thisby an example. Consider the case where one bit is to beembedded, that is, w ∈ {0, 1}. Thus two quantizers arerequired, and their corresponding reconstruction pointsin RN must be well separated to inherit robustnessto attacks. If w = 1, the host signal is quantized bythe first quantizer. Otherwise, it is quantized by thesecond quantizer. Therefore, we see that the quantizerreconstruction points also act as constellation points thatcarry information. Thus, QIM design can be interpretedas the joint design of an ensemble of source codes andchannel codes. The number of quantizers determines theembedding rate. It is observed that QIM structures areoptimal for memoryless watermark channels when energyconstraints are placed on the encoder. As we can see, afundamental principle behind QIM is the attempt to tradeoff embedding rate optimally for robustness.

As discussed in previous sections, many popularwatermarking schemes are based on signal transformssuch as the discrete cosine transform and wavelettransform. The transform coefficients play the role ofcarriers of watermarks. Naturally, different transformspossess widely varying characteristics. Therefore a naturalquestion to ask is, what is the effect of the choice oftransforms on the watermarking capacity? Note thatgood energy compacting transforms such as the discretecosine transform produce transform coefficients thathave unbalanced statistical variances. This property, itis observed, enhances watermarking capacity in somecases (26). Results such as these could help us indesigning high-capacity watermarking techniques thatare compatible with transform-based image compressionstandards such as JPEG2000 and MPEG-4.

To summarize, communication and information theo-retic approaches provide valuable mathematical tools foranalyzing watermarking techniques. They make it possi-ble to predict or estimate the theoretical performance of awatermarking algorithm independently of the underlyingapplication. But the practical utility of these models andanalysis has been questioned by application engineers.Therefore, it is important that watermarking theoreti-cians and practitioners interact with each other througha constructive feed back mechanism to improve the devel-opment and implementation of the state-of-the-art digitalwatermarking systems.

CONCLUSIONS

Digital watermarking is a rapidly evolving area of researchand development. We discussed only the key problemsin this area and presented some known solutions. Onekey research problem that we still face today is thedevelopment of truly robust, transparent, and securewatermarking techniques for different digital media,including images, video, and audio. Another key problem isthe development of semifragile authentication techniques.The solution to these problems will require applyingknown results and developing new results in the fields ofinformation and coding theory, adaptive signal processing,game theory, statistical decision theory, and cryptography.

Although significant progress has already been made,there still remain many open issues that need attentionbefore this area becomes mature. This chapter hasprovided only a snapshot of the current state of the art. Fordetails, the reader is referred to the survey articles (30–43)that deal with various important topics and techniques indigital watermarking. We hope that these references willbe of use to both novices and experts in the field.’’

BIBLIOGRAPHY

1. M. Maes et al., IEEE Signal Process. Mag. 17(5), 47–57(2000).

2. P. Wong and N. Memon, IEEE Trans. Image Process. (inpress).

3. S. Bhattacharjee, Proc. Int. Conf. Image Process., Chicago,Oct. 1998.

4. D. Kundur and D. Hatzinakos, Proc. IEEE, Special IssueIdentification Protection Multimedia Inf. 87(7), 1,167–1,180(1999).

5. C. Y. Lin and S. F. Chang, SPIE Storage and Retrieval ofImage/Video Databases, San Jose, January 1998.

6. R. Radhakrishnan and N. Memon, Proc. Int. Conf. ImageProcess, 971–974, Thessaloniki, Greece, Oct. 2001.

7. J. Fridrich, Proc. Int. Conf. Image Process, Chicago, Oct. 1998.8. W. Bender, D. Gruhl, N. Morimoto, and A. Lu, IBM Syst. J.

35(3–4), 313–336 (1996).9. Digimarc Corporation. http://www.digimarc.com.

10. I. J. Cox, J. Kilian, T. Leighton, and T. Shamoon, IEEETrans. Image Process. 6(12), 1,673–1,687 (1997).

11. R. B. Wolfgang, C. I. Podilchuk, and E. J. Delp, Proc. IEEE87(7), 1,108–1,126 (1999).

12. Method and apparatus for hiding one image or pattern withinanother, US Pat. 5,905,819, 1999, S. J. Daly.

13. Method for embedding digital information in an image, USPat. 5,859,920, 1999, S. J. Daly et al.

14. Method for detecting rotation and magnification in images,US Pat. 5,835,639, 1998, C. W. Honsinger and S. J. Daly.

15. C. Honsinger, IS&T PICS 2000, Portland, March 2000,pp. 264–268; C. W. Honsinger and M. Rabbani, Int. Conf. Inf.Tech.: Coding Comput., March 2000.

16. Method for generating an improved carrier for the data embed-ding problem, US Pat. 6,044,156, 2000, C. W. Honsinger andM. Rabbani.

17. I. J. Cox, M. L. Miller, and A. L. McKellips, Proc. IEEE 87,1,127–1,141 (1999).

18. C. E. Shannon, Bell Syst. Tech. J. 27, 379–423 (1948).

19. C. Cachin, Proc. 2nd Workshop Inf. Hiding, 1998.

20. R. Chandramouli, Proc. SPIE Security and Watermarking ofMultimedia Contents III, 2001.

21. R. Chandramouli, Proc. SPIE Multimedia Syst. Appl. IV, Aug.2001, p. 4,518.

22. B. Chen and G. W. Wornell, IEEE 2nd Workshop MultimediaSignal Process., 1998, pp. 273–278.

23. B. Chen and G. W. Wornell, IEEE Int. Conf. MultimediaComput. Syst. 1, 13–18 (1999).

24. B. Chen and G. W. Wornell, IEEE Int. Conf. Acoust. SpeechSignal Process. 4, 2,061–2,064 (1999).

25. P. Moulin and M. K. Mihcak, http://www.ifp.uiuc.edu/∼moulin/paper.html, June 2001.

172 DISPLAY CHARACTERIZATION

26. M. Ramkumar and A. N. Akansu, IEEE 2nd WorkshopMultimedia Signal Process. Dec. 1998, pp. 267–272.

27. M. Ramkumar and A. N. Akansu, SPIE Multimedia Syst.Appl. 3528, 482–492 (1998).

28. S. D. Servetto, C. I. Podilchuk, and K. Ramachandran, IEEEInt. Conf. Image Process. 1, 445–448 (1998).

29. A. Cohen and A. Lapidoth, Proc. Int. Symp. Inf. Theory, June2000, p. 48.

30. P. Jessop, Int. Conf. Acoust. Speech Signal Process., 80,2,077–2,080.

31. F. Mintzer and G. W. Braudaway, Int. Conf. Acoust. SpeechSignal Process. 80, 2,067–2,070.

32. M. Holliman, N. Memon, and M. M. Yeung, SPIE Securityand Watermarking of Multimedia Contents, Jan. 1999,pp. 134–146.

33. F. Hartung, J. K. Su, and B. Girod, SPIE Security andWatermarking of Multimedia Contents, 1999, pp. 147–158.

34. J. Dittmann et al., SPIE Security and Watermarking ofMultimedia Contents, 1999, pp. 171–182.

35. J. Fridrich and M. Goljan, SPIE Security and Watermarkingof Multimedia Contents, 1999, pp. 214–225.

36. M. Kutter and F. A. P. Petitcolas, SPIE Security and Water-marking of Multimedia Contents, 1999, pp. 226–239.

37. Special Issue, Proc. IEEE 87(7) (1999).38. W. Zhu, Z. Xiong, and Y. Q. Zhang, IEEE Trans. Circuits Syst.

Video Technol. 9(4), 545–550 (1999).39. Proc. Int. Workshop Inf. Hiding.40. Special Issue, IEEE J. Selected Areas Commun. (May 1998).41. M. D. Swanson, M. Kobayashi, and A. H. Tewfik, Proc. IEEE

86(6), 1,064–1,087 (1998).42. G. C. Langelaar et al., IEEE Signal Process. Mag. 17(5),

20–46 (2000).43. C. Podilchuk and E. Delp, IEEE Signal Process Mag. 18(4),

33–46 (2001).

DISPLAY CHARACTERIZATION

DAVID H. BRAINARD

University of PennsylvaniaPhiladelphia, PA

DENIS G. PELLI

New York UniversityNew York, NY

TOM ROBSON

Cambridge ResearchSystems, Ltd.Rochester, Kent, UK

INTRODUCTION

This article describes the characterization and use ofcomputer-controlled displays.1 Most imaging devices are

1 The display literature often distinguishes between calibrationand characterization (e.g. 1–3), calibration refers to the process ofadjusting a device to a desired configuration, and characterizationrefers to modeling the device and measuring its properties to allowaccurate rendering. We adopt this nomenclature here.

now computer controlled, and this makes it possible for thecomputer to take into account the properties of the imagingdevice to achieve the intended image. We emphasize CRT(cathode ray tube) monitors and begin with the standardmodel of CRT imaging. We then show how this modelmay be used to render the desired visual image accuratelyfrom its numerical representation. We discuss the domainof validity of the standard CRT model. The model makesseveral assumptions about monitor performance that areusually valid but can fail for certain images and CRTs.We explain how to detect such failures and how to copewith them.

Here we address primarily users who will be doingaccurate imaging on a CRT. Inexpensive color CRTmonitors can provide spatial and temporal resolutionsof at least 1024 × 768 pixels and 85 Hz, and the emittedintensity is almost perfectly independent of viewing angle.CRTs are very well suited for accurate rendering. Ourtreatment of LCDs (liquid crystal displays) is brief, inpart because this technology is changing very rapidly andin part because the strong dependence of emitted light onviewing angle in current LCD displays is a great obstacle toaccurate rendering. Plasma displays seem more promisingin this regard.

We present all the steps of a basic characterizationthat will suffice for most readers and cite the literaturefor the fancier wrinkles that some readers may need,so that all readers may render their images accurately.The treatment emphasizes accuracy both in color and inspace. Our standard of accuracy is visual equivalence:substituting the desired for the actual stimulus would notaffect the observer.2 We review the display characteristicsthat need to be taken into account to present an arbitraryspatiotemporal image accurately, that is, luminance andchromaticity as a function of space and time. We also treata number of topics of interest to the vision scientist whorequires precise control of a displayed stimulus.

The International Color Consortium (ICC, http://www.color.org/) has published a standard file format (4) forstoring ‘‘profile’’ information about any imaging device.3 Itis becoming routine to use such profiles to achieve accurateimaging (e.g. by using the popular Photoshop program).4

The widespread support for profiles allows most users toachieve characterization and correction without needing tounderstand the underlying characteristics of the imagingdevice. ICC monitor profiles use the standard CRT modelpresented in this article. For applications where thestandard CRT model and instrumentation designed for themass market are sufficiently accurate, users can simplybuy a characterization package consisting of a program

2 The International Color Consortium (4) calls this ‘‘absolutecolorimetric’’ rendering intent, which they distinguish from theirdefault ‘‘perceptual’’ rendering intent. Their ‘‘perceptual’’ intentspecifies that ‘‘the full gamut of the image is compressed orexpanded to fill the gamut of the destination device. Gray balanceis preserved but colorimetric accuracy might not be preserved.’’3 On Apple computers, ICC profiles are called ‘‘ColorSync’’ profilesbecause the ICC standard was based on ColorSync. Free C sourcecode is available to read and write ICC profiles (5,6).4 Photoshop is a trademark of Adobe Systems Inc.

digital watermarking - information systems and internet security

Documents