val oreste tico zip f

Upload: ken-matsuda

Post on 03-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Val Oreste Tico Zip f

    1/53

    The Aesthetic Value of

    Zipfian Visual Art

    Chris Emmery1

    ANR: 609661

    Thesis submitted in partial fulfillment

    of the requirements for the degree of

    Bachelor of Arts in Communication and Information Sciences,

    Bachelor Track Human Aspects of Information Technology,

    at the School of humanities

    of Tilburg University

    Thesis committee:

    dr. J.J. Paijmans2dr. M.M. van Zaanen3

    Tilburg UniversitySchool of Humanities

    Department of Communication and Information SciencesTilburg center for Cognition and Communication (TiCC)

    Tilburg, The Netherlands

    January 30, 2013

    [email protected]@[email protected]

  • 8/12/2019 Val Oreste Tico Zip f

    2/53

    Acknowledgements

    There are several people I would like to thank that have made this thesis possiblethe way it has turned out. First and foremost is Paai (or Dr. Hans Paijmans),

    which has been a source of great ideas, critical remarks and urging glances downthe hallway and the HAIT labs doorway on the 3d floor of the Dante building.His everlasting patience, will to keep me motivated, and quick responses as wellas availability for feedback whenever I entered his room have proven invaluablein finishing this thesis. Second is of course my lovely girlfriend, no need forexplanation there. Special thanks go out to Lieke Walet for helping me with thestatistical analysis and Marnix de Gier for thoroughly proofreading my thesis.I would also like to express my thanks to my former fellow student, now Ph.D,Nanne van Noord, who has given critical remarks on different approaches forthis thesis, and Dr. Menno van Zaanen for reading my thesis and completingthe thesis committee. Last but not least, Vincent Lichtenberg has accompaniedme for a full year in the lab, making it the best workplace I could have wishedfor. Again, thank you all.

  • 8/12/2019 Val Oreste Tico Zip f

    3/53

    Contents

    1 Introduction 1

    2 Explaining Zipfs Law 4

    2.1 G.K. Zipf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Zipfs Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Modifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    2.3.1 Yule-Simon Distribution . . . . . . . . . . . . . . . . . . . 72.3.2 Zipf-Mandelbrot Law. . . . . . . . . . . . . . . . . . . . . 82.3.3 Considering Zipf . . . . . . . . . . . . . . . . . . . . . . . 8

    2.4 Critical Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    3 Previous research 10

    3.1 Zipf & Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.1.1 Natural Language . . . . . . . . . . . . . . . . . . . . . . 113.1.2 Artificial Text. . . . . . . . . . . . . . . . . . . . . . . . . 11

    3.2 Zipf & Music . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    4 Aesthetics of Visual Art & Zipfs Law 18

    4.1 Aesthetics of Zipfs Law . . . . . . . . . . . . . . . . . . . . . . . 184.2 Measuring Aesthetics. . . . . . . . . . . . . . . . . . . . . . . . . 204.3 Golden Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.4 Research question. . . . . . . . . . . . . . . . . . . . . . . . . . . 224.5 Experimental setup. . . . . . . . . . . . . . . . . . . . . . . . . . 23

    4.5.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.5.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    5 Results 30

    6 Discussion & Conclusion 38

    A Pseudo-code 41

    B Figures 43

    i

  • 8/12/2019 Val Oreste Tico Zip f

    4/53

    List of Figures

    2.1 Logscale distribution ofUlysses andThe Art of War . . . . . . . 62.2 Frequency-rank distribution ofUlysses andThe Art of War . . . 62.3 Probabilities of splitting leaves within Yules distribution . . . . 7

    3.1 Frequency-rank distribution of Logscale Random Language codedwith respectively PHP and Python . . . . . . . . . . . . . . . . . 12

    3.2 Logplot of the Letter Frequency Based Random Language . . . . 133.3 The drunkards walk . . . . . . . . . . . . . . . . . . . . . . . . 143.4 Logpot of the Markov-generated languages. . . . . . . . . . . . . 153.5 Zipfs analysis ofMozarts Bassoon Concertio . . . . . . . . . . . 163.6 Frequency-rank distribution ofBeethovens Piano Sonata No. 29,

    Mvmt. 1 andChopins Etude Op. 10, No. 12 . . . . . . . . . . . 17

    4.1 Fractal set Sierpinski Sieve increasing in fractal dimension . . . 194.2 Golden Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.3 Images from the dataset using the parameter values TZC (left)

    and TRC (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.4 Images from the dataset using the parameter values SZC (left)

    and SRC (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.5 The artificial Zipf distribution generated with Python . . . . . . 254.6 The screen shown at the introduction of the experiment . . . . . 264.7 The screen shown during the ranking method . . . . . . . . . . . 274.8 The screen shown during the side-by-side comparison. . . . . . . 284.9 The screen shown during the Likert-based evaluation . . . . . . . 28

    5.1 Canvasses taken from folder 1 . . . . . . . . . . . . . . . . . . . . 335.2 Canvasses taken from folder 17 . . . . . . . . . . . . . . . . . . . 335.3 Graph of total positive judgements for each folder sorted on dif-

    ferent P Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 35

    B.1 Canvasses taken from folder 26 . . . . . . . . . . . . . . . . . . . 43B.2 Canvasses taken from folder 26 . . . . . . . . . . . . . . . . . . . 43

    ii

  • 8/12/2019 Val Oreste Tico Zip f

    5/53

    List of Tables

    2.1 Word Frequencies ofUlyssesby James Joyce . . . . . . . . . . . 52.2 Word Frequencies of the Art of War . . . . . . . . . . . . . . . . 5

    5.1 Demonstration of the case transformations from the original ranges 315.2 Chi-squared test scores for the IGBOD set . . . . . . . . . . . . . 325.3 Chi-squared test scores for the Amount, State and Form sets . . 345.4 Three folders comparing chaos against different methods of radii

    calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355.5 Total occurances of the Zipfian method in different categories in

    relation with experience and gender . . . . . . . . . . . . . . . . 365.6 Descriptives of the 7 point Likert scales judging the Zipfian can-

    vasses in different variations . . . . . . . . . . . . . . . . . . . . . 365.7 Comparison between the scores for the experimental chi-square

    test and that of the Likert scales . . . . . . . . . . . . . . . . . . 37

    iii

  • 8/12/2019 Val Oreste Tico Zip f

    6/53

    Chapter 1

    Introduction

    Zipf & Art

    Art and mathematics have a long historical relationship and have been an im-portant part of our society throughout the centuries. A development in thevein of this relationship is generative art; an art form which is characterizedby works that are partially - or entirely - created by a non-human, autonomoussystem. This refers predominantly, but not exclusively, to algorithmically deter-mined computer-generated artwork. Examples of this art form include WolfgangAmadeus Mozarts Musikalisches Wurfelspiel and fractal art such as the Man-delbrot set. However, only a limited amount of research has been conducted asto how we, as humans, perceive different methods or algorithms within this artform in an aesthetic sense. This connection between mathematics and aesthetics

    dates back to times where quantitative expressions of proportions and beautysuch as the golden ratio were studied by, inter alia, Aristotle, Euclid, Plato andPythagoras. An illustration of this connection is given by Galen: Beauty doesnot consist in the elements, but in the harmonious proportion of the parts (ascited in Manaris et al., 2005). Schroeder (1992, p. 122) observes that Fora work of art to be pleasing and interesting, it should neither be too regularand predictable nor pack too many surprises. Many man-made and naturallyoccurring phenomena have been proven to exhibit Zipfs law; an experimentallaw that describes the occurrence of a power law distribution within these phe-nomena. Due to its natural occurrence, induced by entropy1, this regularityappears to be an appropriate method to be utilized in this thesis research.

    1A very extensive explanation of entropy can be found on Scholarpedia.

    1

    http://www.scholarpedia.org/article/Entropyhttp://www.scholarpedia.org/article/Entropy
  • 8/12/2019 Val Oreste Tico Zip f

    7/53

    Chapter 1. Introduction 2

    The experiment

    Previous research in primarily music and fractals gave rise to the idea that theZipian distribution might have a special aesthetic value due to its occurrence innature. Utilizing the relationship between mathematics and visual art was thena logical decision for combining aesthetics and Zipfs law. Mainly due to theabsence of any previous research with this approach, the experiment allowed foran insightful and promising step in initializing this. By discussing and usingdifferent aspects from the important research fields where Zipfs law is analysedand combining this with previous research in aesthetics, the experimental setupwas designed. Accordingly, computer generated art was used as an effectivemethod to test different distributions. The experiment hopes to give answer tothe question if the Zipfian distribution can be regarded as having a relativelyhigher aesthetic value than the other distributions that will be tested in theexperiment. Further research might allow for more complex methods of research,

    hopefully based on the choices and implications considered in this thesis.

    Methodology

    The experiment for this thesis used random, golden ratio and Zipfian distri-butions in combination with a program that generates computer art based onthese distributions. The dataset consisted of 108 canvasses that were used indifferent combinations in a test of subjective aesthetic measure. To allow this,a suitable method for determining preference between the images needed tobe found, which the results of the literary research determined. Once this wasdone, a web-environment was set up for participants to conduct the experiment.Afterwards, the appropriate statistical test was used to determine if there were

    any significant results.

    Thesis outline

    In Chapter two, Zipfs law will be extensively discussed by explaining the lawitself, adaptations of this law by Simon, Mandelbrot and Yule, and it is con-cluded by considering existing discussion. Moreover, it will become evident whyZipfs Law is preferred for the research conducted here. Chapter three will thendiscuss findings in the field of language, one of the more prominent fields inresearch regarding Zipfs law. In this chapter the focus lies on the algorithmsbehind different methods of generating random language and there will be abrief discussion of the most relevant field of research for this thesis: Zipf and

    music. Chapter four links aesthetics of visual art to the Zipf-research and willconsider the golden ratio in terms of aesthetic value as well as application inthis thesis. Wrapping up the theoretical framework, the research question willbe formulated and the experimental setup will be explained. Chapter five willdiscuss the results of this experiment, followed by the discussion and conclusionin Chapter six.

  • 8/12/2019 Val Oreste Tico Zip f

    8/53

    Chapter 1. Introduction 3

    Terms

    Each time the word program, code, or snippet is used, this refers to aPython2 program, written specifically for this thesis. There is also an importantdifference between the terms figure and image. When Figure is written with acapital letter, it refers to an example; when written without one, figure refersto simple geometric forms. The term image will always refer to the imagein which these figures can be found. Whether or not the images used for thisexperiment can be seen as art will be left to judge by the experts. However,throughout this thesis art will refer to the images in general, and canvassesto them specifically.

    2An explanation of the high-level programming language Python can be found onPython.org.

    http://www.python.org/http://www.python.org/
  • 8/12/2019 Val Oreste Tico Zip f

    9/53

    Chapter 2

    Explaining Zipfs Law

    In this chapter the focus lies on both Zipfs law and the modifications the lawhas brought forth. Zipfs law has been used as an important tool for statisticalanalysis within a variety of research fields and thus will form an important frame-work throughout the thesis. As this has only scarcely been applied to visual art,this thesis will attempt to link said law to aesthetic measure. UnderstandingZipfs law requires background information about its discovery, application andplace within this thesis, which this chapter will provide.

    2.1 G.K. Zipf

    George Kingsley Zipf (1902-1950) was an American linguist and philologistworking mainly with Chinese languages and demographics, studying statisti-

    cal occurrences, and is the eponym of Zipfs Law (Zipf, 1935, 1949). With thisfocus on demographics Zipf hoped to find a fundamental principle that couldexplain different sorts of human behaviour. And that he did: by focusing onlanguage and manually analysing the frequencies of the 29,899 different wordtypes associated with 260,430 word tokens inUlyssesby James Joyce, Zipf foundand discussed a pattern in the distribution of linguistic units, such as words andphonemes. He illustrated the existence of the rank-frequency law by indicatingthat, when the frequency of a word within a given text is multiplied by its rankbased on this frequency, it will approximately return the same number. Thisdistribution, the Zipfian distribution, refers to a distribution of probabilitiesof occurrence that follows Zipfs law (Tullo and Hurford, 2010, p. 1), wherethe term law refers to the strength of the empirical observation that has beentested in various languages (Balasubrahmanyan & Naranan, 1996). The law

    was, aside from language, also found by Zipf in city sizes and salaries. Theseobservations made by Zipf are in accordance with his Least Effort Principle ofHuman Behaviour (Zipf, 1949), which states that humans, animals and evenmachines will predominantly opt for the path of least effort.

    4

  • 8/12/2019 Val Oreste Tico Zip f

    10/53

    Chapter 2. Explaining Zipfs Law 5

    2.2 Zipfs Law

    Zipfs law is a power- or scaling law; a mathematical relationship between twoquantities that refers to the frequency of an event varying as a power of someattribute of that event. This implies that small occurrences are tremendouslycommon and large instances extremely rare. Zipfs law can be written as follows:

    f(r) Cr

    (2.1)

    where f is the frequency of occurrence of an event, r is its statistical rank(position in an ordered list), Ca constant and is close to 1. Or, plainly said,the number of occurrences of a word is approximately inversely proportional toits rank. In a Zipfian distribution the most frequent word is indicated byr = 1,the second most frequent word by r = 2, etcetera. When everyr is drawn ona logarithmic scale against every f, a very clear Zipf curve can be obtained in

    the form of a straight line with a slope of1.This might seem to be an abstract concept. However, there are concrete

    examples where the law takes a more noticeable form. Zipfs law can perhapsbest be illustrated by looking at language, one of the leading research fieldswhere the law is applied, which will be discussed in detail further into thisthesis. Our languages use, for economic reasons, predominantly short words.One can imagine that in absence of this system we would spend a tremendousamount of our time putting to words simple phrases. Not only would this betime-consuming, it would also mean that we are able to retain less words in ourworking memory (Baddeley, 1999) as we remember shorter words more easily(Sigurd, Eeg-Olofsson, & Van Weijer, 2004). The law suggests that if we lookat a given natural text we could state that the 10th most frequent word (f) isexpected to occur approximately 10 times more frequently than the 100th mostfrequent word (Zanette,2004).

    Table 2.1: Word Frequencies ofUlyssesby James Joyce

    Rank (r) Frequency (f) Word

    1 15106 the5 5042 to

    15 1962 for25 1134 she50 562 if

    100 276 father200 137 joe500 52 shillings

    1000 27 tall

    Table 2.2: Word Frequencies oftheArt of War

    Rank (r) Frequency (f) Word

    1 11884 the5 8928 in

    15 948 at25 721 are50 361 more

    100 178 being200 84 maneuver500 36 bridges

    1000 17 portugal

  • 8/12/2019 Val Oreste Tico Zip f

    11/53

    Chapter 2. Explaining Zipfs Law 6

    Figure 2.1: Logscale distribution ofUlysses andThe Art of War

    Figure 2.2: Frequency-rank distribution ofUlysses andThe Art of War

    This is can be demonstrated with simple lines of code. In the example above,two different texts are compared on equal frequencies. On one hand we haveUlysses, the book used by Zipf for analyses; on the other we have The Art ofWar. If we compare the words and the frequencies in which these words occurwithin these texts, we see striking similarities, as well as the clear Zipfian curve.Whenlog(f) is drawn against log(r) in a graph, we can see a clear straight linewith a slope of1.

  • 8/12/2019 Val Oreste Tico Zip f

    12/53

  • 8/12/2019 Val Oreste Tico Zip f

    13/53

    Chapter 2. Explaining Zipfs Law 8

    If we consider the figure above, the leaves can be seen as words, implyingthat new words are introduced with a small, though constant probability. This

    means that older words are reused with an equal probability of what they hadearlier in the creation process (Kornai,2002).

    2.3.2 Zipf-Mandelbrot Law

    Mandelbrot (1953) suggested an optimization-based derivation of Zipfs lawfor heavy-tailed distributions (Mitzenmacher,2004). Mandelbrots optimizationprinciple assumes that ideally we would like to process a lot of information withsmall words. Accordingly, the main issue with Zipfs law was that when plottingthe log frequencies against the log ranks, the middle ranged ones showed thestrongest linear relationship. Mandelbrot therefore proposed his own modifica-tion:

    f(r) C(r+ q)

    (2.3)

    where in addition to the original law, q is added. Whenq= 0 and 1, theformula equals Zipfs law, changing the parameters can be used to fit the curveto the data. Despite the apparent flexibility and therefore increased utility ofMandelbrots derivation, Ferrer-i Cancho and Sole (2001b) point out that it doesnot effectively handle the higher ranks in the distribution.

    2.3.3 Considering Zipf

    In this thesis Zipfs Law will be used to generate visual art. The reason forchoosing Zipf above the existing derivations and alternatives lies in the applica-

    tions of said versions. The Yule-Simon Distribution has seen the predominantinterest of economics (Mitzenmacher, 2004), whereas the Zipf-Mandelbrot Lawis used, as stated, for heavy-tailed distributions. As we are dealing with a smallamount of generated numbers, this law will not benefit the outcome. Moreover,as will become apparent later in this thesis, in analysing aesthetics and art,Zipfs Law is most often the law of choice.

    2.4 Critical Views

    Many different empirical relationships in complex systems have a similar frequency-rank relationship. This includes urban systems, business firms, wealth distribu-tions, river networks, earthquake energy cascade, animals, fractals and routesfrom bifurcation to chaos, which are all hierarchies that follow a Zipfian dis-tribution (Chen,2011). However, simple unstructured systems have also beenfound to display Zipf-law-like distributions (Gunther et al., 1996). Zipfs law istherefore not a theoretical law; as it describes an occurrence rather than makinga prediction, it is an experimental law, or empirical observation (Tullo & Hur-ford,2010). Some researchers accordingly claim that Zipfs law is an inevitability

  • 8/12/2019 Val Oreste Tico Zip f

    14/53

    Chapter 2. Explaining Zipfs Law 9

    (Kello et al., 2010) and therefore uninteresting (Li, 1992; Mandelbrot, 1967).Most often, it is stated that when a purely random sequence of letters or sym-

    bols is generated, Zipfs law is is satisfied with an exponent close to 1 (Cohen,Mantegna, & Havlin, 1996). Because of the observation that a purely randomsequence of letters gives a Zipf plot, it is stated that Zipfs law does not sayanything meaningful about the linguistic nature of the text. Or rather, it doesnot prove that it has a coherent structure. This might be the most discussedtopic within the research dedicated to the law, and has seen both proponents(Li,1992)as well as opponents (Ferrer-i Cancho & Sole,2002). While discussingprevious research in the next chapter, illustrations and remarks on this topicwill be given.

  • 8/12/2019 Val Oreste Tico Zip f

    15/53

    Chapter 3

    Previous research

    In this chapter the most important research fields where Zipfs law is foundwill be discussed. The chapter aims to familiarize the reader with said fieldsand to give a clear view of the history behind the law, leading up to the topicof this thesis main subject. First and foremost, the field of language will bediscussed, where extensive research has been conducted spanning a variety au-thors and scopes. After discussing both natural language and artificial text, wewill look into an application perhaps more closely to visual art; that of music.Throughout the chapter, examples of code and illustrations of the distributionsfor different applications will be given.

    3.1 Zipf & Language

    The language we have built over the centuries allows for an infinite set of words(Sigurd,2004) from a limited set of elements. For one to understand this fun-damental product of our prominent position in evolution (Smith & Szathmary,1997), Ferrer-i Cancho and Sole (2001a, p. 1) propose that a complete theoryof language requires a theoretical understanding of its implicit statistical regular-ities. As organisation has been found to play a key role, the Principe of LeastEffort and Zipfs Law have been the most known methods of research withinthe field of language, predominantly Quantative Linguistics (Ferrer-i Cancho& Sole, 2003). The law has been found to hold true for English, French, Ital-ian, Spanish, etc. as well as extinct languages and the hoax-seeming Voynichmanuscript (Landini,2001), however; not for characters in Chinese, Japanese orKorean (Lu, Zhang, & Zhou,2012). Moreover, analyses of the bee dance (Paij-mans,2004) and dolphin whistles (Ferrer-i Cancho & McCowan, 2012) propose

    that it can also be found in animal communication. As research in the fieldof language turns out to be quite voluminous, this section will be limited toprominent research in the field of natural language and artificial text.

    10

  • 8/12/2019 Val Oreste Tico Zip f

    16/53

    Chapter 3. Previous research 11

    3.1.1 Natural Language

    In natural language, Zipfs principle is based on the interaction between speakerand hearer, where the speaker benefits the most from a simplistic lexicon andthe hearer from a single word for each meaning. For illustration, the simplestutterance of a random vowel would be most efficient effort-wise to a speaker.However; a receiver is benefited most from hearing as much as possible. Thisconstant balancing of efforts is what defines the Principle of Least Effort andis why we tend to use short words more often. Our natural, written languageis constructed from symbols, which makes it susceptible to statistical analysis.Considering this, it is no surprise that there has been extensive research in thisfield. The individual characters as well as the combination of these symbolsforming morphemes, words or phrases have all been tested for their frequencydistributions in written texts. Moreover, the properties of these combinations,such as word length and meaning have also been subject to these tests (Zipf,

    1945, 1949). An illustration of these properties will be given in the randomlanguage part.

    3.1.2 Artificial Text

    A distinct deviation of Zipf-oriented language research is that of artificial text.Here, artificial text should be separated from artificial language, which is definedas being deliberatively designed by one person or a small group in a short periodof time. Hence, the term artificial text refers to text that has been designedfor purposes other than communication. Examples of these include, descendingin randomness: monkeys-and-typewriters language, frequency based languageand the Markov chain (Cohen,1996;Ferrer-i Cancho & Sole,2002; Huelsenbeck,Ronquist, et al.,2001; Li,1992; Miller, Freund, & Johnson, 1965). As stated in

    chapter one, paragraph four, there is an enduring disagreement about whetheror not Zipfs Law should be seen as a null-hypothesis for language, as it hasbeen allegedly found that the law shows in a random combination of charactersforming a language. A description and consideration of previously mentionedvariants in regard to this dispute will be given below:

    Monkeys and Typewriters Language

    Suppose we take a team of live monkeys and place them in a room with a coupleof old fashioned typewriters. The monkeys, being oblivious to the workings ofour language let alone the typewriters, will over time start bashing on the keys.Granted they do not destroy the equipment first, they will generate a purelyrandom string of characters 1.

    As experiments with live monkeys might not be the most effective way toproduce such a text, we turn to programming for an algorithm. Taking each

    1 Weve all heard that a million monkeys banging on a million typewriters will eventuallyreproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is nottrue. Stephen Fry

  • 8/12/2019 Val Oreste Tico Zip f

    17/53

    Chapter 3. Previous research 12

    letter in the alphabet, including a whitespace as choices for each position in atext, and randomly 2 choosing amongst this list to fill that position will generate

    random language (Example1), although language is a misnomer here.

    Example 1. ondbfiblgiswekrpp ebfbe jqwww hxpdppmdwqzyyfl wyhatt

    The algorithm for such a text will look similar to the snippet in the appendix(Algorithm1). For illustration, two different high-level languages were chosen:one hand we have Python, a system-based programming language, and on theother we have PHP, which runs from a server. After having counting the fre-quency of each generated word, a graph comparing both languages could beplotted (Example3.1):

    Figure 3.1: Frequency-rank distribution of Logscale Random Language codedwith respectively PHP and Python

    After observing the graph it can be stated that this random language doesnot exhibit Zipfs Law. However, if we look at the validity of the language, itcould be put forth that, in terms of word length and character frequency, itdoes represent our own language as well (Li, 1992). A closer look at the graphshows that there are a couple of dents within the line. These exist becauseof the fact that programming language is prone to generate a lot of one andtwo letter words, but subsequently generates very long and random strings of

    characters. Our own natural languages do not use long words as frequently, oras long, as random language does. Hence, Ferrer-i Cancho and Sole (2002) state

    2Please note that the method used for random choice here is pseudo-random due to thenature of programming languages. It uses statistical randomness that is generated by anentirely deterministic causal process. It therefore appears to be random, however; it is nottruly random.

  • 8/12/2019 Val Oreste Tico Zip f

    18/53

    Chapter 3. Previous research 13

    that monkey language might be too simplistic and therefore not realistic. Theytherefore propose the letter frequency based random language as an alternative.

    Letter Frequency Based Random Language

    Generating a random text that looks similar to our natural texts requires piecesof information distilled from our own languages. This might be, for example,combinations of letters, length probabilities, grammatical rules, etcetera. How-ever, each piece will result in the language becoming less random3 and thereforepartly morphing into a distribution that will be more likely to satisfy ZipfsLaw. An example of this operation can be given by generating a language thatis based on letter frequencies within an existing natural text, as suggested byFerrer-i Cancho and Sole (2002). Taking The Adventures of Tom Sawyer byMark Twain, all character frequencies will have to be counted and their totalsturned into a probability of occurrence. Based on these totals it is then possible

    to make a weighted choice when generating a random text like described above,giving a more realistic outcome in the process (Example 2).

    Example 2. uer irtar ylhie f h hi t cse cro msohig eae fesdi t

    The algorithm for this piece of code is included in the appendix (Algorithm2).This text will subsequently yield the graph below (Figure3.2). It can be ob-served that part of this distribution satisfies the Zipfian curve, however; thereis an offset in the top part of the curve. More noteworthy is that just by usingthe feature of the letter frequencies, the curve significantly differs from what wehave seen in the random language (Example3.1).

    Figure 3.2: Logplot of the Letter Frequency Based Random Language

    3A random language is based on a completely random combination of letters, whereasartificial, or constructed languages are consciously devised by an individual or group, insteadof having evolved naturally.

  • 8/12/2019 Val Oreste Tico Zip f

    19/53

    Chapter 3. Previous research 14

    Markov Chain

    Andrey Markov (1856-1922) is most famous for his work on stochastic, or ran-dom processes. The most widely known system of his research being the Markovchain, for its many applications to real-word processes. Examples of these ap-plications include the drunkards walk where for each step the walk-positionchanges by +1 or -1 (Figure 3.3) and n-grams in natural language processing.

    Figure 3.3: The drunkards walk

    The Markov chain can be described as a collection of random variables{Xn}(where the index n runs through 0, 1,...) having the property that, given thepresent, the future is conditionally independent of the past (Papoulis 1984, p.532). This memoryless property is called the Markov property. Formally:

    P r(Xn+1|X1 = x1, X2 = x2,...,Xn= xn) = P r(Xn+1 = x

    |Xn= xn) (3.1)

    The Markov chain can be used to build an artificial, pseudo-random text, con-sisting of the words analysed by the algorithm. The algorithm is described ina short piece of pseudo-code included in the appendix (Algorithm3). For eachstep, the algorithm runs over a text, taking each of the two previously scannedwords as an initial state. The code then determines the next word on thispresent state only. Finally, the next word is randomly chosen from a statisticalmodel of potential suffixes from the available corpora. The output will looksimilar to the following (Example3):

    Example 3. Youll sit on the rocks, and leaves you there to be told in wordsof his life, like you, it is so noble-minded and full of curiosity or gratitude, butof a faded walnut tinge enveloped him;

  • 8/12/2019 Val Oreste Tico Zip f

    20/53

    Chapter 3. Previous research 15

    Alternatively, the exact same algorithm can be used to work on characters,or groups of characters. Taking a scope of a given amount of characters, the

    code can choose the following group of predecessors, creating random gibberish(Example4):

    Example 4. quly inlowe m howsho thocoter she wout wof we ded se re mocksdo she haboup tone rlimarereeded ano t hectesto won ony hout out d t knotoftaime lo deno onofto butowsedouit dy

    If we analyse the log plot of this random generated language, we can clearly seethat it does satisfy a Zipfian curve (Figure 3.4).

    Figure 3.4: Logpot of the Markov-generated languages

    Conclusively, different methods of generating a random distribution havebeen discussed. These are, from more to less random: Monkeys and Type-writers language, Frequency based language, Markov letters and Markov words.As randomness decreases, the different methods increasingly satisfy a Zipfiancurve. It was therefore determined that the Monkeys and Typewriters methodwould be most suitable for simulating randomness.

  • 8/12/2019 Val Oreste Tico Zip f

    21/53

    Chapter 3. Previous research 16

    3.2 Zipf & Music

    From language, a logical link can be made to music, as these forms of structuredhuman communication are believed to share some neural mechanisms (Maesset al., 2001). Zipf already speculated about this link when first presenting histheory. As he believed the notes of music were, similar to words words, part ofthe structure of mind, he expected them to have a distribution similar to thatof a text. Zipf (1949) then initialized music research linked to his law with hisevaluation of the intervals between the repetition of notes from amongst othersMozarts Bassoon Concertio in B(), Chopins Etude in F minor, Op. 25, No.2,Irving Berlins Doing What Comes Naturally, andJerome Kerns Who. Hefound a Zipf-line that was mostly in accordance with a slope of1 and concludedthat similar to language, long units were more scarce than short ones. Followingthis research, Voss and Clarke (1975, 1978), Boroda and Polikarpov (1988) andManaris et al. (2003) tried to identify the possible entities of music. A closer

    look at Zipf (1949) gives us the following graph (Figure 3.5):

    Figure 3.5: Zipfs analysis ofMozarts Bassoon Concertio

    As can be observed, the frequency of the intervals generally follow the slope.However, if this method of analysing, for example, note or interval frequenciesis used on other musical pieces, it can be more clearly observed that it doesnot represent an ideal Zipf-line. To do this, the intervals for Beethovens PianoSonata No. 29, Mvmt. 1 and Chopins Etude Op. 10, No. 12 (as used byZanette (2006)) were extracted and plotted in a frequency-rank distribution.The results can be seen in Figure 3.6. It must be noted that the methods used

    by Zipf are not described in the detail and the same pieces used by could notbe found in a MusicXML format used for analysis. For both pieces, only lowinterval steps were taken, as was suggested by Zipf (1949). We can see thatBeethovens piece represents the ideal Zipf-line more than Chopins. However,for both pieces the top as well as the tail divert too much from the slope of1to call it a true Zipfian distribution.

  • 8/12/2019 Val Oreste Tico Zip f

    22/53

    Chapter 3. Previous research 17

    Figure 3.6: Frequency-rank distribution ofBeethovens Piano Sonata No. 29,Mvmt. 1 andChopins Etude Op. 10, No. 12

    Music cannot be interpreted the way language can. It does not, for example,contain the same functional semantics as language does; it cannot tell anythingabout concepts that are not musically related. For example, music cannot de-scribe a room or the weather. Context, similar to linguistic context formed byinteracting words, was therefore believed to be absent within parts of music.Moreover, words can easily be identified by their collection of characters, sep-arated by spaces and punctuation marks. Hence, the link between music andlanguage is considered useless by some researchers (Davies, 1994).

    There are, nevertheless, methods of distilling and interpreting the informa-tion that music poses as a whole. Zanette (2006,p. 11) proposes that contextcan be conceptually related to a quantitative property of literary corpora, enun-ciated by Zipfs law, whose validity in a musical corpus can be investigated by

    objective means. In this paper, proof of compatibility of note usage with Si-mons model is given, along with underlying mechanism within the context thatboth music and language create. In his research, he took compositions fromBach, Debussy, Mozart and Schonberg and characterized individual notes basedon their pitch and duration. The pieces by the first three composers were foundto fit Simons model, from which Zanette concluded that context is present inboth music and language.

  • 8/12/2019 Val Oreste Tico Zip f

    23/53

    Chapter 4

    Aesthetics of Visual Art &

    Zipfs Law

    This chapter will familiarize the reader with the link between art and mathe-matics, with the focus on aesthetics. Different methods of measuring aestheticswill be given, followed by the method of choice for the conducted research inthis thesis. As Zipfs Law has been found in visual as well as auditory art,the limited but important amount of research in this field will be discussed andconnected to motivations for this thesis. An important mathematical conceptwithin art is the golden ratio, which is known for its aesthetic pleasantness andis therefore discussed here for comparison with a Zipfian distribution. Finally,both the research question and the experimental set-up will be stated.

    4.1 Aesthetics of Zipfs Law

    The link between art and mathematics is one that is based on studies long beforeour time. Aristotle, Euclid, Plato and Pythagoras, to name a few, were alreadyinterested in quantitative expressions of proportions and beauty, such as thegolden ratio. In this vein, Arnheim (1974)linked entropy and art by elucidatingthe fact that artists subconsciously have a tendency for balancing chaos andrandomness, where higher entropy increases disorganization. Schroeder (1992)proposes that the power spectrum of the function should therefore be balancedbetween brown 1/f2 and white 1/f0 noise, where f is frequency

    Fractals Benot Mandelbrot (1982) was the first to use the term fractal as de-

    scribing geometric patterns in nature. Fractals are infinitely self-similar; mean-ing that a fractal image as a whole is made up from smaller, identical fractalswhich are constructed from identical fractals and so forth. So, if one wouldzoom in on fractal art, the structure will not change as the pattern will repeatinfinitely in an equally detailed manner. The fractal dimension, or parameterD, can be used to quantify the fractal character of such an image. As the

    18

  • 8/12/2019 Val Oreste Tico Zip f

    24/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 19

    parameter ranges from 1.0, a smooth line, to 2.0 where the area is completelyfilled, the extremes contain no fractal structure; the complexity of the image

    increases with D (Figure4.1).

    Figure 4.1: Fractal set Sierpinski Sieve increasing in fractal dimension

    Fractals can be found in both natural forms such as clouds, waves, snowflakesand trees, as well as human made works such as that of Jackson Pollock (Taylor,Micolich, & Jonas,1999).

    Considering the subject for this thesis, as stated before, research has beenlimited. However, Spehar et al. (2003) make for a particularly relevant result.Their paper is based on the fact that if it would be possible to determine thefractal character with the highest aesthetic preference, it could be useful inunderstanding how our perception and preference is shaped by the things we seeeveryday. Accordingly, research showed an aesthetic preference for fractals witha distribution between 1/f1.3 and 1/f1.5. Moreover, Spehar et al. have testedthe preference for computer generated images built from small dots, varying indensity for each image, albeit without a noticeable outcome. Surprisingly, thegolden ratio (which will be discussed later into this thesis) has also been shownto have a connection with the Zipf-Mandelbrot distribution of 1/f1.4 (Livio,2003) as well.

    Music When analysing music, there is a broad range of entities that can betaken as attributes, including pitch, duration and harmonic- as well as melodicintervals. (Manaris, Purewal, & McCormick, 2002) used these attributes in or-der to determine which of these make for an aesthetically pleasing piece ofmusic (Manaris et al., 2003). The results suggest the possibility of algorithmi-cally identifying and classifying these aspects. Moreover, Manaris postulatedgeneration of socially sanctioned, or aesthetically pleasant, music through agenetic algorithm.

    Manaris et al. (2007) continued his research by creating a tool for said mu-sic generation. Using the music features based on Zipfs law from their2003paper, popularity for 992 pieces was predicted and incorporated into a genetic-programming system called NEvMuse. The fitness function was formed by

    aesthetic judgement from Artificial Art Critics (AACs), a method for objec-tively determining aesthetic value using a genetic algorithm (Machado et al.,2003). Then, Bachs Invention #13 in A minorwas used to generate aestheti-cally pleasing variations. Most importantly, these judgements were successfullycompared to emotional responses from 23 human subjects resulting in high cor-relations. Their research bodes well for the research in this thesis. Not only

  • 8/12/2019 Val Oreste Tico Zip f

    25/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 20

    does it prove that Zipfs Law can be used to generate an aesthetically pleasantart form, it may also offer a framework on which the method for research of

    visual art can be built.

    4.2 Measuring Aesthetics

    Aesthetics, from the Greek aisthesis, refers to the philosophical study of artand beauty or, scientifically, the study of sensory and sensori-emotional val-ues. Cawthon and Moere (2007, p. 1) note thatan anaesthetic is used todull or deaden, causing sleepiness and numbness. In contrast, aesthetic is seenas something that enlivens or invigorates both body and mind, awakening thesenses. The existence of an objective method to define beauty is still topicof debate, however; there have been made fundamental improvements sincePlatos objectivist view (Di Dio, Macaluso, & Rizzolatti, 2007). Plato believed

    that proportion, harmony and unity within parts of an object determined itsperceived beauty, whereas Aristotle distinguished the elements order, symmetryand definiteness.

    Objective Aesthetics It was the mathematician George D. Birkhoff (1911-1996) that proposed a model where perception was combined with order andcomplexity to objectively determine the aesthetic effect of objects such as tilesand vases. The model proposes that complexity (C) forms the effort for focusingthe viewers attention on the object, wherefrom the aesthetic measure (M) isformed by a feeling of value. Harmony, symmetry and order O influence M(Staudek et al.,1999), as high complexity without order deflects contemplation.Therefore:

    M=O/C (4.1)

    Staudek et al. found that Birkhoffs method of aesthetic measure is applica-ble in (computer-aided) design and could be improved by extending the existingmethod with partial order increment values. Accordingly, Birkhoffs measure isfocused on single objects and will most likely pose a problem in dealing withthe complex intertwining of multiple figures on one canvas.

    A recent partial solution to this problem was proposed by Klinger and Salin-garos (2011), who developed a technique for autonomously judging simple pat-terns such as symbols or pixels. Taking L, equal to (C) in Birkhoffs model,and addingCbased on the randomness of the sequence of symbols used in thepatterns, symbol variety and symmetry calculations were combined. This al-lows for distinguishing between organized and disordered complexity and thus

    determining if one would generally see an array of simple patterns as exciting(high aesthetic value) or distressing (low aesthetic value). Although not perfect;measures like these can be improved and developed into an Artificial Art Critic(which will be discussed in the next section) and could pose an important nextstep in ongoing research.

  • 8/12/2019 Val Oreste Tico Zip f

    26/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 21

    Subjective Aesthetics Aside from objective aesthetic measure such as Birk-hoffs measure, a more frequently used way to determine aesthetic judgement is

    simply by asking humans. Although humans have a subjective view on aesthet-ics based on incluences from their culture, personality and social background,several approaches have been developed in order to determine this subjectiveaesthetic measure. Spehar et al. (2003) use Cohnsmethod of paired comparison(Cohn,1894) to estimate the value judgement of fractals. In the vein of thisremarkably early model, another method for determining subjective measureis proposed by Cattell, Glascock, and Washburn (1918). The approach of thepaper was to rank thirty-six pictures based on preference and combining rank-difference correlations for all individual pictures, which resulted in a universalranking for the set. The combination of these measures will form the methodfor determining aesthetic value in this thesis.

    4.3 Golden Ratio

    The golden ratio, which is also referred to as the golden or divine section, hasfascinated human minds throughout the centuries and has been observed inmany natural as well as artificial ratios in, amongst others, anatomy, architec-ture, music and visual art (Doczi, 2005; Hagenmaier, 1963). These ratios aremost often found when calibrating the distances within simple geometric fig-ures. The golden section splits one segment into two parts, where the sum ofthese two quantities equals the ratio of the major to the minor part (Figure4.2).

    Figure 4.2: Golden Ratio

    The proportion of these quantities can be defined as x/y = y(x+y) or x/y =(yx)/x wherex = 1 and y1.618 . In this formula y , or the golden number (phi), can be defined as follows:

    =

    1 +

    5

    2 = 1.61803398874989... (4.2)It must be noted that in any number system, will have an infinite number ofdigits (Sen & Agarwal,2008).

  • 8/12/2019 Val Oreste Tico Zip f

    27/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 22

    The golden ratio is related to aesthetics. However, without consensus. Re-search has long been divided since initial study on the golden rectangle by

    Fechner (1897) between the interpretation of, on one hand, special aestheticsignificance and on the other the conclusion that it is not a special proportion(Boselie,1984). Examples of recent studies using the golden rectangle show thatpopulation preferences are weak (McManus, Cook, & Hunt, 2010) and that com-pactness of figures seems to determine its attractiveness, not the golden ratioitself (Friedenberg, 2012). However, recent study by Green (2012, p. 121) hasalso resulted in refuting the most powerful mathematical arguments againstinterpreting Golden Section findings by Fischler (1981) & Fowler (1993).

    Thus, it must be concluded that whether or not the golden ratio holds anyspecial aesthetic significance is still open for debate. It is however proven toplay an important part in measuring aesthetic judgement for simple, geometricfigures in different forms and is therefore of value to the research of this thesis.By comparing quantities within the golden ratio with quantities that followa Zipfian distribution, the golden ratio can function as a naturally occurringcounterpart. Combining this with a completely random distribution as describedin 3.1.2, complete chaos is also accounted for.

    Having determined which distributions can be used for the experiment, theresearch question, dataset and experimental set-up can now be designed.

    4.4 Research question

    Seeing the small amount of research conducted within the field of visual artconnected to Zipfs law, it is an interesting step to connect aesthetic judge-ment to simple geometric figures generated within the Zipf, golden ratio andrandom distribution. The experimental question can therefore be formulated

    in a straightforward manner: To what extent can it be determined if ZipfsLaw has a higher aesthetic value than the random or golden ratio distributions,within the posed method of measurement? To answer this question, Zipfs lawwill be compared to the random distribution, representing complete chaos, andthe golden ratio, which is generally regarded as having a high aesthetic value.However, due to the research on the golden ratio being inconclusive, it can alsobe seen as an orderly distribution. Limiting this research to just random andZipfian distributions would not prove of much value, as it is highly probable thatparticipants would prefer order over chaos. Using the golden ratio distribution,it is possible to hypothesize that the Zipfian distribution will be regarded ashaving a higher aesthetic value than both the random and the golden ratio dis-tributions. The argument for this being, that if the Zipfian distribution showspreference over random, the aesthetic value is most probable to be the cause.Subsequently, if the golden ratio show preference over random, we can assumethat the used method is able to recognize aesthetic preferences.

  • 8/12/2019 Val Oreste Tico Zip f

    28/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 23

    4.5 Experimental setup

    4.5.1 Data

    Computer generated artwork formed the dataset for this experiment. The mainchallenge in preparing this was that the constraints as well as the variationsare set by hand. This implies that the parameters put into this set can influ-ence the participants and should therefore remain as simple as possible. Theprogram used in this generation process, which will be referred to as IGBOD(Image Generator Based On Distributions), can therefore be described in quitea straightforward manner. What follows is a brief description of the parameters,methods for generating the images and the distributions used by IGBOD, as wellas the choices made during this its development and their possible implicationsfor the experiment.

    Parameters To generate an image IGBOD requires a value for of each ofthese parameters:

    Amount: [25], [50] or [75] figures State: [T]ransparent or [S]olid (Examples4.3 and 4.4) Distribution: [R]andom (3.1.2), [Z]ipfian (2.2) or [G]olden ratio (4.3) Form: [C]ircles or s[Q]uares

    To simplify reading, the abbreviations noted between square brackets will here-after be used to refer to the parameter values (and combinations of these) usedin this program. For example, using a Zipfian distribution will become usingZ, and a canvas with 50 transparent, random circles will be simply referredto as a 50T RCcanvas. The order of the combinations will always be in accor-

    dance with the list above. Finally, when specifically referring to a parameter,the word will be accompanied by a P (e.g. P Form and P State).

    Dimension Methods The dimensions for the different figures were deter-mined by the quantities within three different P Distributions: R, Z and G.When IGBOD produced a value forR, the random integer function was used,which returns a pseudo-random value as described in Example3.1. The Z di-mensions were created with the numpy.random.zipf1 module, which generatesvalues that follow an artificial Zipf distribution (Example4.5)2. Dimensions inGwere also generated with a function using pseudo-random numbers. The basetotal length was determined randomly and then split up into the longest (length/1.62) and smallest part (length - longest). This implies that every C or Q inthis distribution has a sibling that is generated somewhere else on the canvas,while keeping the P Amount of figures equal to that of the other variations.

    1For specifications of this module, please refer to the Numpy Documentation2Please note that this does not represent the Zipfian distribution as well as the examples

    discussed before, due to the limited amount of figures.

    http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.zipf.htmlhttp://docs.scipy.org/doc/numpy/reference/generated/numpy.random.zipf.html
  • 8/12/2019 Val Oreste Tico Zip f

    29/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 24

    Images An empty, white image (500600 pixels) was used by IGBOD forthe ImageDraw3 module to populate with a given P Amount of figures. For

    each figure, this module generates a bound box (a box limiting the dimensionsof the figure) using the formula:

    (x0 r, y0 r), (x1 + r, y1 + r) (4.3)where the x and y coordinates were always chosen randomly and r, the di-mension, was given by the different P Distributions. The figure could then befurther manipulated by stating both its P Form and P State. Subsequently,IGBOD was used to generate a total of 36 sets (which, in total, will be referredto as the IGBOD set) containing R, Z and G. Each possible combination ofparameters was tested three times ((2 P Forms )(2 P States )(3 P Amounts) 3 = 36).

    Figure 4.3: Images from the dataset using the parameter values TZC (left) andTRC (right)

    Figure 4.4: Images from the dataset using the parameter values SZC (left) andSRC (right)

    3For more information on the ImageDraw mo dule, please refer tohttp://www.pythonware.com/library/pil/handbook/imagedraw.htm

    http://pythonware.pdf/http://pythonware.pdf/
  • 8/12/2019 Val Oreste Tico Zip f

    30/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 25

    Choices and implications As the participants attention should be partic-ularly focused on the differences between the distributions, simple geometric

    figures were used to fill the images. To choose a more complex method of gener-ating the images, such as known evolutionary artwork (Corne & Bentley, 2001),would divert the attention from the essential differences that characterize thesedistributions. Furthermore, the dimensions were chosen over the coordinatesx and y as the variable used for processing the different P Distributions, sincethe latter would result in the Zipfian distribution being generally focused onone corner of the canvas, and the golden ratio dividing amongst two generalareas. Taking in regard the results of aforementioned research by Friedenberg(2012), it could be suggested that variations with predominantly small figurescould affect the outcome. However, as the canvasses are kept quite small, thisshould be kept to a minimum. In turn, the different P Amounts of figures areused to determine whether or not this affects overall aesthetic value within, andbetween, different variations.

    Dataset The dataset spans a total of 72 different subsets; from the afore-mentioned IGBOD set, another 36 subsets were formed by selecting images fordifferent methods of evaluation. With a total of 12 new sets (four for eachP Distribution) all containing 25, 50 and 75, the Amount set was used to rankthe P Amounts based on preference with the same method as the IGBOD set.The State set used 9 new sets (three for each P Distribution) in a forced side-by-side comparison as used in Spehar et al. (2003), comparing between T andS4. The three new sets forming the Form set, that were created for comparing50T ZCwith 50T ZQ, also used this method. Finally, the Likert set spanned12 folders for a Likert-scale based evaluation, containing one from each testedcombination with an P Amount of 50. The Likert set was hoped to give an

    absolute indication of aesthetic value, rather than just a relative one.

    Figure 4.5: The artificial Zipf distribution generated with Python

    4Transparent and Solid, as noted on page 24.

  • 8/12/2019 Val Oreste Tico Zip f

    31/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 26

    4.5.2 Experiment

    The experiment was conducted using the dataset described in the previous para-graph in combination with a web environment programmed in PHP. It washosted on a public server, meaning that anyone with the link could access andfill out the questions in the comfort of their own living room. All the answerswere posted and saved in a database after completing the experiment, prevent-ing incomplete data and making it easier to import this data into a statisticalprogram like SPSS.

    Introduction and general set-up After visiting the web page, the subjectswere explained in a small introduction (Figure 4.6) that the experiment wouldconsist of computer generated artworks that were said to have been judged byan Art Expert from Tilburg University. The pretended goal of this experimentwas to see if people would generally judge these works in the same fashion.

    They were then informed about the function to drag the pictures according totheir own preference and the different steps in which the dataset would be repre-sented. Moreover, the subjects were reminded that there were no wrong answersand they should judge the works at first sight. Rounding off the introduction,they were told that the experiment would last about ten minutes total. Addi-tionally, the participants had to state their age, gender and highest educationlevel (BO, LBO, MAVO, VMBO, HAVO, MBO, VWO, HBO, WO). Moreover,they were asked about their level of experience (inexperienced, hobbyist or pro-fessional) and they had to indicate whether they preferred chaos or order ina work of art on a 7 point Likert scale (Dawes, 2008; Malhotra, 2006). Thesedemographics were used in the statistical process to compare to the results ofthe main experiment.

    Figure 4.6: The screen shown at the introduction of the experiment

  • 8/12/2019 Val Oreste Tico Zip f

    32/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 27

    The main experiment The dataset was presented in three different formats.Firstly, the IGBOD set was presented with three pictures for each step, as can

    also be seen in Figure 4.7. In this part, the subjects were asked to drag thecanvasses in the order of their preference. Least preferred at the left hand side,indicated by a and most preferred at the right hand side indicated by a +.This method of ranking and ordering art to deal with aesthetic value was takenfrom the experiment conducted by Cattell (1918).

    Figure 4.7: The screen shown during the ranking method

    Secondly, the Amount set was presented in a similar fashion, dealing with threecanvasses at the same time. For both the State and Form set, a forced side-by-side comparison was utilized (Figure4.8), taken from research by Spehar et al.(2003). In each of the sets discussed so far, the order in which the images areintitally presented were randomized.

  • 8/12/2019 Val Oreste Tico Zip f

    33/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 28

    Figure 4.8: The screen shown during the side-by-side comparison

    Finally, the Likert set was shown on one page (Figure4.9), each with a 7-pointLikert scale to indicate whether they judged these variations in the range fromvery ugly (1) to very beautiful (7). For this part, it was made sure that therespondents judged this based on the canvasses they saw before, not on theirgeneral aesthetic value.

    Figure 4.9: The screen shown during the Likert-based evaluation

  • 8/12/2019 Val Oreste Tico Zip f

    34/53

    Chapter 4. Aesthetics of Visual Art & Zipfs Law 29

    Considerations The choice for an online environment allowed for the exper-iment to be conducted without being restricted to a fixed workplace, however;

    this also means that the conditions in which the subjects participate are notalways ideal. It was made sure that the experiment could only be conductedon a desktop computer, disregarding tablets and smart phones as they facili-tate on-the-go situations which may lower the concentration of the participant.Moreover, the possible amount of items, due to the combining of variables, wastoo high for the experiment to be conducted within a time-span that wouldstill keep the subject focused. This is why it was determined that, consideringthese limitations, the experiment would not last longer than an average of tenminutes. Further attempts to stimulate the subjects focus were incorporatedinto the dragging mechanism. This was a conscious decision, as dragging thecanvasses would force the participant to think about ordering all presented can-vasses, rather than just making a decision on which one they preferred the mostout of three. This allowed for drawing stronger conclusions using one case, inexample: the majority preferredZoverG, andGoverR, as opposed to havingto do this using several cases. Furthermore, the colours used in the web envi-ronment were low in saturation, making them as neutral as possible (Ou et al.,2004). The reference to the art expert was made with the intention of justifyingthe use of extremely simple art. It was assumed that when the participantswere convinced that an art expert would be able to rank these canvasses, itwould positively improve their opinion of the images and the earnestness of thisexperiment. Finally, the orientation of the and the + was determined bythe positioning of negative and positive ranges within Likert scales.

  • 8/12/2019 Val Oreste Tico Zip f

    35/53

    Chapter 5

    Results

    Demographics The experiment was completed by 37 participants, with agesranging from 16 to 58 with a mean of 24,22. A majority of the respondentswas male (67,6%) and higher educated (91,8% had VWO, HBO or WO level),mainly due to the great amount of participation from students (48,6%). Apartfrom these basic demographics, the respondents were asked which level of artexperience they had, which turned out to be amateur for the majority (75,7%)and no participants seeing themselves as a professional. Finally, the participantsleaned more towards order than chaos in their preferences with a mean of 4,73on a 7 point Likert scale.

    Preparing the data The results of the main experiment were recorded insmall arrays ranging from to + wherein the different P Distributions weresorted: 0 for R, 1 for Z and 2 for G. As SPSS

    1

    was used for analysing thedata, some preparation was needed. The yielded arrays were split up into threedifferent columns or categories: one for the lowest preference (min), average pref-erence (eqa) and highest preference (plu). Thus, an array res 1 that looked like0, 2, 1 (R,G,Z) was split into res 1 min = 0,res 1 eq a= 2, and res 1 plu= 1.Using this division, it was possible to individually test matches on these dif-ferent categories. For example, it was possible to transform cases and generatetotal values per case for the total amount of participants who ranked Z as a+. The IGBOD set was transformed in this manner, as is shown in Table 5.1.

    The data for the split sets (i.e. res 1 min) contained the position of can-vasses. In this characteristic of the dataset lies the crux the statistical analysisfaces; it is nominal data. What this implies is that the data does not have anynumerical meaning. It cannot interact in a way that for example test scores can.

    This means that for statistical analysis of the dataset, this experiment is limitedto descriptive methods. Aside from looking at frequencies and descriptives, thedata of different sets will be analysed by using the2 and the t-test.

    1For more information on SPSS, please refer to its wikipedia entry.

    30

    http://en.wikipedia.org/wiki/SPSShttp://en.wikipedia.org/wiki/SPSS
  • 8/12/2019 Val Oreste Tico Zip f

    36/53

    Chapter 5. Results 31

    Table 5.1: Demonstration of the case transformations from the original ranges

    original cases transformed

    res 1 plu through res 36 plu 0 tot R plu1 tot Z plu2 tot G plu

    res 1 eqa through res 36 eqa 0 tot R eqa1 tot Z eqa2 tot G eqa

    res 1 min through res 36 min 0 tot R min1 tot Z min2 tot G min

    Statistical analysis Seeing as the dataset consist of nominal data, to deter-mine preference for certain images would require a test that is designed to handlethis. The 2-test offers such a solution. By assuming that if there would existno preference whatsoever, the chance that a certain canvas would be placed asmost preferred would be equal to the other two positions. Significant deviationfrom this assumption will indicate that there does exist a general preference forthe variation yielding this deviation.

    For the 2-test, cases 1 through 60 were tested with the assumption thatN= 12, 5 (for a total ofN= 37, equal to a probability P(AB C) = 33, 3%)for each image within these cases. As reporting each test in text would take upa considerable amount of space and would not benefit readability, the results forthe test are displayed in both Tables 5.2and Table5.3. The results for the 2

    (chi-sq), degrees of freedom (df) and asymptotic significance (As.Sig.) values

    are noted along with the P Distribution (dist) deviating from the N = 12, 5assumption. Each significant result for

  • 8/12/2019 Val Oreste Tico Zip f

    37/53

    Chapter 5. Results 32

    Table 5.2: Chi-squared test scores for the IGBOD set

    - = +

    fl# var chi-sq df As.Sig dist chi-sq df As.Sig dist chi-sq df As.Sig dist

    1 25TC 5.568a 2 .062 2.649a 2 .266 5.568a 2 .0622 25TC 16.270a 2 .000 R 5.243a 2 .073 3.946a 2 .1393 25TC 14.000a 2 .001 R 6.054a 2 .048 G 14.649a 2 .001 Z4 25SC 3.459a 2 .177 2.000a 2 .368 .703a 2 .7045 25SC .676b 1 .411 .378a 2 .828 14.486a 2 .001 R6 25SC 6.541a 2 .038 R/Z 4.919a 2 .085 1.351a 2 .5097 50TC 42.541a 2 .000 R 23.081a 2 .000 G 24.703a 2 .000 Z8 50TC 29.892a 2 .000 R 29.892a 2 .000 G 33.784a 2 .000 Z9 50TC 33.784a 2 .000 R 29.892a 2 .000 G 33.946a 2 .000 Z

    10 50Sc .378a 2 .828 5.568a 2 .062 8.811a 2 .012 R11 50SC 7.514a 2 .023 R 1.027a 2 .598 3.459a 2 .17712 50SC 5.243a 2 .073 4.919a 2 .085 2.649a 2 .26613 75TC 51.946a 2 .000 R 31.838a 2 .000 G 26.811a 2 .000 Z14 75TC 52.108a 2 .000 R 47.405a 2 .000 G 38.000a 2 .000 Z15 75TC 25.973b 1 .000 R 47.405a 2 .000 G 38.324a 2 .000 Z16 75SC 7.189a 2 .027 G .216a 2 .898 5.568a 2 .06217 75SC .216a 2 .898 1.514a 2 .469 2.811a 2 .24518 75SC 7.189a 2 .027 G 4.919a 2 .085 .378a 2 .82819 25TQ 16.270a 2 .000 Z 1.027a 2 .598 9.135a 2 .010 G20 25TQ 9.135a 2 .010 Z 5.892a 2 .053 9.459a 2 .009 R21 25TQ 10.757a 2 .005 Z 2.811a 2 .245 2.649a 2 .26622 25SQ .216a 2 .898 7.514a 2 .023 Z 5.243a 2 .073

    23 25SQ .703a 2 .704 3.622a 2 .164 1.676a 2 .43324 25SQ 6.541a 2 .038 R/Z 16.432a 2 .000 G 3.946a 2 .13925 50TQ 23.730a 2 .000 R 16.919a 2 .000 G 9.784a 2 .008 Z26 50TQ 16.595a 2 .000 R 9.784a 2 .008 G 9.297a 2 .010 Z27 50TQ 16.270a 2 .000 R 8.811a 2 .012 G 2.811a 2 .24528 50SQ 6.865a 2 .032 Z 4.108a 2 .128 .378a 2 .82829 50SQ 5.892a 2 .053 1.514a 2 .469 4.919a 2 .08530 50SQ 2.000a 2 .368 1.676a 2 .433 .378a 2 .82831 75TQ 27.622a 2 .000 R 22.757a 2 .000 G 14.486a 2 .001 Z32 75TQ 26.000a 2 .000 R 23.730a 2 .000 G 18.541a 2 .000 Z33 75TQ 19.838a 2 .000 R 9.297a 2 .010 G 9.459a 2 .009 Z34 75SQ 1.027a 2 .598 5.892a 2 .053 2.000a 2 .36835 75SQ 1.676a 2 .433 6.054a 2 .048 G 3.622a 2 .164

    36 75SQ 1.351a 2 .509 .216a 2 .898 1.027a 2 .598a. 0 cells (,0%) have expected frequencies less than 5. The minimum expectedcell frequency is 12.3.b. 0 cells (,0%) have expected frequencies less than 5. The minimum expectedcell frequency is 18.5.

  • 8/12/2019 Val Oreste Tico Zip f

    38/53

    Chapter 5. Results 33

    whereas R fills up most of the space. Dealing with such a limited amount offigures, the relationship between the different figures might become less impor-

    tant, thus obscuring the effect of different P Distributions.

    Figure 5.1: Canvasses taken from folder 1

    This could also explain the absence of any preference for the Sfigures3. Itcan be observed that R, in this combination of variables, looks more orderlythan both Z and G. This can in turn be attributed to the manner in whichSwere generated. Ris more likely to spawn large figures, overlapping the smallerones and therefore dismissing its chaotic character.

    Figure 5.2: Canvasses taken from folder 17

    3Solid, please refer to page 24 for reviewing the abbreviations.

  • 8/12/2019 Val Oreste Tico Zip f

    39/53

    Chapter 5. Results 34

    Furthermore, a look at the Amount, State and Form sets indicated a pref-erence for certain manipulations. Overall, 75 was least preferred, deviating 7

    times. 50 deviated 5 times in the middle section and once as most preferred,whereas 25 was indicated as being most preferred 5 times. In the forced com-parison section between T4 and S, for both RC and GC, the S variant waspreferred throughout the folders. However, the SZ C were only preferred for75. Finally, the face-off between CandQ indicated thatCwas preferred for allthree folders.

    Table 5.3: Chi-squared test scores for the Amount, State and Form sets

    - = +

    fl# var chi-sq df As.Sig dist chi-sq df As.Sig dist chi-sq df As.Sig dist

    37 RC 33.784a 2 .000 75 26.324a 2 .000 5 0 34.432a 2 .000 25

    38 RC 42.541a 2 .000 75 15.297a 2 .000 5 0 17.892a 2 .000 2539 RC 3.459a 2 .177 ,216a 2 .898 5.405a 2 .06740 RC 1.027a 2 .598 1,351a 2 .509 4.270a 2 .11841 ZC 17.568a 2 .000 75 3.622a 2 .164 5.243a 2 .07342 ZC .027b 1 .869 3,622a 2 .164 16.432a 2 .000 5043 ZC 13.351a 2 .001 75 22.757a 2 .000 50 2.000a 2 .36844 ZC 4.108a 2 .128 2,324a 2 .313 2.000a 2 .36845 GC 33.946a 2 .000 75 5.892a 2 .053 13.351a 2 .001 2546 GC 38.324a 2 .000 75 11.730a 2 .003 5 0 19.676a 2 .000 2547 GC 5.568a 2 .062 3,946a 2 .139 4.270a 2 .11848 GC 14.000a 2 .001 75 9.135a 2 .010 50 9.784a 2 .008 2549 RC 19.703a 1 .000 RT25 19.703a 1 .000 RS2550 RC 16.892a 1 .000 RT50 16.892a 1 .000 RS50

    51 RC 33.108a 1 .000 RT75 33.108a 1 .000 RS7552 ZC 2.189a 1 .139 2.189a 1 .13953 ZC .027a 1 .869 .027a 1 .86954 ZC 19.703a 1 .000 ZT75 19.703a 1 .000 ZS7555 GC 9.757a 1 .002 GT25 9.757a 1 .002 GS2556 GC 11.919a 1 .001 GT50 11.919a 1 .001 GS5057 GC 14.297a 1 .000 GT75 14.297a 1 .000 GS7558 ZSC 9.757a 1 .002 SQUA 9.757a 1 .002 CIRC59 ZSC 4.568a 1 .033 SQUA 4.568a 1 .033 CIRC60 ZSC 14.297a 1 .000 SQUA 14.297a 1 .000 CIRC

    a. 0 cells (,0%) have expected frequencies less than 5. The minimum expectedcell frequency is 12.3.

    b. 0 cells (,0%) have expected frequencies less than 5. The minimum expectedcell frequency is 18.5.

    4Transparent, please refer to page 24 for reviewing the abbreviations.

  • 8/12/2019 Val Oreste Tico Zip f

    40/53

    Chapter 5. Results 35

    As stated in the previous paragraph, totals were calculated for each position,for each method. The results of this transformation were displayed in a graph,

    allowing a better overview of the total data (Figure 5.3). The graph illustratesthe high peaks for Z judged positively, as well as outliers in the dataset. Forexample, it shows some high R peaks in the middle.

    Figure 5.3: Graph of total positive judgements for each folder sorted on differentP Distribution

    This collection allows for verification of the questions that were asked in theintroduction part of the experiment. Table 5.4notes the frequencies of differentpoints in the Likert scale for the question regarding a preference for either chaos(1) or order (7). The samples res 13 plu, res 14 plu and res 15 plu were useddue to their chaotic character of 75T C. Examining these frequencies, it canbe observed that the majority chose Z as most preferred. However, the tablealso indicates that participants who indicated prior to the experiment that they

    preferred chaos, do not necessarily prefer RC. Note that the results of thesecases were proven in the2-test to be rigid and deviating.

    Table 5.4: Three folders comparing chaos against different methods of radiicalibration

    res 13 plu res 14 plu res 15 plu

    rand zipf gora Total rand zipf gora Total rand zipf gora Total

    chaos 2 0 0 1 1 0 1 0 1 0 1 0 13 1 4 0 5 1 4 0 5 1 4 0 54 2 4 0 6 2 4 0 6 0 5 1 65 0 13 5 18 0 14 4 18 1 13 4 18

    6 0 5 0 5 0 5 0 5 0 5 0 57 0 1 1 2 0 2 0 2 0 2 0 2

    Total 3 27 7 37 3 30 4 37 2 30 5 37

  • 8/12/2019 Val Oreste Tico Zip f

    41/53

    Chapter 5. Results 36

    Using the t-test (Table 5.5) it was determined whether or not part of thetested demographics influenced the samples mean. The sample was determined

    by counting all the cases where participants ranked a canvas that was gener-ated by Zeither in the min, eqa or plu category. No difference was found inpreference between respondents with art experience in the groups amateur andhobbyist. The total values for each selected Zin the plu (t(37) = 1.46, p= .15),eqa (t(37) =.17, p= .87), or min (t(37) = 1.48, p= .15) categories were not sig-nificant. Furthermore, gender did not prove to have an influence on preferenceeither. Again, with the plu (t(37) = 1.64, p = .11), eqa (t(37) = .48, p = .63),and min (t(37) = 1.44, p = .16) categories not yielding any significant results.It should be noted that the t-test was done for every total apart from Z. Nosignificance was found, however.

    Table 5.5: Total occurances ofZin different categories in relation with experi-

    ence and gender (score minimum was 1, maximum 36; the standard deviationis noted between brackets)

    experience gender

    amateur hobbyist male female

    plu tot 14.14 (6.12) 18.22 (9.95) 13.80 (6.56) 17.92 (8.30)eqa tot 12.29 (6.34) 8.44 (8.38) 12.48 (6.83) 9.00 (6.94)min tot 9.57 (3.81) 9.33 (3.64) 9.72 (3.27) 9.08 (4.66)

    Finally, descriptive information illustrated in Table 5.6 shows the scoresthat the participants allocated to all the different canvasses that were tested inthe experiment. The table shows, along with the fact that some values were

    missing, that overall the pictures were averagely received in terms of generalvalue. Moreover, values were more or less centred around the average of 4.

    Table 5.6: Descriptives of the 7 point Likert scales judging the Zipfian canvassesin different variations

    lik1 lik2 lik3 lik4 lik5 lik6 lik7 lik8 lik9 lik10 lik11 lik12

    N Valid 37 37 37 37 37 37 37 36 37 36 36 37Missing 0 0 0 0 0 0 0 1 0 1 1 0

    Mean 2.68 4.43 3.57 4.49 4.38 4.35 3.30 3.64 3.32 4.14 3.83 4.11Median 3.00 5.00 4.00 5.00 4.00 4.00 3.00 3.00 3.00 4.50 4.00 4.00Mode 3 5a 4 5 4a 4 1 3 2 5 5 3aStd. Dev. 1.203 1.625 1.281 1.261 1.441 1.338 1.714 1.533 1.492 1.376 1.444 1.410

    Variance 1.447 2.641 1.641 1.590 2.075 1.790 2.937 2.352 2.225 1.894 2.086 1.988Minimum 1 1 1 2 2 2 1 1 1 1 1 1Maximum 5 7 6 7 7 7 6 7 7 7 6 7Sum 99 164 132 166 162 161 122 131 123 149 138 152

  • 8/12/2019 Val Oreste Tico Zip f

    42/53

    Chapter 5. Results 37

    Table5.7gives some additional explanation to the figures that were picked forthe Likert test. Comparing both the mean for the Likert test as the asymptotic

    significance for the experimental cases, the relevance of these results can bedetermined. It can be observed that the Scases yielded a higher preference forboth R and G, whereas Zaveragely score highest in the T cases.

    Table 5.7: Comparison between the scores for the experimental chi-square testand that of the Likert scales

    Likert Folder Method Mean exp as.sig

    1 7 50TRC 2.68 .0002 7 50TZC 4.43 .0003 7 50TGC 3.57 .0004 10 50SRC 4.49 .8285 10 50SCZ 4.38 .0626 10 50SGC 4.35 .0127 25 50TRQ 3.30 .0008 25 50TZQ 3.64 .0009 25 50TGQ 3.32 .008

    10 28 50SRQ 4.14 .03211 28 50SZQ 3.83 .12812 28 50SRQ 4.11 .828

  • 8/12/2019 Val Oreste Tico Zip f

    43/53

    Chapter 6

    Discussion & Conclusion

    Zipf & Art All in all, the research showed some significant results as well asimplications for future research. As previous research in this field was almostnon-existent, major risks had to be taken in terms of experimental approach.Relying on just a few previously tested methods for testing aesthetic value, amix of different approaches formed the total experiment. Moreover, due to theamount of implications for each choice in the different variations tested thatare inherent to aesthetics, the experiment had to be constructed as simply aspossible. However, despite being in an academic blind spot, previous researchand theories in the area of Zipf s Law were used as a foundation and frameworkin setting up the experiment. Research in language allowed for an introductioninto the most important field where Zipfs law is found, as well as choosingand justifying complete randomness as a counterpart for a Zipfian distribution.Especially the field of music proved to be resourceful in combining an art formwith Zipfs law. Finally, two important researches by Spehar et al. (2003)andManaris et al. (2007) were found in the field of aesthetics. These researchesin respectively music and fractals both found that Zipf could allow for thegeneration of aesthetically pleasing art. From these researches, the experimentalset-up was formed, combining the previously discussed theory with two methodsfor subjective aesthetic judgement (Cattell,1918; Spehar et al., 2003).

    Analysing the participants As said, the experiment allows for a lot of im-plications and interpretation due to divergent results and views in the discussedliterature. However, first and foremost will be the interpretation of the resultsdiscussed in the previous chapter, which will be structured accordingly. Thedemographics of the participant group for this experiment were skewed on an

    educational level, as a considerable majority was highly educated. However, thetotal group was too small (and lower education unrepresented) to split and testfor the effect of their education on their overall judgements. Despite the factthat most of the respondents were students and would likely have had someeducation in art history, a majority rated themselves as amateurs. Althoughtheir self-rated experience with art proved not to have had any effect on this, the

    38

  • 8/12/2019 Val Oreste Tico Zip f

    44/53

    Chapter 6. Discussion & Conclusion 39

    chance exists that most were actually ineffective in rating themselves. Neverthe-less, the general tendency was that the mean for hobbyists rating Zpositively

    was slightly higher and could therefore change in a larger experiment where allthe groups are equally represented.

    The participants turned out to be quite inconsistent in terms of indicatingpreference for either chaos or order and subsequently judging the extreme casesbased on this preference. Table5.4 showed that participants preferring chaosavoided judging the most chaotic, random variant as most positive. However,the major amount of participants slightly preferred order (score 5) and didactually chose for Z in the most chaotic cases. Testing an orderly case wasavoided here, due to the fact that both G and Zinduce structure. This mightbe explained through the fact that the participants associate art in a generalsense with for example paintings they are familiar with. One can imagine thatrespondents preferring chaos anticipated paintings with lively colours and manyobjects, in contrast to numerous lines absent of colour.

    Ranking the computer generated art Proceeding to the main experiment,this paragraph will discuss the most important results for the thesis. The 2-testproved to allow an effective measurement of the individual cases yielding sig-nificant results in terms of preference. Note that this part determined aestheticvalue in terms of ranking different methods, in contrast to judging the canvassesindividually. This implies that results from this test give a relative indicationof aesthetic value, not an absolute one. This method was chosen due to the factthat the experiment was limited in terms of image complexity. The approach ofkeeping the amount of different variables as low as possible allowed the partici-pants to mainly concentrate on the difference between P Distributions. Lookingat the research question posed in 4.4, this was the main focus of the experiment.

    Table5.2initially shows that it cannot be said that, throughout the completeranking part of the experiment, Z was most preferred. From the total of theIGBOD set, Zwas only chosen 12 times as most preferred (R 3 times and Gonly once). However, a closer look at the different P Distributions makes thisresult a lot more promising. Four sets of the same three P Distributions wereconsistently judged in the following order, from least to most preferred: R,G,Z.These sets were allTand spanned bothCand Sin 50 and 75. In this sense, it issafe to state that for the cases that were consistently judged by the participants,the Zipfian distribution was most preferred.

    In the end, the canvasses containing Sallowed for a divergent overall imageper canvas, which resulted in varying preferences. However, they have provento ineffectively represent the nature of each distribution and can therefore beregarded as less important in the final conclusion (Figure 5.1&5.2). It must

    be noted however, that both R and G were consistently ranked higher in theirsolid variant. For Z, this was only true for 75. This might be explained by thechaotic feel that T conveys. Seeing as the respondents judged R as being leastpreferred, this is only a logical conclusion. Accordingly, bothR and Z showedthe highest preference for 25 and least for 75.

  • 8/12/2019 Val Oreste Tico Zip f

    45/53

    Chapter 6. Discussion & Conclusion 40

    Likert based measure of aesthetic value It can be stated that this mea-sure did not prove very successful. The choice for picking a stock P Amount of

    figures for a single pick out of every tested variation turned out to include 4 outof 12 pictures that did not have an overall preference. However, this measuredoes prove that when individual judgements of preference have to be made, thescores are consistent with those in the ranking experiment. For example, thistable shows the same ranking pattern as we have seen before in the T cases.From least to most preferred: R, G, Z. It also shows that R images in theirsolid state are valued higher than either G or Z. On a scale from 1 to 7, despiteexplicitly stating that the images should be ranked on basis of the other cases,they did not exceed an average with only a few participants allocating 7 points.Some participants even rated all the images between 1 and 3. However, thesmall variances in scores still give a good indication of absolute value, albeit forgenerally unattractive pictures.

    Conclusion Taking all the results in consideration, it can be stated that par-ticipants allocated a higher aesthetic value to canvasses that have either anorderly distribution, or an overall orderly appearance. In this sense, the Zipfiandistribution was consistently chosen as being most preferred and therefore hav-ing the highest aesthetic value for cases where the character of the distributionswas most noticeable. For this experiment, the answer to the research questionposed in 4.4 is: the Zipfian distribution has been proven to allow for a higheraesthetic value within simple, computer generated art. This result is in accor-dance with previous research that already proved that the Zipfian distributionposed a high aesthetic value in music and fractal art. Conclusively, the Zipfiandistribution continues to grant new insights and surprising results.

    Future research This experiment faced major restrictions on complexity dueto time and methodological constraints, therefore keeping the dataset as simpleas possible. It could therefore be used as a basis for further research, testingmore variables, methods and more complex figures all together. Moreover, themethod for measuring aesthetic value was experimental and risky as subjectiveaesthetic value in this format is hard to test and does not, at all, have a broadrange of established methods. However, the use of participants could be avoidedby using an AAC as proposed by Manaris et al. (2007) or an aesthetic analysismethod such as that of Klinger and Salingaros (2011). There is still much roomleft open for advanced research, and, hopefully, this thesis could be used toinitialize this research.

  • 8/12/2019 Val Oreste Tico Zip f

    46/53

    Appendix A

    Pseudo-code

    Data: Alphabet plus whitespaceResult: Random textnumber of characters = 10000000;while number of characters>0 do

    choose random letter from alphabet;write random pick to file;number of characters1;

    end

    return text;

    Algorithm 1: Pseudo-code for the Random Language Algorithm

    Data: Extracted letter frequenciesResult: Letter Frequency Based Random Language

    forx in range of total character amount douniform probability based on frequencies;take a random pick from the new range of frequencies;write pick to text;

    end

    return text;

    Algorithm 2: Pseudo-code for the Letter Frequency Based Random Language

    41

  • 8/12/2019 Val Oreste Tico Zip f

    47/53

    Appendix A. Pseudo-code 42

    Data: Corpus of different booksResult: Pseudo-random text

    for line in text dofor current word in line do

    markov chain = word[1], word[2] + current word;word[1] = word[2];word[2] = current word;

    end

    for i in range of the word count dofrom chain pick random word1 & word2;append new word;w1 = w2;w2 = newword;

    end

    endreturn text;

    Algorithm 3: Pseudo-code for the Markov chain Algorithm

  • 8/12/2019 Val Oreste Tico Zip f

    48/53

    Appendix B

    Figures

    Figure B.1: Canvasses taken from folder 26

    Figure B.2: Canvasses taken from folder 26

    43

  • 8/12/2019 Val Oreste Tico Zip f

    49/53

    Bibliography

    Arnheim, R. (1974).Entropy and art: an essay on disorder and order. Universityof California Press.

    Baayen, H. (1992). Statistical models for word frequency distributions: a lin-

    guistic evaluation. Computers and the Humanities, 26(5), 347363.

    Baddeley, A. (1999). Essentials of human memory. Cognitive Psychology.Taylor & Francis.

    Balasubrahmanyan, V., & Naranan, S. (1996). Quantitative linguistics andcomplex system studies. Journal of Quantitative Linguistics, 3(3), 177228.

    Boroda, M., & Polikarpov, A. (1988). The zipf-mandelbrot law and units ofdifferent text levels. In Musikometrika(Chap. 1, pp. 127158).

    Boselie, F. (1984). The aesthetic attractivity of the golden section. PsychologicalResearch, 45(4), 36737