calculus: a modern approach

Calculus: A Modern Approach

Horst R. BeyerLouisiana State University (LSU)

Center for Computation and Technology (CCT)328 Johnston Hall

Baton Rouge, LA 70803, USA

1

Dedicated to the Holy Spirit

Contents

Contents 3

1 Introduction 51.1 Short Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 The General Approach of the Text . . . . . . . . . . . . . . . . . 7

1.3.1 Motivational Parts . . . . . . . . . . . . . . . . . . . . . 81.3.2 Core Theoretical Parts . . . . . . . . . . . . . . . . . . . 91.3.3 Parts Containing Examples and Problems . . . . . . . . . 10

1.4 Miscellaneous Aspects of the Approach . . . . . . . . . . . . . . 121.5 Requirements of Applications . . . . . . . . . . . . . . . . . . . 131.6 Remarks on the Role of Abstraction in Natural Sciences . . . . . . 14

2 Calculus I 172.1 A Sketch of the Development of Rigor in Calculus and Analysis . 172.2 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 Elementary Mathematical Logic . . . . . . . . . . . . . . 202.2.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.2.3 Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.3 Limits and Continuous Functions . . . . . . . . . . . . . . . . . . 602.3.1 Limits of Sequences of Real Numbers . . . . . . . . . . . 602.3.2 Continuous Functions . . . . . . . . . . . . . . . . . . . 88

2.4 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1212.5 Applications of Differentiation . . . . . . . . . . . . . . . . . . . 1442.6 Riemann Integration . . . . . . . . . . . . . . . . . . . . . . . . 211

3 Calculus II 2493.1 Techniques of Integration . . . . . . . . . . . . . . . . . . . . . . 249

3.1.1 Change of Variables . . . . . . . . . . . . . . . . . . . . 2493.1.2 Integration by Parts . . . . . . . . . . . . . . . . . . . . . 2663.1.3 Partial Fractions . . . . . . . . . . . . . . . . . . . . . . 2813.1.4 Approximate Numerical Calculation of Integrals . . . . . 297

3.2 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . 3083.3 Series of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . 338

3

3.4 Series of Functions . . . . . . . . . . . . . . . . . . . . . . . . . 3783.5 Analytical Geometry and Elementary Vector Calculus . . . . . . . 439

3.5.1 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . 4403.5.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . 4503.5.3 Conic Sections . . . . . . . . . . . . . . . . . . . . . . . 4783.5.4 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . 4923.5.5 Quadric Surfaces . . . . . . . . . . . . . . . . . . . . . . 5003.5.6 Cylindrical and Spherical Coordinates . . . . . . . . . . . 5093.5.7 Limits in Rn . . . . . . . . . . . . . . . . . . . . . . . . 5163.5.8 Paths in Rn . . . . . . . . . . . . . . . . . . . . . . . . . 520

4 Calculus III 5424.1 Vector-valued Functions of Several Variables . . . . . . . . . . . 5424.2 Derivatives of Vector-valued Functions of Several Variables . . . . 5664.3 Applications of Differentiation . . . . . . . . . . . . . . . . . . . 5974.4 Integration of Functions of Several Variables . . . . . . . . . . . . 6274.5 Vector Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 6784.6 Generalizations of the Fundamental Theorem of Calculus . . . . . 694

4.6.1 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . 7014.6.2 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . 7194.6.3 Gauss’ Theorem . . . . . . . . . . . . . . . . . . . . . . 732

5 Appendix 7505.1 Construction of the Real Number System . . . . . . . . . . . . . 7505.2 Lebesgue’s Criterion for Riemann-integrability . . . . . . . . . . 7625.3 Properties of the Determinant . . . . . . . . . . . . . . . . . . . . 7675.4 The Inverse Mapping Theorem . . . . . . . . . . . . . . . . . . . 783

References 791

Index of Notation 797

Index of Terminology 799

4

1 Introduction

1.1 Short IntroductionThis text is an enlargement of lecture notes written for Calculus I, II and IIIcourses given at the Department of Mathematics of Louisiana State Univer-sity in Baton Rouge. It follows syllabi for these courses at LSU. Mainly,it is devised for teaching standard entry level university calculus courses,but can also be used for teaching courses in advanced calculus or under-graduate analysis, oriented towards calculations and applications, and alsofor self-study. The reasons for devising a text of such threefold nature isexplained in Section 1.3. This text is unique also in its special attentionto the needs of applications and due to its unusually elaborate motivationscoming from the history of mathematics and applications. As a result, thetext introduces early on basic material that is needed in applied sciences,in particular from the area of differential equations. Its motivations followOtto Toeplitz’ famous ‘genetic’ method, [96].

1.2 BackgroundCurrently, the content coverage and approach in standard calculus texts ap-pear static. Indeed, such courses teach to a large extent views of the 18thcentury. On the other hand, the demand for analysis skills of increasingsophistication and abstraction in applications is still unbroken.

As pointed out in Section 1.6, the need for a higher level of mathemati-cal sophistication in the discipline which is most fundamental for applica-tions, physics, was a byproduct, in particular, of the study of atomic sys-tems. In particular, the mathematics education of physicists needs to gobeyond calculus. A study of functional analysis, especially that of the spec-tral theorems of self-adjoint linear operators in Hilbert spaces, considerablyenhances the understanding of quantum theory beyond that given in stan-dard quantum mechanics texts. Such knowledge is extremely helpful in thestudy of the more advanced quantum theory of fields and, very likely, also

5

for the formulation and understanding of more advanced unified quantumfield theories that are still to come.

Also in the engineering sciences, the need for higher mathematical sophis-tication is visible, in particular, in connection with the solution of partialdifferential equations (PDE). PDE dominate current applications and func-tional analysis also provides the basis for their treatment. A good examplefor application of functional analytic methods is the method of finite el-ements which is widely used in engineering sciences for the solution ofboundary value problems of elliptic differential equations. Also, questionsafter the relation of approximate solutions, provided by numerical methods,to the solutions of the original PDE gain importance and hence lead into thearea of functional analysis.

The mathematical thinking taught in current standard calculus courses pro-vides no proper basis for more advanced courses in the area of analysis,in particular, courses in advanced calculus or undergraduate analysis.1 Asa consequence, the last don’t build on any previous knowledge of calcu-lus, but start completely new.2 Frequently, students from natural sciencesand engineering, which form a major part of classes, don’t attend such ad-vanced courses, mostly for reasons of time. As a consequence, frequently,standard calculus courses lead the last students into a dead end. In today’stime, where the speed of development of all parts of society rapidly in-creases, such procedure appears no longer appropriate.

Since a major raise of the mathematical level of standard calculus courses

1 This is not surprising since precisely that thinking led calculus into serious crisis in thebeginning of the 19th century. Only after that crisis was overcome, the development ofmore advanced mathematical fields was possible.

2 Of course, this is not very efficient. Also, significantly, students of mathematics of-ten face substantial problems in the first decisive parts of such courses that demand aconsiderably higher level of abstraction. Usually, this problem cannot be avoided byoffering honor calculus courses, since most often there are only insufficient numbersof students to fill such courses. Also, the last are not always taught on a significantlyhigher level than standard calculus courses.

6

does not appear feasible, without losing the bulk of students, the result isa dilemma. The goal of providing a basic calculus education to a largemass of students, that is at the same time suitable as basis for more ad-vanced analysis courses and also for increased demands for analysis skillsof higher mathematical sophistication in applications, seems unreachable.Visibly, current standard calculus courses pursue the first part of this goal,only.

1.3 The General Approach of the TextThe text tries to reach the whole goal, instead. As is suitable for calculuscourses, it has a strong orientation towards calculations, but uses consis-tently mathematical methods of the 20th century, in particular, the basicconcepts of sets and maps, for the development of calculus. It is mainly theuse of these efficient concepts that distinguishes 20th century mathematicsfrom older mathematics. In addition, special care was taken to include ma-terial that is needed early on in applied sciences, in particular from the areaof differential equations. On the last, details are given in Section 1.5.

As a consequence, the text rests on Chapter 2, Basics, of Calculus I thatintroduces the concepts of sets and maps. Due to their inherent simplicity,the understanding of these concepts is possible to the majority of students.This introduction is preceded by a short subsection on elementary mathe-matical logic to explain the meaning of the notion of a proof. This takesinto account the experience that a large number of students have difficul-ties in understanding that meaning. It is also hoped that this subsectionconvinces some students, if still necessary, that they are capable of under-standing proofs.

Therefore, Chapter 2 should be covered in detail in class. Its thorough studywill provide the student with the basic tools that are essential for the under-standing of modern mathematics. A student that mastered this chapter willrealize in the following that a main step in the solution of a problem is itsreformulation in terms of the ‘language’ provided in Chapter 2. After that,

7

the solution of a large number of problems is obvious. As a consequence,he or she will gradually realize that the seemingly ‘challenging’ nature ofmany standard calculus problems is due to an inadequate formulation. Inthis way, the student will learn to appreciate the power of the provided ‘lan-guage’ which will guide him or her through the rest of the course.

Mostly, chapters consist of three parts. An introductory motivational part,a core theoretical part and a part containing examples and problems.

1.3.1 Motivational Parts

Those parts consider historical mathematical problems or problems fromapplications that lead to the development of the mathematics in the theo-retical part of the chapter. Such problems often have a certain ‘directness’which is suitable to catch students attention and should help every studentto get an idea ‘why’ certain mathematics was developed and ‘what math-ematics is good for’. To the author’s experience, practically all studentshave a high interest in such parts and, if given, are more inclined to followsubsequent more theoretical investigations. Also, motivations of this typeare largely missing in standard calculus texts known to the author.

In this, the text follows Otto Toeplitz’ ‘genetic’ method, suggested in 1926and realized in his ‘Die Entwicklung der Infinitesimalrechnung, Bd. I.’from 1949 [96]. To the knowledge of the author, the present text is the firstthat implements Toeplitz’ method to a large extent and at the same time iscapable to cover a three semester course in calculus. On the other hand,differently to Toeplitz, the text does not follow the historic order of themathematical development because, from today’s perspective, that devel-opment was not very efficient. Also, the formal approach to mathematics,with Hilbert as its main proponent, made clear that ‘understanding’ in math-ematics is ‘structural understanding’. The last is an achievement of the 20thcentury. Presenting the material in the historical order would obstruct thepath towards such understanding and be contrary to the intentions of thetext.

8

Also, wherever possible, motivation is taken from applications. This is suit-able, in particular, for students from natural sciences and engineering. Thisincludes introductions to sections like that on improper integrals that usesmotivation from the mechanics of periodic motion where improper inte-grals occur naturally in the analysis. Also, a large number of examples andproblems consider basic problems related to theoretical mechanics, generalrelativity and quantum mechanics. In this, it pays off that the author is amathematical physicist that has a first hand research knowledge of theseareas. As a consequence, those problems are realistic.

In cases where prototypical problems seemed unavailable, pure historicalsketches of the development were used for the purpose of motivation. Forinstance, such approach was used in the introduction to the section on settheory. That introduction points out the fact that the original object of studyof set theory was the concept of the infinite and that initial resistance againstthe theory had its roots in ancient Greek philosophical views of the infinitethat were still not completely overcome at the time.

The motivational introductions should be accessible to every student andbe gone through in detail in class.

1.3.2 Core Theoretical Parts

Those parts gives a rigorous development of essential parts of the machin-ery of analysis. Essentially, they are on the level of a standard under-graduate analysis or advanced calculus text, like Lang’s ‘UndergraduateAnalysis’, [63], but proofs are intentionally more detailed and have beensimplified as far as possible. For this purpose, also current mathematicalliterature, in particular, the American Mathematical Monthly and the Math-ematics Magazine, has been systematically searched. For instance, this ledto the adoption of E. J. McShane’s proof of Lagrange’s multiplier rule [76]which does not use the implicit mapping theorem. Also simplifications sug-gested by [4], [25], [26], [32], [33], [40], [44], [61], [90] and [97] have been

9

used. As a consequence, the text can also be used to teach undergraduateanalysis or advanced calculus courses oriented towards calculations and ap-plications.

In class, the statements of the most important theorems should appear onthe blackboard to teach students to work with these statements, even if thecorresponding proofs are not fully understood or skipped. On the otherhand, for reasons of time, it is to be expected that a number of proofs haveto be omitted or can only be indicated in class.

On the other hand, students from mathematics and also from natural sci-ences and engineering, are advised to go through proofs, that have in omit-ted or only indicated in class, in self-study. To facilitate such deeper study,this text gives students the chance to look up the full proofs without thenecessity for a time consuming study of a large number of other sources.1

The last is no easy task for a beginner and, usually, lacks efficiency. For thisreason, the text is also devised for an unguided self-study and very explicit.In particular, it tries to give also elementary steps in calculations to suchextent that they become evident. As a consequence, large parts of the textshould not even need paper and pencil.

1.3.3 Parts Containing Examples and Problems

The majority of problems and examples are of a type and level occurring instandard university calculus texts in the US, but consistently reformulatedin modern terms.

The problems are mostly calculational in nature, as is appropriate for cal-culus courses also suitable for students for applied sciences. Accordingto experience, the mastery of the study of applied sciences needs, at theminimum, technical mathematical skills. Sometimes, the opinion is uttered

1 In particular such study is complicated by different choices of notation. Of course, theauthor would not discourage students from such study if there is sufficient time, but,generally, a dense undergraduate curriculum should not leave much time for that.

10

that the advent of mathematical software tools, like Mathematica, Maple,Matlab made such skills redundant. In fact, this is not the case since theuse of such software led to the consideration of problems whose complex-ity would have prevented an attack in the past. For instance, viewed fromthe perspective of algebraic manipulation associated to such problems, thiscomplexity is reflected in the output of such programs. Simplification al-gorithms cannot possible know what the user’s intentions are. Hence theuser has to guide the software to a useful answer without knowledge of thatanswer. This process needs a lot of mathematical experience and skills. Asa consequence, efficient use of such programs presupposes technical math-ematical skills and experience and even a form of structural understandingof mathematical manipulations. In addition, it is well-known that such pro-grams are not completely free of errors. Particular examples are given onpages 263 and 292 of the text. Therefore, users need to perform routinechecks of the results of such programs which also requires mathematicalskills.1

The examples appear throughout in form of fully worked problems. Asa consequence, these do not only exemplify the theory, but at the same timeteach problem solving and prepare for exams. This procedure is partic-ularly helpful for beginners. Wherever possible, the results of exampleshave been checked with Mathematica 5.1.

Also, a large number of examples and problems consider basic problemsfrom applications, in particular, from theoretical mechanics, general rela-tivity and quantum mechanics. In this, it pays off that the author is a math-ematical physicist who has a first hand research knowledge of these areas.As a consequence, those problems are realistic.

Every calculus student needs to solve those problems and be able under-stand those examples. In particular, in class, the examples should be cov-ered in detail.1 Compared to these requirements, the effort for learning the correct syntax of such pro-

grams is relatively low.

11

1.4 Miscellaneous Aspects of the Approach(i) The text tries to introduce only essential mathematical structures and

terminology and only in places where they are of direct subsequentuse. In particular, mathematical notions are developed only to thelevel needed in the sequel of the text, thereby stressing their tool char-acter.

(ii) Material which is used in the text, but whose development wouldcause a major disrupt of the course, like the proof of Lebesgue’s char-acterization of Riemann integrability, are deferred to the appendix tomake it accessible to interested students. In addition, the appendixcontains a complete version of Cantor’s construction of the real num-bers as equivalence classes of Cauchy sequences of rational numbers.Today, it is well-known that the whole of analysis and calculus restson a construction of the real number system. Therefore, mainly forstudents of mathematics, such a construction has been included. Thefrequently used introduction of the real number system by a compli-cated set of axioms, for example, as in [63], has been avoided sincesuch should appear implausible, in particular, to such students.

(iii) The basic limit notion of the text is that of limits of sequences. Con-tinuous limits are introduced as a derived concept, but their use isusually avoided. In particular, the definition of the continuity of func-tions proceeds by means of the conceptually simpler notion of ‘se-quential continuity’, instead of the equivalent classical ε, δ-approach.Generally, the last approach is often problematic for beginners.

(iv) The text contains 210 diagrams whose role is to assist intuition, butnot to create the illusion of being able to replace any argument insidea proof. Mistakenly, the last is sometimes assumed by students. Forthis reason, it is explained in the introduction of the section on the de-velopment of rigor in calculus and analysis why geometric intuitionis no longer regarded a valid tool in mathematical proofs. Still, gooddiagrams can be useful for the formulation of conjectures.

12

(v) In general, theorems contain their full set of assumptions, so that astudy of their environment is not necessary for their understanding.For the same reason, occasionally, shorter definitions appear as partof theorems, and theorems as well as definitions contain also materialthat would normally appear only in subsequent remarks.

1.5 Requirements of ApplicationsThe bulk of material needed early on in applied sciences is from the area ofdifferential equations. In the case of physics, this is the case since the ad-vent of Newtonian mechanics in the 17th century. The advent of quantumtheory made it necessary, in particular, to go beyond differential equationson to abstract evolution equations, see, e.g., [8]. Of course, the treatmentof differential equations cannot be comprehensive in calculus courses, buta number of important cases can already be treated with methods from cal-culus. Such cases have been in included in this text as examples of calculusapplications and in problem sections. For instance, second order differen-tial equations with constant coefficients are already treated in the sectionon applications of differentiation in Calculus I. The uniqueness of the so-lutions of such an equation can be proved by help of an energy inequality.The solutions are found by help of a simple transformation that eliminatesthe first order derivative of the unknown function. A two-parametric familyof solutions of the resulting equation is easily found. Within the sections onRiemann integration and its applications, separable first order differentialequations are solved by help of integration. The solutions of the equation ofmotion for a simple pendulum are considered in the introduction to the sec-tion on improper integrals in Calculus II. Solutions of Bessel’s differentialequation are derived by the method of power series in the section on se-ries of functions. The derivation of solutions of the hypergeometric and theconfluent hypergeometric differential equations are part of the subsequentproblem section. Connected to differential equations are special functions,in particular, the Gamma and the Beta function. The last are defined andstudied within the section on improper Riemann integrals. That section alsoderives well-known values of certain exponential integrals used in quantum

13

theory and probability theory and a standard integral representation for Rie-mann’s zeta function.

In addition, in applications often the need arises to integrate discontinu-ous functions as well as functions over unbounded domains. Usually, thoseneeds are due to idealizations that make problems accessible to direct ana-lytical calculation. Such ‘model systems’ are still the main source for thedevelopment of an intuitive understanding of natural phenomena.1 For thisreason, applications need an integration theory which is capable of inte-grating a large class of functions. Lebesgue’s integration theory is wellsuited for this purpose. Still, for reasons of practicability, the text devel-ops Riemann’s integration theory, though close to its limits. In particu-lar, Lebesgue’s characterization of Riemann integrability is given inside thetext, but its proof is deferred to the appendix. For integration of functionsin several variables, we use Serge Lang’s approach to Riemann integrationfrom [63]. This approach is capable of integrating bounded functions, de-fined on closed bounded intervals, that are continuous, except from pointsof a ‘negligible’ set. Negligible sets can be covered by a finite number of in-tervals with an associated sum of volumes which can be made smaller thanevery preassigned real number ą 0. Hence negligible sets are particularbounded sets of Lebesgue measure zero.

1.6 Remarks on the Role of Abstraction in Natural Sci-ences

Examples for the fact that the most fundamental of natural sciences, physics,always operated on a level of abstraction similar to that of mathematics areeasy to find. A first example comes from Newtonian mechanics whosedevelopment was intertwined with that of calculus. The former theory de-scribes strict point particles, that is, particles without any spatial extension.Of course, experimentally such point particles have never been observedand therefore constitute an abstraction that has its roots in ancient Greek1 The rising importance of numerical investigations has not, and likely, cannot change

that.

14

geometry. They have always been regarded as an idealization of a muchmore complicated reality. Still, the assumption of Newtonian point parti-cles led to predictions that were in excellent agreement with observationsand measurement until the advent of quantum theory in the first quarter ofthe 20th century. Einstein’s theory of special relativity has been the causeof another abstraction to enter physics, namely the unification of time andspace into a four dimensional space-time. Such unification led to a remark-able simplification of that theory. Since it is the belief of most physiciststhat the ‘simplicity’ of a description, that is consistent with the experimen-tal facts and that predicts new phenomena that are subsequently observed,at least partially, reflects an objective reality, nowadays this unification isa commonly used abstraction. A further abstraction is due to Einstein’stheory of general relativity that absorbed the gravitational field into the ge-ometry of the four dimensional space-time. Subsequently, quantum theoryled to the description of matter by elements of abstract Hilbert spaces withcorresponding physical observables being spectral measures of self-adjointoperators in this space. In the algebraic quantum theory of fields, observ-ables are elements of a von Neumann algebra, and physical states of thefield are positive linear forms on the algebra.

The above indicates that the development of physics towards the under-standing of deeper aspects of nature was paralleled by the application ofmathematical methods of increasing sophistication. In order to avoid theoccurrence of errors, the last also necessitated an increasing stress on math-ematical rigor in physics. Current physics is as abstract as mathematicssince it studies practically exclusively phenomena that cannot be perceivedby human senses, but only indirectly by help of highly sophisticated ex-perimental equipment. Hence, similar to mathematics, in physics visualintuition is no longer of much help in the analysis of phenomena. In con-trast, the development of physics supports the view that theories based ondirect human perception inevitably contain extrapolations on the nature ofthings which ultimately turn out to be seriously flawed. Finally, in currentspeculative, i.e., without experimental evidence, physical theories there iscurrently nothing else available than mathematical consistency and rigor to

15

give such theories credibility. Those can only try to ‘replace’ experiment,temporarily, by mathematical consistency and rigor, although ultimatelyonly the outcome of experiments decide on the ‘truth’ of a physical theory.Viewed from this perspective, its is quite obvious that calculus courses needto go into the direction of increased mathematical sophistication in order tonarrow a widening gap to contemporary applications. In this connection,it needs to be remembered that after the advent of quantum theory, it hasbeen recognized that the laws of quantum theory provide also the basis forthe laws of chemistry. Therefore, it is to be expected that the other naturalsciences and the engineering sciences follow the development of physics to-wards the use of more subtle mathematical methods. Such trend is alreadyobvious.

AcknowledgmentsI am indebted to Kostas Kokkotas, Tubingen, by suggesting the inclusionof a number of valuable examples in the text.

16

2 Calculus I

2.1 A Sketch of the Development of Rigor in Calculus andAnalysis

It is evident that a science that leads to contradictory statements loses itsvalue. Therefore, the occurrence of such an event sends a shock wavethrough the scientific community. The immediate response is an analy-sis of the validity of the reasoning that leads to the contradiction. In casethat reasoning appears to be ‘valid’, i.e., if the contradiction can be derivedby generally accepted rules of inference (‘logic’) from assumptions thatare generally believed to be true (‘axioms’), the field is in a crisis becausethose assumptions and/or rules need to be revised until the contradiction isresolved. If this succeeds, it has to be determined whether all previouslyobtained results of the science are derivable from the revised basis. Poten-tially, a large number of results could be lost in this way.

Probably the first example of a serious crisis in mathematics is the discov-ery in ancient Greece around 450 B.C. that the length,

?2 , of a diagonal

of a square with sides of length 1 is no rational number, a fact that will beproved in Example 2.2.15 below. Tradition attributes this discovery to amember of the Pythagorean school of thought. The fundamental assump-tion of that school was that the essence of everything is expressible in termsof whole numbers and their ratios, i.e., of quantities which are discrete incharacter. As a consequence of the discovery, that line of thought lost itsbasis. As a result, Plato’s’ school of thought completely reorganized themathematical knowledge of the time by giving it an exclusively geometricbasis. In this, the product of two lengths is not another length, but an area,for instance, that of rectangle. Hence the equation

x2“ 2

can be solved geometrically, for instance, by constructing a square withedge x whose area is equal to the area of a rectangle with sides 2 and 1.As a consequence, algebraic equations were solved in terms of geometric

17

quantities. On the other hand, viewed from a today’s perspective, that ap-proach bypassed the problem of irrational quantities, rather than solving itand can be seen as a prime reason for a major delay of the development ofmathematical calculus / analysis. The last was developed as late as in the17th century in Western Europe.

The crisis gave important reasons for the development of the axiomaticmethod in mathematics in ancient Greece, i.e., proof by deduction fromexplicitly stated postulates. Without doubt, this method is the single mostimportant contribution of ancient Greece to mathematics which is the basisof mathematics until today. In style, modern mathematics texts, includingthe present text, mirror that of the epoch making thirteen books of Euclid’sElements written around 300 B.C. [37]. Previous Egyptian and Babylonianmathematics made no distinction between exact and approximate resultsnor were there indications of logical proofs or derivations. On the otherhand, the Egyptians and Babylonians had already quite accurate approxi-mations for π and square roots that were needed in land survey. For in-stance, the Egyptians determined the value of π within an error of 2 ¨ 10´2

and the value of?

2 within an error of 10´4. The Babylonians were alreadyfamiliar with the so called Pythagorean theorem and determined the valueof π within an error of 10´7 and the value of

?2 within an error of 10´6.

In order to be considered as properly established in ancient Greece, a the-orem had to be given a geometric meaning. This tradition continued in theMiddle Ages and the Renaissance in the West. The geometric intuition wasmore trusted than insight into the nature of numbers. In the early phases ofthe development of calculus / analysis in the 17th and 18th century and alsoin the views of its founding fathers Isaac Newton and Gottfried WilhelmLeibniz, geometric intuition was of major importance, but in the sequelwas gradually replaced by arithmetic.

A major factor in this process was the construction of non-euclidean ge-ometries by Nicolai Lobachevsky (1829) [72], Janos Bolyai (1831) [11]and earlier, but unpublished, by Gauss. In his ‘Elements’, Euclid bases

18

geometry on five postulates that are assumed to be valid. Generally, onlythe first four of them were considered geometrically intuitive, whereas thefifth, the so called parallel postulate, was expected to be a consequence ofthe other postulates. For about 2000 years, an enormous effort went intothe investigation of this question. The construction of non-euclidean ge-ometries which satisfy the first four, but not the fifth, of Euclid’s postulatesproved the independence of the parallel postulate from the other postulates.This result stripped Euclidean geometry from its central role it retained forabout 2000 years.

The final removal of geometric intuition as a means of mathematical proofswas caused from a number of geometrically non-intuitive results of calculus/ analysis , in particular, the demonstration of the existence of a continuousnowhere differentiable function by Karl Weierstrass in 1872 [99], see Ex-ample 3.4.13, and the construction of a plane-filling continuous curve byGiuseppe Peano in 1890 [84], see Example 3.4.14. Weierstrass conceivedand in large part carried out a program known as the arithmetization ofanalysis, under which analysis is based on a rigorous development of thereal number system. This is the common approach until today. For thisreason, Weierstrass is often considered as the father of modern analysis. Acommon rigorous development of the real number system by use of Cauchysequences is given in Appendix 5.1.

Today, reference to geometric intuition is not considered a valid argumentin the proof of a theorem. Of course, such intuition might give hints howto perform such a proof, but the means of the proof itself are purely for-mal. This situation is similar to that of blindfold chess, i.e., the playingof a game of chess without seeing the board. That formal approach hasbeen suggested by David Hilbert for the foundation of mathematics andhas become the standard of most working mathematicians. It culminatedin the collective works of a group of mathematicians publishing under thepseudonym ‘Bourbaki’. The series comprises 40 monographs that becamea standard reference on the fundamental aspects of modern mathematics.

19

2.2 Basics2.2.1 Elementary Mathematical Logic

In the 17th century Leibniz suggested the construction of a universal lan-guage for the whole of mathematics that allows the formalization of proofs.In 1671, he constructed a mechanical calculator, the step reckoner, that wascapable of performing multiplication, division and the calculation of squareroots. Also in view of his involvement in the construction of other mechan-ical devices, like pumps, hydraulic presses, windmills, lamps, submarines,clocks, it is likely that he envisioned machines that ultimately could per-form proofs. The first scientific work on algebraization of Aristotelian logicappeared in 1847 [10], 1858 [81] by George Boole and Augustus De Mor-gan, respectively. The formation of mathematical logic as an independentmathematical discipline is linked with Hilbert’s program mentioned in Sec-tion 2.1 on formal axiomatic systems that resulted from the recognition ofthe unreliability of geometrical intuition. That program called for a formal-ization of all of mathematics in axiomatic form, together with a proof thatit is free from contradictions, i.e., that it is what is called ‘consistent’. Theconsistency proof itself was to be carried out using only what Hilbert called’finitary’ methods. In the sequel, neither Leibniz nor Hilbert’s visions havebeen achieved.

However, what has been achieved is sufficient for most working mathemati-cians today. In the following, we present only the very basics of symboliclogic and display some basic types of methods of proof in simple cases. De-spite of its brevity, this chapter is very important because the given logicalrules for correct mathematical reasoning will be in constant use throughoutthe book (as well as throughout the whole of mathematics) without explicitmentioning. Therefore, its careful study is advised to the reader. Alsoshould the reader fill in additional steps into proofs whenever he/she feelsthe necessity for this. The last should become a routine operation also forthe rest of the book. To the experience of the author, this is a necessity to afathom the material.

20

Definition 2.2.1. (Statements) A statement (or proposition) is an assertionthat can determined as true or false.

Often abstract letters like A,B,C, . . . are used for their representation.

Example 2.2.2. The following are statements:

(i) The president George Washington was the first president of the UnitedStates ,

(ii) 2 + 2 = 27 ,

(iii) There are no positive integers a, b, c and n with n ą 2 such an`bn “cn . (Fermat’s conjecture)

The following are no statements:

(iv) Which way to the Union Station? ,

(v) Go jump into the lake!

Definition 2.2.3. (Truth values) The truth value of a statement is denotedby ‘T’ if it is true and by ‘F’ if it is false.

Example 2.2.4. For example, the statement

9` 16 “ 25 (2.2.1)

is true and therefore has truth value ‘T’, whereas the statement

9` 16 “ 26

is false and therefore has truth value ‘F’. Also, the statement Example 2.2.2 (i)is true, the statement Example 2.2.2 (ii) is false, and it is not yet knownwhether the statement Example 2.2.2 (iii) is true or false.

21

Definition 2.2.5. (Connectives) Connectives like ‘and’, ‘or’, ‘not’, . . .stand for operations on statements.

Connective Symbol Name‘not’ Negation‘and’ ^ Conjunction‘or’ _ Disjunction

‘if . . . then’ ñ Conditional‘. . . if and only if . . . ’ ô Bi-conditional

Example 2.2.6. For example, the statement

‘It is not the case that 9` 16 “ 25’

is the negation (or ‘contrapositive’) of (2.2.1). It can be stated more simplyas

9` 16 ‰ 25 .

Other examples are compounds like the following

Example 2.2.7.

(i) Tigers are cats and alligators are reptiles ,

(ii) Tigers are cats or (tigers are) reptiles ,

(iii) If some tigers are cats, and some cats are black, then some tigers areblack ,

(iv) 9` 16 “ 25 if and only if 8` 15 “ 23 .

Definition 2.2.8. (Truth tables) A truth table is a pictorial representationof all possible outcomes of the truth value of a compound sentence. Theconnectives are defined by the following truth tables for all statements Aand B.

A B A A^B A_B Añ B Aô B

T T F T T T TT F F F T F FF T T F T T FF F T F F T T

.

22

Note that the compound A _ B is true if at least one of the statements Aand B is true. This is different from the normal usage of ‘or’ in English.It can be described as ‘and/or’. Therefore, the statement 2.2.7 (ii) is true.Also, the statements 2.2.7 (i) and 2.2.7 (iv) are true.

Also, note that from a true statement A there cannot follow a false state-ment B, i.e., in that case the truth value of A ñ B is false. This can beused to identify invalid arguments and also provides the logical basis for socalled indirect proofs.

Note that valid rules of inference do not only come from logic, but alsofrom the field (Arithmetic, Number Theory, Set Theory, ...) the statementis associated to. For instance, the equivalence 2.2.7 (iv) is concluded byarithmetic rules, not by logic. Those rules could turn out to be inconsistentwith logic in that they allow to conclude a false statement from a true state-ment. Such rules would have to be abandoned. An example for this is givenby the statement 2.2.7 (iii). Although the first two statements are true, thewhole statement is false because there are no black tigers. In the following,the occurrence of such a contradiction is indicated by the symbol . Notethat the rule of inference in 2.2.7 (iii) is false even if there were black tigers.

Example 2.2.9. (Inconsistent rules) Assume that the real numbers are partof a larger collection of ‘ideal numbers’ for which there is a multiplicationwhich reduces to the usual multiplication if the factors are real. Further,assume that for every ideal number z there is a square root

?z , i.e., such

that`?

z˘2“ z ,

which is identical to the positive square root if z is real and positive. Finally,assume that for all ideal numbers z1, z2, it holds that

?z1z2 “

?z1 ¨

?z2 .

Note that the last rule is correct if z1 and z2 are both real and positive. Thenwe arrive at the following contradiction:

´1 “`?´1

˘2“?´1 ¨

?´1 “

a

p´1qp´1q “?

1 “ 1 .

23

Hence an extension of the real numbers with all these properties does notexist.

A simple example for an indirect proof is the following.

Example 2.2.10. (Indirect proof) Prove that there are no integers m andn such that

2m` 4n “ 45 . (2.2.2)

Proof. The proof is indirect. Assume the opposite, i.e., that there are in-tegers m and n such that (2.2.2) is true. Then the left hand side of theequation is divisible without rest by 2, whereas the right hand side is not. Hence the opposite of the assumption is true. This is what we wanted toprove.

Example 2.2.11. Calculate the truth table of the statements`

pAñ Bq ^ pB ñ Cq˘

ñ pAñ Cq (Transitivity) ,ˆ

pA_Bq ^`

pAñ Cq ^ pB ñ Cq˘

˙

ñ C (Proof by cases) ,

p B ñ Aq ô pAñ Bq (Contraposition) . (2.2.3)

Solution:

A B C Añ B B ñ C pAñ Bq ^ pB ñ Cq Añ C

T T T T T T TT T F T F F FT F T F T F TT F F F T F FF T T T T T TF T F T F F TF F T T T T TF F F T T T T

24

pAñ Bq ^ pB ñ Cq ñ pAñ Cq

TTTTTTTT

A B C A_B Añ C B ñ C pAñ Cq ^ pB ñ Cq

T T T T T T TT T F T F F FT F T T T T TT F F T F T FF T T T T T TF T F T T F FF F T F T T TF F F F T T T

pA_Bq ^`

pAñ Cq ^ pB ñ Cq˘ `

pA_Bq ^`

pAñ Cq ^ pB ñ Cq˘˘

ñ C

T TF TT TF TT TF TF TF T

A A B B B ñ A Añ B p B ñ Aq ô pAñ Bq

T F T F T T TT F F T F F TF T T F T T TF T F T T T T

25

The members of (2.2.3) are so called tautologies , i.e., statements that aretrue independent of the truth values of their variables. At the same time theyare frequently used rules of inference in mathematics, i.e., for all statementsA,B and C it can be concluded from the truth of the left hand side (in largebrackets) of the relations on the truth of the corresponding right hand side.

Example 2.2.12. (Transitivity) Consider the statements

(i) If Mike is a tiger, then he is a cat,

(ii) If Mike is a cat, then he is a mammal,

(iii) If Mike is a tiger, then he is a mammal.

Statements (i), (ii) are both true. Hence it follows by the transitivity of ñthe truth of (iii) (and since ‘Mike’, the tiger of the LSU, is indeed a tiger,he is also a mammal).

Example 2.2.13. (Proof by cases) Prove that

n` |n´ 1| ě 1 (2.2.4)

for all integers n.

Proof. For this, let n be some integer. We consider the cases n ď 1 andn ě 1. If n is an integer such that n ď 1, then n´ 1 ď 0 and therefore

n` |n´ 1| “ n` 1´ n “ 1 ě 1 .

If n is an integer such that n ě 1, then n´ 1 ě 0 and therefore

n` |n´ 1| “ n` n´ 1 “ 2n´ 1 ě 2´ 1 “ 1 .

Hence in both cases (2.2.4) is true. The statement follows since any integeris ď 1 and/or ě 1.

Example 2.2.14. (Contraposition) Prove that if the square of an integer iseven, then the integer itself is even.

26

Proof. We define statements A, B as

‘The square of the integer (in question) is even’

and‘The integer (in question) is even’ ,

respectively. Hence B corresponds to the statement

‘The integer (in question) is odd’ ,

and A corresponds to the statement

‘The square of the integer (in question) is odd’ .

Hence the statement follows by contraposition if we can prove that thesquare of any odd integer is odd. For this, let n be some odd integer. Thenthere is an integer m such that n “ 2m` 1. Hence

n2“ p2m` 1q2 “ 4m2

` 4m` 1 “ 2 p2m2` 2mq ` 1

is an odd integer and the statement follows.

Based on the result in the previous example, we can prove now the resultmentioned in Section 2.1 that there is no rational number whose square isequal to 2.

Example 2.2.15. (Indirect proof) Prove that there is no rational numberwhose square is 2.

Proof. The proof is indirect. Assume on the contrary that there is such anumber r. Without restriction, we can assume that r “ pq where p, qare integers without common divisor different from 1 and that q ‰ 0. Bydefinition,

r2“

ˆ

p

q

˙2

“p2

q2“ 2 .

Hence it follows thatp2“ 2q2

27

and therefore by the previous example that 2 is a divisor of p. Hence there isan integer p such that p “ 2p. Substitution of this identity into the previousequation gives

2p2“ q2 .

Hence it follows again by the previous example that 2 is also divisor of q.As a consequence, p, q have 2 as a common divisor which is in contradictionto the assumption. Hence there is no rational number whose square isequal to 2.

Problems

1) Decide which of the following are statements.

a) Did you solve the problem?b) Solve the problem!c) The solution is correct.d) Maria has green eyes.e) Soccer is the national sport in many countries.f) Soccer is the national sport in Germany.g) During the last year, soccer had the most spectators among all

sports in Germany.f) Explain your solution!g) Can you explain your solution?h) Indeed, the solution is correct, but can you explain it?i) The solution is correct; please, demonstrate it on the black-

board.

2) Translate the following composite sentences into symbolic notationusing letters for basic statements which contain no connectives.

a) Either John is taller than Henry, or I am subject to an opticalillusion.

b) If John’s car breaks down, then he either has to come by bus orby taxi.

c) Fred will stay in Europe, and he or George will visit Rome.d) Fred will stay in Europe and visit Rome, or George will visit

Rome.

28

e) I will travel by train or by plane.f) Neither Newton nor Einstein created quantum theory.g) If and only if the sun is shining, I will go swimming today; in

case I go swimming, I will have an ice cream.h) If students are tired or distracted, then they don’t study well.i) If students focus on learning, their knowledge will increase; and

if they don’t focus on learning, their knowledge will remainunchanged.

3) Denote by M , T , W the statements ”Today is Monday”, ”Today isTuesday” and ”Today is Wednesday”, respectively. Further, denoteby S the statement ”Yesterday was Sunday”. Translate the followingstatements into proper English.

a) M Ñ pT _W q ,

b) S ØM ,

c) S ^ pM _ T q ,

d) pS Ñ T q _M ,

e) M Ø pT ^ p W qq _ S ,

f) pM Ø T q ^ pp W q _ Sq .

4) By use of truth tables, prove that

a) p Aq ô A ,

b) pA^Bq ô pB ^Aq ,

c) pA_Bq ô pB _Aq ,

d) pAô Bq ô pB ô Aq ,

e) pA^Bq ô p Aq _ p Bq ,

f) pA_Bq ô p Aq ^ p Bq ,

g) pAÑ Bq ô p Aq _B ,

h) A^ pB ^ Cq ô pA^Bq ^ C ,

i) A_ pB _ Cq ô pA_Bq _ C ,

k) A_ pB ^ Cq ô pA_Bq ^ pA_ Cq ,

l) A^ pB _ Cq ô pA^Bq _ pA^ Cq .

for arbitrary statements A, B and C.

5) Assume thata ¨ pb` cq “ a ¨ b` c

for all real a, b and c is a valid arithmetic rule of inference. Derivefrom this a contradiction to the valid arithmetic statement that 0 ‰ 1.

29

Therefore, conclude that the enlargement of the field of arithmetic byaddition of the above rule would lead to an inconsistent field.

6) Prove indirectly that 3n` 2 is odd if n is an odd integer.

7) Prove indirectly that there are no integersm ą 0 and n ą 0 such that

m2 ´ n2 “ 1 .

8) If a, b and c are odd integers, then there is no rational number x suchthat ax2 ` bx ` c “ 0. [Hint: Assume that there is such a rationalnumber x “ rs where r, s ‰ 0 are integers without common divi-sor. Show that this implies the equation rpar` bsq “ ´cs2 which iscontradictory.]

9) Prove that there is an infinite number of prime numbers, i.e., of nat-ural numbers ě 2 that are divisible without remainder only by 1 andby that number itself. [Hint: Assume the opposite and construct anumber which is larger than the largest prime number, but not divisi-ble without remainder by any of the prime numbers.]

10) Prove by cases that

|x´ 1| ´ |x` 2| ď 3

for all real x.

11) Prove by cases that

|x´ 1| ` |x` 2| ě 3

for all real x.

12) Prove by cases thatˇ

ˇ

ˇ

a

b

ˇ

ˇ

ˇ“|a|

|b|

for all real numbers a, b such that b ‰ 0.

13) Prove by cases that if n is an integer, then n3 is of the form 9k ` rwhere k is some integer and r is equal to ´1, 0 or 1.

14) Prove that if n is an integer, then n5 ´ n is divisible by 5. [Hint:Factor the polynomial n5 ´ n as far as possible. Then consider thecases that n is of the form n “ 5q ` r where q is an integer and r isequal to 0, 1, 2, 3 or 4.]

30

2.2.2 Sets

Set theory was created by Georg Cantor between the years 1874 and 1897.Its development was triggered by the general effort to develop a rigorousbasis for calculus / analysis in the 19th century. As we shall see later, forthis it is necessary to treat infinite collections of real numbers. Since antiq-uity, most of the mathematicians did not consider collections of infinitelymany objects as valid objects of thinking. This is likely due to the influenceof ancient Greek philosophy, in particular that of Aristotle (384-322 B.C.),that dominated the thinking in the west up to the 18th century. According toAristotle (384-322 B.C.), the infinite is imperfect, unfinished and therefore,unthinkable; it is formless and confused. Hence it had to be excluded fromconsideration. Precisely such consideration is done by set theory. For thisreason, initially Cantor’s work received much criticism and was accusedto deal with fictions. Once its use for calculus / analysis was understood,attitudes began to change, and by the beginning of the 20th century, settheory was recognized as a distinct branch of mathematics. Finally, it evenprovided the basis for the whole of mathematics in the work of Bourbakimentioned in Section 2.1. Today, the notions of set theory seem so naturalthat the in part fierce debates at the time of its creation are hard to under-stand.

In the following, only the very basics of Cantor’s original formulation ofset theory is given which is sufficient for the purposes of the book. Today,that approach is called ‘naive’ set theory because it uses a definition of setswhich is too broad and leads to contradictions if its full generality is ex-ploited. One such contradiction, the so called Zermelo-Russel’s paradoxis described at the end of this section. So a more restrictive definition ofsets is needed to avoid such contradictions. For this, we refer to books onaxiomatic set theory. In the following such paradoxa will not play a rolebecause calculus / analysis naturally deals with a far reduced class of setswhich satisfy the more restrictive definition of axiomatic set theory.

Like the previous section, this section is very important because the given

31

notions of set theory will be in consistent use throughout the book as anefficient unifying language, but without going as far as Bourbaki’s work.Therefore, its careful study is advised to the reader. Like the material ofthe previous section, its apparent simplicity should not lead to an underes-timation of it’s importance. Precisely the achievement of such simplicityis the ultimate goal of the whole of mathematics because it signals a fullunderstanding of the studied object. Complexity just signals a deficient un-derstanding. In addition, from a practical point of view, such simplicitydrastically reduces the chance of the occurrence of errors.

In the following we adopt the naive definition of sets given by Cantor.

Definition 2.2.16. (Sets) A set is an aggregation of definite, different ob-jects of our intuition or of our thinking, to be conceived as a whole. Thoseobjects are called the elements of the set.

This implies that for a given set A and any given object a it follows thateither a is an element of A or it is not. The first is denoted by a P A ,and the second is denoted by a R A . The set without any elements, the socalled ‘empty set’, is denoted by φ.

Example 2.2.17. Examples of sets are

the set of all cats ,

the set of the lowercase letters of the Latin alphabet ,

the set of odd integers .

Definition 2.2.18. (Elements) For a set A, the following statements havethe same meaning

a is inA ,

a is an element ofA ,

a is a member ofA ,

a P A .

32

Given some not necessarily different objects x1, x2, . . . , the set containingthese objects is denoted by

tx1, x2, . . . u .

In particular, we define the set of natural numbers N , the set of naturalnumbers N˚ without 0 , the set of integers Z and the set of integers Z˚without 0 by

Definition 2.2.19. (Natural numbers, integers)

N :“ t0, 1, 2, 3, . . . u ,

N˚ :“ t1, 2, 3, . . . u ,

Z :“ t0, 1,´1, 2,´2, 3,´3 . . . u ,

Z˚ :“ t1,´1, 2,´2, 3,´3, . . . u .

Another way of defining a set is by a property characterizing its elements,i.e., by a property which is shared by all its elements, but not by any otherobject:

tx : x has the propertyP pxqu .

It is read as: ‘The set of all x such that P pxq’. In this, the symbol ‘:’ is readas ‘such that’. In particular, we define the set of rational numbers Q, the setof rational numbers Q˚ without 0 , the set of real numbers R and the set ofreal numbers R˚ without 0 by

Definition 2.2.20. (Rational and real numbers)

Q :“ tpq : p P Z^ q P N^ q ‰ 0u ,

Q˚ :“ tpq : p P Z˚ ^ q P N^ q ‰ 0u ,

R :“ tx : x is a real numberu ,R˚ :“ tx : x is a non-zero real numberu .

Definition 2.2.21. (Subsets, equality of sets) For all sets A and B, wedefine

A Ă B :ô Every element of A is also an element of B

33

and say that ‘A is a subset B’, ‘A is contained in B’, ‘A is included in B’or ‘A is part of B’. Finally, we define

A “ B :ô A Ă B ^B Ă A

ô A and B contain the same elements .

Here and in the following, wherever meaningful, the symbol ‘:’ in front ofother symbols means and is read as ‘per definition’.

Example 2.2.22. For instance,

t1, 1, 2, 3, 5u Ă t1, 1, 2, 3, 5, 8, 13u ,

t1, 1, 2, 3, 5u Ă trp1`?

5qn ´ p1´?

5qnsp2n?

5q : n P N˚u ,t1, 2, 3, 3, 5, 1u “ t1, 2, 3, 5u ,

t1, 1, 2, 3, 5, . . . u “ trp1`?

5qn ´ p1´?

5qnsp2n?

5q : n P N˚u .

In particular, we define subsets of R, so called intervals , by

Definition 2.2.23.

ra, bs :“ tx P R : a ď x ď bu , pa, bq :“ tx P R : a ă x ă bu

ra, bq :“ tx P R : a ď x ă bu , pa, bs :“ tx P R : a ă x ď bu

rc,8q :“ tx P R : x ě cu , pc,8q :“ tx P R : x ą cu

p´8, dq :“ tx P R : x ă du , p´8, ds :“ tx P R : x ď du

for all a, b P R such that a ď b and c, d P R.

We define the following operations on sets.

Definition 2.2.24. (Operations on sets, I) For all sets A and B, we define

(i) their union AYB, read: ‘A union B’, by

AYB :“ tx : x P A_ x P Bu

34

x

y

A

x

y

x

y

B

x

y

Fig. 1: Two subsets A and B of the plane.

AÜB A BAÝB

Fig. 2: Union and intersection of A and B. The last is given by the blue domain.

35

AB

Fig. 3: The relative complement of B in A.

(ii) and their intersection AXB, read: ‘A intersection B’, by

AXB :“ tx : x P A^ x P Bu .

If AXB “ φ, we say that A and B are disjoint.

(iii) the relative complement of B in A, A zB, read: ‘A without B’ or ‘Aminus B’, by

A zB :“ tx : x P A^ x R Bu .

(iv) their cross (or Cartesian / direct) product A ˆ B, read: ‘A cross B’,by

AˆB :“ tpx, yq : x P A^ y P Bu

where ordered pairs px1, y1q, px2, y2q are defined equal,

px1, y1q “ px2, y2q ,

if and only if x1 “ x2 and y1 “ y2. We also use the notation A2 forAˆA. More generally, we define for n P N such that n ě 3 and sets

36

1 2x

A

1 2 3x

1

2

3y

B

1 2 3x

1

2

3y

0 1 2 3y

3

21

x

1

2

3

z A´B

Fig. 4: Subsets A of the real line and B of the plane and their cross product.

37

A1, . . . , An the corresponding Cartesian product

A1 ˆ ¨ ¨ ¨ ˆ An (2.2.5)

to consist of all ordered n-tuples px1, . . . , xnq of elements x1 P A1,. . . , xn P An. Also in this case, we define such ordered pairs px1, . . . ,xnq and py1, . . . , ynq to be equal if and only if all their componentsare equal, i.e., if and only if x1 “ y1, . . . , xn “ yn. We also use thenotation

ną

i“1

Ai

for (2.2.5) and, in the case that A1, . . . , An are all equal to some setA, the notation An. Finally, we define R1 :“ R.

Example 2.2.25.

t1, 2, 3, 5, 8, 13u Y t1, 3, 4, 7, 11, 18u “ t1, 2, 3, 4, 5, 7, 8, 11, 13, 18u

t1, 2, 3, 5, 8, 13u X t1, 3, 4, 7, 11, 18u “ t1, 3u

t1, 2, 3, 5, 8, 13u zt1, 2, 3, 5u “ t8, 13u ,

t1, 2, 3, 5, 1u zt1u “ t2, 3, 5u ,

t1, 2u ˆ t1, 3, 4u “ tp1, 1q, p1, 3q, p1, 4q, p2, 1q, p2, 3q, p2, 4qu .

We also define unions and intersection of arbitrary families of sets.

Definition 2.2.26. (Operations on sets, II) Let I be some non-empty setand for every i P I the corresponding Ai an associated set. Then we define

ď

iPI

Ai :“ tx : x P Ai for some i P Iu ,

č

iPI

Ai :“ tx : x P Ai for all i P Iu .

Example 2.2.27. Determineď

nPN˚r 1n, 1s ,

č

nPN˚r 0, 1ns .

38

Solution: By definition

S1 :“ď

nPN˚r 1n, 1s “ tx : x P r 1n, 1s for some n P N˚u .

Any x P R such that x ą 1 or x ď 0 is not contained any of the setsr 1n, 1s, n P N˚ and hence also not contained in their union S1. On theother hand, if x P R is such that 0 ă x ď 1, then

1

nď x ď 1

if n P N˚ is such that n ě 1x. Hence for such n, x P r1n, 1s and hencex P S1. As a consequence,

ď

nPN˚r 1n, 1s “ p0, 1s .

Further, by definition

S2 :“č

nPN˚r 0, 1ns “ tx : x P r 0, 1ns for all n P N˚u .

No x P R such that x ă 0 is contained in any of the r 0, 1ns, n P N˚ andhence also not contained in S2. 0 is contained in all of these sets and hencealso contained in S2. If x P R is such that x ą 0, then

1

nă x

for n P N˚ such that n ą 1x. Hence for such n, x R r0, 1ns and thereforex R S2. As a consequence,

č

nPN˚r 0, 1ns “ t0u .

The naive Definition 2.2.16 of sets leads to paradoxa like the one of Zermelo-Russel (1903):

39

Assume that there is a set of all sets that don’t contain itself as an element:

S :“ tx : x is a set^ x R xu .

Since S is assumed to be a set, either S P S or S R S. From the assumptionthat S P S, it follows by the definition of S that S R S . Hence it followsthat S R S. From S R S, it follows by the definition of S that S P S .Hence there is no such set.

Bernard Russell also used a statement about a barber to illustrate this prin-ciple. If a barber cuts the hair of exactly those who do not cut their ownhair, does the barber cut his own hair?

So a more restrictive definition of sets is needed to avoid such contradic-tions. For this, we refer to books on axiomatic set theory. In the followingsuch paradoxa will not play role because we don’t use the full generalityof Definition 2.2.16. Calculus / analysis naturally deals with a far reducedclass of sets which satisfy the more restrictive definition of axiomatic settheory.

Problems

1) For each pair of sets, decide whether not the following sets are equal:

A :“ t´2, 3u, B :“ t3,´2u Y φ,C :“ t´2, 3u Y tφu, D :“

tx P R : x2 ´ x´ 6 “ 0u, E :“ tφ,´2, 3u, F :“ t´2, 3,´2u,

G :“ t´2, φ, φ, 3u .

2) Simplify

t´2, 3u Y tt´2u, t3uu Y t´2, t3uu Y tt´2u, 3u .

3) Decide whether

t1, 3u P t1, 3, t1, 7u, t1, 3, 7uu .

Justify your answer.

40

4) LetA :“ tφ, t1u, t1, 3u, t3, 4uu. Determine for each of the followingstatements whether it is true or false.

a) 1 P A ,b) t1u Ă A ,c) t1u P A,d) t1, 3u Ă A ,e) tt1, 3uu P A ,f) φ P A ,g) φ Ă A ,h) tφu Ă A .

5) Give an example of sets A,B,C such that A P B and B P C, butA R C.

6) Sketch the following sets

A :“ tpx, yq P R2 : x` y ` 1 “ 0u ,

B :“ tpx, yq P R2 : 2x` 3y ` 5 “ 0u ,

C :“ tpx, yq P R2 : x2 ` y2 “ 1u, D :“ tp0, 1qu ,

E :“ tp´1,´1qu, F :“ tp0,´1qu, G :“ tp´1, 0qu, H :“ tp2,´3qu ,

I :“ tp´4, 1qu, J :“ tx P R : ´4 ď x ď 2u,K :“ t0u, L :“ t1u

into a xy-diagram and calculate AXB, AX C, pAXBq X C, AXpBXCq,BXpJˆLq,CXpJˆKq,A zB,B zA,BYE, pCYF qYG.

7) Let A,B and C be sets. Show that

a) If A Ă B and B Ă C, then A Ă C ,b) AYB “ B YA ,c) AXB “ B XA ,d) AY pB Y Cq “ pAYBq Y C ,e) AX pB X Cq “ pAXBq X C ,f) AY pB X Cq “ pAYBq X pAY Cq ,g) AX pB Y Cq “ pAXBq Y pAX Cq ,h) C zpAYBq “ pC zAq X pC zBq ,i) C zpAXBq “ pC zAq Y pC zBq .

41

2.2.3 Maps

The development of the concept of a function and its generalization, i.e., theconcept of a map (or ‘mapping’), are further major achievements of West-ern culture that have no counterpart in ancient Greek mathematics. The firstconcept underwent considerable changes until it reached its current mean-ing.

The principal objects of study of the calculus in the 17th century weregeometric objects, in particular curves, but not functions in their currentmeaning. Also the variables associated with those objects had a geomet-rical meaning, like abscissas, ordinates and tangents. The term functionappeared first in the works of Leibniz. In particular, he asserts that a tan-gent is a function of a curve. This only very roughly matches the modernnotion of a function. Newton’s method of ‘fluxions’ applies to ‘fluents’ notto functions. For Newton, a curve is generated by a continuous motion of apoint he called ‘fluent’ because he thought of it as a flowing quantity. The‘fluxion’ or rate at which it flowed, was the point’s velocity.

Under the influence of analytic geometry, in the first half of the 18th cen-tury, the geometric concept of variables was replaced by the concept ofa function as an equation or analytic expression composed of variablesand numbers. Admissible analytic expressions were those that involvedthe four algebraic operations, roots, exponentials, logarithms, trigonomet-ric functions, derivatives and integrals. In the sequel, as a consequenceof the study of the solutions of the wave equation in one space dimension(‘the Vibrating-String Problem’), the concept of a function was enlarged toinclude such that are piecewise defined on intervals by several analytic ex-pressions and functions (in the sense of curves) drawn by ‘free-hand’ andpossibly not expressible by any combination of analytic expressions.

The final step in the evolution of the function concept was made by Gus-tav Lejeune Dirichlet in 1829 [30] in a paper which gave a precise mean-ing to Fourier’s work from 1822 [41] on heat conduction. In that work,

42

Fourier claimed that ‘any’ function defined over an interval p´l, lq can berepresented by his series over this interval. Not only by modern standards,Fourier’s statement and proof were insufficient, but a proof or disproofof that statement presupposed a clear definition of the concept of a func-tion. For Dirichlet, y is a function of a variable x, defined on the intervala ă x ă b, if to every value of the variable x in this interval there cor-responds a definite value of the variable y. Also, it is irrelevant in whatway this correspondence is established. Already in 1887 [31], Dirichletgeneralizes the concept of a function to that of a mapping

‘By a mapping of a system S a law is understood, in accor-dance with which to each determinate element s of S there isassociated a determinate object, which is called the image of sand is denoted by ϕpsq; we say too, that ϕpsq corresponds tothe element s, that ϕpsq is caused or generated by the mappingϕ out of s, that s is transformed by the mapping ϕ into ϕpsq.’

This definition practically coincides with the modern definition of mapsgiven below.

Fourier’s claim pinpointed a major weakness in the mathematics of the 18thcentury. On the one hand, the insufficiency of Fourier’s ‘proof’ was obvi-ous to the mathematical community at the time. On the other hand, thenotion of a function was to nebulously defined as that it could have beenconvincingly claimed that his result was false. This clearly signaled thatthose mathematical notions (or the ‘mathematical language’) were to im-precise to deal with such questions and that more precise notions had to bedeveloped. This makes clear the size of Dirichlet’s achievement. He hadto solve simultaneously two intertwined problems, namely the giving of aprecise mathematical meaning to Fourier’s result and the development ofa mathematical framework where this is possible. In particular, it was notclear whether such thing was possible at all. Until today, such problems arecommon in mathematics related to applications.

A careful study also of this section is advised to the reader. It introduces

43

A

B

f

Fig. 5: Points in the set A and their images in the set B under the map f are connected byarrows. Compare Definition 2.2.28.

into the current notion of maps and gives efficient means for their descrip-tion which will be used throughout the book. If there is reference made to afunction or to a map in the following, the imagining of a picture similar toFig 2.2.28 should be helpful to the reader. Mathematically, it is possible toidentify a map with a set, namely its graph, see Definition 2.2.33. In suchexclusivity, this is not advisable since this often does not provide any visualhelp, in particular in cases when the graph is a subset of a space of morethen 3 dimensions. The last is frequently the case in applications. In ad-dition, it often hinders intuition since maps are frequently used to describetransformations. It is more advisable, to consider the graph of a map as oneof the options to describe or visualize the latter. Indeed, this option will fre-quently be used in Calculus I and II. Other such options, becoming relevantin Calculus III, are sometimes contour and density maps. With increasingcomplexity of the considered problems, also in applications, the options fora meaningful visualization of the involved maps rapidly decreases and anabstract view of maps is becoming essential.

44

Definition 2.2.28. (Maps) Let A and B be non-empty sets.

(i) A map (or mapping) f from A into B, denoted by f : A Ñ B, is anassociation which associates to every element of A a correspondingelement of B. If B is a subset of the real numbers, we call f a func-tion. We call A the domain of f . If f is given, we also use the shortnotation Dpfq for the domain of f .

(ii) For every x P A, we call fpxq the value of f at x or the image of xunder f .

(iii) For any subset A 1 of A, we call the set fpA 1q containing all the im-ages of its elements under f ,

fpA 1q :“ tfpxq : x P A 1

u , (2.2.6)

the image of A 1 under f . In particular, we call fpAq the range orimage of f . If f is given, we use also the short notation

Ranpfq :“ fpAq “ tfpxq : x P Au .

for the range or image of f .

(iv) For any subset B 1 of B, we call the subset f´1pB 1q of A containingall those elements which are mapped into B 1,

f´1pB 1

q :“ tx P A : fpxq P B 1u ,

the inverse image of B 1 under f . In particular if f is a function, wecall

f´1pt0uq “ tx P A : fpxq “ 0u ,

the set of zeros of f or the zero set of f .

(v) For any subset A 1 of A, we define the restriction of f to A 1 as themap f |A 1 : A 1 Ñ B defined by

f |A 1pxq :“ fpxq

for all x P A 1.

45

Remark 2.2.29. (Variables) We will not introduce a precise notion of‘variables’ in the following because such would be redundant. Still thereis a residual of such historic notion present in the commonly used charac-terization of functions as functions of one variable, several variables or nvariables where n P N is such that n ě 2. Also in this text, we will referto a function whose domain is a subset of R as a function of one variableand to a function whose domain is a subset of Rn, where n P N is such thatn ě 2, as a function of several variables or a function of n variables.

Remark 2.2.30. In the following, we make the general assumption of basicknowledge of integer powers and n-th roots, where n P N˚, as well as ofthe functions

sin : RÑ R , arcsin : r´1, 1s Ñ r´π2, π2s ,

cos : RÑ R , arccos : r´1, 1s Ñ r0, πs ,

tan : p´π2, π2q Ñ R , arctan : RÑ p´π2, π2q ,

exp : RÑ R , ln : p0,8q Ñ R

as provided by high school mathematics. Still, we give definitions of someof these functions later on to exemplify methods of calculus.

Example 2.2.31. Define f : ZÑ Z by

fpnq :“ n2

for all n P Z. Moreover, let g be the restriction of f to N. Calculate

fpZq, fpt´2,´1, 0, 1, 2uq, f´1pt´1, 0, 1uq, f´1

pt6uq, g´1pt´1, 0, 1uq .

Solution:

fpZq “ tn2 : n P Nu , fpt´2,´1, 0, 1, 2uq “ t0, 1, 4u ,

f´1pt´1, 0, 1uq “ t´1, 0, 1u , f´1

pt6uq “ φ , g´1pt´1, 0, 1uq “ t0, 1u .

Example 2.2.32. Define f : Df Ñ R and g : Dg Ñ R such that

(a) fpxq “?x` 2 for all x P Df

46

(b) gpxq “ 1px2 ´ xq for all x P Dg

and such Df and Dg are maximal. Find the domains Df and Dg. Giveexplanations. Solution: In case (a) the inequality

x` 2 ě 0 pô x ě ´2q

has to be satisfied in order that the square root is defined. Hence

Df :“ tx P R : x ě ´2u

and f : Df Ñ R is defined by fpxq :“?x` 2 for all x P Df . In case (b)

the denominator has to be different from zero in order that the quotient isdefined. Because of

x2´ x “ xpx´ 1q “ 0 ô x P t0, 1u ,

we conclude that

Dg :“ tx P R : x ‰ 0^ x ‰ 1u

and that g : Dg Ñ R is defined by gpxq :“ 1px2 ´ xq for all x P Dg.

Definition 2.2.33. (Graph of a map) Let A and B be some sets and f :AÑ B be some map. Then we define the graph of f by:

Gpfq :“ tpx, fpxqq P AˆB : x P Au .

Example 2.2.34. Sketch the graphs of the functions f and g from Exam-ple 2.2.32. Solution: See Fig. 6 and Fig. 7.

Example 2.2.35. Find the ranges of the functions in Example 2.2.32. So-lution: Since the square root assumes only positive numbers, we concludethat

fpDf q Ă ty : y ě 0u .

Further for every y P r0,8q, it follows thata

y2 ´ 2` 2 “ y

47

and hence thatty : y ě 0u Ă fpDf q

and, finally, that fpDf q “ ty : y ě 0u. Further, for x ă 0 or x ą 1, itfollows that xpx ´ 1q ą 0 and hence that gpxq ą 0. For 0 ă x ă 1, itfollows that

´1

4ď

ˆ

x´1

2

˙2

´1

4“ xpx´ 1q ă 0

and hence that gpxq ě ´4. Hence it follows that

ty : y ą 0u Y ty : y ď ´4u Ă gpDgq .

Finally, for any real y such that py ą 0q _ py ď ´4q, it follows that

g

ˆ

1

2`

c

1

y`

1

4

˙

“ y

and hence that gpDgq Ă ty : y ą 0u Y ty : y ď ´4u.

A map is called injective (or one-to-one) if no two points from its domainare mapped onto the same point. A map into a set B is called surjective(or onto) if every element from B is the image of some element from itsdomain. Finally, a map is called bijective (or one-to-one and onto) if it isinjective and surjective.

Definition 2.2.36. (Injectivity, surjectivity, bijectivity) Let A and B besome sets and f : AÑ B be some map. We define

(i) f is injective (or one-to-one) if different elements of A are mappedinto different elements of B, or equivalently if

fpxq “ fpyq ñ x “ y

for all x, y P A. In this case, we define the inverse map f´1 asthe map from fpAq into A which associates to every y P fpAq theelement x P A such that fpxq “ y.

48

-2 -1 1 2x

0.5

1

1.5

2

y

Fig. 6: Gpfq from Example 2.2.32.

-1 -0.5 0.5 1.5 2x

-10

-8

-6

-4

-2

2

4

y

Fig. 7: Gpgq from Example 2.2.32.

49

(ii) f is surjective (or onto) if every element of B is the image of someelement(s) of A:

fpAq “ B .

(iii) f is bijective (or one-to-one and onto) if it is both injective and sur-jective. In this case, the domain of the inverse map is the whole ofB.

Example 2.2.37. Let f and g be as in Example 2.2.31. In addition, defineh : Z Ñ Z by hpnq :“ n ` 1 for all n P Z. Decide whether f, g and h areinjective, surjective or bijective. If existent, give the corresponding inversefunction(s).

Solution: f is not injective (and hence also not bijective), nor surjective,for instance, because of

fp´1q “ fp1q “ 1 , 2 R fpAq .

g is injective because ifm and n are some natural numbers such that gpmq “gpnq, then it follows that

0 “ m2´ n2

“ pm´ nqpm` nq

and hence thatm “ n _ m “ ´n

and therefore, since g has as its domain the natural numbers, that m “ n.The inverse g´1 : gpAq Ñ A is given by g´1plq “

?l for all l P gpAq. g is

not surjective (and hence also not bijective), for instance, since 2 R gpAq. his injective because if m and n are some natural numbers such that hpmq “hpnq, then it follows that

0 “ m` 1´ pn` 1q “ m´ n

and hence that m “ n. h is surjective (and hence as a whole bijective)because for any natural n we have hpn ´ 1q “ n. The inverse functionh´1 : ZÑ Z is given by h´1pnq “ n´ 1 for all n P Z.

50

The following characterizes the injectivity, surjectivity and bijectivity of amap in terms of its graph. In the special case of functions defined on subsetsof the real numbers, the theorem can be stated as follows. Such function isinjective if and only if every parallel to the x-axis intersects its graph in atmost one point. If such function maps into the set B, then it is surjective,bijective, respectively, if and only if the intersection of every parallel to thex-axis through a point from B intersects its graph in at least one point andprecisely one point, respectively.

Theorem 2.2.38. Let A and B be sets and f : A Ñ B be a map. Further,define for every y P B the corresponding intersection Gfy by

Gfy :“ Gpfq X tpx, yq : x P Au .

Then

(i) f is injective if and only if Gfy contains at most one point for ally P B.

(ii) f is surjective if and only if Gfy is non-empty for all y P B.

(iii) f is bijective if and only if Gfy contains exactly one point for ally P B.

Proof. (i) The proof is indirect. Assume that there is y P B such that Gfy

contains two points px1, yq and px2, yq. Then, since Gfy is part of Gpfq, itfollows that y “ fpx1q “ fpx2q and hence, since by assumption x1 ‰ x2,that f is not injective. Further, assume that f is not injective. Then there aredifferent x1, x2 P A such that fpx1q “ fpx2q. Hence Gffpx1q contains twodifferent points px1, fpx1qq and px2, fpx1qq. (ii) If f is surjective, then forany y P B there is some x P A such that y “ fpxq and hence px, yq P Gfy.On the other hand, if Gfy is non-empty for all y P B, then for every y P Bthere is some x P A such that px, yq P Gfy and hence, since Gfy is part ofGpfq, that y “ fpxq. Hence f is surjective. (iii) is an obvious consequenceof (i) and (ii).

51

-2 -1 1 2x

-1

0.5

1.5

2y

Fig. 8: Gpfq from Example 2.2.32 and parallels to the x-axis.

-1 -0.5 0.5 1.5 2x

-10

-8

-4

2

y

Fig. 9: Gpgq from Example 2.2.32 and parallels to the x-axis.

52

Example 2.2.39. Apply Theorem 2.2.38 to investigate the injectivity of fand g from Example 2.2.32. Solution: Fig. 8, Fig. 9 suggest that f isinjective, but not surjective and that g is neither injective nor surjective.

Example 2.2.40. Show that f and the restriction of g to

tx : x ě 12^ x ‰ 1u ,

where f and g are from Example 2.2.32, are injective and calculate theirinverse. Solution: If x1, x2 are any real numbers ě ´2 and such thatfpx1q “ fpx2q, then ?

x1 ` 2 “?x2 ` 2

and hencex1 ` 2 “ x2 ` 2

and x1 “ x2. Hence f is injective. Further, for every y in the range of fthere is x ě ´2, such that

y “?x` 2

and hencex “ y2

´ 2 .

Thereforef´1

pyq “ y2´ 2

for all y from the range of f . Further, if x1 and x2 are some real numbersě 12 different from 1 and such that

1

x21 ´ x1

“1

x22 ´ x2

,

thenpx1 ´ x2qpx1 ` x2 ´ 1q “ 0

and hence x1 “ x2. Hence the restriction of g to tx : x ě 12 ^ x ‰ 1u isinjective. Finally, if y is some real number in the range of this restriction,then y is in particular different from zero and

y “1

x2 ´ x,

53

henceˆ

x´1

2

˙2

“1

y`

1

4

and

x “1

2`

c

1

y`

1

4.

Therefore

f´1pyq “

1

2`

c

1

y`

1

4

for all y from the range of that restriction of g.

The next defines the composition of maps which corresponds to the appli-cation of maps in sequence.

Definition 2.2.41. (Composition) Let A,B,C and D be sets. Further, letf : A Ñ B and g : C Ñ D be maps. We define the composition g ˝ f :f´1pB X Cq Ñ D (read: ‘g after f ’) by

pg ˝ fqpxq :“ gpfpxqq

for all x P f´1pBXCq. Note that g˝f is trivial, i.e., with an empty domain,for instance, if B X C “ φ. Also note that f´1pB X Cq “ A if B Ă C.

Example 2.2.42. Calculate f ˝ f , h ˝ h, f ˝ h and h ˝ f where f , h aredefined as in Example 2.2.31, Example 2.2.37, respectively.

Solution: Obviously, all these maps map Z into itself. Moreover for ev-ery n P Z:

pf ˝ fqpnq “ fpfpnqq “ fpn2q “ pn2

q2“ n4 ,

ph ˝ hqpnq “ hphpnqq “ hpn` 1q “ pn` 1q ` 1 “ n` 2 ,

ph ˝ fqpnq “ hpfpnqq “ hpn2q “ n2

` 1 ,

pf ˝ hqpnq “ fphpnqq “ fpn` 1q “ pn` 1q2 “ n2` 2n` 1 .

Note in particular that h ˝ f ‰ f ˝ h.

54

Example 2.2.43. Let A and B be sets. Moreover, let f : A Ñ B be someinjective map. Calculate f´1 ˝ f . Assume that f is also surjective (andhence as a whole bijective) and calculate also f ˝ f´1 for this case.

Solution: To every y P fpAq, the map f´1 associates the correspondingx P A which satisfies fpxq “ y. In particular, it associates to fpxq theelement x for all x P A. Hence

f´1˝ f “ idA , f ˝ f´1

“ idfpAq

where for every set C the corresponding map idC : C Ñ C is defined by

idCpxq :“ C

for all x P C. Further, if f is bijective, fpAq “ B and hence

f ˝ f´1“ idB .

The following theorem gives a relation between the graph of an injectivemap and the graph of its inverse. In the special case of functions definedon subsets of the real numbers, the theorem characterizes the graph of theinverse of such a function as the reflection of the graph of that functionabout the line tpx, xq P R2 : x P Ru.

Theorem 2.2.44. (Graphs of inverses of maps) Let A and B be sets andf : AÑ B be an injective map. Moreover, define R : X ˆY Ñ Y ˆX by

Rpx, yq :“ py, xq

for all x P A and y P B. Then the graph of the inverse map is given by

Gpf´1q “ RpGpfqq .

Proof. ‘Ă’: Let py, f´1pyqq be an element of Gpf´1q. Then y P fpAq andf´1pyq P A is such that fpf´1pyqq “ y. Therefore pf´1pyq, yq P Gpfq and

py, f´1pyqq “ Rpf´1

pyq, yq P RpGpfqq .

55

-2 -1 1 2x

-2

-1

1

2

y

Fig. 10: Gpfq, Gpf´1q from Example 2.2.32 and the reflection axis.

‘Ą’: Let pfpxq, xq be some element of RpGpfqq. Then f´1pfpxqq “ x andhence

pfpxq, xq “ pfpxq, f´1pfpxqq P Gpf´1

q .

Example 2.2.45. Apply Theorem 2.2.44 to the graph of the function f fromExample 2.2.32 to draw the graphs of its inverse. (See Example 2.2.40.)Solution: See Fig. 10.

Problems

1) Find

fpr0, π2sq , f´1pt1uq , f´1pt3uq , f´1pr0, 2sq .

In addition, find the maximal domain D Ă R that contains the pointπ8 and is such that f |D is injective. Finally, calculate the inverse ofthe map h : D Ñ fpDq defined by hpxq :“ fpxq for all x P D.

56

-1 1x

-1

1y

-2 -1 1 2x

1

2

y

1 2x

-2

-1

1

y

Fig. 11: Subsets of R2. Which is the graph of a function?

a) fpxq :“ 2 sinp3xq ,x P R ,b) fpxq :“ 3 cosp2xq ,x P R ,c) fpxq :“ tanpx2q3 ,x P tx P pp2k´1qπ, p2k`1qπq : k P Zu .

2) Define f : RÑ R and g : R zt´1u Ñ R by fpxq :“ x`1 for x P Rand gpxq :“ px` 1q2px` 1q for x P R zt´1u. Is f “ g?

3) Let f : Df Ñ R be defined such that the given equation below issatisfied for all x P Df and such that Df Ă R is a maximal. In eachof the cases, find the corresponding Df , the range of f , and draw thegraph of f :

a) fpxq “ x2 ´ 3 ,b) fpxq “ 1

?x ,

c) fpxq “ 1p1´ xq ,d) fpxq “ x2|x| ,e) fpxq “ x|x| ,f) fpxq “ |x|13 ,g) fpxq “ |x2 ´ 1| ,h) fpxq “

a

sinpxq .

4) Which of the subsets of R2 in Fig. 11 is the graph of a function? Givereasons.

5) Find the function whose graph is given by

a)

px, yq P R2 : x2y ` x` 1 “ 0(

,

b)

px, yq P R2 : x “ ypy ` 1q(

,

c)

px, yq P R2 : y2 ` 6xy ` 9x2 “ 0(

.

6) In each of the following cases, find a bijective function that has do-main D and range R and calculate its inverse.

57

a) D “ tx P R : 1 ď x ď 2u, R “ tx P R : 3 ď x ď 7u ,b) D “ tx P R : ´1 ď x ď 1u, R “ tx P R : x ě 3u .

7) a) Define f : Df Ñ R and g : Dg Ñ R such that

fpxq :“x´ 1

x´ 3, gpxq :“ 2`

a

x2 ´ 9

for all x P Df , x P Dg , respectively, and such Df and Dg aremaximal. Find the domains and ranges of the functions f andg. Give explanations.

b) If possible, calculate pf ˝ gqp5q and pg ˝ fqp5q. Give explana-tions.

c) f is injective (= ‘one to one’). Calculate its inverse.

8) Is there a function which is identical to its inverse? Is there more thenone such function?

9) Define f : RÑ R, g : RÑ R and h : RÑ R by

fpxq :“ 1` x , gpxq :“ 1` x` x2 , hpxq :“ 1´ x

for every x P R. Calculate

pf ˝ fqpxq , pf ˝ gqpxq , pg ˝ fqpxq , pg ˝ gqpxq ,

pf ˝ hqpxq , ph ˝ fqpxq , pg ˝ hqpxq , ph ˝ gqpxq ,

ph ˝ hqpxq , rf ˝ pg ˝ hqspxq , rpf ˝ gq ˝ hspxq

for every x P R.

10) Define f : RÑ R, g : RÑ R and h : tx P R : x ą 0u Ñ R by

fpxq :“ x` a , gpxq :“ ax , hpxq :“ xa

for every x in the corresponding domain where a P R. For each ofthese functions and every n P N˚, determine the n-fold compositionwith itself.

11) Define f : RÑ R by

fpxq :“ r 1` p2´ xq13 s17 , gpxq :“ cosp2xq

for every x P R. Express f and g as a composition of four functions,none of which is the identity function. In addition, in the case of g,the sine function should be among those functions.

58

12) Let A and B be sets, f : AÑ B and B1, B2 be subsets of B. Showthat

f´1pB1 YB2q “ f´1pB1q Y f´1pB2q ,

f´1pB1 XB2q “ f´1pB1q X f´1pB2q .

13) Express the area of an equilateral triangle as a function of the lengthof a side.

14) Express the surface area of a sphere of radius r ą 0 as a function ofits volume.

15) Consider a circle S1r of radius r ą 0 around the origin of an xy-

diagram. Express the length of its intersections with parallels to they-axis as a function of their distance from the y-axis. Determine thedomain and range of that function.

16) From each corner of a rectangular cardboard of side lengths a ą 0and b ą 0, a square of side length x ě 0 is removed, and the edgesare turned up to form an open box. Express the volume of the box asa function of x and determine the domain of that function.

17) Consider a body in the earth’s gravitational field which is at rest attime t “ 0 and at height s0 ą 0 above the surface. Its height s andspeed v as a function of time t are given by

sptq “ s0 ´1

2gt2 , vptq “ ´gt

where g is approximately 9.81ms2. Determine the domain and rangeof the functions s and v. In addition, express s as a function of thespeed and determine domain and range.

59

Fig. 12: Hexagons inscribed in and circumscribed about the unit circle.

2.3 Limits and Continuous Functions2.3.1 Limits of Sequences of Real Numbers

For motivation of infinite processes, we consider one of its early exam-ples, namely Archimedes’ measurement of the circle. Archimedes consid-ered regular polygons of 6, 12, 24, . . . sides inscribed in and circumscribedabout the unit circle in order to achieve rational estimates of its circumfer-ence of increasing accuracy. Since trigonometric functions were not knownat his time, differently to the reasoning below, he used elementary geomet-ric methods to derive the relation (2.3.1) below. Such derivation is given asan exercise. See Problem 6 below.

For every n “ 6, 12, 24, . . . , we define a corresponding sn as the circumfer-ence of the regular polygon of n sides. Since geometric intuition suggeststhat the shortest connection of two point in the plane is a straight line, weexpect sn to give a lower bound of the circumference of the unit circle,i.e., of 2π. For the same reason, we expect, see Fig. 13, that the sequences6, s12, s24, . . . is increasing. The proof of this is given as an exercise. SeeProblem 7 below. In particular,

60

A B

D

C

E

ΠH2nLΠn

Fig. 13: Depiction to Archimedes’ measurement of the circle. The dots in the corners Cand D indicate right angles.

sn “ n ¨ ln

where ln is the length of the side of the polygon. From Fig 13, we concludethat

ln2“ sin

´π

n

¯

,l2n2“ sin

´ π

2n

¯

.

Further, it follows that

sin´π

n

¯

“ sin´

2 ¨π

2n

¯

“ 2 sin´ π

2n

¯

cos´ π

2n

¯

“ 2 sin´ π

2n

¯

c

1´ sin2´ π

2n

¯

and hence that”

sin2´ π

2n

¯ı2

´ sin2´ π

2n

¯

`1

4sin2

´π

n

¯

.

The last implies that

sin2´ π

2n

¯

“1

2

„

1´

c

1´ sin2´π

n

¯

and hence that

l22n “ 4 sin2´ π

2n

¯

“ 2

«

1´

c

1´l2n4

ff

“l2n2

1`b

1´ l2n4

.

61

Finally, we arrive at the recursion relation

l22n “l2n

2`a

4´ l2n(2.3.1)

which Archimedes used to obtain the length of the sides of the 2n-gon fromthat of the n-gon. He started from S6 “ 1 to obtain

l212 “1

2`?

3“ 2´

?3 .

In the next step, he used the approximation

?3 «

1351

780

to obtain a lower bound for s12. Continuing in this fashion up to the 96-gon,he arrived at the approximation

s96 « 620

71

which gives the circumference of the circle, i.e., 2π, within an error of2 ¨ 10´3. Note that far better approximations to 2π were already known tothe ancient Babylonians. More important is the fact that this method couldbe used to calculate 2π to arbitrary precision, i.e., within an error less thanan arbitrary small preassigned error bound ε ą 0.

Given such error bound ε ą 0, and taking into account that the sequences6,s12,s24, . . . is increasing, we expect that there is some correspondingnatural number N such that

2π ´ s2n ă ε

for all natural numbers n such that n ě N . Indeed this expectation turnsout to be correct later. Since,

2π ´ s2n “ |s2n ´ 2π|

62

Fig. 14: Dodecagon inscribed in a unit circle.

for all n P N, n ě 6, we note that our expectation is equivalent to thestatement that for every arbitrary preassigned error bound ε ą 0, there issome corresponding natural number N such that

|s2n ´ 2π| ă ε

for all natural numbers n such that n ě N . The last is also used to definethe limit of a sequence of real numbers in general.

Definition 2.3.1. Let x1, x2, . . . be a sequence of elements of R and x P R.Then we define

limnÑ8

xn “ x

if for every ε ą 0, there is a corresponding n0 such that for all n ě n0

|xn ´ x| ă ε ,

i.e., from the n0-th member on, all remaining members of the sequence arewithin a distance from x which is less than ε. 1 In this case, we say that the1 As a consequence, only finitely many members have distance ě ε from x.

63

10 20 30 40 50n

0.25

0.5

0.75

1

1.25

1.5

1.75

2

Fig. 15: pn, pn` 1qnq for n “ 1 to n “ 50 and asymptotes.

sequence x1, x2, . . . is convergent to x. Note that this implies that for everyε ą 0

|xn| “ |xn ´ x` x| ď |xn ´ x| ` |x| ď ε` |x|

for all n P N˚, apart from finitely many members of the sequence, andhence that x1, x2, . . . is bounded, i.e., that there is M ě 0 such that |xn| ďM for all n P N˚. If the sequence is not convergent to any real number, wecall the sequence divergent.

Example 2.3.2. Let a be some real number and xn :“ a for all n P N˚.Then

limnÑ8

xn “ a .

Indeed, if ε ą 0 is given, then

|xn ´ a| “ |a´ a| “ 0

for all n P N˚. Hence we can choose N “ 1. Note that in this simple case,the chosen N works for every ε ą 0. In general this will be impossible.

64

10 20 30 40 50n

-2

-1.5

-1

-0.5

0.5

1

1.5

2

Fig. 16: pn, p´1qnpn` 1qnq for n “ 1 to n “ 50 and asymptotes.

10 20 30 40 50n

10

20

30

40

50

Fig. 17: pn, pn2 ` 1qnq for n “ 1 to n “ 50 and an asymptote.

65

Example 2.3.3. Investigate whether the following limits exist.

(i)

limnÑ8

n` 1

n(2.3.2)

(ii)

limnÑ8

p´1qn ¨n` 1

n, (2.3.3)

(iii)

limnÑ8

n2 ` 1

n. (2.3.4)

Solution: Fig. 15, Fig. 16 and Fig. 17 suggest that the limit 2.3.2 is 1,whereas the limits 2.3.3, 2.3.4 don’t exist. Indeed

limnÑ8

n` 1

n“ 1 . (2.3.5)

For the proof, let ε be some real number ą 0. Further, let n0 be somenatural number ą 1ε. Then it follows for every n P N such that n ě n0:

ˇ

ˇ

ˇ

ˇ

n` 1

n´ 1

ˇ

ˇ

ˇ

ˇ

“1

nď

1

n0

ă ε .

and hence the statement (2.3.5). The proof that (2.3.3) does not exist pro-ceeds indirectly. Assume on the contrary that there is some x P R such

limnÑ8

p´1qn ¨n` 1

n“ x .

Then there is some n0 P N such thatˇ

ˇ

ˇ

ˇ

p´1qn ¨n` 1

n´ x

ˇ

ˇ

ˇ

ˇ

ă1

4

for all n P N such n ě n0. Without restriction of generality, we can assumethat n0 ě 4. Then it follows for any even n P N such that n ě n0:

|x´ 1| “

ˇ

ˇ

ˇ

ˇ

n` 1

n´ x´

1

n

ˇ

ˇ

ˇ

ˇ

ď

ˇ

ˇ

ˇ

ˇ

n` 1

n´ x

ˇ

ˇ

ˇ

ˇ

`1

nď

1

4`

1

n0

66

ď1

4`

1

4“

1

2

and for any odd n P N such that n ě n0:

|x` 1| “

ˇ

ˇ

ˇ

ˇ

´n` 1

n´ x`

1

n

ˇ

ˇ

ˇ

ˇ

ď

ˇ

ˇ

ˇ

ˇ

´n` 1

n´ x

ˇ

ˇ

ˇ

ˇ

`1

nď

1

4`

1

n0

ď1

4`

1

4“

1

2,

and hence we arrive at the contradiction that

2 “ |x´ 1´ px` 1q| ď |x´ 1| ` |x` 1| ď1

2`

1

2“ 1 .

Hence our assumption that (2.3.3) exists is false. The proof that (2.3.4)does not exist proceeds indirectly, too. Assume on the contrary that there issome x P R such

limnÑ8

n2 ` 1

n“ x .

Further, let ε be some real number ą 0. Finally, let n0 be some naturalnumber ě |x| ` ε. Then it follows for n ě n0 thatˇ

ˇ

ˇ

ˇ

n2 ` 1

n´ x

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

n´ x`1

n

ˇ

ˇ

ˇ

ˇ

“ n´ x`1

ną n´ x ě |x| ` ε´ x “ ε .

Hence there is an infinite number of members of the sequence that have adistance from x which is greater than ε. This contradicts the existence of alimit of (2.3.4). Hence such a limit does not exist.

The alert reader might have noticed that Def 2.3.1 might turn out to be in-consistent with logic, and then would have to be abandoned, if it turned outthat some sequence has more than one limit point. Part piq of the followingTheorem 2.3.4 says that this is impossible.

In particular, this theorem says that a sequence in R can have at most onelimit point (in part (i)), that the sequence consisting of the sums of themembers of convergent sequences in R is convergent against the sum of

67

their limits (in part (ii)), that the sequence consisting of the products of themembers of convergent sequences in R is convergent against the product oftheir limits (in part (iii)) and that the sequence consisting of the inverse ofthe members of a sequence convergent to a non-zero real number is conver-gent against the inverse of that number (in part (iv)).

Theorem 2.3.4. (Limit Laws) Let x1, x2, . . . ; y1, y2, . . . be sequences ofelements of R and x, x, y P R.

(i) IflimnÑ8

xn “ x and limnÑ8

xn “ x ,

then x “ x.

(ii) IflimnÑ8


yn “ y ,

thenlimnÑ8

pxn ` ynq “ x` y .

(iii) IflimnÑ8


yn “ y ,

thenlimnÑ8

xn ¨ yn “ x ¨ y .

(iv) IflimnÑ8

xn “ x and x ‰ 0 ,

thenlimnÑ8

1

xn“

1

x.

Proof. ‘(i)’: The proof is indirect. Assume that the assumption in (i) is trueand that x ‰ x. Then there is n0 P N such that for n P N satisfying n ě n0:

|xn ´ x| ă1

2|x´ x| and |xn ´ x| ă

1

2|x´ x| .

68

Hence it follows the contradiction that

|x´ x| “ |x´ xn ` xn ´ x| ď |x´ xn| ` |xn ´ x| ă |x´ x| .

Hence it follows that x “ x. ‘(ii)’: Assume that the assumption in (ii) istrue. Further, let ε ą 0. Then there is n0 P N such that for n P N withn ě n0:

|xn ´ x| ăε

2and |yn ´ y| ă

ε

2and hence

|xn ` yn ´ px` yq| ď |xn ´ x| ` |yn ´ y| ă ε .

‘(iii)’: Assume that the assumption in (iii) is true. Further, let ε ą 0 andδ ą 0 such that δpδ ` |x| ` |y|q ă ε. (Obviously, such a δ exists.) Thenthere is n0 P N such that for n P N with n ě n0:

|xn ´ x| ăδ

2and |yn ´ y| ă

δ

2.

Then

|xn ¨ yn ´ x ¨ y| “ |xn ¨ yn ´ xn ¨ y ` xn ¨ y ´ x ¨ y| ď

|xn| ¨ |yn ´ y| ` |xn ´ x| ¨ |y| ď |xn ´ x| ¨ |yn ´ y| ` |x| ¨ |yn ´ y|`

|xn ´ x| ¨ |y| ă ε .

‘(iv)’: Assume that the assumption in (iv) is true. Further, let ε ą 0 andδ ą 0 such that 1p|x|p|x|´δqq ă mint|x|, εu. (Obviously, such a δ exists.)Then there is n0 P N such that for n P N satisfying n ě n0:

| |xn| ´ |x| | ď |xn ´ x| ă δ ,

and hence also|xn| ą |x| ´ δ ą 0

andˇ

ˇ

ˇ

ˇ

1

xn´

1

x

ˇ

ˇ

ˇ

ˇ

“|xn ´ x|

|xn| ¨ |x|ă

|xn ´ x|

p|x| ´ δq ¨ |x|ă ε .

69

Remark 2.3.5. The previous theorem is of fundamental importance in theinvestigation of sequences. Usually, it is applied as follows. First, a givensequence of real numbers is decomposed into combinations of sums, prod-ucts, quotients of sequences whose convergence is already known. Then theapplication of the theorem proves the convergence of the sequence and al-lows the calculation of its limit if the limits of those constituents are known.

Example 2.3.6. Prove the convergence of the sequence x1, x2, . . . and cal-culate its limit where

xn :“1

n

for all n P N˚. Solution: In Example 2.3.3, we proved that

limnÑ8

n` 1

n“ 1 .

Since1

n“n` 1

n` p´1q

for every n P N˚, it follows by Theorem 2.3.4 and Example 2.3.2 the exis-tence of

limnÑ8

1

n

and that

limnÑ8

1

n“ lim

nÑ8

ˆ

n` 1

n` p´1q

˙

“ limnÑ8

n` 1

n` lim

nÑ8p´1q

“ 1` p´1q “ 0 .


xn :“1

n` a

for all n P N˚ and a ě 0. Solution: First, we notice that

xn :“1

n` a“

1

1` an

¨1

n(2.3.6)

70

for every n P N˚. Further, by Theorem 2.3.4, Example 2.3.2 and Exam-ple 2.3.6, it follows the existence of

limnÑ8

´

1à

n

¯

and that

limnÑ8

´

1à

n

¯

“ limnÑ8

1`´

limnÑ8

a¯

ˆ

limnÑ8

1

n

˙

“ 1` a ¨ 0 “ 1 .

Since the last is different from 0, it follows by Theorem 2.3.4 that

limnÑ8

1

1` an

“1

limnÑ8

`

1` an

˘ “1

1“ 1 .

Finally, again by application of Theorem 2.3.4, it follows from this andExample 2.3.6 the convergence of x1, x2, . . . and that

limnÑ8

xn “

ˆ

limnÑ8

1

1` an

˙ˆ

limnÑ8

1

n

˙

“ 1 ¨ 0 “ 0 .

Remark 2.3.8. Note that the result in the last Example is unchanged if ais some arbitrary real number. Only if a is some integer ă 0, the term xáhas to be excluded from the sequence because undefined.


xn :“3n` 2

2n` 1

for all n P N˚. Solution: First, we notice that

xn “3n` 2

2n` 1“

32¨ 2n` 2

2n` 1“

32¨ p2n` 1q ` 1

2

2n` 1“

3

2`

1

4¨

1

n` 12

Hence it follows by Theorem 2.3.4, Example 2.3.2 and Example 2.3.7 theconvergence of x1, x2, . . . and that

limnÑ8

xn “

ˆ

limnÑ8

3

2

˙

`

ˆ

limnÑ8

1

4

˙ˆ

limnÑ8

1

n` 12

˙

“3

2`

1

4¨ 0 “

3

2.

71

The following is a comparison theorem that allows to conclude from theconvergence of one of the involved sequences on the convergence of theother sequence.

Theorem 2.3.10. Let x1, x2, . . . and y1, y2, . . . be sequences of real num-bers such that

|xn| ď yn

for all n P N. Further, letlimnÑ8

yn “ 0 .

ThenlimnÑ8

xn “ 0 .

Proof. Let ε ą 0. Since y1, y2, . . . is convergent to 0, there is n0 P N suchthat

|xn| ď yn “ |yn| ă ε

for all n ě n0. Hence it follows that x1, x2, . . . is convergent to 0.

Example 2.3.11. Prove the convergence of the sequence x1, x2, . . . andcalculate its limit where

xn :“1

n2 ` a2

for all n P N˚ and a P R. Solution: We note that for every n P N˚

1

n2 ` a2ď

1

n.

Hence it follows by Theorem 2.3.10 and Example 2.3.6 that

limnÑ8

xn “ 0 .

The following theorem is often used in the analysis of convergent sequenceswhose limits cannot readily be determined. In this way, by approximationof the members of the sequence, frequently estimation of its limit can bederived.

72

Theorem 2.3.12. (Limits preserve inequalities) Let x1, x2, . . . and y1, y2, . . .be sequences of elements of R converging to x, y P R, respectively. Furtherlet xn ď yn for all n P N˚. Then also x ď y.

Proof. The proof is indirect. Assume on contrary that x ą y. Then itfollows the existence of an n P N˚ such that both

x´ xn ď |xn ´ x| ă1

2px´ yq , yn ´ y ď |yn ´ y| ă

1

2px´ yq

and hence the contradiction

x´ y ď x´ y ` yn ´ xn ă x´ y .

Hence x ď y.

Example 2.3.13. Define the sequence x1, x2, . . . recursively by

xn`1 :“1

2

ˆ

xn `a

xn

˙

for all n P N˚ where x1 ą 0 and a ě 0. Show that

limnÑ8

xn ě?a (2.3.7)

if x1, x2, . . . converges. Solution: For every x ą 0, it follows that

0 ď`

x´?a˘2“ x2

´ 2?a x` a

and hence that

1

2

´

x`a

x

¯

“1

2xpx2

` aq ě2?a x

2x“?a .

Therefore, since x1 ą 0, it follows inductively that xn ą 0 for all n P N˚and hence that

xn ě?a

for all n P N˚ zt1u. Hence if x1, x2, . . . is convergent, it follows by Theo-rem 2.3.12 the validity of (2.3.7).

73

In many cases, in particular such related to applications where sequencesare often defined recursively, it is not obvious how to decide whether agiven sequence is convergent or divergent. Then it is usually tried first toestablish the existence of a limit by application of a very general theorem,i.e., a theorem that is applicable to a very large class of sequences that haveonly few specific properties. If the sequence is found to be convergent, thedetermination of its limit or the derivation of estimations of that limit isperformed in subsequent steps. The derivation of such general theorems isthe goal in the following.

For this, we notice that Definition 2.3.1 is not of much use for decidingthe convergence of a given sequence if there is no obvious candidate for itslimit. Therefore it is natural to ask, whether there is a general way to de-cide that convergence without reference to a limit. Indeed, this is possibleby means of the so called Cauchy criterion. For its formulation, we needthe notion of Cauchy sequences. Roughly speaking, a sequence x1, x2, . . .of real numbers is called a Cauchy sequence if for every arbitrary preas-signed error bound ε ą 0, after omission of finitely many terms of thesequence, the distance between every two members of the remaining se-quence is smaller than ε.

Definition 2.3.14. (Cauchy sequences) We call a sequence x1, x2, . . . ofreal numbers a Cauchy sequence if for every ε ą 0 there is a correspondingn0 P N˚ such that

|xm ´ xn| ă ε

for all m,n P N˚ satisfying m ě n0 and n ě n0.

Example 2.3.15. Define x1 :“ 0, x2 :“ 1 and

xn`2 :“1

2pxn ` xn`1q

for all n P N˚. Show that x1, x2, . . . is a Cauchy sequence. Solution:First, it follows for every n P N˚ that xn`2 is the midpoint of the interval

74

10 20 30 40 50n

0.2

0.4

0.6

0.8

1x

Fig. 18: (n, xn) from Example 2.3.15 for n “ 1 to n “ 50.

In between xn and xn`1 given by In “ [xn, xn`1] if xn ď xn`1 and In “[xn`1, xn] if xn ą xn`1. Further,

xn`2 ´ xn`1 “1

2pxn ` xn`1q ´ xn`1 “ ´

1

2pxn`1 ´ xnq .

Hence it follows by the method of induction that I1 Ą I2 Ą I3 . . . and that

xn`1 ´ xn “p´1qn´1

2n´1.

As a consequence, if ε ą 0 and n0 P N˚ is such that 21´n0 ă ε, then itfollows for m,n P N˚ satisfying m ě n0 and n ě n0 that xm P In0 andtherefore that

|xm ´ xn| ď1

2n0´1ă ε .

Hence x1, x2, . . . is a Cauchy sequence. See Fig. 18.

The following is easy to show.

75

Theorem 2.3.16. Every convergent sequence of real numbers is a Cauchysequence.

Proof. For this, let x1, x2, . . . be a sequence of real numbers converging tosome x P R and ε ą 0. Then there is n0 P N˚ such that

|xn ´ x| ă ε2

for all n P N˚ satisfying n ě n0. The last implies that

|xm ´ xn| “ |xm ´ x´ pxn ´ xq| ď |xm ´ x| ` |xn ´ x| ă ε

for all n,m P N˚ satisfying n ě n0 and m ě n0. Hence x1, x2, . . . is aCauchy sequence.

The opposite statement that every Cauchy sequence of real numbers is con-vergent is not obvious, but a deep property of the real number system. Thisis proved in the Appendix, see the proof of Theorem 5.1.11 in the frame-work of Cantor’s construction of the real number system by completion ofthe rational numbers using Cauchy sequences. The most important parts ofcalculus / analysis, are based on the following theorem or, equivalently, onBolzano-Weierstrass theorem below.

Theorem 2.3.17. (Completeness of the real numbers) Every Cauchysequence of real numbers is convergent.

Proof. See the proof of Theorem 5.1.11 in the Appendix.

In the following, we derive far reaching consequences of the completenessof the real numbers.

Theorem 2.3.18. (Bolzano-Weierstrass) For every bounded sequence x1,x2, . . . of real numbers there is a subsequence, i.e., a sequence xn1 , xn2 , . . .that corresponds to a strictly increasing sequence n1, n2, . . . of non-zeronatural numbers, which is convergent.

76

Proof. For this let x1, x2, . . . be a bounded sequence of real numbers. Thenwe define

S :“ tx1, x2, . . . u .

In case that S is finite, there is a subsequence x1, x2, . . . which is constantand hence convergent. In case that S is infinite, we choose some elementxn1 of the sequence. Since S is bounded, there is a ą 0 such that S Ă

I1 :“ r´a4, a4s. At least one of the intervals r´a4, 0s, r0, a4s containsinfinitely many elements of S. We choose such interval I2 and xn2 P I2

such that n2 ą n1. In particular I2 Ă I1. Bisecting I2 into two intervals,we can choose a subinterval I3 Ă I2 containing infinitely many elementsof S and xn3 P I3 such that n3 ą n2. Continuing this process, we arriveat a sequence of intervals I1, I2, . . . such that I1 Ą I2 Ą . . . and such thatthe length of Ik is a2k for every k P N˚. Also, we arrive at a subsequencexn1 , xn2 , . . . of x1, x2, . . . such that xk P Ik for every k P N˚. For givenε ą 0, there is k0 P N˚ such that a2k0 ă ε. Further, let k, l P N˚ be suchthat k ě k0 and l ě k0. Then it follows that xk P Ik0 , xl P Ik0 and thereforethat

|xk ´ xl| ď a2k0 ă ε .

Hence xn1 , xn2 , . . . is a Cauchy sequence and therefore convergent accord-ing to Theorem 2.3.17.

For the following, the Bolzano-Weierstrass theorem will be fundamental.It will be applied in the proofs of a number of important theorems, forinstance, Theorem 2.3.33, Theorem 2.3.44 and Theorem 3.5.59. Also thefollowing theorem is an important and frequently applied consequence ofBolzano-Weierstrass’ theorem. Until the beginning of the 19th century itsstatement must have been considered as geometrically obvious because itwas used without mentioning. For instance in Augustin-Louis Cauchy’stextbook ‘Cours d’analyse’ from 1821 [22], it is implicitly used in the proofof the intermediate value theorem, see Theorem 2.3.37 below, but withoutproof. From today’s perspective, it is clear that such geometric intuitionwas based on an illusion.

Theorem 2.3.19. Let x1, x2, . . . be an increasing sequence of real numbers,i.e., such that xn ď xn`1 for all n P N, which is also bounded from above,

77

i.e., for which there is M ě 0 such that xn ď M for all n P N. Thenx1, x2, . . . is convergent.

Proof. Since x1, x2, . . . is increasing and bounded from above, it followsthat this sequence is also bounded. Hence according to the previous theo-rem, there is a subsequence, i.e., a sequence xn1 , xn2 , . . . that correspondsto a strictly increasing sequence n1, n2, . . . of non-zero natural numbers,which is convergent. We denote the limit of such sequence by x. Then,

xn ď x

for all n P N˚. Otherwise, there is m P N˚ such that xm ą x. If nk0 P N˚is such that nk0 ě m, then

xnk ě xm ą x

for all k P N˚ such that k ě k0. This implies that

limkÑ8

xnk ě xm ą x .

Further, for ε ą 0, there is k0 such that

|xnk ´ x| ă ε

for all k P N˚ such that k ě k0. Hence it follows for all n P N˚ satisfyingn ě nk0 that

|xn ´ x| “ x´ xn ď x´ xnk0 “ |xnk0 ´ x| ă ε .

Therefore, x1, x2, . . . is convergent to x.

Corollary 2.3.20. Let x1, x2, . . . be an decreasing sequence of real num-bers, i.e., such that xn`1 ď xn for all n P N, which is also bounded frombelow, i.e., for which there is a real M ě 0 such that xn ěM for all n P N.Then x1, x2, . . . is convergent.

78

10 20 30 40 50n

0.1

0.2

0.3

0.4

0.5

x

Fig. 19: (n, xn) from Example 2.3.21 for n “ 1 to n “ 50.

Proof. The sequence ´x1,´x2, . . . is increasing, bounded from above andtherefore convergent to a real number x by the previous theorem. Hencex1, x2, . . . is convergent to ´x.

Example 2.3.21. Show that the sequence x1, x2, . . . defined by x1 :“ 12and

xn :“1 ¨ 3 . . . p2n´ 1q

2 ¨ 4 . . . p2nq

for all n P N˚ zt1u is convergent. Solution: The sequence x1, x2, . . . isbounded from below by 0. In addition,

xn`1 “2n` 1

2pn` 1qxn ď xn

for all n P N˚ and hence x1, x2, . . . is decreasing. Hence x1, x2, . . . isconvergent according to Corollary 2.3.20. See Fig 19.

79

Definition 2.3.22. Let S be a non-empty subset of R. We say that S isbounded from above (bounded from below) if there is M P R such thatx ďM (x ěM ) for all x P S.

The following theorem can be considered as a variation of Theorem 2.3.19which is also in frequent use. Its power will be demonstrated in the subse-quent example.

Theorem 2.3.23. Let S be a non-empty subset of R which is bounded fromabove (bounded from below). Then there is a least upper bound (largestlower bound) of S which will be called the supremum of S (infimum of S)and denoted by supS (inf S).

Proof. First, we consider the case that S is bounded from above. For this,we define the subsetsA,B of R as all real numbers that are no upper boundsof S and containing all upper bounds of S, respectively,

A :“ ta P R : There is x P S such that x ą au ,

B :“ tb P R : x ď b for all x P Su .

Since S is non-empty and bounded from above, these sets are non-empty.In addition, for every a P A and every b P B, it follows that a ă b. Leta1 P A and b1 P B. Recursively, we construct an increasing sequencea1, a2, . . . in A and a decreasing sequence b1, b2, . . . in B by

an`1 :“

#

pan ` bnq2 if pan ` bnq2 P Aan if pan ` bnq2 P B ,

bn`1 :“

#

bn if pan ` bnq2 P Apan ` bnq2 if pan ` bnq2 P B

for every n P N˚. According to Theorem 2.3.19, both sequences are con-vergent to real numbers a and b, respectively. Since,

bn ´ an “ pb1 ´ a1q2n´1

for all n P N˚, it follows that a “ b. In the following, we show thatb “ supS. For every x P S, it follows that x ă bn for all n P N˚ and hence

80

that x ď b. Hence b is an upper bound of S. Let b be an upper bound ofS such that b ă b. Then there is n P N˚ such that b ă an. Since an is noupper bound for S, the same is also true for b. Therefore, b is the smallestupper bound of S, i.e., b “ supS. Finally, we consider the case that S isbounded from below. Then ´S :“ t´x : x P Su is bounded from above.Obviously, a real number a is a lower bound of S if and only if ´a is anupper bound of´S. Hence´ supp´Sq is the largest lower bound of S, i.e.,inf S exists and equals ´ supp´Sq.

Example 2.3.24. Prove that there is a real number x such that x2 “ 2.Solution: For this, we define

S :“ ty P R : 0 ď y2ď 2u .

Since 0 P S, S is a non-empty. Further, S does not contain real numbersy ě 2 since the last inequality implies that

y2´ 2 “ py ´ 2qpy ` 2q ` 2 ě 2 .

Hence S is bounded from above. We define x :“ supS. In the following,we prove that x2 “ 2 by excluding that x2 ă 2 and that x2 ą 2. First, weassume that x2 ă 2. Then it follows for n P N˚ that

ˆ

x`1

n

˙2

´ 2 “ x2´ 2`

2x

n`

1

n2ď x2

´ 2`2x

n`

1

n

“ x2´ 2`

2x` 1

n.

Hence if n ě (2x` 1)(2´ x2) it follows thatˆ

x`1

n

˙2

ď 2

and therefore that x ` (1n) P S. As a consequence, x is no upper boundfor S. Second, we assume that x2 ą 2. Then it follows for ε ą 0 that

px´ εq2 ´ 2 “ x2´ 2´ 2εx` ε2

ě x2´ 2´ 2εx .

81

Hence if ε ă (x2 ´ 2)(2x), it follows that

px´ εq2 ą 2 .

As a consequence, x is not the smallest upper bound for S. Finally, itfollows that x2 “ 2. Note that according to Example 2.2.15, x is no rationalnumber.

Below, we define the exponential function as a limit of sequences. Thisfunction is of fundamental importance for applications. It appears in a nat-ural way in the description of physical systems throughout the whole ofphysics. One prominent example is the description of radioactive decay. Itsdiscovery is often attributed to Jacob Bernoulli, who became familiar withcalculus through a correspondence with Leibniz, resulting from his study ofthe problem of continuous compound interest. For motivation, we brieflysketch the problem in the following.

For this, we assume that a bank account contains a ą 0 Dollars that pays100x percent interest per year where x is some real number. Of course,in practice x ě 0. If the interest is payed once at the end of the year, theaccount contains

a1 :“ a` x a “ a p1` xq

Dollars at the end of the year. If the interest is payed semiannually, after12 years the account contains

a`x

2a “ a

´

1`x

2

¯

Dollars and after one year

a2 :“ a´

1`x

2

¯

`x

2a´

1`x

2

¯

“ a´

1`x

2

¯2

ě a1

Dollars. Analogously, if the interest is payed n-times per year where n PN˚, the account contains

an :“ a´

1`x

n

¯n

82

2 4 6 8 10 12 14n

2.71

2.72

2.73

2.74

Fig. 20: pn, xnq, pn, ynq from Lemma 2.3.25 and pn, eq for n “ 1 to n “ 15.

Dollars after one year. Bernoulli investigated the question whether thisamount would grow indefinitely with the increase of n or whether it wouldstay bounded. Indeed, as we shall see below, the sequence a1, a2, . . . isconverging to a real number which is denoted by aex or a exppxq. Forsimplicity, below we restrict n to powers of 2. This is an approach of OttoDunkel, 1917 [33] which avoids the use of Bernoulli’s inequality. Thisrestriction can be removed later, for instance, with the help of L‘Hospital’stheorem, Theorem 2.5.38.

Lemma 2.3.25. Let x P R. Define

xn :“´

1`x

2n

¯p2nq

, yn :“´

1´x

2n

¯´p2nq

for all n P Z. Then for all n P N˚ such 2pn´1q ą |x|:

0 ă xn´1 ď xn ď yn ď yn´1 (2.3.8)

andˇ

ˇ

ˇ

ˇ

xnyn´ 1

ˇ

ˇ

ˇ

ˇ

ďx2

4m2. (2.3.9)

83

Proof. For this let n P N˚ be such that m :“ 2pn´1q ą |x|. Then

´

1`x

2m

¯2

“ 1`x

m`

x2

4m2ě 1`

x

mą 0 ,

´

1´x

2m

¯2

“ 1´x

m`

x2

4m2ě 1´

x

mą 0

and hence0 ă xn´1 ď xn and 0 ă yn ď yn´1 .

Finally, it follows that

yn ´ xn “´

1´x

2m

¯´2m

´

´

1`x

2m

¯2m

“

´

1´x

2m

¯´2m

#

1´

„

1´´ x

2m

¯22m

+

“

´

1´x

2m

¯´2m

¨

„

1´

ˆ

1´x2

4m2

˙

¨

«

ˆ

1´x2

4m2

˙0

`

ˆ

1´x2

4m2

˙1

` ¨ ¨ ¨ `

ˆ

1´x2

4m2

˙2m´1ff

and hence xn ď yn and (2.3.9).

Note that the sequence y1, y2, . . . in Lemma 2.3.25 is a decreasing andbounded from below by 0 and hence convergent according to Theorem 2.3.20.Hence we can define the following:

Definition 2.3.26. We define the exponential function exp : RÑ R by

exppxq :“ ex :“ limnÑ8

´

1´x

2n

¯´p2nq

for all x P R.

Then we conclude

Theorem 2.3.27.

84

(i)

ex “ limnÑ8

´

1`x

2n

¯p2nq

and ex ą 0 for all x P R.

(ii)

1` x ď´

1`x

2n

¯p2nq

ď ex ď´

1´x

2n

¯´p2nq

ď1

1´ x(2.3.10)

for all x P R such |x| ă 1 and all n P N.

(iii)ex`y “ exey

for all x, y P R.

Proof. From (2.3.9), it follows for every x P R:

limnÑ8

xnyn“ 1

and hence by the limit laws Theorem (2.3.4) that

limnÑ8

yn ¨ limnÑ8

xnyn“ lim

nÑ8xn

and by (2.3.8) and Theorem 2.3.12 that ex ą 0 for all x P R. Fur-ther, if |x| ă 1, it follows from (2.3.8) and by Theorem 2.3.12 the esti-mates (2.3.10). Finally, if y P R and n P N is such that m :“ 2n ąmaxt4|x|, 4|y|, 2|x||y|u, then

`

1` xm

˘m `

1` ym

˘m

`

1` x`ym

˘m “

ˆ

1`hmm

˙m

wherehm :“

xy

m` x` y

85

is such that |hm| ă 1. Hence by (2.3.10)

1` hm ď

ˆ

1`hmm

˙m

ď1

1´ hm,

and it follows by Theorem 2.3.4 and Theorem 2.3.12 that

exey

ex`y“ lim

nÑ8

ˆ

1`hmm

˙m

“ 1 .

Problems

1) Below are given the first 8 terms of a sequence x1, x2, . . . . Foreach find a representation xn “ fpnq, n “ 1, . . . , 8 where f is anappropriate function.

a) 2, 4, 6, 8, 10, 12, 14, 16,b) 2, 4, 8, 16, 32, 64, 128, 256,c) ´1, 1, ´1, 1, ´1, 1, ´1, 1,d) 1, 3, 6, 10, 15, 21, 28, 36,e) ´1, 34, ´57, 710, ´913, 1116, ´1319, 1522,f) 2, 0, 2, 0, 2, 0, 2, 0,g) 57, 0, 79, 0, 911, 0, 1113, 0,h) 1, 1, 46, 824, 16120, 32720, 645040, 12840320,i) 0, 1, 0, ´1, 0, 1, 0, ´1,j) 0, 1, 0, 1, 0, ´1, 0, ´1,k) 0, 1, 0, 0, 0, ´1, 0, 1, [0, 0, 0, ´1,]l) 0, 1, 0, 0, 0, 1, 0, 1, [0, 0, 0, 1].

2) Prove the convergence of the sequence and calculate its limit. For thisuse only the limit laws, the fact that a constant sequence convergesto that respective constant and the fact that

limnÑ8

p1nq “ 0 .

Give details.

86

a) xn :“ 1` p1nq, n P N˚,b) xn :“ 5` p´2q p1nq ` 3 p1nq2, n P N˚,c) xn :“ r1` p´4q p1nqsr2` 3 p1nq2s, n P N˚,d) xn :“ 3n2, n P N˚,e) xn :“ p2n´ 1qpn` 3q, n P N˚,f) xn :“ p3n2 ´ 6n´ 10qp7n2 ` 3n´ 5q, n P N˚,g) xn :“ p3n2 ´ 6n´ 10qp7n3 ` 3n´ 5q, n P N˚.

3) Determine in each case whether the given sequence is convergent ordivergent. Give reasons. If it is convergent, calculate the limit.

a) xn :“ n`1n b) xn :“ p´1qn

n

c) xn :“ p´1qn`

1´ 1n

˘

d) xn :“ 1`p´1qn

n

e) xn :“ sinpnπq f) xn :“ sin`

nπ2

˘

` cospnπq

g) xn :“ nn2`1 h) xn :“ n2

n2`1

i) xn :“ n3

n2`1 j) xn :“ n2´n

n3`1

for every n P N˚.

4) The table displays pairs pn, snq, n “ 1, . . . , 10, where sn is the mea-sured height in meters of a free falling body over the ground aftern10 seconds and at rest at initial height 4m.

p1, 3.951q p2, 3.804q p3, 3.559q p4, 3.216qp5, 2.775q p6, 2.236q p7, 1.599q p8, 0.864q

.

Draw these points into an xy-diagram where the values of n appearon the x-axis and the values of sn on the y-axis. Find a representationsn “ fpn10q, n “ 1, . . . , 10, where f is an appropriate function,and predict the time when the body hits the ground.

5) The table displays pairs p2n10, Lnq, n “ 1, ..., 8, where 2n10 isthe pressure in atmospheres (atm) of an ideal gas (, at constant tem-perature of 20 degrees Celsius,) confined to a volume which is pro-portional to the length Ln. The last is measured in millimeters (mm).

p0.2, 672q p0.4, 336q p0.6, 224q p0.8, 168qp1.0, 134.4q p1.2, 112q p1.4, 96q p1.6, 84q

87

Draw these points into an xy-diagram where the values of n appearon the x-axis and the values of Ln on the y-axis. Find a represen-tation Ln “ fp2n10q, n “ 1, . . . , 8, where f is an appropriatefunction, and predict L10.

6) Like Archimedes, derive the recursion relation (2.3.1) by elementarygeometric reasoning without the use of trigonometric functions.

7) Reconsider Archimedes’ measurement of the circle and calculate therecursion relation for the sequence of circumferences s6, s12, s24, . . .that corresponds to (2.3.1). In addition, prove that this sequence isincreasing as well as bounded from above and hence convergent.

2.3.2 Continuous Functions

This section starts the investigation of properties of functions defined onsubsets of the real numbers.

Alongside the notion of a function, the notion of the continuity of a func-tion underwent considerable changes until it reached its current meaning.In his textbook ‘Introductio ad analysin infinitorum’ from 1748 [38], Leon-hard Euler defines a function as an equation or analytic expression com-posed of variables and numbers. Admissible analytic expressions werethose that involved the four algebraic operations, roots, exponentials, log-arithms, trigonometric functions, derivatives and integrals. This commonproperty of functions was also called ‘continuity in form’. The study ofthe solutions of the wave equation in one space dimension (‘the Vibrating-String Problem’), made necessary the consideration of compounds of suchfunctions. Such were called ‘discontinuous’ functions by Euler. This in-cluded functions (in the sense of curves) that are traced by the free motionof the hand and therefore not subject to any law of continuity in form. Un-like modern definitions of continuity of a function, continuity in the senseof Euler included the differentiability of the function in the modern sense.The last concept will be defined in Section 2.4. Hence the term continuouswas used to indicate a kind of regularity of the function. The same is truetoday.

88

The modern definition of continuity goes back to a publication of Bern-hard Bolzano from 1817 [12]. The literal translation of the (German) titleis

‘Purely analytical proof of the theorem, that between each twovalues which guarantee an opposing result, at least one realroot of the equation lies.’

The phrase ‘opposing result’ means an opposite sign, and the theorem inquestion is the intermediate value theorem, see Theorem 2.3.37 below. Inthis paper, he criticizes that the known proofs of that theorem still make ref-erence to geometric intuition although such arguments were already consid-ered inadequate in pure mathematics at the time. He argues that the conceptof continuity should be understood in the following sense. A function fpxqvaries according to the law of continuity for all values of x which lie insideor outside certain limits if for every such x the value of the difference

fpx` ωq ´ fpxq

can be made smaller than any given quantity if ω can be assumed as small asone wishes. Essentially the same formulation can also be found in Cauchy’stextbook ‘Cours d’analyse’ from 1821 [22]. This formulation practicallycoincides with a modern definition.

It is important to note that, on first sight and unlike Bolzano, Cauchy’sdefinition makes reference to infinitesimal quantities. The use of suchquantities, which have their roots in ancient Greek philosophy, was quitecommon at that time. Among others, Johannes Kepler, Newton, Leibniz,Jacob Bernoulli, Euler and Cauchy, previously to the writing of his ‘Coursd’analyse’, made use of them. Jean le Rond d’Alembert, Joseph Louis La-grange, Bolzano and others distrusted that concept and tried to avoid it. Onthe other hand, Cauchy replaces the concept of fixed infinitesimally smallquantities by a definition of infinitesimals in terms of an essentially modernconcept of limits. In this way, he ‘reconciles rigor with infinitesimals’ andbecame an important and influential promoter of rigor in calculus / analysis.

89

In modern calculus / analysis, infinitesimals are not part of the real num-ber system. Following Cauchy, their role has been replaced by the rigorousconcept of limits.

The assumption of continuity of the involved function is sufficient to provethe intermediate value theorem, although neither Bolzano nor Cauchy couldgive a completely satisfactory proof according to modern standards becausea rigorous foundation of the real number system was still missing. An addi-tional important property of continuous functions, defined on closed inter-vals of R, is that they assume a maximum and also a minimum value. SeeTheorem 2.3.33 below.

Below, we define the continuity of a function as the property to ‘preservelimits’. This form of the definition goes back to Heinrich Eduard Heine andis called ‘sequential continuity’ in more general situations (than functionsdefined on subsets of the real numbers).

Definition 2.3.28. (Continuity) Let f : D Ñ R be a function and x PD. Then we say f is continuous in x if for every sequence x1, x2, . . . ofelements in D from

limνÑ8

xν “ x

it follows that

limνÑ8

fpxνq “ f´

limνÑ8

xν

¯

r“ fpxqs .

If f is not continuous in x, we say f is discontinuous in x. Also we say fis continuous if f is continuous in all points of its domain D.

Example 2.3.29. (Basic examples for continuous functions.) Let a, b bereal numbers and f : RÑ R be defined by

fpxq :“ ax` b

for all x P R. Then f is continuous.

90

Proof. Let x be some real number and x1, x2, . . . be a sequence of real ofnumbers converging to x. Then for any given ε ą 0, there is n0 P N suchthat for n P N with n ě n0:

|a| ¨ |xn ´ x| ă ε

and hence also that

|fpxnq ´ fpxq| “ |axn ` b´ pax` bq| “ |axn ´ ax| “ |a| ¨ |xn ´ x| ă ε

andlimnÑ8

fpxnq “ fpxq .

An example for a function which is discontinuous in one point.

Example 2.3.30. Consider the function f : RÑ R defined by

fpxq :“x

|x|

for x ‰ 0 and fp1q :“ 1. Then

limnÑ8

1

n“ 0 and lim

nÑ8

ˆ

´1

n

˙

“ 0 ,

but

limnÑ8

f

ˆ

1

n

˙

“ 1 and limnÑ8

f

ˆ

´1

n

˙

“ ´1 .

Hence f is discontinuous at the point 1. See Fig. 21. Such discontinuity iscalled a ‘jump discontinuity’.

The following gives an example of a function that is discontinuous in ev-ery point of its domain and is known as Dirichlet’s function. It was givenin Dirichlet’s 1829 paper [30] which gave a precise meaning to Fourier’swork from 1822 [41] on heat conduction. As described in the beginningof Section 2.2.3, that paper also gave the first modern definition of func-tions. His example clearly demonstrates that he moved considerably pasthis time with his concept of functions since such type of function had notbeen considered before.

91

Example 2.3.31. (Dirichlet’s function, a function which is nowhere con-tinuous) Define f : RÑ R by

fpxq :“

#

1 if x is rational0 if x is irrational

for every x P R. For the proof that f is everywhere discontinuous, letx P R. Then x is either rational or irrational. If x is rational, then xn :“x`

?2n for every n P N˚ is irrational. (Otherwise,

?2 “ npxn ´ xq is a

rational number. ) Hence

limnÑ8

fpxnq “ 0 ‰ 1 “ fpxq ,

and f is discontinuous in x. If x is irrational, by construction of the realnumber system, see Theorem 5.1.11 (i) in the Appendix, there is a sequenceof rational numbers x1, x2, . . . that is convergent to x. Hence

limnÑ8

fpxnq “ 1 ‰ 0 “ fpxq ,

and f is discontinuous in x also in this case.

In the following, we define ‘continuous’ limits of the form

limxÑa

fpxq

where f is some function and a some real number or 8,´8. In classical(=‘pre-modern’) understanding, the symbol was understood as the variablex approaching a in a ‘continuous’ way, an understanding that was heav-ily dependent on geometric intuition. Nowadays, there are good reasons todistrust such an intuition resulting from Cantor’s classification of infinitesets. That classification separates infinite sets into those that are countableand those that are not. The last are called ‘uncountable’. A countable setis a set which is the image of an injective map with domain N. It can beshown that the sets Z and Q are countable, but that R and also any intervalof R containing more than one point is uncountable. Therefore, the geo-metric intuition of the variable x approaching a in a continuous way would

92

involve the visualization of an uncountable set which can be considered hu-manly impossible. For this reason, it can very well be said that a large partof classical calculus / analysis used arguments that were based on illusions,even if one excludes its frequent use of infinitesimal quantities from theconsideration.

The following definition introduces notation which is in frequent use inother textbooks of calculus / analysis. We will use it only occasionally.

Definition 2.3.32. (Continuous limits) Let f be function defined on a sub-set of R, a P RY t´8u Y t8u and b P R.

(i) We say that a sequence x1, x2, . . . of real numbers converges to8 or´8 if for every n P N there are only finitely many members that areď n or ě ´n, respectively.

(ii) If there is sequence x1, x2, x3, . . . in the domain of f that convergesto a, we define

limxÑa

fpxq “ b ,

if for every such sequence it follows that

limnÑ8

fpxnq “ b .

An important property of continuous functions, defined on closed intervalsof R, is that they assume a maximum value and a minimum value. Thecorresponding theorem is a direct consequence of the Bolzano-Weierstrasstheorem Theorem 2.3.18.

Theorem 2.3.33. (Existence of maxima and minima of continuous func-tions on compact intervals) Let f : ra, bs Ñ R be a continuous functionwhere a and b are real numbers such that a ă b. Then there is x0 P ra, bssuch that

fpx0q ě fpxq p fpx0q ď fpxq q

for all x P ra, bs.

93

-1 -0.5 0.5 1x

-0.5

0.5

y

Fig. 21: Graph of f from Example 2.3.30.

0.2 0.4 0.6 0.8 1x

0.1

0.2

0.3

0.4

y


94

Proof. For this, in a first step, we show that f is bounded and hence thatsup fpra, bsq exists. In the final step, we show that there is c P ra, bs suchthat fpcq “ sup fpra, bsq. For this, we use the Bolzano-Weierstrass theo-rem. The proof that f is bounded is indirect. Assume on the contrary thatf is unbounded. Then there is a sequence x1, x2, . . . such that

fpxnq ą n (2.3.11)

for all n P N. Hence according to Theorem 2.3.18, there is a subsequencexk1 , xk2 , . . . of x1, x2, . . . converging to some element c P ra, bs. Note thatthe corresponding sequence is fpxk1q, fpxk2q, . . . is not converging as aconsequence of (2.3.11). But, since f is continuous, it follows that

fpcq “ limkÑ8

fpxnkq

Hence f is bounded. Therefore let M :“ sup fpra, bsq. Then for everyn P N there is a corresponding cn P ra, bs such that

|fpcnq ´M | ă1

n. (2.3.12)

Again, according to Theorem 2.3.18, there is a subsequence ck1 , ck2 , . . . ofc1, c2, . . . converging to some element c P ra, bs. Also, as consequence of(2.3.12), the corresponding sequence fpck1q, fpck2q, . . . is converging toMand by continuity of f to fpcq. Hence fpcq “ M and by the definition ofM :

fpcq “M ě fpxq

for all x P ra, bs. By applying the previous reasoning to the continuousfunction ´f , it follows the existence of a c 1 such that

´ fpc 1q ě ´fpxq

and hence alsofpc 1q ď fpxq

for all x P ra, bs.

95

As a by product of the proof of the previous theorem, we proved that everycontinuous function defined on a bounded closed interval of R is boundedin the following sense.

Definition 2.3.34. (Boundedness of functions) We call a function f boundedif there is M ą 0 such that

|fpxq| ďM

for all x from its domain.

An example for an unbounded function defined on a bounded closed inter-val of R is given by the function f from Example 2.3.36 below.

Corollary 2.3.35. Every continuous function defined on a bounded closedinterval of R is bounded.

A simple example of a function which is discontinuous in one point anddoes not assume a maximal value is:

Example 2.3.36. Define f : r0, 1s Ñ R by

fpxq :“

"

1` x2 if 0 ď x ă 12px´ 1q2 if 12 ď x ď 1 .

See Fig. 22.

Another important property of continuous functions, defined on closed in-tervals of R, is that they assume all values between those at the intervalends.

Theorem 2.3.37. (Intermediate value theorem) Let f : ra, bs Ñ R bea continuous function where a and b are real numbers such that a ă b.Further, let fpaq ă fpbq and γ P pfpaq, fpbqq. Then there is x P pa, bq suchthat

fpxq “ γ .

96

Proof. DefineS :“ tx P ra, bs : fpxq ď γu .

Then S is non-empty, since a P S, and bounded from above by b. Hencec :“ supS exists and is contained in ra, bs. Further, there is a sequencex1, x2, . . . in S such that

|xn ´ c| ď1

n(2.3.13)

for all n P N. Hence x1, x2, . . . is converging to c, and it follows by thecontinuity of f that

limnÑ8

fpxnq “ fpcq .

Moreover, since fpxnq ď γ for all n P N, it follows that fpcq ď γ. Asa consequence, c ‰ b. Now for every x P pc, bs, it follows that fpxq ąγ because otherwise c is not an upper bound of S. Hence there exists asequence y1, y2, . . . in pc, bs which is converging to c. Further, because ofthe continuity of f

limnÑ8

fpynq “ fpcq

and hence fpcq ě γ. Finally, it follows that fpcq “ γ and therefore alsothat c ‰ a and c ‰ b.

The following corollary displays a main application of the intermediatevalue theorem: If f is a continuous function defined on a closed intervalof R whose values at the interval ends have a different relative sign, i.e.,one of those is ă 0 and the other one is ą 0, then there is x in the domainof f such that

fpxq “ 0 .

Corollary 2.3.38. Let f : ra, bs Ñ R be a continuous function where a andb are real numbers such that a ă b. Moreover, let fpaq ă 0 and fpbq ą 0.Then there is x P pa, bq such that fpxq “ 0.

Example 2.3.39. Define f : RÑ R by

fpxq :“ x3` x` 1

97

-1 -0.5 0.5 1x

-1

1

2

3

y


for all x P R. Then by Theorems 2.3.46, 2.3.48 below, f is continuous.Also, it follows that

fp´1q “ ´1 ă 0 and fp0q “ 1 ą 0

and hence by Corollary 2.3.38 that f has a zero in p´1, 0q. See Fig. 23.

Remark 2.3.40. Note in the previous example that the value (0.375) of f inthe mid point ´0.5 of r´1, 0s is ą 0. Hence it follows by Corollary 2.3.38that there is a zero in the interval r´1,´0.5s. The iteration of this process iscalled the ‘bisection method’. It is used to approximate zeros of continuousfunctions.

Polynomial functions, defined on the whole of R, of an odd order nec-essarily assume the value 0 since they assume values of different relativesign for large negative and large positive arguments. That the same is nottrue in general for polynomial functions of even order can be seen fromthe fact that, for instance, the polynomial function f : R Ñ R defined byfpxq :“ 1` x2 for all x P R does not assume the value zero.

98

Theorem 2.3.41. Let n be a natural number and a0, a1, . . . , a2n be realnumbers. Define the polynomial p : RÑ R by

ppxq :“ a0 ` a1x` ¨ ¨ ¨ ` a2nx2n` x2n`1

for all x P R. Then there is some x P R such that fpxq “ 0.

Proof. Below in Example 2.3.49, it is proved that p is continuous. Further,define

x0 :“ 1`maxt|a0|, |a1|, . . . , |a2n|u .

Then

´`

a0 ` a1x0 ` ¨ ¨ ¨ ` a2nx2n0

˘

ď |a0| ` |a1| ¨ |x0| ` ¨ ¨ ¨ ` |a2n| ¨ |x0|2n

ď px0 ´ 1q ¨ p1` x0 ` ¨ ¨ ¨ ` x2n0 q “ x2n`1

0 ´ 1 ă x2n`10

and hence ppx0q ą 0. Also

a0 ` a1p´x0q ` ¨ ¨ ¨ ` a2np´x0q2nď |a0| ` |a1| ¨ |x0| ` ¨ ¨ ¨ ` |a2n| ¨ |x0|

2n

ď px0 ´ 1q ¨ p1` x0 ` ¨ ¨ ¨ ` x2n0 q “ x2n`1

0 ´ 1 ă ´p´x0q2n`1

and hence pp´x0q ă 0. Hence according to Theorem 2.3.37, there is x Pr´x0, x0s such that fpxq “ 0.

The ‘converse’ of Theorem 2.3.37 is not true, i.e., a function that assumesall values between those at its interval ends is not necessarily continuous onthat interval. This can be seen, for instance, from the following Example.

Example 2.3.42. Define f : r0, 2πs Ñ R by

fpxq :“ sinp1xq

for 0 ă x ď 2π and fp0q :“ 0. Then f is not continuous (in 0), but assumesall values in the in the interval rfp0q, fp2πqs “ r0, 1s. Note also that f hasan infinite number of zeros, located at 1pnπq for n P N˚.

A useful property of continuous functions for theoretical investigationssuch as Theorem 2.3.44 below is that they map intervals of R that are con-tained in their domain on intervals of R.

99

0.2 0.4 0.6x

-1

-0.5

0.5

1

y


Theorem 2.3.43. Let f : ra, bs Ñ R be a continuous function where a andb are real numbers such that a ă b. Then the range of f is given by

fpra, bsq “ rα, βs (2.3.14)

for some α, β P R such that α ď β.

Proof. Denote by α, β the minimum value and the maximum value of f ,respectively, which exist according to Theorem 2.3.33. Then for every x Prα, βs

α ď fpxq ď β .

Further, let xm, xM P ra, bs be such that fpxmq “ α and fpxMq “ β,respectively. Finally denote by I the interval rxm, xM s if xm ď xM andrxM , xms if xM ă xm. Then the restriction f |I of f to I is continuous and,according to Theorem 2.3.37 (applied to the function ´f |I if xM ă xm),every value of rα, βs is in its range.

100

Intuitively, for instance, as a consequence of Theorem 2.2.44, it is to beexpected that the inverse of an injective continuous function is itself con-tinuous. Indeed, this true.

Theorem 2.3.44. Let f : ra, bs Ñ R, where a, b P R are such that a ă b,be continuous and strictly increasing, i.e., for all x1, x2 P ra, bs such thatx1 ă x2 it follows that fpx1q ă fpx2q. Then the inverse function f´1 iscontinuous, too.

Proof. From the property that f is strictly increasing, it follows that f isalso injective. Further, from Theorem 2.3.43 it follows the existence ofα, β P R such that the range of f is given by rα, βs and hence that

f´1 : rα, βs Ñ ra, bs .

Now let y be some element of rα, βs and y1, y2, . . . be some sequence of el-ements of rα, βs that is converging to y, but such that f´1py1q, f

´1py2q, . . .is not converging to f´1pyq. Then there is an ε ą 0 along with a subse-quence yn1 , yn2 , . . . of y1, y2, . . . such that

ˇ

ˇf´1pynkq ´ f

´1pyq

ˇ

ˇ ě ε (2.3.15)

for all k P N˚. According to the Bolzano-Weierstrass’ Theorem 2.3.18,there is a subsequence ynk1 , ynk2 , . . . of yn1 , yn2 , . . . such

limlÑ8

f´1pynkl q “ x (2.3.16)

for some x P ra, bs. Hence it follows by the continuity of f that

limlÑ8

ynkl “ fpxq

and y “ fpxq, since ynk1 , ynk2 , . . . is also convergent to y, but from (2.3.15)it follows by (2.3.16) that

x ‰ f´1pyq

which, since f is injective, leads to the contradiction that

y ‰ fpxq .

Hence such y and sequence y1, y2, . . . don’t exist and f´1 is continuous.

101

In the case of sequences, the limit laws, see Theorem 2.3.4, stated that sums,products and quotients (if defined) of convergent sequences are convergentto the corresponding sum, product, quotient (if defined) of their limits. Atypical application of these limit laws consisted in the decomposition ofa given sequence into sums, products, quotients of sequences whose con-vergence is already known. Then the application of the limit laws provedthe convergence of the sequence and allowed the calculation of its limit ifthe limits of those constituents are known. Theorems similar in structureto that of the limit laws for sequences hold for continuous functions andare given below. Sums, products, quotients (wherever defined) and com-positions of continuous functions are continuous. Indeed, this is a simpleconsequence of the limit laws, Theorem 2.3.4, and the definition of con-tinuity. According to Theorem 2.3.44 the same is true for the inverse ofan injective continuous function. A typical application of the thus obtainedtheorems consists in the decomposition of a given function into sums, prod-ucts, quotients, compositions and inverses of functions whose continuity isalready known. Then the application of those theorems proves the continu-ity of that function. In this way, the proof of continuity of a given functionis greatly simplified and, usually, obvious. Therefore, in such obvious casesin future, the continuity of the function will be just stated, but not explicitlyproved.

Definition 2.3.45. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such thatD1XD2 ‰ φ. Moreover, let a P R. Then we define pf1`f2q : D1XD2 Ñ R(read: ‘f plus g’) and a ¨ f1 : D1 Ñ R (read: ‘a times f ’) by

pf1 ` f2qpxq :“ f1pxq ` f2pxq

for all x P D1 XD2 and

pa ¨ f1qpxq :“ a ¨ f1pxq

for all x P D1.

Theorem 2.3.46. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such thatD1 XD2 ‰ φ. Moreover let a P R. Then it follow by Theorem 2.3.4 that

102

(i) if f1 and f2 are both continuous in x P D1 X D2, then f1 ` f2 iscontinuous in x, too,

(ii) if f1 is continuous in x P D1, then a ¨ f1 is continuous in x, too.

Definition 2.3.47. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such thatD1 XD2 ‰ φ. Then we define f1 ¨ f2 : D1 XD2 Ñ R (read: ‘f1 times f2’)by

pf1 ¨ f2qpxq :“ f1pxq ¨ f2pxq

for all x P D1 X D2. If moreover Ranpf1q Ă R˚, then we define 1f1 :D1 Ñ R (read: ‘1 over f1’) by

p1f1qpxq :“ 1f1pxq

for all x P D1.

Theorem 2.3.48. Let f1 : D1 Ñ R, f2 : D2 Ñ R be functions such thatD1 XD2 ‰ φ.

(i) If f1 and f2 are both continuous in x P D1 X D2, then f1 ¨ f2 iscontinuous in x, too.

(ii) If f1 is such that Ranpf1q Ă R˚ as well as continuous in x P D1, then1f1 is continuous in x, too.

Proof. For the proof of (i), let x1, x2, . . . be some sequence in D1 X D2

which converges to x. Then for any ν P N˚

|pf1 ¨ f2qpxνq ´ pf1 ¨ f2qpxq| “ |f1pxνqf2pxνq ´ f1pxqf2pxq|

“ |f1pxνqf2pxνq ´ f1pxqf2pxνq ` f1pxqf2pxνq ´ f1pxqf2pxq|

ď |f1pxνq ´ f1pxq| ¨ |f2pxνq| ` |f1pxq| ¨ |f2pxνq ´ f2pxq|

ď |f1pxνq ´ f1pxq| ¨ |f2pxνq ´ f2pxq| ` |f1pxνq ´ f1pxq| ¨ |f2pxq|

` |f1pxq| ¨ |f2pxνq ´ f2pxq|

and hence, obviously,

limνÑ8

pf1 ¨ f2qpxνq “ pf1 ¨ f2qpxq .

103

For the proof of (ii), let x1, x2, . . . be some sequence inD1 which convergesto x. Then for any ν P N˚

|p1f1qpxνq ´ p1f1qpxq| “ |1f1pxνq ´ 1f1pxq|

“ |f1pxνq ´ f1pxq|r |f1pxνq| ¨ |f1pxq| s


limνÑ8

p1f1qpxνq “ p1f1qpxq .

In the following, we give two examples for the application of Theorem 2.3.46and Theorem 2.3.48.

Example 2.3.49. Let n P N and a0, a1, . . . , an be real numbers. Then thecorresponding polynomial of n-th order p : RÑ R defined by

ppxq :“ a0 ` a1x` ¨ ¨ ¨ ` anxn

for all x P R, is continuous.

Proof. The proof is a simple consequence of Example 2.3.29, Theorem 2.3.46and Theorem 2.3.48.

Example 2.3.50. Explain why the function

fpxq :“x3 ` 2x2 ` x` 1

x2 ´ 3x` 2(2.3.17)

is continuous at every number in its domain. State that domain. Solution:The domain D is given by those real numbers for which the denominatorof the expression (2.3.17) is different from 0. Hence it is given by

D “ R zt1, 2u .

Further, as a consequence of Example 2.3.49, the polynomials p1 : RÑ R,p2 : D Ñ R defined by

p1pxq :“ x3` 2x2

` x` 1 ,

104

p2pxq :“ x2´ 3x` 2

for all x P R and x P D, respectively, are continuous. Since p2pRq Ă R˚,it follows by Theorem 2.3.48 that the function 1p2 is continuous. Finallyfrom this, it follows by Theorem 2.3.48 that p1p2 is continuous.

Theorem 2.3.51. Let f : Df Ñ R, g : Dg Ñ R be functions and Dg be asubset of R. Moreover let x P Df , fpxq P Dg, f be continuous in x and gbe continuous in fpxq. Then g ˝ f is continuous in x.

Proof. For this, let x1, x2, . . . be a sequence in Dpg ˝ fq converging to x.Then fpx1q, fpx2q, . . . is a sequence inDg. Moreover since f is continuousin x, it follows that

limνÑ8

fpxνq “ fpxq .

Finally, since g is continuous in fpxq it follows that

limνÑ8

pg ˝ fqpxνq “ limνÑ8

gpfpxνqq “ gpfpxqq “ pg ˝ fqpxq .

Example 2.3.52. Show that f : RÑ R defined by

fpxq :“ |x|

for all x P R, is continuous. Solution: Define the polynomial p2 : R Ñ Rby p2pxq :“ x2 for every x P R. According to Example 2.3.49, p2 is contin-uous. Then f “ s2˝p2, where s2 denotes the square-root function on r0,8q,which, by Theorem 2.3.44, is continuous as inverse of the strictly increas-ing restriction of p2 to r0,8q. Hence f is continuous by Theorem 2.3.51.

Example 2.3.53. The functions sin : RÑ R and exp : RÑ R are contin-uous. Show that arcsin : r´1, 1s Ñ r´π2, π2s, cos : R Ñ R, arccos :r´1, 1s Ñ r0, πs, tan : p´π2, π2q Ñ R, arctan : R Ñ p´π2, π2qand the natural logarithm function ln : p0,8q Ñ R are continuous. Solu-tion: Since the restriction of sin to r´π2, π2s and exp are in particular

105

-2 2 3x

-3

-2

-1

1

2

3

y

Fig. 25: Graph of sin, arcsin and asymptotes.

-3 -2 2 3x

-3

-2

-1

2

3

y

Fig. 26: Graph of cos, arccos and asymptotes.

106

-3 -2 -1 1 2x

-3

-2

-1

1

2

3

y

Fig. 27: Graph of tan, arctan and asymptotes.

-3 -2 -1 1 2 3x

-3

-2

-1

1

2

3

y

Fig. 28: Graph of exp, ln.

107

A B C D

F

x

x

1

cosHxL

sinHxL tanHxL

Fig. 29: Sketch for Example 2.3.54. The dots in the cornersB and F indicate right angles.

increasing, their inverses arcsin and ln are continuous according to Theo-rem 2.3.44. Further, since

cospxq “ sin´

x`π

2

¯

for all x P R, the cosine function is continuous as composition of continu-ous functions according to Theorem 2.3.51. Further, the restriction of costo r0, πs is in particular increasing and hence its inverse arccos continuousaccording to Theorem 2.3.44. Also, tan : R z tk ¨ π ` pπ2q : k P Zu Ñ Rdefined by

tanpxq :“sinpxq

cospxq

for every x P R z tk ¨ π ` pπ2q : k P Zu is continuous according to The-orem 2.3.48 as quotient of continuous functions. Finally, the restriction oftan to p´π2, π2q is in particular increasing and hence its inverse arctancontinuous according to Theorem 2.3.44.

It is not uncommon that, in a first step, in the definition of a continuousfunction f certain real numbers have to be excluded from the domain sincethe expression used for the definition is not defined in those points. Suchpoints are called singularities of f , although not part of the domain of f .Most frequent is the case that the definition in a point would involve divi-sion by 0. Since this division is not defined, that point has to excluded from

108

-1.5 -1 -0.5 0.5 1 1.5x

0.5

1

1.5

2

2.5

y

Fig. 30: Graphs of f (red) and h (blue) from Example 2.3.54.

the domain of f . In particular in applications, singularities of functions arepoints of interest. For instance, in physics they often signal the breakdownof theories at such locations. In case that there is a continuous functionf whose restriction to the domain of f coincides with f and, in addition,contains a singularity of f , then that singularity is called a removable andf a continuous extension of f . If xs P R is a singularity of f and if thereis a sequence x1, x2, . . . in the domain of f that is convergent to xs, then itfollows by the assumed continuity of f that

limnÑ8

fpxnq “ limnÑ8

fpxnq “ fpxsq

and hence that every continuous extension of f containing xs in its domainassumes the same value in xs. Continuous functions with singularities thatare not removable are easy to construct. For instance, f : R˚ Ñ R definedby fpxq :“ 1x has a singularity at x “ 0 and the sequence

fp11q, fp12q, fp13q, . . .

109

diverges. Since

limnÑ8

1

n“ 0 ,

it follows that there is no continuous extension of f . The following is anoften appearing case of a removable singularity.

Example 2.3.54. (Removable singularities) Define f : RÑ R by

fpxq “sinpxq

x

for every x P R˚ and fp0q “ 1. Then f is continuous. Proof: By Theo-rem 2.3.48, the continuity of sin and the linear function p : RÑ R, definedby ppxq :“ x, x P R, see Example 2.3.29, it follows the continuity of f inall points of R˚. The proof that f is also continuous in x “ 0, follows fromthe following inequality (compare Fig 30):

ˇ

ˇ

ˇ

ˇ

sinpxq

x´ 1

ˇ

ˇ

ˇ

ˇ

ď1

cospxq´ 1 , (2.3.18)

for all x P p´π2, π2q zt0u. For its derivation and in a first step, weassume that 0 ă x ă π2 and consider the triangle ADF in Fig 29, in par-ticular the areas ApABF q, ApACF q and ApADF q of the triangles ABF ,ACF and ADF , respectively. Then we have the following relation:

ApABF q ď ApACF q ď ApADF q

and hence1

2sinpxq cospxq ď

x

2ď

tanpxq

2and

cospxq ďsinpxq

xď

1

cospxq.

From this follows, by the symmetries of sin, cos under sign change ofthe argument, the same equality for ´π2 ă x ă 0. Hence for x P

p´π2, π2q zt0u:sinpxq

x´ 1 ď

1

cospxq´ 1

110

and

1´sinpxq

xď 1´ cospxq ď

1

cospxq´ 1

and hence finally (2.3.18). Now since h : p´π2, π2q Ñ R defined by

hpxq :“1

cospxq´ 1 ,

for all x P p´π2, π2q is continuous, it follows by (2.3.18) and Theo-rem 2.3.10 the continuity of f also in x “ 0.

Remark 2.3.55. The alert reader might have noticed that geometric intu-ition was used in the derivation of the inequality (2.3.18) that is also usedfurther on, although such intuition is no longer admitted in proofs. In-deed, this could be avoided by introducing the sine and cosine functionsby their power series expansions, see Example 3.4.27 from Calculus II, butthis would take us to far off course.

Often, in particular in applications, functions occur that are defined on un-bounded intervals of the real numbers. For instance, such appear in thedescription of the frequently occurring physical systems of infinite exten-sion, like the motion of planets and comets around the sun. In such casesthe behavior of the function near `8 and/or ´8 is of interest. Such studywould be much simplified if `8 and ´8 would be part of the real num-bers which is not the case. But there is a simple method to reduce thediscussion of the behavior of a function near `8 and/or ´8 to that of arelated function near 0 which is based on the fact that the auxiliary functionh :“ pR˚ Ñ R, x ÞÑ 1xq maps large positive real numbers to small pos-itive numbers and large negative real numbers to small negative numbers.Hence the behavior of a function f near ˘8 is completely determined bythe behavior the function f :“ f ˝ h near 0. This fact provides a simplemethod for the calculation of limits at infinity.

Theorem 2.3.56. (Limits at infinity) Let a ą 0 and L be some real num-ber.

111

(i) If f : ra,8q Ñ R is continuous, then

limxÑ8

fpxq “ L

if and only if the transformed function f : r0, 1as Ñ R defined by

fpxq :“ fp1xq

for all x P p0, 1as and fp0q :“ L is continuous in 0. In this case,we call the parallel through the x-axis through p0, Lq a ‘horizontalasymptote of Gpfq for large positive x’.

(ii) If f : p´8,´as Ñ R is continuous, then

limxÑ´8

fpxq “ L

if and only if the transformed function f : r´1a, 0s Ñ R defined by

fpxq :“ fp1xq

for all x P r´1a, 0q and fp0q :“ L is continuous in 0. In thiscase, we call parallel through the x-axis through p0, Lq a ‘horizontalasymptote of Gpfq for large negative x’.

Proof. “(i)”: IflimxÑ8

fpxq “ L , (2.3.19)

we conclude as follows. For this, let x1, x2, . . . be a sequence in p0, 1asthat is convergent to 0. As a consequence, for m P N, there is N P N suchthat

xn “ |xn| ď1

m` 1

for all n P N such that n ě N . This implies that

1

xně m` 1 ě m

112

for all n P N such that n ě N . Hence it follows from (2.3.19) that

L “ limnÑ8

fp1xnq “ limnÑ8

fpxnq .

Obviously, this also implies that

limnÑ8

fpxnq “ L

for sequences x1, x2, . . . in r0, 1as that are convergent to 0 and hence thatf is continuous in 0. On the other hand, if f is continuous in 0, we concludeas follows. For this, let x1, x2, . . . be a sequence in ra,8q which containsonly finitely many members that are ď m for every m P N. Then for suchm, there is N P N such that xn ą m ` 1 for all n P N satisfying n ě N .This also implies that

ˇ

ˇ

ˇ

ˇ

1

xn

ˇ

ˇ

ˇ

ˇ

“1

xnă

1

m` 1

for such n. Since this is true for every m P N, we conclude that

limnÑ8

1

xn“ 0

and hence by the continuity of f in 0 that

L “ limnÑ8

f

ˆ

1

xn

˙

“ limnÑ8

fpxnq .

Finally, since this is true for every such sequence x1, x2, . . . , (2.3.19) fol-lows.“(ii)”: The proof is analogous to that of (i). If

limxÑ´8

fpxq “ L , (2.3.20)

we conclude as follows. For this, let x1, x2, . . . be a sequence in r´1a, 0qthat is convergent to 0. As a consequence, for m P N, there is N P N suchthat

xn “ ´|xn| ě ´1

m` 1

113

for all n P N such that n ě N . This implies that

1

xnď ´pm` 1q ď ´m

for all n P N such that n ě N . Hence it follows from (2.3.20) that

L “ limnÑ8

fp1xnq “ limnÑ8

fpxnq .

Obviously, this also implies that

limnÑ8

fpxnq “ L

for sequences x1, x2, . . . in r´1a, 0q that are convergent to 0 and hencethat f is continuous in 0. On the other hand, if f is continuous in 0, weconclude as follows. For this, let x1, x2, . . . be a sequence in p´8,´aswhich contains only finitely many members that are ě ´m for every m P

N. Then for such m, there is N P N such that xn ă ´pm` 1q for all n P Nsatisfying n ě N . This also implies that

ˇ

ˇ

ˇ

ˇ

1

xn

ˇ

ˇ

ˇ

ˇ

“ ´1

xnă

1

m` 1

for such n. Since this is true for every m P N, we conclude that

limnÑ8

1

xn“ 0

and hence by the continuity of f in 0 that

L “ limnÑ8

f

ˆ

1

xn

˙

“ limnÑ8

fpxnq .

Finally, since this is true for every such sequence x1, x2, . . . , (2.3.20) fol-lows.

114

-10 -5 5 10x

-1

0.5

y

Fig. 31: Gpfq and asymptote for Example 2.3.57.

Example 2.3.57. Consider the function f : r1,8q Ñ R defined by

fpxq “x2 ´ 1

x2 ` 1

for all x P r1,8q. See Fig. 31. Then the transformed function f : r0, 1s ÑR, defined by

fpxq :“1´ x2

1` x2

for all x P r0, 1s, is continuous and hence since

fpxq “ fp1xq

for all x P p0, 1s, it follows that

limxÑ8

fpxq “ 1 .

Hence y “ 1 is a horizontal asymptote of Gpfq for large positive x. SeeFig. 31.

115

2 4 6 8x

-2

-1

1

2

y

Fig. 32: Gpfq and asymptotes for Example 2.3.58.

Example 2.3.58. Find the limits

limxÑ8

?2x2 ` 1

3x´ 5, limxÑ´8

?2x2 ` 1

3x´ 5.

Solution: Define f : tx P R : x ‰ 53u Ñ R by

fpxq :“

?2x2 ` 1

3x´ 5

for all x P R ^ x ‰ 53. Then the transformed functions f correspondingto the restrictions of f to r1,8q and p´8,´1s are given by the continuousfunctions

fpxq :“x

|x|¨

?2` x2

3´ 5x2

for all x P r0, 1s and

fpxq :“x

|x|¨

?2` x2

3´ 5x2

116

for all x P r´1, 0s, respectively, and hence

limxÑ8

?2x2 ` 1

3x´ 5“

?2

3, limxÑ´8

?2x2 ` 1

3x´ 5“ ´

?2

3.

See Fig. 32.

Problems

1) Show the continuity of the function f . For this, use only Theorems2.3.46, 2.3.48, 2.3.51 on sums, products/quotients, compositions ofcontinuous functions, and the continuity of constant functions/theidentity function idR on R.

a) fpxq :“ x` 7 , x P R ,b) fpxq :“ x2 , x P R ,c) fpxq :“ 3x , x P R˚ ,d) fpxq :“ px` 3qpx´ 8q , x P R zt8u ,e) fpxq :“ px2 ` 3x` 2qpx2 ` 2x` 2q , x P R .

2) Assume that f and g are continuous functions in x “ 0 such thatfp0q “ 2 and

limxÑ0

r2fpxq ´ 3gpxqs “ 1 .

Calculate gp1q.

In the following, it can be assumed that rational functions, i.e., quotientsof polynomial functions, are continuous on their domain of definition.In addition, it can be assumed that the exponential function, the naturallogarithm function, the general power function, the sine and cosine func-tion and the tangent function are continuous.

3) For arbitrary c, d P R, define fc,d : RÑ R by

fc,dpxq :“

$

’

&

’

%

1px2 ` 1q if x P p´8, 1qcx` d if x P r´1, 1s?

4x` 5 if x P p1,8q

for all x P R. Determine c, d such that the corresponding fc,d iseverywhere continuous. Give reasons.

117

4) For arbitrary c P R, define fc : r0,8q Ñ R by

fcpxq :“

#

x sinp1xq if x P p0,8qc if x “ 0

for all x P r0,8q. Determine c such that the corresponding fc iseverywhere continuous. Explain your answer.

5) For every k P R, define fk : r´13,8q zt1u Ñ R by

fkpxq :“

#?3x`1´

?2x`2

x´1 if x P r´13,8q zt1u

k if x “ 1.

For what value of k is fk continuous? Give explanations.

6) Define the function f : R Ñ R by fpxq :“ x4 ` 10x ´ 15 for allreal x. Use your calculator to find an interval of length 1100 whichcontains a zero of f (i.e, some real x such that fpxq “ 0). Giveexplanations.

7) Determine in each case whether the given sequence has a limit. Ifthere is one, calculate that limit. Otherwise, give arguments whythere is no limit.

a) xn :“?n` 1´

?n , b) xn :“

?n

?n` 1

for all n P N.

8) Find the limits.

a) limnÑ8

e1n , b) limnÑ8

cos

ˆ

n` 1

n¨ π

˙

c) limnÑ8

cospnq

n,

d) limnÑ8

lnpnq

n, Hint: Use that lnpnq ď 2

?n for n ě 1 .

e) limnÑ8

n1n , Hint: Use d)

f) limnÑ8

´

a

n2 ` 6n´ n¯

,

g) limnÑ8

´

n13 ´ pn` 1q13¯

,

Hint: Use that a´ b “a3 ´ b3

a2 ` ab` b2for all a ‰ b ,

118

h) limhÑ0

p1` hq23 ´ 1

h, Hint: Use the hint in g) .

9) Calculate the limits.

a) limnÑ8

sin

ˆ

17n` 4

n` 5

˙

, b) limxÑ2

tan

ˆ

3x` 2

5x` 7

˙

,

c) limxÑ5

x2 ´ 8x` 15

x´ 5.

In each case, give explanations.

10) Find the limits

a) limxÑ8 rxpx` 1qs ,

b) limxÑ´8 rxpx` 1qs ,

c) limxÑ8 rpsinxqxs ,

d) limxÑ8 rp3x3 ` 2x2 ` 5x` 4q p2x3 ` x2 ` x` 5qs ,

e) limxÑ8 px?

1` x2 q ,

f) limxÑ´8 px?

1` x2 q ,

g) limxÑ8 p?

3x2 ` 2x ?

2x2 ` 5 q ,

h) limxÑ8 p?x2 ` 3´

?x2 ` 1 q ,

i) limxÑ´8 p?x2 ` 4x` 5´

?x2 ` 2 q .


fpxq :“

#

x if x is rational0 if x is irrational

for every x P R. Find the points of discontinuity of f .

12) Define f : R Ñ R by fpxq :“ 0 if x “ 0 or if x is irrational andfpmnq :“ 1n if m P Z˚ and n P N˚ have no common divisorgreater than 1. Find the points of discontinuity of f .

13) Let f and g be functions from R to R whose restrictions to Q coin-cide. Show that f “ g.

14) Let D Ă R, f : D Ñ R be continuous in some x P D and fpxq ą 0.By an indirect proof, show that there is ε ą 0 such that fpxq ą 0 forall x P D X px´ ε, x` εq.

15) Let a, b P R such that a ă b and f : ra, bs Ñ ra, bs. By use of theintermediate value theorem, show that f has a fixed point, i.e., thatthere is x P ra, bs such that fpxq “ x.

119

16) Use the intermediate value theorem to prove that for every a ě 0there is a uniquely determined x ě 0 such that x2 “ a. That x isdenoted by

?a.

120

0.5 1x

0.25y

Fig. 33: Graph of A and its point with maximum ordinate.

2.4 DifferentiationPossibly, the first mathematician to use the derivative concept in some im-plicit form is Pierre de Fermat in his calculation of maximum / minimumordinate values of curves in Cartesian coordinate systems and in his way ofdetermination of tangents at the points of curves.

The first may be due to the observation that the ordinate values of a curvenear a maximum (or a minimum) change very little near the abscissa of itslocation, differently to other points of the curve. It is not clear whether thiswas his real motivation because he never published his method, but only de-scribed it in communications to other mathematicians from 1637 onwards.Also in these instances, he did not explain its logical basis so that its generalvalidity was quickly questioned. On the other hand, his procedure suggeststhat observation as the basis of the method. For display of the method,he considers the problem of finding the maximal area of a rectangle withperimeter 2b where b ą 0. If x ě 0 denotes the width of such a rectangle,the corresponding area is given by

Apxq :“ x ¨ pb´ xq ,

121

see Fig 33. If px0, Apx0qq is the point of GpAq with maximal ordinate, then

Apx0 ` hq “ px0 ` hq ¨ pb´ x0 ´ hq “ x0 ¨ pb´ x0q ` h ¨ pb´ 2x0 ´ hq

“ Apx0q ` h ¨ pb´ 2x0 ´ hq « Apx0q (2.4.1)

for h such that x0 ` h P DpAq “ r0, bs and of small absolute value where« means ‘approximately’ . Hence if h ‰ 0,

b´ 2x0 ´ h « 0 .

By neglecting the term ´h on the left hand side of the last relation, hearrives at the equation

b´ 2x0 “ 0 (2.4.2)

and hence at x0 “ b2 which gives Apx0q “ b24. Indeed, the rectanglewith perimeter 2b of maximal area is given by a square with sides b2. Wenote that from (2.4.1) it follows that

Apx0 ` hq ´ Apx0q

h“ b´ 2x0 ´ h

if h ‰ 0. Hence the equation (2.4.2) is equivalent to the demand that

limhÑ0,h‰0

Apx0 ` hq ´ Apx0q

h“ 0

where the addition of h ‰ 0 in the limit symbol indicates that only se-quences with non-vanishing members are admitted . In modern calculus/ analysis, the limit on the left of the last equation is called the derivativeof f in x0 and is denoted by f 1px0q. Hence in modern terms, Fermat de-mands that f 1px0q “ 0. Indeed, the vanishing of the derivative in a pointis necessary, but not sufficient, for a (differentiable) function to assume anextremum, i.e., a minimum or maximum value, in that point, see Theo-rem 2.5.1.

Fermat uses a similar method for the determination of tangent lines tocurves. To a greater extent, such were not studied until the middle of

122

a-c a a+h a+dx

fHaL

fHa+hL

y

Fig. 34: Depiction to Fermat’s method of determination of tangents.

the 17th century. Apart from Archimedes’ construction of tangent linesto his spiral, in ancient Greece, tangents were constructed only in few sim-ple cases, namely for ellipses, parabolas and hyperbolas where they weredefined as lines that touch the curve in only one point. In general, this def-inition is too imprecise. In particular, the concept of differentiation alsogives a precise meaning to tangent lines to curves. For the description ofFermat’s method, we consider Fig 34 which displays the graph of a func-tion f together with its tangent at the point pa, fpaqq and the normal tothe tangent in this point. By definition, the tangent goes through the pointpa, fpaqq and hence is determined once we know the location of its intersec-tion pa´c, 0qwith the x-axis where c is the unknown. For the determinationof c, Fermat considers the triangles with corners pa ´ c, 0q, pa, 0q, pa, fpaqand pa ´ c ` h, 0q, pa ` h, 0q, pa ` h, fpa ` hqq to be approximately sim-ilar in the case of a tangent. These triangles are similar only if the pointpa ` h, fpa ` hqq would lie on the tangent. In general, the error of the ap-proximation is becoming smaller with smaller h. The approximation givesthe relation

fpaq

c«fpa` hq

c` hor

pc` hqfpaq « cfpa` hq .

123

The last gives

c «hfpaq

fpa` hq ´ fpaq.

If f is explicitly given, Fermat proceeds further by performing the divisionand neglecting h as in his previous method. For instance if fpxq “ x2 forall x P R, then

hfpaq

fpa` hq ´ fpaq“

ha2

pa` hq2 ´ a2“

ha2

2ah` h2“

a2

2a` h«a

2

which leads to c “ a2. Indeed, this is the correct result. Also, usingmodern notation and assuming that

f 1paq :“ limhÑ0,h‰0

fpa` hq ´ fpaq

h‰ 0 ,

the following

c “ limhÑ0,h‰0

hfpaq

fpa` hq ´ fpaq“

fpaq

f 1paq

gives the correct result. As a side remark, in older literature, the directedline segment pa, fpaqq, pa ´ c, 0q is called the tangent line in pa, fpaqq andits projection onto the x-axis the corresponding subtangent. In addition, thedirected line segment pa, fpaqq, pa ` d, 0q is called the normal in pa, fpaqqand its projection onto the x-axis the corresponding subnormal. When Fer-mat’s method was reported to Rene Descartes by Marin Mersenne in 1638,Descartes attacked it as not generally valid. He proposed as a challenge thecurve

C :“ tpx, yq P R2 : x3` y3

“ 3axyu , (2.4.3)

a P R, which since then is known as ‘Folium of Descartes’. Indeed, Fer-mat’s ‘method’ produced the right results, and ultimately Descartes con-ceded its validity.

A further candidate for the first mathematician to use the derivative con-cept in some implicit form is Galileo Galilei. In 1589, using inclined planes,

124

-2 -1 1 2x

-2

-1

1

2y

Fig. 35: Folium of Descartes for the case a “ 1, compare (2.4.3).

Galileo discovered experimentally that in vacuum all bodies, regardless oftheir weight, shape, or composition, are uniformly accelerated in exactlythe same way, and that the fallen distance s is proportional to the square ofthe elapsed time t:

sptq “1

2gt2 (2.4.4)

for all t P R where g “ 9.8msec2 is the gravitational acceleration. Thisresult was in contradiction to the generally accepted traditional theory ofAristotle that assumed that heavier objects fall faster than lighter ones. Onthe Third Day of his ‘Discorsi’ from 1638 [42], he discusses uniform andnaturally accelerated motion. The idea that the velocity is the same as aderivative can be read between the lines. Even a recognition of the funda-mental theorem of calculus, see Theorem 2.6.19, is visible in this specialcase. A modern way of deduction would proceed, for instance, as follows.

For this, we consider the average speed of a falling body described by(2.4.4), i.e., the traveled distance divided by the elapsed time, during the

125

time interval rt, t ` hs, if h ě 0, and rt ` h, ts, if h ă 0, respectively, forsome t, h P R. Then

spt` hq ´ sptq

t` h´ t“

g

2hrpt` hq2 ´ t2s “

g

2hp2ht` h2

q “ g

ˆ

t`h

2

˙

.

Hence it follows by Example 2.3.29 that

limhÑ0,h‰0

spt` hq ´ sptq

t` h´ t“ gt ,

which suggests itself as (and indeed is the) definition of the instantaneousspeed vptq of the body at time t:

vptq :“ s 1ptq :“ limhÑ0,h‰0

spt` hq ´ sptq

t` h´ t“ gt .

For a geometrical interpretation of the limit

limhÑ0,h‰0

spt` hq ´ sptq

t` h´ t,

also in more general situations where s is not necessarily given by p2.4.4q,note that the quotient

spt` hq ´ sptq

t` h´ t

gives the slope of the line segment (‘secant’) between the points pt, sptqqand pt` h, spt` hqq on the graph of s for every h ‰ 0. In the limit hÑ 0that slope approaches the slope of the tangent to Gpsq in the point pt, sptqq.Hence in particular, a geometrical interpretation of s 1ptq “ vptq is the slopeof the tangent to Gpsq at the point pt, sptqq, see Fig. 36.

As for the definition of the continuity of functions, Cauchy, in his text-book ‘Cours d’analyse’ from 1821 [22] and by using Lagrange’s notation,terminology and Lagrange’s characterization of the derivative in terms ofinequalities, was the first to give a definition of the derivative of a function

126

0.2 0.4 0.6 0.8 1 1.2t @secD

1

2

3

4sHtL @mD

0.8-0.4

sH0.8L-sH0.4L

Fig. 36: Gpsq, secant line and tangent at p0.4, sp0.4qq.

based on limits which is very near to the modern definition. Still, his un-derstanding of limits was different from the modern understanding. Thiswas not without consequences. During the early 19th century, it resultedin the general belief that every continuous function is everywhere differ-entiable, except perhaps at finitely many points. Even several ‘proofs’ ofthis ‘fact’ appeared during that time. Therefore, it came as a shock when in1872 [99] Weierstrass proved the existence of a continuous function whichis nowhere differentiable, see Example 3.4.13. For the first time, this resultsignaled the complete mastery of the concepts of derivative and limit whichis characteristic for modern calculus / analysis.

Definition 2.4.1. Let f : pa, bq Ñ R be a function where a, b P R suchthat a ă b. Further, let x P pa, bq and c P R. We say f is differentiable inx with derivative c if for all sequences x0, x1, . . . in pa, bq ztxu which are

127

convergent to x it follows that

limnÑ8

fpxnq ´ fpxq

xn ´ x“ c .

In this case, we define the derivative f 1pxq of f in x by

f 1pxq :“ c .

Further, we say f is differentiable if f is differentiable in all points of itsdomain pa, bq. In that case, we call the function f 1 : pa, bq Ñ R associatingto every x P pa, bq the corresponding f 1pxq the derivative of f . Higher orderderivatives of f are defined recursively. If f pkq is differentiable for k P N˚,we define the derivative f pk`1q of order k ` 1 of f by f pk`1q :“ pf pkqq 1,where we set f p1q :“ f 1. In that case, f will be referred to as pk ` 1q-times differentiable. Frequently, we also use the notation f 2 :“ f p2q andf 3 :“ f p3q.

The differentiability of a function in a point of its domain implies also itscontinuity in that point. This is a simple consequence of the definition ofdifferentiability and the limit laws Theorem 2.3.4. That the opposite is nottrue in general, can be seen from Example 2.4.6 or Example 2.4.7. More-over in Calculus II, we give an example of a continuous function which isnot differentiable in any point of its domain, see Example 3.4.13.

Theorem 2.4.2. Let f : pa, bq Ñ R be a function where a, b P R suchthat a ă b. Further, let f be differentiable in x P pa, bq. Then f is alsocontinuous in x.

Proof. Let x0, x1, . . . be a sequence in pa, bq which is convergent to x. Ob-viously, it is sufficient to assume that x0, x1, . . . is a sequence in pa, bq ztxu.Then it follows by the limit laws Theorem 2.3.4 that

limnÑ8

pfpxnq ´ fpxqq “ limnÑ8

fpxnq ´ fpxq

xn ´ x¨ limnÑ8

pxn ´ xq “ 0

and hence thatlimnÑ8

fpxnq “ fpxq .

128

Similar to the case of continuous functions, we shall see later on, see The-orems 2.4.8, 2.4.10, that sums, products, quotients (wherever defined) andcompositions of differentiable functions are differentiable. Indeed, this isa another simple consequence of the limit laws, Theorem 2.3.4, and thedefinition of differentiability. As usual, a typical application of those theo-rems consists in the decomposition of a given function into sums, products,quotients and compositions of functions whose differentiability is alreadyknown. Then the application of those theorems proves the differentiabil-ity of that function and allows the calculation of its derivative. To providea basis for the application of those theorems, in the following, we provethe differentiability of some elementary functions, powers, the exponen-tial function and the sine function, from the definition of differentiabilityand by use of their special properties. In this process, we also explicitlycalculate the derivatives.

Example 2.4.3. Let c P R, n P N˚ and f, g : RÑ R be defined by

fpxq :“ c , gpxq :“ xn

for all x P R. Then f, g are differentiable and

f 1pxq “ 0 , g 1pxq “ nxn´1

for all x P R.

Proof. Let x P R and x0, x1, ¨ ¨ ¨ be a sequence of numbers in R ztxu whichis convergent to x P R. Then:

limνÑ8

fpxνq ´ fpxq

xν ´ x“ lim

νÑ80 “ 0 .

Further, for any ν P N:

gpxνq ´ gpxq

xν ´ x“pxνq

n ´ xn

xν ´ x

“ pxνqn´1

` pxνqn´2x` ¨ ¨ ¨ ` xνx

n´2` xn´1

129

and hence by Example 2.3.49:

limνÑ8

gpxνq ´ gpxq

xν ´ x“ xn´1

` xn´2x` ¨ ¨ ¨ ` xxn´2` xn´1

“ nxn´1 .

In the next example, we show that the derivative of the exponential functionis given by that function itself. As we shall see later, this fact along with thefact that expp0q “ 1 can be used to characterize the exponential function,see Example 2.5.8.

Example 2.4.4. The exponential function is differentiable with

exp 1pxq “ exppxq

for all x P R.

Proof. First, we prove that exp is differentiable in 0 with derivative e0 “ 1.For this, let h1, h2, . . . be some sequence in R zt0u which is convergent to0. Moreover, let n0 P N be such that |hn| ă 1 for n ě 0. Then for any suchn:

ehn ´ e0

hn ´ 0´ e0

“ehn ´ p1` hnq

hn.

We consider the cases hn ą 0 and hn ă 0. In the first case, it follows by(2.3.10) and some calculation that

0 ďehn ´ p1` hnq

hnďhn4¨

3´ hn`

1´ hn2

˘2 ď12

4hn “ 3hn .

Analogously, it follows in the second case that

hn ďhn

1´ hnďehn ´ p1` hnq

hnďhn4.

Hence it follows in both cases thatˇ

ˇ

ˇ

ˇ

ehn ´ p1` hnq

hn

ˇ

ˇ

ˇ

ˇ

ď 3|hn|

130

and therefore by Theorem 2.3.10 that

limnÑ8

ehn ´ p1` hnq

hn“ 0 .

Now let x P R and x1, x2, . . . be some sequence in R ztxu which is conver-gent to x. Then

exn ´ ex

xn ´ x´ ex “ ex ¨

exn´x ´ r1` pxn ´ xqs

xn ´ x,

and hence it follows by Theorem 2.3.4 and the previous result that

limnÑ8

exn ´ ex

xn ´ x“ ex

and therefore the statement of this Theorem.

Example 2.4.5. The sine function is differentiable with

sin 1pxq “ cospxq

for all x P R.

Proof. Let x P R and x1, x2, . . . be some sequence in R ztxu, which isconvergent to x. Further define hn :“ xn ´ x, n P N. Then it follows bythe addition theorems for the trigonometric functions

sinpxnq ´ sinpxq

xn ´ x“

sinpx` hnq ´ sinpxq

hn

“ sinpxq ¨cosphnq ´ 1

hn` cospxq ¨

sinphnq

hn

“ ´ sinpxq ¨hn2

„

sinphn2q

hn2

2

` cospxq ¨sinphnq

hn

and hence by Example 2.3.54 and Theorem 2.3.4 that

limnÑ8

sinpxnq ´ sinpxq

xn ´ x“ cospxq .

131

-1 -0.5 0.5 1x

0.5

1y

Fig. 37: Graph of the modulus function. See Example 2.4.6.

We give two examples of continuous functions that are not differentiablein points of their domains. In the first case, this is due to the presence of a‘corner’ in the graph of the function. In such a point no tangent to the graphexists and hence the function is not differentiable in the corresponding pointof its domain. In the second case, the non-differentiability is due to fact thatthere is a vertical tangent to the graph. Since the derivative of a function fin a point p of its domain gives the slope of the tangent to its graph at thepoint pp, fppqq, the derivative in p would would have to be infinite in orderto account for a vertical tangent, but infinity is not a real number. Therefore,a function is not differentiable in such a point p.

Example 2.4.6. The function f : RÑ R defined by

fpxq :“ |x|

for all x P R, is not differentiable in 0, because

limnÑ8

ˇ

ˇ´ 1n

ˇ

ˇ´ 0

´ 1n´ 0

“ ´1 ‰ limnÑ8

ˇ

ˇ

1n

ˇ

ˇ´ 01n´ 0

“ 1 .

See Fig. 37.

Example 2.4.7. The function f : RÑ R defined by

fpxq :“ x13

132

-1 -0.5 0.5 1x

-1

0.5

1y


for all x P R, is not differentiable in 0, because the sequence

`

1n

˘13´ 013

1n´ 0

“ n23

has no limit for nÑ 8. See Fig. 38.

As mentioned above, similar to the case of continuous functions, sums,products, quotients (wherever defined) and compositions of differentiablefunctions are differentiable. This is a simple consequence of the limit laws,Theorem 2.3.4, and the definition of differentiability. A typical applica-tion of the thus obtained theorems consists in the decomposition of a givenfunction into sums, products, quotients, compositions of functions whosedifferentiability is already known. Then the application of those theoremsproves the differentiability of that function and allows the calculation ofits derivative from the derivatives of the constituents of decomposition. Inthis way, the proof of differentiability of a given function is greatly simpli-fied and, usually, obvious. Also, the calculation of its derivative is reduced

133

to a simple mechanical procedure if the derivatives of the constituents ofdecomposition are known. Therefore, in such obvious cases in future, thedifferentiability of the function will be just stated and its derivative will begiven without explicit proof.

Theorem 2.4.8. (Sum rule, product rules and quotient rule) Let f, g betwo differentiable functions from some open interval I into R and a P R.

(i) Then f ` g, a ¨ f and f ¨ g are differentiable with

pf ` gq 1pxq “ f 1pxq ` g 1pxq , pa ¨ fq 1pxq “ a ¨ f 1pxq

pf ¨ gq 1pxq “ fpxq ¨ g 1pxq ` gpxq ¨ f 1pxq

for all x P I .

(ii) If f is non-vanishing for all x P I , then 1f is differentiable andˆ

1

f

˙ 1

pxq “ ´f 1pxq

rfpxqs2

for all x P I .

Proof. For this let x P I and x1, x2, . . . be some sequence in I ztxu whichis convergent to x. Then:

|pf ` gqpxνq ´ pf ` gqpxq ´ pf1pxq ` g 1pxqqpxν ´ xq|

|xν ´ x|

ď|fpxνq ´ fpxq ´ f

1pxqpxν ´ xq|

|xν ´ x|`|gpxνq ´ gpxq ´ g

1pxqpxν ´ xq|

|xν ´ x|

and

|pa ¨ fqpxνq ´ pa ¨ fqpxq ´ ra ¨ pf1qpxqspxν ´ xq|

|xν ´ x|

“ |a| ¨|fpxνq ´ fpxq ´ f

1pxqpxν ´ xq|

|xν ´ x|

134

and hence

limνÑ8

|pf ` gqpxνq ´ pf ` gqpxq ´ pf1pxq ` g 1pxqqpxν ´ xq|

|xν ´ x|“ 0

and

limνÑ8

|pa ¨ fqpxνq ´ pa ¨ fqpxq ´ ra ¨ pf1qpxqspxν ´ xq|

|xν ´ x|“ 0 .

Further, it follows that

|pf ¨ gqpxνq ´ pf ¨ gqpxq ´ pfpxq ¨ g1pxq ` gpxq ¨ f 1pxqqpxν ´ xq|

|xν ´ x|

ď|fpxνq ´ fpxq ´ f

1pxqpxν ´ xq|

|xν ´ x|¨ |gpxq|

` |fpxq| ¨|gpxνq ´ gpxq ´ g

1pxqpxν ´ xq|

|xν ´ x|

`|fpxνq ´ fpxq|

|xν ´ x|¨ |gpxνq ´ gpxq|

and hence that

limνÑ8

|pf ¨ gqpxνq ´ pf ¨ gqpxq ´ pfpxq ¨ g1pxq ` gpxq ¨ f 1pxqqpxν ´ xq|

|xν ´ x|

“ 0 .

If f is does in any point of its domain I , it follows thatˇ

ˇ

ˇ

1fpxνq

´ 1fpxq

` 1rfpxqs2

¨ f 1pxqpxν ´ xqˇ

ˇ

ˇ

|xν ´ x|ď

1

|fpxq|2¨|fpxνq ´ fpxq ´ f

1pxqpxν ´ xq|

|xν ´ x|

`|fpxνq ´ fpxq|

2

|fpxνq| ¨ |fpxq|2 ¨ |xν ´ x|

135

and hence that

limνÑ8

ˇ

ˇ

ˇ

1fpxνq

´ 1fpxq

` 1rfpxqs2

¨ f 1pxqpxν ´ xqˇ

ˇ

ˇ

|xν ´ x|“ 0 .

Finally, since x1, x2, . . . and x P I were otherwise arbitrary, the theoremfollows.

As a simple application of Theorem 2.4.8, we prove the differentiability ofpolynomial functions and calculate their derivatives.

Example 2.4.9. Let n P N and a0, a1, . . . , an be real numbers. Then thecorresponding polynomial of n-th order p : RÑ R, defined by

ppxq :“ a0 ` a1x` ¨ ¨ ¨ ` anxn

for all x P R, is differentiable and

p 1pxq :“ a1 ` ¨ ¨ ¨ ` nanxpn´1q

for all x P R.

Proof. The proof is a simple consequence of Example 2.4.3 and Theo-rem 2.4.8.

Theorem 2.4.10. (Chain rule) Let f : I Ñ R, g : J Ñ R be differentiablefunctions defined on some open intervals I, J of R and such that the domainof the composition g ˝ f is not empty. Then g ˝ f is differentiable with

pg ˝ fq 1 “ g 1pfpxqq ¨ f 1pxq

for all x P Dpg ˝ fq.

Proof. For this let x P Dpg ˝ fq and x1, x2, . . . be some sequence in Dpg ˝fq ztxu which is convergent to x. Then:

|pg ˝ fqpxνq ´ pg ˝ fqpxq ´ pg1pfpxqq ˝ f 1pxqqpxν ´ xq|

|xν ´ x|ď

136

|gpfpxνqq ´ gpfpxqq ´ g1pfpxqqpfpxνq ´ fpxqq|

|xν ´ x|`

|g 1pfpxqqpfpxνq ´ fpxq ´ f1pxqpxν ´ xqq|

|xν ´ x|


limνÑ8

|pg ˝ fqpxνq ´ pg ˝ fqpxq ´ pg1pfpxqq ¨ f 1pxqqpxν ´ xq|

|xν ´ x|

“ 0 .

Finally, since x1, x2, . . . and x P Dpg ˝ fq were otherwise arbitrary, thetheorem follows.

A typical application of the chain rule is given in the following example.The cosine function is equal to the composition of the sine function andthe translation pRÑ R, x ÞÑ x` pπ2qq. Since both of these functions aredifferentiable, by Theorem 2.4.10, the same is true for their composition. Inaddition, by knowledge of the derivatives of these functions, the derivativeof their composition, i.e., the cosine function, can be calculated by use ofthe same theorem. In preparation of the calculation of the derivative of theinverse tangent function function, we also show the differentiability of thetangent function and calculate its derivative from the derivatives of the sineand the cosine with the help of Theorem 2.4.8.

Example 2.4.11. The cosine and the tangent function are differentiablewith

cos 1pxq “ ´ sinpxq

for all x P R and

tan 1pxq “1

cos2pxq“ 1` tan2

pxq

for all x P R z

π2` kπ : k P Z

(

.

137

Proof. Sincecospxq “ sin

´

x`π

2

¯

for all x P R, it follows by Examples 2.4.5, 2.4.3 and Theorem 2.4.8 (i.e.,the ‘sum rule’) and Theorem 2.4.10 (i.e., the ‘chain rule’) that cos is differ-entiable with derivative

cos 1pxq “ cos´

x`π

2

¯

“ ´ sinpxq

for all x P R. Further, because of

tanpxq “sinpxq

cospxq

for all x P R z

π2` kπ : k P Z

(

, it follows by Examples 2.4.5 and Theo-rems 2.4.8 (i.e., the ‘Quotient Rule’) that tan is differentiable with deriva-tive

tan 1pxq “cospxq ¨ cospxq ´ sinpxq ¨ p´ sinpxqq

cos2pxq“

1

cos2pxq

“ 1` tan2pxq

for all x P R z

π2` kπ : k P Z

(

.

Functions from applications frequently depend on several variables, i.e.,are defined on subsets of Rn for some n P N such that n ě 2. For suchfunctions, the concept of differentiation will be formulated in Calculus III.The calculation of the corresponding derivatives can be reduced to the cal-culation of derivatives of functions in one variable by help of the concept ofpartial derivatives. The last was developed soon after that of differentiationbecause of applications. The historic view of the partial derivative was thatof treating all variables of an analytic expression as constant, apart fromone. In this way, there is achieved an analytic expression in one variablethat can be differentiated in the usual way. The result was called a par-tial derivative of the original expression. The modern definition of partialderivatives is very similar. To define the partial derivative of a function f

138

in several variables, we consider an auxiliary partial function which resultsfrom f by restricting its domain to those points whose components are allgiven constants, apart from one of the components. The result is a functiondefined on a subset of R. In general, this function depends on the aboveconstants. The derivative of the auxiliary function in some point p of its do-main, so far existent, is called the partial derivative of f in the point whosecomponents are the given constants apart from the remaining componentwhich is given by p.

Definition 2.4.12. Let f : U Ñ R be a function of several variables whereU is a subset of Rn, n P N zt0, 1u. In particular, let i P t1, . . . , nu, x P Ube such that the corresponding function

fpx1, . . . , xi´1, ¨, xi`1, . . . , xnq

is differentiable at xi. In this case, we say that f is partially differentiableat x in the i-th coordinate direction, and we define:

Bf

Bxipxq :“ rfpx1, . . . , xi´1, ¨, xi`1, . . . , xnqs

1pxiq .

If f is partially differentiable at x in the i-th coordinate direction at everypoint of its domain, we call f partially differentiable in the i-th coordinatedirection and denote by BfBxi the map which associates to every x P Uthe corresponding pBfBxiqpxq. Partial derivatives of f of higher order aredefined recursively. If BfBxi is partially differentiable in the j-th coor-dinate direction, where j P t1, . . . , nu, we denote the partial derivative ofBfBxi in the j-th coordinate direction by

B2f

BxjBxj.

Such is called a partial derivative of f of second order. In the case j “ i,we set

B2f

Bx2i

:“B2f

BxiBxi.

Partial derivatives of f of higher order than 2 are defined accordingly.

139

Example 2.4.13. Define f : R2 Ñ R by

fpx, yq :“ x3` x2y3

´ 2y2

for all x, y P R. Find

Bf

Bxp2, 1q and

Bf

Byp2, 1q .

Solution: We have

fpx, 1q “ x3` x2

´ 2 and fp2, yq “ 8` 4y3´ 2y2

for all x, y P R. Hence it follows that

Bf

Bxpx, 1q “ 3x2

` 2x ,

x P R,Bf

Byp2, yq “ 12y2

´ 4y ,

y P R, and, finally, that

Bf

Bxp2, 1q “ 16 and

Bf

Byp2, 1q “ 8 .

Example 2.4.14. Define f : R3 Ñ R by

fpx, y, zq :“ x2y3z ` 3x` 4y ` 6z ` 5

for all x, y, z P R. Find

Bf

Bxpx, y, zq ,

Bf

Bypx, y, zq and

Bf

Bzpx, y, zq

for all x, y, z P R. Solution: Since in partial differentiating with respect toone variable all other variables are held constant, we conclude that

Bf

Bxpx, y, zq “ 2xy3z ` 3 ,

Bf

Bypx, y, zq “ 3x2y2z ` 4 ,

Bf

Bzpx, y, zq “ x2y3

` 6 ,

for all x, y, z P R.

140

Problems

1) By the basic definition of derivatives, calculate the derivative of thefunction f .

a) fpxq :“ 1x ,x P p0,8q ,b) fpxq :“ px´ 1qpx` 1q ,x P R zt´1u ,c) fpxq :“

?x ,x P p0,8q .

2) Calculate the slope of the tangent to G(f) at the point p1, fp1qq andits intersection with the x-axis.

a) fpxq :“ x2 ´ 3x` 1 ,x P R ,b) fpxq :“ p3x´ 2qp4x` 5q ,x P R zt´54u ,c) fpxq :“ e3x ,x P R .

3) Calculate the derivatives of the functions f1, . . . , f8 with maximaldomains in R defined by

a) f1pxq :“ 5x8 ´ 2x5 ` 6 , f2pθq :“ 3 sinpθq ` 4 cospθq ,

b) f3ptq :“ p1` 3t2 ` 5t4qpt2 ` 8q ,

f4pxq :“ 3ex rsinpxq ` 6 cospxqs ,

c) f5ptq :“3t4 ´ 2t` 5

t3 ` 8, f6pϕq :“

5 cospϕq

tanpϕq,

d) f7pxq :“ sinp3x2q , f8ptq :“ e4 sinp7tq .

4) A differentiable function f satisfies the given equation for all x fromits domain. Calculate the slope of the tangent to Gpfq in the specifiedpoint P without solving the equations for fpxq.

a) x2 ` pfpxqq2 “ 1 , P “ p´1?

2 , 1?

2 q ,

b) px´ 1q2rx2 ` pfpxqq2 s ´ 4x2 “ 0 , P “ p1`?

2 , 1`?

2 q ,

c) x`a

fpxqr2´ fpxqs “ arccosp1´ fpxqq ,

P “ ppπ4q ´ p1?

2 q, 1´ p1?

2 qq .

Remark: The curve in b) is a cycloid which is the trajectory of a pointof a circle rolling along a straight line. The curve in c) is named afterNicomedes (3rd century B.C.), who used it to solve the problem oftrisecting an angle.

5) Give a function f : RÑ R such that

141

a) f 1ptq “ 1 for all t P R and such that fp0q “ 2 ,b) f 1ptq “ ´2fptq for all t P R and such that fp0q “ 1 ,c) f 1ptq “ ´2fptq ` 3 for all t P R and such that fp0q “ 1 .

6) Let I be a non-empty open interval in R and p, q P R. Further, letf : I Ñ R. Show that

f 2pxq ` pf 1pxq ` qfpxq “ 0

for all x P I if and only if

f 2pxq “

ˆ

p2

4´ q

˙

fpxq

for all x P I where f : I Ñ R is defined by

fpxq :“ epx2fpxq

for all x P I .

7) Newton’s equation of motion for a point particle of mass m ě 0moving on a straight line is given by

mf 2ptq “ F pfptqq (2.4.5)

for all t from some time interval I Ă R, where fptq is the positionof the particle at time t, and F pxq is the external force at the point x.For the specified force, give a solution function f : RÑ R of (2.4.5)that contains 2 free real parameters.

a) F pxq “ F0 ,x P Rwhere F0 is some real parameter ,b) F pxq “ ´kx ,x P Rwhere k is some real parameter .

8) Newton’s equation of motion for a point particle of mass m ě 0moving on a straight line under the influence of a viscous friction isgiven by

mf 2ptq “ ´λf 1ptq (2.4.6)

for all t P R where fptq is the position of the particle at time t, andλ P r0,8q is a parameter describing the strength of the friction. Givea solution function f of (2.4.6) that contains 2 free real parameters.

9) For all px, yq from the domain, calculate the partial derivativespBfBxqpx, yq, pBfByqpx, yq of the given function f .

a) fpx, yq :“ x4 ´ 2x2y2 ` 3x´ 4y ` 1 , px, yq P R2 ,

142

b) fpx, yq :“ 3x2 ´ 2x` 1 , px, yq P R2 ,c) fpx, yq :“ sinpxyq , px, yq P R2 .

10) Let f : R Ñ R and g : R Ñ R be twice differentiable functions.Define upt, xq :“ fpx´ tq ` gpx` tq for all pt, xq P R2. Calculate

Bu

Btpt, xq ,

Bu

Bxpt, xq ,

B2u

Bt2pt, xq ,

B2u

Bx2pt, xq

for all pt, xq P R2. Conclude that u satisfies

B2u

Bt2´B2u

Bx2“ 0

which is called the wave equation in one space dimension (for a func-tion u which is to be determined).

143

2.5 Applications of DifferentiationThe applications of differentiation are manifold. We start with the applica-tion to the finding of maxima and minima of functions. For motivation, weconsider a continuous function f defined on a closed interval ra, bs wherea, b P R are such that a ď b. According to Theorem 2.3.33, f assumes amaximum and minimum value, i.e., there are xM , xm P ra, bs such that

fpxMq ě fpxq , fpxmq ď fpxq

for all x P ra, bs. The values fpxMq, fpxmq are called the maximum andminimum value of f , respectively. These values are uniquely determinedbecause if xM , xm P ra, bs are such that

fpxMq ě fpxq , fpxmq ď fpxq

for all x P ra, bs, it follows by definition of xM , xM , xm, xm that

fpxMq ě fpxMq , fpxmq ď fpxmq

as well as thatfpxMq ě fpxMq , fpxmq ď fpxmq

and hence that

fpxMq “ fpxMq , fpxmq “ fpxmq .

On the other hand, a function can assume its maximum value and/or itsminimum value in more than one point. For instance, the function p r0, 4πs ÑR, x ÞÑ 1` sinx q assumes its maximum value 3 and its minimum value 1in the points π2, 5π2 and 3π2, 7π2, respectively, see Fig. 39.

After this interrupt, we continue with the discussion of the maximum andminimum values of f . Each of them can be assumed either at a boundarypoint a or b of the interval or in a point of the open interval pa, bq. In the lastcases, if the function is differentiable on pa, bq, differentiation can be usedto determine the position(s) where they are assumed. We remember that the

144

Π

23 Π

2

5 Π

2

7 Π

2

x

1

2

3

4y

Fig. 39: Graph and segments of tangents of a function, p r0, 4πs Ñ R, x ÞÑ 2`sinx q, thatassumes both its maximum and its minimum value in several points of its domain. Notethat the tangents in those points are horizontal corresponding to a vanishing derivative inthose points.

function A from Fermat’s example at the beginning of Section 2.4 assumedits maximum value in the midpoint of its domain and that his way of findingits position was equivalent to the demand of a vanishing derivative at theposition of a maximum value. Indeed, this also true for a minimum value.With precise definitions of limits and derivatives at hand, both follow fromvery simple observations. By definition of xM , it follows that

fpxq ´ fpxMq ď 0

for all x P Dpfq. As a consequence, we conclude that

fpxq ´ fpxMq

x´ xMď 0

if b ą x ą xM andfpxq ´ fpxMq

x´ xMě 0

145

if a ă x ă xM . By choosing a sequence x1, x2, . . . of elements of pxM , bq,pa, xMq that converges to xM in Definition 2.4.1, it follows from this andTheorem 2.3.12 that f 1pxMq ď 0 and f 1pxMq ě 0, respectively, and hencethat f 1pxMq “ 0. Also, by definition of xm, it follows that

fpxq ´ fpxmq ě 0

for all x P Dpfq. As a consequence, we conclude that

fpxq ´ fpxmq

x´ xmě 0

if b ą x ą xm andfpxq ´ fpxmq

x´ xmď 0

if a ă x ă xm. By choosing a sequence x1, x2, . . . of elements of pxm, bq,pa, xmq that converges to xm in Definition 2.4.1, it follows from this andTheorem 2.3.12 that f 1pxmq ě 0 and f 1pxmq ď 0, respectively, and hencethat f 1pxmq “ 0.

Hence in case that the restriction of f to pa, bq is differentiable, the standardprocedure of finding the maximum and minimum values of f proceeds byfinding the zeros of the derivative of the restriction, subsequent calculationof the corresponding function values of f in those zeros and comparison ofthe obtained values with the function values of f at a and b. The maximum,minimum value of these function values is the maximum and minimumvalue of f , respectively.

Theorem 2.5.1. (Necessary condition for the existence of a local min-imum/maximum) Let f be a differentiable real-valued function on someopen interval I of R. Further, let f have a local minimum / maximum atsome x0 P I , i.e, let

fpx0q ď fpxq fpx0q ě fpxq

for all x such that x0 ´ ε ă x ă x0 ` ε, for some ε ą 0. Then

f 1px0q “ 0 ,

i.e, x0 is a so called ‘critical point’ for f .

146

-1 -0.6 -0.4 -0.2 0.2 0.4x

0.95

1.05

1.1

1.15

y


Proof. If f has a local minimum/maximum at x0 P I , then it follows forsufficiently small h P R˚ that

1

hrfpx0 ` hq ´ fpx0qs

is ě pďq 0 and ď pěq 0, for h ą 0 and h ă 0, respectively. Therefore, itfollows by Theorem 2.3.12 that f 1px0q is at the same time ě 0 and ď 0 andhence, finally, equal to 0.

Example 2.5.2. Find the critical points of f : RÑ R defined by

fpxq :“ x4` x3

` 1

for all x P R. Solution: The critical points of f are the solutions of theequation

0 “ f 1pxq “ 4x3` 3x2

“ x2p4x` 3q

and hence given by x “ 0 and x “ ´34. See Fig. 40. Note that f hasa local extremum at x “ ´34, but not at x “ 0. Hence the condition inTheorem 2.5.1 is necessary, but not sufficient for the existence of a localextremum.

147

-3 -2 -1 2 3x

-2

-1

1

2

3

4

5

y


Example 2.5.3. Find the maximum and minimum values of f : r´π, πs ÑR defined by

fpxq :“ x´ 2 cospxq

for all x P r´π, πs. Solution: Since f is continuous, such values existaccording to Theorem 2.3.33. Those points, where these values are as-sumed, can be either on the boundary of the domain, i.e., in the points ´πor π, there f assumes the values 2 ´ π and 2 ` π, respectively, or insidethe interval, i.e., in the open interval p´π, πq. In the last case, according toTheorem 2.5.1 those are critical points of the restriction of f to this interval.The last are given by

x “ ´π

6, ´

5π

6since

f 1pxq “ 1` 2 sinpxq

for all x P p´π, πq. Now

f´

´π

6

¯

“ ´

´π

6`?

3¯

, f

ˆ

´5π

6

˙

“?

3´5π

6

148

and hence the minimum value of f is ´pπ6q ´?

3 (assumed inside theinterval) and its maximum value is π` 2 (assumed at the right boundary ofthe interval). See Fig. 41.

The following is a theorem of Michel Rolle, published in 1691, which heused in his method of cascades devised to find intervals around zeros ofpolynomial functions that contain no other roots. In this connection, thesubsequent theorem gives that the open interval I that is contained in thedomain of a continuous function and that has two subsequent roots of thatfunction as end points, contains precisely one zero of the derivative of therestriction of that function to I if that restriction is differentiable.

Theorem 2.5.4. (Rolle’s theorem) Let f : ra, bs Ñ R be continuous wherea, b P R are such that a ă b. Further, let f be differentiable on pa, bq andfpaq “ fpbq. Then there is c P pa, bq such that f 1pcq “ 0.

Proof. Since f is continuous, according to Theorem 2.3.33 f assumes itsminimum and maximum value in some points x0 P ra, bs and x1 P ra, bs,respectively. Now if one of these points is contained in the open intervalpa, bq, the derivative of f in that point vanishes by Theorem 2.5.1. Other-wise, if both of those points are at the interval ends a, b it follows that

fpaq ď fpxq ď fpbq “ fpaq

for all x P ra, bs. Hence in this case, f is a constant function, and it followsby Example 2.4.3 that f 1pcq “ 0 for every c P pa, bq. Hence in both casesthe statement of the theorem follows.

The following example provides a typical application of Rolle’s theorem.

Example 2.5.5. Show that f : RÑ R defined by

fpxq :“ x3` x` 1

for all x P R, has exactly one zero. (Compare Example 2.3.39.)

149

a c bx

fHaL

fHbL

y

Fig. 42: Illustration of the statement of the mean value theorem 2.5.6.

Proof. f is continuous and because of

fp´1q “ ´1 ă 0 and fp0q “ 1 ą 0

and Corollary 2.3.38 has a zero x0 in p´1, 0q. See Fig. 23. Further, f isdifferentiable with

f 1pxq “ 3x2` 1 ą 0

for all x P R. Now assume that there is a another zero x1. Then it followsby Theorem 2.5.4 the existence of a zero of f 1 in the interval with endpointsx0 and x1. Hence f has exactly one zero.

The mean value theorem is a simple generalization of Rolle’s theoremwhich will be frequently used in the following. Its use as a central theoret-ical tool in calculus / analysis was initiated by Cauchy. Its proof proceedsby construction of an appropriate auxiliary function which allows the ap-plication of Rolle’s theorem. For a simple geometrical interpretation of thestatement of the mean value theorem, we consider a continuous function fdefined on a closed interval of R with left end point a and right end pointb, where a ă b, which is differentiable on pa, bq. Then according to thetheorem, there is a tangent to graph of the restriction of f to pa, bq withslope identical to slope of the line segment (‘secant’) from pa, fpaqq andpb, fpbqq, see Fig. 42.

150

Theorem 2.5.6. (Mean value theorem) Let f : ra, bs Ñ R be a continuousfunction where a, b P R are such that a ă b. Further, let f be differentiableon pa, bq. Then there is c P pa, bq such that

fpbq ´ fpaq

b´ a“ f 1pcq .

Proof. Define the auxiliary function h : ra, bs Ñ R by

hpxq :“ fpxq ´fpbq ´ fpaq

b´ a¨ px´ aq ´ fpaq

for all x P ra, bs. Then h is continuous as well as differentiable on pa, bqwith

h 1pxq “ f 1pxq ´fpbq ´ fpaq

b´ a

for all x P pa, bq and hpaq “ hpbq “ 0. Hence by Theorem 2.5.4 there isc P pa, bq such that

h 1pcq “ f 1pcq ´fpbq ´ fpaq

b´ a“ 0 .

Intuitively, it should be expected that every function which is defined onan open interval of R and has a vanishing derivative is a constant function.Indeed, this can be seen as a first important consequence of the mean valuetheorem.

Theorem 2.5.7. Let f : pa, bq Ñ R be differentiable, where a, b P R aresuch that a ă b. Further, let f 1pxq “ 0 for all x P pa, bq. Then f is aconstant function.

Proof. The proof is indirect. Assume that f is not a constant function. Thenthere are x1, x2 P pa, bq satisfying x1 ‰ x2 and fpx1q ‰ fpx2q. Hence itfollows by Theorem 2.5.6 the existence of c P px1, x2q such that

fpx2q ´ fpx1q

x2 ´ x1

“ f 1pcq “ 0

and hence that fpx1q “ fpx2q. Hence f is a constant function.

151

Typically, the previous theorem is applied in proofs of uniqueness of solu-tions of differential equations and in the derivation of so called ‘conservedquantities’ of physical systems as in the subsequent examples.

Example 2.5.8. (A characterization of the exponential function) Leta, b P R be such that a ă 0 and b ą 0. Find all solutions f : pa, bq Ñ R ofthe differential equation

f 1pxq “ fpxq

for all x P pa, bq that satisfy fp0q “ 1. Solution: We know that

fpxq :“ exppxq

for every x P pa, bq satisfies all these demands. Indeed, it follows by help ofthe previous theorem, Theorem 2.5.7, that there is no other solution. Thiscan be seen as follows. For this, let f be some function that satisfies theserequirements. Then we define the auxiliary function h : pa, bq Ñ R by

hpxq :“ expp´xq fpxq

for all x P pa, bq. As a consequence, h is differentiable with a derivative h 1

satisfying

h 1pxq “ ´ expp´xq fpxq ` expp´xq f 1pxq

“ ´ expp´xq fpxq ` expp´xq fpxq “ 0

for all x P pa, bq. Hence it follows by Theorem 2.5.7 that h is a constantfunction of value hp0q “ fp0q “ 1 which has the consequence that fpxq “exppxq for all x P pa, bq.

Example 2.5.9. (Energy conservation) Newton’s equation of motion fora point particle of mass m ě 0 moving on a straight line is given by


for all t from some non-empty open time interval I Ă R where fptq is theposition of the particle at time t and F pxq is the external force at the point

152

x. Assume that F “ ´V 1 where V is a differentiable function from anopen interval J Ą RanpIq. Show that E : I Ñ R defined by

Eptq :“m

2pf 1ptqq2 ` V pfptqq (2.5.2)

for all t P I is a constant function. Solution: It follows by Theorem 2.4.8,Theorem 2.4.10 and (2.5.1) that E is differentiable with derivative

E 1ptq “ mf 1ptqf 2ptq ` V 1

pfptqq ¨ f 1ptq “ f 1ptq rmf 2ptq ´ F pfptqqs “ 0

for all t P I . Hence according to Theorem 2.5.7, E is a constant function.In physics, its value is called the total energy of the particle. As a con-sequence, the finding of the solutions of the solution of (2.5.1), which issecond order in the derivatives, is reduced to the solution of (2.5.2), whichis only first order in the derivatives, for an assumed value of the total energy.

Utilizing the interpretation of the values of the derivative of a function asproviding the slopes of tangents at its graph, it is to be expected that a differ-entiable function is increasing (decreasing) on intervals where its derivativeassumes positive values (negative values), i.e., values that are ě 0 (ď 0).That this is intuition is correct is displayed by the following theorem. Itsstatement can be regarded as a another important consequence of the meanvalue theorem.

Theorem 2.5.10. Let f : ra, bs Ñ R be continuous where a, b P R are suchthat a ă b. Further, let f be differentiable on pa, bq and such that f 1pxq ą 0( f 1pxq ě 0 ) for every x P pa, bq. Then f is strictly increasing ( increasing )on ra, bs, i.e.,

fpxq ă fpyq p fpxq ď fpyq q

for all x, y P ra, bs that satisfy x ă y.

Proof. Let x and y be some elements of ra, bs such that x ă y. Thenthe restriction of f to the interval rx, ys satisfies the assumptions of Theo-rem 2.5.6, and hence there is c P px, yq such that

fpyq “ fpxq ` f 1pcqpy ´ xq ą fpxq p ě fpxq q .

153

0.5 1x

1

2

3

y

Fig. 43: Graphs of exp and approximations. See Example 2.5.12.

Typically, the previous theorem is used in the derivation of lower and upperbounds for the values of functions or more generally in the comparison offunctions and, in particular, in the proof of injectivity of functions. Thesubsequent examples provide such applications.

Example 2.5.11. Show that the exponential function exp : R Ñ R isstrictly increasing. Solution: By Example 2.4.4 and Theorem 2.3.27 itfollows that exp 1pxq “ exppxq ą 0 for all x P R. Hence it follows byTheorem 2.5.10 that exp is strictly increasing. Hence there is an inversefunction to exp which is called the natural logarithm and is denoted by ln.See Fig. 28.

Example 2.5.12. Show that

(i)ex ą 1 (2.5.3)

for all x P p0,8q.

(ii)ex ą x` 1 (2.5.4)

for all x P p0,8q.

154

(Compare Theorem 2.3.27.)

Proof. Define the continuous function f : r0,8q Ñ R by fpxq :“ ex ´1 for all x P r0,8q. Then f is differentiable on p0,8q with f 1pxq “ex ą 0 for all x P p0,8q. Hence f is strictly increasing according toTheorem 2.5.10, and (2.5.3) follows since fp0q “ e0 ´ 1 “ 0. Further,define the continuous function gpxq :“ ex ´ 1´ x for all x P r0,8q. Theng is differentiable on p0,8q with g 1pxq “ ex ´ 1 ą 0 for all x P p0,8qwhere (2.5.3) has been applied. Hence (2.5.4) follows by Theorem 2.5.10since gp0q “ e0 ´ 0´ 1 “ 0.

From Example 2.5.11 and (2.5.4), it follows by the intermediate value the-orem, Theorem 2.3.37, that

expp r0,8q q “ r1,8q

and hence by part (iii) of Theorem 2.3.27 that the range of exp is given byp0,8q which therefore is also the domain of its inverse function ln. As aconsequence, exp is a strictly increasing bijective map from R onto p0,8q.See Fig. 28.


lnpa ¨ bq “ lnpaq ` lnpbq

for all a, b ą 0. Solution: For a, b ą 0, it follows by Theorem 2.3.27 that

lnpa ¨ bq “ ln`

elnpaq¨ elnpbq

˘

“ ln`

elnpaq`lnpbq˘

“ lnpaq ` lnpbq .

In Example 2.5.9, we derived a conserved quantity for the solutions of adifferential equation, a special case of Newton’s equation of motion. Ig-noring the physical dimensions of the involved quantities in that example,in the special case that m “ 2, F pxq “ ´2x for all x P R, the functionE : I Ñ R, defined by

Eptq “ pfptqq2 ` pf 1ptqq2

155

for all t P I and a solution f of the differential equation

f 2ptq ` fptq “ 0

for all t P I , was found to be a constant function. The value of the cor-responding constant is called the total energy that is associated to f . Animportant feature of that quantity is its positivity. In the subsequent theo-rem, we show that estimates on the growth of the same function E definedfor solutions of the related differential equation (2.5.5) can be used to showthe uniqueness of the solutions of that differential equation. The key for thisis the following lemma whose proof provides a further application of The-orem 2.5.10. Differential equations of the form (2.5.5) appear frequentlyin applications, for instance, in the description of the amplitudes of oscil-lations of damped harmonic oscillators in mechanics and in the descriptionof the current as a function of time in simple electric circuits in electrody-namics.

Lemma 2.5.14. (An ‘energy’ inequality for solutions of a differentialequation) Let p, q P R. Further, let I be some open interval of R, x0 P Iand f : I Ñ R satisfy the differential equation

f 2pxq ` p f 1pxq ` q fpxq “ 0 (2.5.5)

for all x P I . Finally, define

Epxq :“ pfpxqq2 ` pf 1pxqq2

for all x P I . Then for all x P I

0 ď Epxq ď Epx0q ek|x´x0|

wherek :“ 1` 2|p| ` |q| .

Proof. Since f is twice differentiable, E is differentiable such that

E 1pxq “ 2fpxqf 1pxq ` 2f 1pxqf 2pxq

156

“ 2fpxqf 1pxq ´ 2 r p f 1pxq ` q fpxq s f 1pxq

“ 2 p1´ qqfpxqf 1pxq ´ 2 p pf 1pxqq2

for all x P I . Hence E 1 is continuous and satisfies

|E 1pxq| ď 2 p1` |q|q |f 1pxq| |fpxq| ` 2 |p| pf 1pxqq2

ď p1` |q|q“

pfpxqq2 ` pf 1pxqq2‰

` 2 |p| pf 1pxqq2 ď kEpxq

for all x P I where it has been used that

2 |f 1pxq| |fpxq| ď pfpxqq2 ` pf 1pxqq2 .

As a consequence,

´kEpxq ď E 1pxq ď kEpxq

for all x P I . We continue analyzing the consequences of these inequalities.For this, we define auxiliary functions Er, El by

Erpxq :“ e´kxEpxq , Elpxq :“ ekxEpxq

for all x P I . Then

E 1rpxq “ e´kx pE 1

pxq ´ kEpxqq ď 0 , E 1l pxq “ ekx pE 1

pxq ` kEpxqq ě 0

for all x P I . Hence Er is decreasing, which is equivalent to the increasingof ´Er, and Er is increasing. Hence it follows by Theorem 2.5.10 that

Epxq ď Epx0q ekpx´x0q “ Epx0q e

k|x´x0|

for x ě x0 and that

Epxq ď Epx0q ekpx0´xq “ Epx0q e

k|x´x0| .

for x ď x0.

The unique dependence of the solutions of (2.5.6) on ‘initial data’, fpx0q

and f 1px0q given at some x0 P R is a simple consequence of the precedinglemma.

157

Theorem 2.5.15. Let p, q P R. Further, let I be some open interval of R,x0 P I and y0, y

10 P R. Then there is at most one function f : I Ñ R such

thatf 2pxq ` p f 1pxq ` q fpxq “ 0 (2.5.6)

for all x P I and at the same time such that

fpx0q “ y0 , f1px0q “ y 10 .

Proof. For this, let f, f : I Ñ R be such that

f 2pxq ` p f 1pxq ` q fpxq “ f 2pxq ` p f 1pxq ` q fpxq “ 0

for all x P I and

fpx0q “ fpx0q “ y0 , f1px0q “ f 1px0q “ y 10 .

Then u :“ f ´ f satisfies

u 2pxq ` p u 1pxq ` q upxq “ 0

for all x P I andupx0q “ u 1px0q “ 0 .

Hence it follows by Lemma 2.5.14 that upxq “ 0 for all x P I and hencethat f “ f .

Of course, of main interest for applications are the solutions of (2.5.6).These are obtained by reducing the solution of this equation to the solutionof the special cases corresponding to p “ 0. The solutions of the last areobvious. Their representation is simplified by use of hyperbolic functionswhich are introduced next.

Definition 2.5.16. We define the hyperbolic sine function sinh, the hyper-bolic cosine function cosh and the hyperbolic tangent function tanh by

sinhpxq :“1

2

`

ex ´ e´x˘

, coshpxq :“1

2

`

ex ` e´x˘

,

158

-2 -1 1 2x

-3

-2

-1

2

3

y

Fig. 44: Graphs of the hyperbolic sine and cosine function.

-2 -1 1 2x

-0.5

0.5

y

Fig. 45: Graphs of the hyperbolic tangent function and asymptotes given by the graphs ofthe constant functions on R of values 1 and ´1.

159

tanhpxq :“sinhpxq

coshpxq,

for all x P R. Obviously, sinh, tanh are antisymmetric and cosh is sym-metric, i.e.,

sinhp´xq “ ´ sinhpxq , cosp´xq “ coshpxq , tanhp´xq “ ´ tanhpxq

for all x P R. Also these functions are differentiable and, in particular,

sinh 1 “ cosh , cosh 1 “ sinh

similarly to the sine and cosine functions. Another resemblance to thesefunctions is the relation

cosh2pxq ´ sinh2

pxq “1

4

”

`

ex ` e´x˘2´`

ex ´ e´x˘2ı

“1

4

“`

ex ` e´x ´ ex ` e´x˘ `

ex ` e´x ` ex ´ e´x˘‰

“1

42 e´x 2 ex “ 1

for all x P R. In particular, this implies that

tanh 1pxq :“cosh2

pxq ´ sinh2pxq

cosh2pxq

“ 1´ tanh2pxq “

1

cosh2pxq

for all x P R.

The solution of (2.5.6) corresponding to ‘initial data’, fpx0q and f 1px0q

given at some x0 P R are obtained in the proof of the following theorem byconsidering a function that is related to f . As a consequence of (2.5.6), thatfunction is a solution of the differential equation of the form (2.5.6) withp “ 0. The solutions of these special equations are obvious.

Theorem 2.5.17. Let p, q P R, D :“ pp24q ´ q and x0, y0, y10 P R. Then

the unique solution to

f 2pxq ` pf 1pxq ` qfpxq “ 0

160

satisfying fpx0q “ y0 and f 1px0q “ y 10 is given by

fpxq “ y0e´ppx´x0q2

“

coshpD12px´ x0qq

`D´12´py0

2` y 10

¯

sinhpD12px´ x0qq

ı

for x P R if D ą 0,

fpxq “ e´ppx´x0q2”

y0 `

´py0

2` y 10

¯

px´ x0q

ı

for x P R if D “ 0 and

fpxq “ y0e´ppx´x0q2

“

cosp|D|12px´ x0qq

` |D|´12´py0

2` y 10

¯

sinp|D|12px´ x0qq

ı

for x P R if D ă 0.

Proof. For this, we first notice that a function h : RÑ R satisfies

h 2pxq ` ph 1pxq ` qhpxq “ 0 (2.5.7)

for all x P R if and only if

h 2pxq `

ˆ

q ´p2

4

˙

hpxq “ 0 (2.5.8)

for all x P R where h : RÑ R is defined by

hpxq :“ epx2hpxq (2.5.9)

for all x P R. Indeed, it follows by Theorem 2.4.8 that h is twice differen-tiable if and only if h is twice differentiable and in this case that

h 1pxq “ epx2´

h 1pxq `p

2hpxq

¯

,

h 2pxq “ epx2ˆ

h 2pxq ` p h 1pxq `p2

4hpxq

˙

161

for all x P R. The last implies that

h 2pxq `

ˆ

q ´p2

4

˙

hpxq “ epx2ˆ

h 2pxq ` p h 1pxq `p2

4hpxq

˙

` epx2ˆ

q ´p2

4

˙

hpxq “ epx2 ph 2pxq ` ph 1pxq ` qhpxqq “ 0

for all x P R if and only if (2.5.7) is satisfied for all x P R. In addition,hpx0q “ y0 and h 1px0q “ y 10 if and only if

hpx0q “ y0epx02 , h 1px0q “

´py0

2` y 10

¯

epx02 . (2.5.10)

For the solution of (2.5.8) and (2.5.10), we consider three cases. If D :“pp24q ´ q ą 0, then a solution to (2.5.8) and (2.5.10) is given by

hpxq “ y0epx02 coshpD12

px´ x0qq

`D´12´py0

2` y 10

¯

epx02 sinhpD12px´ x0qq

for x P R. If D “ 0, then a solution to (2.5.8) and (2.5.10) is given by

hpxq “ y0epx02 `

´py0

2` y 10

¯

epx02px´ x0q

for x P R. If D ă 0, then a solution to (2.5.8) and (2.5.10) is given by

hpxq “ y0epx02 cosp|D|12px´ x0qq

` |D|´12´py0

2` y 10

¯

epx02 sinp|D|12px´ x0qq

for x P R. Hence, finally, it follows by (2.5.9) and by Theorem 2.5.15 thestatement of this theorem.

According to Theorem 2.3.44, the inverse of a strictly increasing continu-ous function defined on a closed interval ra, bs of R where a, b P R are suchthat a ă b, is continuous, too. If the restriction of f to pa, bq is in additiondifferentiable, then the restriction of f´1 to pfpaq, fpbqq is also differen-tiable. Moreover, the following theorem gives an often used representationof the derivative of the last in terms of the derivative of f .

162

Theorem 2.5.18. (Derivatives of inverse functions) Let f : ra, bs Ñ Rbe continuous where a, b P R are such that a ă b. Further, let f be differ-entiable on pa, bq and such that f 1pxq ą 0 for every x P pa, bq. Then theinverse function f´1 is defined on rfpaq, fpbqs as well as differentiable onpfpaq, fpbqq with

`

f´1˘ 1pyq “

1

f 1pf´1pyqq(2.5.11)

for all y P pfpaq, fpbqq.

Proof. By Theorem 2.5.10, it follows that f is strictly increasing and hencethat there is an inverse function f´1 for f . Further, by Theorem 2.3.44 f´1

is continuous, and by Theorem 2.3.43 it follows that fpra, bsq “ rfpaq, fpbqsand hence that f´1 is defined on rfpaq, fpbqs. Now let y P pfpaq, fpbqqand y1, y2, . . . be a sequence in pfpaq, fpbqq ztyu which is convergent to y.Then f´1py1q, f

´1py2q, . . . is a sequence in pa, bq ztf´1pyqu which, by thecontinuity of f´1, converges to f´1pyq. Hence it follows for n P N˚ that

f´1pynq ´ f´1pyq

yn ´ y“

„

f pf´1pynqq ´ f pf´1pyqq

f´1pynq ´ f´1pyq

´1

and hence by the differentiability of f in f´1pyq, that f 1pf´1pyqq ą 0 andby Theorem 2.3.4 the statement (2.5.11).

The following examples, give two applications of the previous theorem.The second example is from the field of General Relativity.

Example 2.5.19. Calculate the derivative of ln, arcsin, arccos and arctan.Solution: By Theorem 2.5.18, it follows that

ln 1pxq “1

exp 1plnpxqq“

1

expplnpxqq“

1

x

for every x P p0,8q,

arcsin 1pxq “1

sin 1parcsinpxqq“

1

cosparcsinpxqq

163

1 2 3x

-1

1

2

y

Fig. 46: Graph of the auxiliary function h from Example 2.5.20.

“1

a

1´ sin2parcsinpxqq“

1?

1´ x2,

arccos 1pxq “1

cos 1parccospxqq“ ´

1

sinparccospxqq

“ ´1

a

1´ cos2parccospxqq“ ´

1?

1´ x2

for all x P p´1, 1q and

arctan 1pxq “1

tan 1parctanpxqq“

1

p1` tan2qparctanpxqq“

1

1` x2

for every x P R.

Example 2.5.20. In terms of Kruskal coordinates, the radial coordinateprojection r : Ω Ñ p0,8q of the Schwarzschild solution of Einstein’s fieldequation is given by

rpu, vq “ h´1pu2

´ v2q

164

for all pv, uq P Ω where h : p0,8q Ñ p´1,8q is defined by

hpxq :“´ x

2M´ 1

¯

exp2Mq

for all x P p0,8q. Here

Ω :“ tpv, uq P R2 : u2´ v2

ą ´1u ,

and M ą 0 is the mass of the black hole. In addition, geometrical units areused where the speed of light and the gravitational constant have the value1. Finally, h is bijective and h´1 is differentiable. Calculate

Br

Bv,Br

Bu.

for all pv, uq P Ω. Solution: For this, let pv, uq P Ω. In a first step, weconclude by Theorem 2.5.18 that

Br

Bvpu, vq “ ´2v ¨ ph´1

q1pu2

´ v2q “ ´2v ¨ rh 1ph´1

pu2´ v2

qq s´1

“ ´2v ¨ rh 1prpu, vqq s´1 ,

Br

Bupu, vq “ 2u ¨ ph´1

q1pu2

´ v2q “ 2u ¨ rh 1ph´1

pu2´ v2

qq s´1

“ 2u ¨ rh 1prpu, vqq s´1 .

Since

h 1pxq “1

2Mexp2Mq `

1

2M

´ x

2M´ 1

¯

exp2Mq “x

4M2¨ exp2Mq

for every x ą 0, this implies that

Br

Bvpu, vq “ ´8M2v

ˆ

e´rp2Mq

r

˙

pu, vq ,

Br

Bupu, vq “ 8M2u

ˆ

e´rp2Mq

r

˙

pu, vq .

165

1 2x

1

2

y

1 2x

1

2

y

Fig. 47: Graphs of power functions corresponding to positive (ě 0) and negative (ď 0) a,respectively. See Definition 2.5.21.

The following defines general powers of strictly positive (ą 0) real numbersin terms of the exponential function and its inverse, the natural logarithmfunction.

Definition 2.5.21. (General powers) For every a P R, we define the cor-responding power function by

xa :“ ea¨lnx

for all x ą 0.

By Theorem 2.4.10, the power function pp0,8q Ñ R, x ÞÑ xaq is differen-tiable with derivative

a

x¨ ea¨lnx “

a

x¨ elnx`pa´1q¨lnx

“a

x¨ elnx

¨ epa´1q¨lnx“ a ¨ xpa´1q

in x ą 0. Also the following calculational rules are simple consequencesof the definition of general powers and basic properties of the exponentialfunction and its inverse.


x0“ 1 , xa ya “ pxyqa , xa xb “ xa`b , pxaqb “ xab

166

1 2x

-2

-1

1

y

Fig. 48: Graphs of ln and polynomial approximations corresponding to a “ 12, 1 and 2.See Example 2.5.23.

for all x, y ą 0 and a, b P R. Solution: By Definition 2.5.21, it follows forsuch x, y, a and b that

x0“ e0¨lnx

“ e0“ 1 ,

xa ya “ ea¨lnxea¨ln y “ ea¨lnx`a¨ln y “ ea¨plnx`ln yq“ ea lnpxyq

“ pxyqa ,

xa xb “ ea¨lnx eb¨lnx “ ea¨lnx`b¨lnx “ epa`bq¨lnx “ xa`b ,

pxaqb “ eb¨lnxa

“ eb¨ln ea¨ln x

“ eb a lnx“ ea b lnx

“ xab .

The following derives frequently used polynomial approximations of thenatural logarithm function as a further example for the application of The-orem 2.5.10. A verbalization of the estimate (2.5.12) is that the naturallogarithm ‘ lnpxq is growing more slowly than any positive power of x forlarge x ’.

Example 2.5.23. Show that for every a ą 0

lnpxq ă1

apxa ´ 1q (2.5.12)

167

for all x ą 1. (See Exercise 2.3.2 for an application of the case a “ 12.)Solution: Define the continuous function f : r1,8q Ñ R by

fpxq :“1

apxa ´ 1q ´ lnpxq

for all x ě 1. Then f is differentiable on p1,8q with

f 1pxq “1

a¨a

x¨ ea¨lnx ´

1

x“

1

x

`

ea¨lnx ´ 1˘

ą 0

for x ą 1 and fp1q “ 0. Hence (2.5.12) follows by Theorem 2.5.10.

Another important consequence of Theorem 2.5.6 is given by Taylor’s the-orem which is frequently employed in applications. For its formulation, weneed to introduce some additional terminology.

Definition 2.5.24. If m,n P N such that m ď n and and am, . . . , an P R,we define

nÿ

k“m

ak :“ am ` am`1 ` ¨ ¨ ¨ ` an .

Note that, as a consequence of the associative law for addition, it is notnecessary to indicate the order in which the summation is to be performed.Further, obviously,

nÿ

k“m

pak ` bkq “

˜

nÿ

k“m

ak

¸

`

˜

nÿ

k“m

bk

¸

and

λnÿ

k“m

ak “nÿ

k“m

λ ak

for every λ P R and bm, . . . , bn P R. In addition, we define for every n P Nthe corresponding factorial n! recursively by

0! :“ 1 , pk ` 1q! :“ pk ` 1qk!

for every k P N˚. Hence in particular, 1! “ 1, 2! “ 2, 3! “ 6, 4! “ 24,5! “ 120 and so forth.

168

For the motivation of Taylor’s theorem, we consider a twice continuouslydifferentiable function f defined on an open subinterval pa, bq of R wherea, b P R are such that a ă b. Further, let x0, x P pa, bq. According to themean value theorem, there is ξ in the open interval between x0 and x suchthat

fpxq ´ fpx0q

x´ x0

“ f 1pξq .

This implies that

fpxq “ fpx0q ` px´ x0qf1pξq .

Further, by the same reasoning, it follows the existence of ζ in the openinterval between x0 and ξ such that

f 1pξq “ f 1px0q ` pξ ´ x0qf2pζq .

Hence we conclude that

fpxq “ fpx0q ` px´ x0qf1pξq

“ fpx0q ` px´ x0q r f1px0q ` pξ ´ x0qf

2pζqs

“ fpx0q ` px´ x0qf1px0q ` px´ x0qpξ ´ x0qf

2pζq

and

|fpxq ´ fpx0q ´ px´ x0qf1px0q| ď |x´ x0|

2¨ |f 2pζq| . (2.5.13)

Since f 2 is continuous, we conclude that for every arbitrary preassignederror bound ε ą 0 there is an interval I around x0 such that

|fpxq ´ fpx0q ´ px´ x0qf1px0q| ď ε

for every x P I . Hence the restriction of f to I can be approximated withinan error ε by the restriction of the linear polynomial function

p1pxq :“ fpx0q ` px´ x0qf1px0q

169

for all x P R to I . This polynomial is called the linearization of f aroundthe point x0. Note that

p1px0q “ fpx0q , p11px0q “ f 1px0q .

Therefore, p1 is the uniquely determined linear, i.e. of order ď 1, polyno-mial that assumes the value fpx0q in x0 and whose derivative assumes thevalue f 1px0q in x0. In particular, its graph coincides with the tangent to thegraph of f in x0. In applications, functions are frequently replaced by theirlinearizations around appropriate points to simplify subsequent reasoning.Often, this is done without performing an error estimate like (2.5.13) in thehope the error introduced by the replacement is in some sense ‘small’.

If f is sufficiently often differentiable, it is to be expected that f can bedescribed with higher precision near x0 by polynomials of higher orderthan 1. Indeed, this is true and Taylor’s theorem provides such so calledTaylor polynomials pn for n P N˚ with n ą 1. It is tempting to speculatethat pn is the uniquely determined polynomial of order ď n such that

pnpx0q “ fpx0q , ppkqn px0q “ f pkqpx0q

for k “ 1, . . . , n. In that case, pn is easily determined to be of the form

pnpxq “nÿ

k“0

f pkqpx0q

k!px´ x0q

k ,

for all x P R where we set f p0q :“ f and f is assumed to be pn ` 1q-timescontinuously differentiable. Indeed, this speculation turns out to be correct.

We first give Taylor’s theorem in a form which resembles that of the meanvalue theorem. Its proof proceeds by application of the last to a skillfullyconstructed auxiliary function.

Theorem 2.5.25. (Taylor’s theorem) Let n P N˚, I be a non-trivial openinterval and f : I Ñ R be n´times differentiable. Finally, let a and b be

170

two different elements from I . Then there is c in the open interval betweena and b such that

fpbq “n´1ÿ

k“0

f pkqpaq

k!pb´ aqk `

f pnqpcq

n!pb´ aqn (2.5.14)

where f p0q :“ f and pb´ aq0 :“ 1.

Proof. Define the auxiliary function g : I Ñ R by

gpxq :“ fpbq ´n´1ÿ

k“0

f pkqpxq

k!pb´ xqk

for all x P I . Then it follows that gpbq “ 0 and moreover that g is differen-tiable with

g 1pxq “ ´n´1ÿ

k“0

f pk`1qpxq

k!pb´ xqk `

n´1ÿ

k“1

f pkqpxq

pk ´ 1q!pb´ xqk´1

“ ´f pnqpxq

pn´ 1q!pb´ xqn´1

for all x P I . Define a further auxiliary function h : I Ñ R by

hpxq :“ gpxq ´

ˆ

b´ x

b´ a

˙n

gpaq

for all x P I . Then it follows that hpaq “ hpbq “ 0 and that h is differen-tiable with

h 1pxq “ ´f pnqpxq

pn´ 1q!pb´ xqn´1

` npb´ xqn´1

pb´ aqngpaq

for all x P I . Hence according to Theorem 2.5.4, there is c in the openinterval between a and b such that

0 “ h 1pcq “ ´f pnqpcq

pn´ 1q!pb´ cqpn´1q

` npb´ cqn´1

pb´ aqngpaq

which implies (2.5.14).

171

Taylor’s Theorem 2.5.25 is usually applied in the following form,

Corollary 2.5.26. (Taylor’s formula) Let n P N˚, I be a non-trivial openinterval of length L and f : I Ñ R be n´times differentiable. Finally, letx0 P I and C ě 0 be such that

|f pnqpxq| ď C

for all x P I . Thenˇ

ˇ

ˇ

ˇ

ˇ

fpxq ´n´1ÿ

k“0

f pkqpx0q

k!px´ x0q

k

ˇ

ˇ

ˇ

ˇ

ˇ

ďCLn

n!.

for all x P I .

Remark 2.5.27. The polynomial

pn´1pxq :“n´1ÿ

k“0

f pkqpx0q

k!px´ x0q

k

for all x P R in Corollary 2.5.26 is called ‘ the pn ´ 1q-degree polynomialof f centered at x0 ’. In particular, it follows (for the case n “ 2) that:

p1pxq “ fpx0q ` f1px0q px´ x0q

for all x P R which is also called the ‘linearization or linear approximationof f at x0’ and

|fpxq ´ p1pxq| ďCL2

2if C ě 0 is such that

|f 2pxq| ď C

for all x P I . In applications, one often meets the notation

fpxq « p1pxq

saying that f and p1 are approximately the same near x0 . If the error can beseen to be ‘negligible’ for the application, this often leads to a replacementof f by its linearization.

172

0.1 0.2 0.3 0.4 0.5x

1.05

1.1

1.15

1.2

1.25y

Fig. 49: Graphs of f and p1 from Corollary 2.5.28.

Example 2.5.28. Calculate the linearization p1 of f : r´1,8q Ñ R definedby

fpxq :“?

1` x

for all x P r´1,8q at x “ 0, and estimate its error on the interval r0, 12s.Solution: f is twice differentiable on p´1,8q with

f 1pxq “1

2p1` xq´12 , f 2pxq “ ´

1

4p1` xq´32 .

Hence p1 is given by

p1pxq “ 1`1

2x

for all x P R Because ofˇ

ˇ

ˇ

ˇ

1

4p1` xq´32

ˇ

ˇ

ˇ

ˇ

ď1

4

for all x P r0, 12s, it follows from (2.5.27) that the absolute value of therelative error satisfies

|p1pxq ´ fpxq|

|fpxq|ď

1

32

173

for all x P r0, 12s.

We know that the first derivative of a function f in a point p of its domainprovides the slope of the tangent at the graph of the function in the pointpp, fppqq. Hence it is natural to ask whether there is geometrical interpre-tation of the second derivative. Indeed, such interpretation can be given interms of the way how the graph of the function ‘bends’. This can be seenby help of Taylor’s theorem. For this, we consider a three times continu-ously differentiable function f defined on an open subinterval pa, bq of Rwhere a, b P R are such that a ă b. Further, let x0, x P pa, bq. According toTaylor’s theorem, there is ξ in the open interval between x0 and x such that

fpxq “ fpx0q ` f1px0qpx´ x0q `

f 2px0q

2px´ x0q

2`f 3pξq

6px´ x0q

3 .

If f 1px0q ‰ 0, since f 3 is continuous, it follows for x sufficiently near tox0 that

|f 2px0q|

2px´ x0q

2ą|f 3pξq|

6|x´ x0|

3

and hence thatfpxq ą fpx0q ` f

1px0qpx´ x0q

if f 2px0q ą 0 and

fpxq ă fpx0q ` f1px0qpx´ x0q

if f 2px0q ă 0. Hence if f 2px0q ą 0, for x sufficiently near to x0, the valueof fpxq exceeds the value of its linearization at x0 or, equivalently, the pointpx, fpxqq lies above the tangent at x0. In this case, we say that f is locallyconvex at x0. If f 2px0q ă 0, for x sufficiently near to x0, the value of fpxqis smaller than the value of its linearization at x0 or, equivalently, the pointpx, fpxqq lies below the tangent at x0. In this case, we say that f is locallyconcave at x0.

Definition 2.5.29. (Convexity / concavity of a differentiable function)Let f : pa, bq Ñ R be differentiable where a, b P R are such that a ă b. Wecall f convex (concave) if

fpxq ą fpx0q ` f1px0qpx´ x0q p fpxq ă fpx0q ` f

1px0qpx´ x0q q

174

for all x0, x P pa, bq such that x0 ‰ x.

The following theorem proves the convexity / concavity of a function underless restrictive assumptions than our motivational analysis above.

Theorem 2.5.30. Let f : pa, bqÑ R be twice differentiable on pa, bq, wherea, b P R are such that a ă b, and such that f 2pxq ą 0 (f 2pxq ă 0) for allx P pa, bq. Then

fpxq ą fpx0q ` f1px0qpx´ x0q p fpxq ă fpx0q ` f

1px0qpx´ x0q q

for all x0, x P pa, bq such that x0 ‰ x, i.e., ‘f is convex’ (‘f is concave’).

Proof. First, we consider the case that f 2pxq ą 0 for all x P pa, bq. Forthis, let x0 P pa, bq and x P pa, bq be such that x ą x0. According toTheorem 2.5.6, there is c P px0, xq such that

fpxq ´ fpx0q

x´ x0

“ f 1pcq .

By Theorem 2.5.10, it follows that f 1 is strictly increasing on rx0, xs andhence that

fpxq ´ fpx0q

x´ x0

“ f 1pcq ą f 1px0q

and thatfpxq ą fpx0q ` f

1px0qpx´ x0q . (2.5.15)

Analogously for x P pa, bq such that x ă x0, it follows that there c P px, x0q

such thatfpx0q ´ fpxq

x0 ´ x“ f 1pcq

and such that f 1 strictly increasing on rx, x0s and hence that

fpx0q ´ fpxq

x0 ´ x“ f 1pcq ă f 1px0q

which implies (2.5.15). In the remaining case that f 2pxq ă 0 for all x Ppa, bq, application of the previous to ´f gives

´fpxq ą ´fpx0q ´ f1px0qpx´ x0q

175

-1 1 2 3x

5

10

15

20

25

30

y

Fig. 50: Graphs of exp along with linearizations around x “ 1, 2 and 3.

and hencefpxq ă fpx0q ` f

1px0qpx´ x0q

for all x0 P pa, bq and x P pa, bq ztx0u.

Example 2.5.31. The exponential function exp is convex because of exp 2pxq “exppxq ą 0 for all x P R. See Fig. 50.

Example 2.5.32. Find the intervals of convexity and concavity of f : RÑR defined by

fpxq :“ x4` x3

´ 2x2` 1

for all x P R. Solution: f is twice continuously differentiable with

f 1pxq “ 4x3` 3x2

´ 4x , f 2pxq “ 12x2` 6x´ 4 “ 12

ˆ

x2`

1

2x´

1

3

˙

“ 12

˜

x`1

4`

c

19

48

¸

¨

˜

x`1

4´

c

19

48

¸

176

-2 -1 1x

-2

2

4

y

Fig. 51: Graph of f from Example 2.5.32 and parallels to the y´axis through its inflectionpoints.

for all x P R. Hence f is convex on the intervals˜

´8,´1

4´

c

19

48

¸

,

˜

´1

4`

c

19

48,8

¸

and concave on the interval˜

´1

4´

c

19

48,´

1

4`

c

19

48

¸

.

The following theorem gives another useful characterization of a functiondefined on interval I of R to be convex. Such function is convex if andonly if for every x, y P I such that x ă y the graph of f |px,yq lies below thestraight line (‘secant’) between px, fpxqq and py, fpyqq.

Theorem 2.5.33. Let f : pa, bq Ñ R be differentiable on pa, bq wherea, b P R are such that a ă b. Then f is convex if and only if

fpzq ă fpxq ` pz ´ xqfpyq ´ fpxq

y ´ x

„

“ fpyq ´ py ´ zqfpyq ´ fpxq

y ´ x

177

x

y

Fig. 52: Graph of a convex function (black) and secant (blue). Compare Theorem 2.5.33.

for all x, y, z P pa, bq such that x ă z ă y.

Proof. If f is convex, we conclude as follows. For the first step, let x, y Ppa, bq be such that x ă y. As a consequence of the convexity of f , it followsthat

fpyq ą fpxq ` f 1pxqpy ´ xq , fpxq ą fpyq ` f 1pyqpx´ yq

and hence that

f 1pxq ăfpyq ´ fpxq

y ´ xă f 1pyq .

This is true for all x, y P pa, bq be such that x ă y. Not that this impliesthat f 1 is strictly increasing. For the second step, let x, y, z P pa, bq be suchthat x ă z ă y. By the mean value theorem Theorem 2.5.6, it follows theexistence of ξ P px, yq such that

fpyq ´ fpxq

y ´ x“ f 1pξq .

In the case that ξ ď z, it follows by help of the first step that

fpyq ´ fpxq

y ´ x“ f 1pξq ď f 1pzq ă

fpyq ´ fpzq

y ´ z

178

and hence that

fpzq ă fpyq ´ py ´ zqfpyq ´ fpxq

y ´ x.

In the case that z ď ξ, it follows by help of the first step that

fpyq ´ fpxq

y ´ x“ f 1pξq ě f 1pzq ą

fpzq ´ fpxq

z ´ x

and hence that


y ´ x.

On the other hand, if


y ´ x

„

“ fpyq ´ py ´ zqfpyq ´ fpxq

y ´ x

for all x, y, z P pa, bq such that x ă z ă y, we conclude as follows. Forthis, note that the previous implies that

fpzq ´ fpxq

z ´ xăfpyq ´ fpxq

y ´ xăfpyq ´ fpzq

y ´ z.

In the following, let x, y, z, ξ P pa, bq be such that x ă z ă ξ ă y. Itfollows from the assumption that

fpzq ´ fpxq

z ´ xăfpξq ´ fpxq

ξ ´ xăfpyq ´ fpxq

y ´ x.

From this, it follows by taking the limit z Ñ x that

f 1pxq ďfpξq ´ fpxq

ξ ´ xăfpyq ´ fpxq

y ´ x.

This implies that

f 1pxq ăfpyq ´ fpxq

y ´ x

179

and therefore that

fpyq ą fpxq ` f 1pxqpy ´ xq . (2.5.16)

Also, it follows from the assumption that

fpyq ´ fpxq


y ´ zăfpyq ´ fpξq

y ´ ξ.

From this, it follows by taking the limit ξ Ñ y that

fpyq ´ fpxq


y ´ zď f 1pyq .

This implies thatfpyq ´ fpxq

y ´ xă f 1pyq

and therefore that

fpxq ą fpyq ` f 1pyqpx´ yq . (2.5.17)

Since (2.5.16) is true for all x, y P pa, bq such that x ă y, we conclude thefollowing for x, y P pa, bq such that y ă x

fpxq ą fpyq ` f 1pyqpx´ yq .

Finally, from this and (2.5.17), it follows that

fpxq ą fpyq ` f 1pyqpx´ yq .

for all x, y P pa, bq such that x ‰ y.

A typical example for the application of Theorems 2.5.30, 2.5.33 is givenin the following example which derives an occasionally used lower boundfor the sine function.

180

Π

2Π

x

1y

Fig. 53: Graph of sine function (black) and secant (blue). Compare Example 2.5.34.


sinpxq ě 2xπ (2.5.18)

for all x P r0, π2s. Solution: By application of Theorem 2.5.30, it fol-lows that the restriction of ´ sin to p0, πq is convex. According to Theo-rem 2.5.33, this implies that

´ sinpxq ď ´ sinp1nq ` rx´ p1nqs ¨sinp1nq ´ 1

pπ2q ´ p1nq

for all x P r1n, π2s where n P N˚. By taking the limit nÑ 8, this leadsto

´ sinpxq ď ´2xπ

for all x P p0, π2s. From the last and the fact that (2.5.18) is triviallysatisfied for x “ 0, it follows the validity of (2.5.18) for all x P r0, π2s.

We know that the vanishing of the first derivative in a point x of the domainis a necessary, but in general not sufficient, condition for a differentiablefunction f to assume a local maximum or minimum in x. In that case, thetangent to graph of f in the point px, fpxqq is horizontal; if f is in addi-tion twice continuously differentiable such that f 2pxq ă 0 (f 2pxq ą 0),then it follows by the continuity of f 2 that the restriction of f 2 to a suffi-ciently small interval around x assumes strictly negative (strictly positive)values and hence that that restriction is concave (convex) and therefore thatx marks the position of a local maximum (minimum) of f .

181

-6 -4 -2 2 4 6x

0.2

0.8

y


Theorem 2.5.35. (Sufficient condition for the existence of a local mini-mum/maximum) Let f be a twice continuously differentiable real-valuedfunction on some open interval I of R. Further, let x0 P I be a criticalpoint of f such that f 2pxq ą 0 (f 2pxq ă 0). Then f has a local minimum(maximum) at x0.

Proof. Since f 2 is continuous with f 2px0q ą 0 (f 2px0q ă 0), there is anopen interval J around x0 such that f 2pxq ą 0 (f 2pxq ă 0) for all x P J .((Otherwise there is for every n P N˚ some yn P I such that |yn´x0| ă 1nand f 2pynq ď 0 (f 2pynq ě 0). In particular, this implies that limnÑ8 yn “x0 and by the continuity of f 2 also that limnÑ8 f

1pynq “ f 2px0q. Henceit follows by Theorem 2.3.12 that f 2px0q ď 0 (f 2px0q ě 0). )) Henceit follows by Theorem 2.5.30 that fpxq ą fpx0q (fpxq ă fpx0q) for allx P J z tx0u.

Example 2.5.36. Find the values of the local maxima and minima of

fpxq :“ ln

ˆ

5

4` sin2

pxq

˙

182

for all x P R. Solution: f is twice continuously differentiable with

f 1pxq “sinp2xq

54` sin2pxq

, f 2pxq “2 cosp2xq

54` sin2pxq

´sin2p2xq

`

54` sin2pxq

˘2

for all x P R. Hence the critical points of f are at xk :“ kπ2, k P Z andfor each k P Z:

f 2pxkq “2p´1qk

54` sin2pxkq

.

Hence it follows by Theorem 2.5.35 that f has a local minimum/maximumof value lnp54q at x2k and of value lnp94q at x2k`1, respectively, and eachk P Z.

Another important consequence of Theorem 2.5.6 (or its equivalent, Rolle’stheorem) is given by Cauchy’s extended mean value theorem which is thebasis for the proof of L’Hospital’s rule, Theorem 2.5.38, for the calculationof indeterminate forms. The proof of the extended mean value theoremproceeds by application of Rolle’s theorem to a skillfully devised auxiliaryfunction.

Theorem 2.5.37. (Cauchy’s extended mean value theorem) Let f, g :ra, bs Ñ R be continuous functions where a, b P R are such that a ăb. Further, let f, g be continuously differentiable on pa, bq and such thatg 1pxq ‰ 0 for all x P pa, bq. Then there is c P pa, bq such that

fpbq ´ fpaq

gpbq ´ gpaq“f 1pcq

g 1pcq. (2.5.19)

Proof. Since g 1 is continuous with g 1pxq ‰ 0 for all x P pa, bq, it followsby Theorem 2.3.37 that either g 1pxq ą 0 or g 1pxq ă 0 for all x P pa, bqand hence by Theorem 2.3.44 that g is either strictly increasing or strictlydecreasing on pa, bq. Since g is continuous, from this also follows thatgpbq ‰ gpaq. Define the auxiliary function h : ra, bs Ñ R by

hpxq :“ fpxq ´ fpaq ´fpbq ´ fpaq

gpbq ´ gpaq¨ pgpxq ´ gpaqq

183

for all x P ra, bs. Then h is continuous as well as differentiable on pa, bqsuch that

h 1pxq “ f 1pxq ´fpbq ´ fpaq

gpbq ´ gpaq¨ g 1pxq

for all x P ra, bs and hpaq “ hpbq “ 0. Hence according to Theorem 2.5.4,there is c P pa, bq such that

h 1pcq “ f 1pcq ´fpbq ´ fpaq

gpbq ´ gpaq¨ g 1pcq “ 0


L’Hospital’s rule goes back to Johann Bernoulli who instructed the youngFrench marquis Guillaume Francois Antoine de L’Hospital in 1692 in thenew Leibnizian discipline of calculus during a visit in Paris. Johann signeda contract under which in return for a regular salary, he agreed to sendL’Hospital his discoveries in mathematics, to be used as the marquis mightwish. The result was that one of Johann’s chief contributions to calculusfrom 1694 has ever since been known as L’Hospital’s rule on indeterminateforms after its publication in L’Hospital’s book ‘Analyse des infiniment pe-tits’ in 1696 [69]. L’Hospital’s book was the first textbook on calculus andwas met with great success.

An indeterminate form, we already met in Example 2.3.54 where it wasproved that

limxÑ0,x‰0

sinpxq

x“ 1 . (2.5.20)

Formally, that limit is of the ‘indeterminate’ type

0

0

where the last formal expression is obtained by replacing sinpxq and x inthe quotient sinpxqx by

limxÑ0,x‰0

sinpxq and limxÑ0,x‰0

x ,

184

respectively. Since sin and the identical function on R are continuous andaccording to the limit laws, this expression would give the correct resultfor the limit (2.5.20) if it would involve division by a non-zero number.But, since division by zero is not defined, that expression is not defined andhence ‘indeterminate’. The following theorem treats also indeterminatelimits of the type

8

8.

The calculation of limits of other indeterminate types can usually be re-duced to the calculation of limits of these two types.

L’Hospital’s rule is a simple consequence of Cauchy’s extended mean valuetheorem.

Theorem 2.5.38. (Indeterminate forms/L’Hospital’s rule) Let f : pa, bq ÑR and g : pa, bq Ñ R be continuously differentiable, where a, b P R aresuch that a ă b, and such that g 1pxq ‰ 0 for all x P pa, bq. Further, let

limxÑa

fpxq “ limxÑa

gpxq “ 0 (2.5.21)

or let |fpxq| ą 0 and |gpxq| ą 0 for all x P pa, bq as well as

limxÑa

1

|fpxq|“ lim

xÑa

1

|gpxq|“ 0 . (2.5.22)

Finally, let

limxÑa

f 1pxq

g 1pxq

exist. Then

limxÑa

fpxq

gpxq“ lim

xÑa

f 1pxq

g 1pxq. (2.5.23)

Proof. Since g 1 is continuous with g 1pxq ‰ 0 for all x P pa, bq, it followsby the Theorem 2.3.37 that either g 1pxq ą 0 or g 1pxq ă 0 for all x P pa, bqand hence by Theorem 2.3.44 that g is either strictly increasing or strictlydecreasing on pa, bq. First, we consider the case (2.5.21). Then f and g

185

can be extended to continuous functions on ra, bq assuming the value 0in a. Now, let x0, x1, . . . be a sequence of elements of pa, bq convergingto a. Then by Theorem 2.5.37 for every n P N there is a correspondingcn P pa, xnq such that

fpxnq

gpxnq“f 1pcnq

g 1pcnq.

Obviously, the sequence c0, c1, . . . is converging to a, and hence it followsthat

limnÑ8

fpxnq

gpxnq“ lim

nÑ8

f 1pcnq

g 1pcnq“ lim

xÑa

f 1pxq

g 1pxq(2.5.24)

and hence, finally, that (2.5.23). Finally, we consider the second case. Solet |fpxq| ą 0 and |gpxq| ą 0 for all x P pa, bq, and in addition let (2.5.22)be satisfied. Further, let x0, x1, . . . be some sequence of elements of pa, bqconverging to a and let b 1 P pa, bq. Because of (2.5.22), there is n0 P Nsuch that

ˇ

ˇ

ˇ

ˇ

fpb 1q

fpxnq

ˇ

ˇ

ˇ

ˇ

ă 1 andˇ

ˇ

ˇ

ˇ

gpb 1q

gpxnq

ˇ

ˇ

ˇ

ˇ

ă 1 .

for all n P N such that n ě n0. Then according to Theorem 2.5.37 for anysuch n, there is a corresponding cn P pxn, b 1q such that

fpxnq ´ fpb1q

gpxnq ´ gpb 1q“f 1pcnq

g 1pcnq.

Hence it follows that

fpxnq

gpxnq“

1´ gpb 1qgpxnq

1´ fpb 1qfpxnq

¨f 1pcnq

g 1pcnq

and since c0, c1, . . . is converging to a by (2.5.22) and Theorem 2.3.4, itfollows the relation (2.5.24) and hence, finally, (2.5.23).

Example 2.5.39. FindlimxÑ0`

x lnpxq .

Solution: Define fpxq :“ lnpxq and gpxq :“ 1x for all x P p0, 1q. Then fand g are continuously differentiable and such that g 1pxq “ ´1x2 ‰ 0 for

186

all x P p0, 1q. Further, |fpxq| “ | lnpxq| ą 0, |gpxq| “ |1x| “ 1|x| ą 0for all x P p0, 1q. Finally, (2.5.22) is satisfied and

limxÑ0

f 1pxq

g 1pxq“ lim

xÑ0p´xq “ 0 .

Hence according to Theorem 2.5.38:

limxÑ0`

x lnpxq “ 0 .

Example 2.5.40. Determine

limxÑ8

xe´x .

Solution: Define fpyq :“ 1y and gpyq :“ expp1yq for all y P p0, 1q.Then f and g are continuously differentiable and such that g 1pyq “ ´y´2¨

expp1yq ‰ 0 for all y P p0, 1q. Further, |fpyq| “ 1|y| ą 0, |gpyq| “expp1yq ą 0 for all y P p0, 1q. Finally, (2.5.22) is satisfied and

limyÑ0

f 1pyq

g 1pyq“ lim

yÑ0`

1

e1y“ 0 .


limxÑ8

xe´x “ limyÑ0`

1y

e1y“ 0 .

Example 2.5.41. Calculate

limxÑ8

x2e´x .

Solution: Define fpyq :“ 1y2 and gpyq :“ expp1yq for all y P p0, 1q.Then f and g are continuously differentiable as well as such that g 1pyq “´ expp1yqy2 ‰ 0 for all y P p0, 1q. Further, |fpyq| “ 1y2 ą 0,|gpyq| “ expp1yq ą 0 for all y P p0, 1q. Finally, (2.5.22) is satisfiedand by Example 2.5.40

limyÑ0

f 1pyq

g 1pyq“ lim

yÑ0`

2y

e1y“ 0 .

187


limxÑ8

x2e´x “ limyÑ0`

1y2

e1y“ 0 .

Remark 2.5.42. Recursively in this way, it can be shown that

limxÑ8

xne´x “ 0 .

for all n P N.

That the condition that g 1pxq ‰ 0 for all x P pa, bq in Theorem 2.5.38 is notredundant can be seen from the following example.

Example 2.5.43. For this define

fpxq :“2

x` sin

ˆ

2

x

˙

, gpxq :“

„

2

x` sin

ˆ

2

x

˙

e sinp1xq

for all x P p0, 25q. Then f and g are continuously differentiable and satisfy

limxÑ0

1

|fpxq|“ lim

xÑ0

1

|gpxq|“ 0 .

Sincefpxq

gpxq“ e´ sinp1xq

for all x P p0, 25q, pfgqpxq does not have a limit value for xÑ 0. Further,it follows that

f 1pxq “ ´2

x2

„

1` cos

ˆ

2

x

˙

“ ´4

x2cos 2

ˆ

1

x

˙

,

g 1pxq “ ´1

x2cos

ˆ

1

x

˙

e sinp1xq

„

2

x` sin

ˆ

2

x

˙

` 4 cos

ˆ

1

x

˙

and hence that

f 1pxq

g 1pxq“

4x cosp1xq e´ sinp1xq

2` x sinp2xq ` 4x cosp1xq

188

for all

x P p0, 25q z

"

2

p2k ` 1qπ: k P N

*

.

We notice that

limxÑ0`

4x cosp1xq e´ sinp1xq

2` x sinp2xq ` 4x cosp1xq“ 0 .

This does not contradict Theorem 2.5.38 since g 1 has zeros of the form2pp2k ` 1qπq, k P N. Hence there is no b ą 0 such that the restrictions off and g would satisfy the assumptions in Theorem 2.5.38.

The following example shows that in general the existence of

limxÑa

fpxq

gpxq

does not imply the existence of

limxÑa

f 1pxq

g 1pxq.

Example 2.5.44. For this, define

fpxq :“ x sinp1x2q expp´1xq , gpxq :“ expp´1xq

for all x ą 0. Then

limxÑ0

fpxq “ 0 , limxÑ0

gpxq , limxÑ0

fpxq

gpxq“ 0 .

Further,

f 1pxq “1

x2

“

xpx` 1q sinp1x2q ´ 2 cosp1x2

q‰

expp´1xq ,

g 1pxq “1

x2expp´1xq ,

f 1pxq

g 1pxq“ xpx` 1q sinp1x2

q ´ 2 cosp1x2q

for all x ą 0. Hence f 1g 1 does not have a limit value for xÑ 0.

189

For the motivation of following contraction mapping lemma, we consider amethod of calculating square roots of numbers which can be traced back toancient Greek times, but there are indications that this method was alreadyknown in ancient Babylonia. For this, we consider the problem of approxi-mation of

?N by fractions where N is some non-zero natural number. If q

is some non-zero positive rational number such that

q2ă N ,

then it follows that q ă?N ,

q

Nă

?N

N“

1?N

and hence thatN

qą?N .

Hence, the arithmetic mean

q :“1

2

ˆ

q `N

q

˙

of q and Nq, which is the midpoint of the interval rq,Nqs, might be abetter approximation to

?N than q. Indeed, a little calculation gives that

q 2Ń “

1

4

ˆ

q `N

q

˙2

Ń “1

4

ˆ

q2` 2N `

N2

q2

˙

Ń

“1

4

ˆ

q2´ 2N `

N2

q2

˙

“1

4

„

q2Ń `

N

q2pN ´ q2

q

“1

4

ˆ

N

q2´ 1

˙

pN ´ q2q

„

“pN ´ q2q2

4q2ą 0

and hence thatq 2Ń ă N ´ q2

if1

4

ˆ

N

q2´ 1

˙

ă 1

190

which is equivalent to N ă 5q2. Hence if

q2ă N ă 5q2 ,

then q is a better approximation to?N than q. Note that q does not satisfy

the same inequalities since q 2 ą N . On the other hand,

¯q :“1

2

ˆ

q `N

q

˙

satisfies

¯q 2Ń “

1

4

ˆ

N

q 2´ 1

˙

pN ´ q 2q

“1

4

ˆ

1Ń

q 2

˙

pq 2Ńq

„

“pq 2 Ńq2

4q 2ą 0

and hence is a better approximation to?N than q since

1

4

ˆ

1Ń

q 2

˙

ă 1 .

Hence by continuing this process, we arrive at rational approximations to?N whose accuracy increase in every step.

For instance forN “ 2 and q “ 1, note that q2 ă N ă 5q2 since 1 ă 2 ă 5,we arrive at the following rational approximations to

?2

3

2,

17

12,

577

408,

665857

470832,

886731088897

627013566048.

The value 1712, which gives?

2 within an error of 3 ¨ 10´3, was used asa common rough approximation of

?2 by the Babylonians. Starting from

q “ 1712, Babylonian arithmetic leads to the fraction

1`24

60`

51

602`

10

603“

30547

21600

191

1 !!!!2 2

x

!!!!2

3

6

y

Fig. 55: Graph of T for the case N “ 2 and auxiliary curves.

which was found on the Babylonian tablet YBC 7289 and gives?

2 withinan error of 6 ¨ 10´7.

A modern interpretation of the process in terms of maps is that

q , T pqq , T pT pqqq “ pT ˝T qpqq , T pT pT pqqqq “ pT ˝ pT ˝T qqpqq . . . ,

where T : p0,8q Ñ p0,8q is defined by

T pxq :“1

2

ˆ

x`N

x

˙

for every x ą 0, gives a sequence of approximations to?N of increasing

accuracy. We expect that

limnÑ8

T npqq “?N

where T n for n P N is inductively defined by T 0 :“ idp0,8q and T k`1 :“T ˝ T k, for k P N.

Indeed, if x0, x1, . . . converges to some element of x˚ P p0,8q, wherex ą 0 and

xk :“ T kpxq

192

5 10 15 20n

1

!!!!2

2

x

Fig. 56: pn, xnq for x “ 1 and n “ 1 to n “ 20.

for all k P N, then

xk`1 “ T k`1pxq “ T pT kpxqq “ T pxkq “

1

2

ˆ

xk `N

xk

˙

,

and hence it follows by the limit laws that

x˚ “ limkÑ8

xk`1 “ limkÑ8

1

2

ˆ

xk `N

xk

˙

“1

2

ˆ

x˚ `N

x˚

˙

.

As a consequence, in this case, x˚ satisfies the equation

1

2

ˆ

x˚ ´N

x˚

˙

“ 0

or equivalently x2˚ “ N which implies that

x˚ “?N

193

since it was assumed that x˚ ą 0. It is natural to ask in what sense?N is

a particular point for the map T . For this, we notice that

T p?N q “

1

2

ˆ

?N `

N?N

˙

“?N ,

that is, T maps?N onto itself, i.e.,

?N is a so called ‘fixed point’ of the

map T . Also, every fixed point x of T satisfies the equation

x “1

2

ˆ

x`N

x

˙

which is equivalent to x “?N , i.e., there is no other fixed point of T .

Finally, it is natural to ask whether there is a special property of the mapthat leads to the convergence of x0, x1, . . . . For this, we notice that forx ě

?N and y ě

?N , it follows that

N

xyď 1

and hence that

|T pxq ´ T pyq| “

ˇ

ˇ

ˇ

ˇ

1

2

ˆ

x´ y `N

xŃ

y

˙ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

1

2

„

x´ y Ń

xypx´ yq

ˇ

ˇ

ˇ

ˇ

“|x´ y|

2¨

ˇ

ˇ

ˇ

ˇ

1Ń

xy

ˇ

ˇ

ˇ

ˇ

ď1

2|x´ y| .

This leads to

|T pT pxqq ´ T pT pyqq| ď1

2|T pxq ´ T pyq| ď

1

4|x´ y|

and inductively to

|T kpxq ´ T kpyq| ď1

2k|x´ y|

for all k P N. Since?N is a fixed point of T , this implies that

|T kpxq ´?N | ď

1

2k|x´

?N |

194

and hence thatlimkÑ8

T kpxq “?N . (2.5.25)

Since for x ă?N , as already observed above, it follows that

pT pxqq2 Ń “px2 Ńq2

4x2ą 0

and hence thatT pxq ą

?N .

Therefore, we conclude that (2.5.25) holds for all x ą 0. In addition,we notice that the fact that N P N˚ was nowhere used in the previousdiscussion. As a consequence, summarizing that discussion, we proved thefollowing result.

Theorem 2.5.45. (Babylonian method of approximating roots of realnumbers, I) Let a ą 0 and T : p0,8q Ñ p0,8q be defined by

T pxq :“1

2

´

xà

x

¯

for every x ą 0, thenlimkÑ8

T kpxq “?a

where T n for n P N is inductively defined by T 0 :“ idp0,8q and T k`1 :“T ˝ T k, for k P N.

Functions T satisfying

|T pxq ´ T pyq| ď α|x´ y|

for some 0 ď α ă 1 and all x, y of their domain are called contractions.We notice from the previous discussion that if such a function T has a fixedpoint x˚ and maps its domain into that domain, then it follows as above that

limkÑ8

T kpxq “ x˚ .

for all x P DpT q. On the other hand, in many cases the existence of sucha fixed point is not obvious, but such can be shown with the help of Theo-rem 2.3.33 if the domain of T is a closed interval of R. This is the additionalpoint that is treated in Theorem 2.5.46.

195

Lemma 2.5.46. (Contraction mapping lemma on the real line) Let T :ra, bs Ñ R be such that T pra, bsq Ă ra, bs where a, b P R are such thata ă b. In addition, let T be a contraction, i.e., let there exist α P r0, 1q suchthat

|T pxq ´ T pyq| ď α ¨ |x´ y| (2.5.26)

for all x, y P ra, bs. Then T has a unique fixed point, i.e., a unique x˚ Pra, bs such that

T px˚q “ x˚ .

Further,

|x´ x˚| ď|x´ T pxq|

1´ α(2.5.27)

andlimnÑ8

T npxq “ x˚ (2.5.28)

for every x P ra, bs where T n for n P N is inductively defined by T 0 :“idra,bs and T k`1 :“ T ˝ T k, for k P N.

Proof. Note that (2.5.26) implies that T is continuous. Further, define thehence continuous function f : ra, bs Ñ R by

fpxq :“ |x´ T pxq|

for all x P ra, bs. Note that x P ra, bs is a fixed point of T if and only if itis a zero of f . By Theorem 2.3.33 f assumes its minimum in some pointx˚ P ra, bs. Hence

0 ď fpx˚q ď fpT px˚qq “ |T px˚q´T pT px˚qq| ď α |x˚´T px˚q| “ α fpx˚q

and therefore fpx˚q “ 0 since the assumption fpx˚q ‰ 0 leads to thecontradiction that 1 ď α. If x P ra, bs is a fixed point of T , then

|x˚ ´ x| “ |T px˚q ´ T pxq| ď α ¨ |x˚ ´ x|

and hence x “ x˚ since the assumption x ‰ x˚ leads to the contradictionthat 1 ď α. Finally, let x P ra, bs. Then

|x´ x˚| “ |x´ T px˚q| “ |x´ T pxq ` T pxq ´ T px˚q|

196

ď |x´ T pxq| ` |T pxq ´ T px˚q| ď fpxq ` α ¨ |x´ x˚|

and hence (2.5.27). Further from (2.5.27) and

|T npxq ´ x˚| “ |Tnpxq ´ T npx˚q| ď αn|x´ x˚| ď

αn

1´ αfpxq ,

it follows (2.5.28) since limnÑ8 αn “ 0.

The following example applies the previous lemma to the Babylonian methodof approximating roots of real numbers. In this, there are used more widelyapplicable methods in the proof of invariance of the domain of the functionT and in the proof that T is a contraction.

Example 2.5.47. (Babylonian method of approximating roots of realnumbers, II) Let a ą 0 and N P N be such that N2 ą a. Finally, defineT : r

?a,N s Ñ R by

T pxq :“1

2

´

xà

x

¯

for all x P r?a,N s. Then

limnÑ8

T npNq “?a . (2.5.29)

Proof. First, we note that

T p?aq “

?a , T pNq “

1

2

´

N à

N

¯

ă N

and hence that?a is a fixed point of T . Further, T is twice differentiable

on p?a,Nq with derivatives

T 1pxq “1

2

´

1á

x2

¯

“1

2x2

`

x2´ a

˘

ą 0 , T 2pxq “a

x3ą 0

for all x P p?a,Nq. Hence T, T 1 are strictly increasing according to Theo-

rem 2.3.44, T pr?a,N sq Ă r

?a,N s and

0 ď T 1pxq ď1

2

´

1á

N2

¯

ă1

2.

197

1 3x

-5

5

10

15

y

Fig. 57: Graph of pRÑ R, x ÞÑ x3 ´ 2x´ 5q.

In particular, it follows by Theorem 2.5.6 that

|T pxq ´ T pyq| ď1

2¨ |x´ y|

for all x, y P r?a,N s. By Lemma 2.5.46, it follows that T has a unique

fixed point, which hence is given by?a , and in particular (2.5.29).

For instance for N “ 2 and q “ 1, we get in this way the first five approxi-mating fractions

3

2,

17

12,

577

408,

665857

470832,

886731088897

627013566048

with corresponding errors (according to (2.5.27)) equal or smaller than

1

6,

1

204,

1

235416,

1

313506783024,

1

555992422174934068969056.

In 1669, Newton submitted a paper with title ‘De analysi per aequationesnumero terminorum infinitas’ to the Royal Society. This paper was pub-lished only much later in 1712 [82]. Among others, Newton introduces by

198

example a iterative method for the approximation of zeros of differentiablefunctions which is now named after him. For this, he considers the equation

x3´ 2x´ 5 “ 0 . (2.5.30)

As a first approximation to the solution in the interval [2,3], compare Fig 57,he uses x “ 2. Substitution of x “ 2` p into (2.5.30) gives

0 “ x3´ 2x´ 5 “ p2` pq3 ´ 2p2` pq ´ 5

“ 8` 12p` 6p2` p3

´ 4´ 2p´ 5 “ ´1` 10p` 6p2` p3

Neglecting higher order terms in p than first order, i.e., effectively replacingthe last polynomial in p by its linearization around p “ 0, he arrives at theequation

´1` 10p “ 0

and hence at p “ 110. In this way, he arrives at x “ 2.1 as a secondapproximation to the solution. He then substitutes x “ 2.1` q into (2.5.30)to obtain

0 “ x3´ 2x´ 5 “ p2.1` qq3 ´ 2p2.1` qq ´ 5

“ 9.261` 13.23q ` 6.3q2` q3

´ 4.2´ 2q ´ 5

“ 0.061` 11.23q ` 6.3q2` q3 .

Again, neglecting higher order terms in q than first order, i.e., in this effec-tively replacing the last polynomial in q by its linearization around q “ 0,he arrives at the equation

0.061` 11.23q

and hence at q « ´0.0054 where only the first leading digits of are re-tained. In this way, he arrives at the rounded result x “ 2.0946 as a thirdapproximation to the solution which approximates that solution within anerror of 5 ¨ 10´5.

It has to be taken into account that Newton’s paper does not contain ref-erences to his fluxions or fluents. On the other hand, in spirit, his procedure

199

matches today’s version of the method. The only difference is that today’smethod does not involve substitutions. It proceeds as follows. We definef :“ pR Ñ R, x ÞÑ x3 ´ 2x ´ 5q. Starting from the first approximationx0 “ 2 of its zero, we calculate the linearization p10 of f around x0. Since

f 1pxq “ 3x2´ 2

for all x P R, we arrive at

p10pxq “ fpx0q ` f1px0qpx´ x0q “ ´1` 10 px´ 2q “ ´21` 10x

for all x P R. Effectively replacing the function f by its linearization p10,we arrive at the equation

´21` 10x “ 0

and hence, as Newton, at the first approximation x1 “ 2.1. In the secondstep, we calculate the linearization p11 of f around x1. It is given by

p11pxq “ fpx1q ` f1px1qpx´ x1q “ 0.061` 11.23px´ 2.1q

“ ´23.522` 11.23x

for all x P R. Again, effectively replacing the function f by its linearizationp11, we arrive at the equation

´23.522` 11.23x “ 0

and hence, as Newton, at the second approximation x2 “ 2.0946 by repeat-ing Newton’s way of rounding the result.

From today’s perspective, Newton’s method can be viewed as a particu-lar application of the contraction mapping lemma. This is also used belowto prove the convergence of the method and to provide an error estimate.The method is iterative and used to approximate solutions of the equationfpxq “ 0 where f : I Ñ R is a differentiable function on a non-trivial open

200

interval I of R. Starting from an approximation xn P I to such a solution,the correction xn`1 is given by the zero of the linearization around xn,

fpxnq ` f1pxnqpx´ xnq ,

x P R, and hence by

xn`1 “ xn ´fpxnq

f 1pxnq(2.5.31)

assuming f 1pxnq ‰ 0, thereby essentially replacing the function f by itslinearization around xn.

It is instructive to analyze the recursion (2.5.31) in a little more detailwhere we assume that f 1 is in addition continuous. For this, let’s assumethat fpxnq ą 0. If f is increasing in some neighborhood of xn, i.e., iff 1pxnq ą 0, then we would expect the solution to be located to the left (=towards smaller values) of xn and, indeed, in this case, xn`1 is to the left ofxn. If f is decreasing in some neighborhood of xn, i.e., if f 1pxnq ă 0, thenwe would expect the solution to be located to the right (= towards largervalues) of xn and also xn`1 is to the right of xn. If fpxnq ă 0 and f isincreasing in some neighborhood of xn, i.e., if f 1pxnq ą 0, then we wouldexpect the solution to be to the right of xn and also xn`1 is to the right of xn.Finally, if f is decreasing in some neighborhood of xn, i.e., if f 1pxnq ă 0,then we would expect that the solution is to the left of xn and also xn`1 isto the left of xn. Hence the recursion (2.5.31) shows as very intuitive be-havior. On the other hand, for this reasoning to be make sense, the solutionshould be very near to xn. In particular in cases that xn is near to a criticalpoint of f , the method usually fails because of leading to corrections of amuch too large size.

Finally, since the graph of the linearization of f around xn gives the tan-gent to the graph of f in the point pxn, fpxnqq, xn`1 gives the abscissa ofthe intersection of that tangent with the x-axis. This fact gives a geometricinterpretation to Newton’s method.

201

12 x0x1x2x

-2

14y

Fig. 58: Graph of f from Example 2.5.48 (a “ 2) and Newton steps starting from x0 “ 4.

The following example shows that the Babylonian method of approximat-ing roots of real numbers can be seen as a particular case of Newton’smethod.

Example 2.5.48. Let a ą 0. Define f : RÑ R by

fpxq :“ x2´ a

for all x P R. Then

xn`1 “ xn ´fpxnq

f 1pxnq“ xn ´

x2n ´ a

2xn“

1

2

ˆ

xn `a

xn

˙

for xn ‰ 0 which is the iteration used in Example 2.5.45.

Theorem 2.5.49. (Newton’s method) Let f be a twice differentiable real-valued function on a non-trivial open interval I of R. Further, let I containa zero x0 of f and be such that f 1pxq ‰ 0 for all x P I and in particularsuch that

ˇ

ˇ

ˇ

ˇ

fpxqf 2pxq

f 12pxq

ˇ

ˇ

ˇ

ˇ

ď α

202

for all x P I and some α P R satisfying 0 ď α ă 1. Then

limnÑ8

T npxq “ x0 (2.5.32)

for all x P I where

T pxq :“ x´fpxq

f 1pxq.

Finally,

|x´ x0| ď|x´ T pxq|

1´ α(2.5.33)

for all x P I .

Proof. First, it follows that T is differentiable with derivative

T 1pxq “fpxqf 2pxq

f 12pxq

for all x P I and that x0 is a fixed point of T . By Theorem 2.5.6 it followsthat

ˇ

ˇ

ˇ

ˇ

T pxq ´ T px0q

x´ x0

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

T pxq ´ x0

x´ x0

ˇ

ˇ

ˇ

ˇ

ă 1

for all x P I different from x0 and hence that

|T pxq ´ x0| ď |x´ x0| (2.5.34)

for all x P I . Now let ra, bs, where a, b P R such that a ă b, be some closedsubinterval of I containing x0. Then it follows by (2.5.34) that T pra, bsq Ăra, bs and by Theorem 2.5.6 that

ˇ

ˇ

ˇ

ˇ

T pxq ´ T pyq

x´ y

ˇ

ˇ

ˇ

ˇ

ď α

for all x, y P ra, bs satisfying x ‰ y and hence that

|T pxq ´ T pyq| ď α|x´ y|

for all x, y P ra, bs. Hence by Lemma 2.5.46, the relations (2.5.32) and(2.5.33) follow for all x P ra, bs.

203

-1 -0.5 0.5 1x

-1

-0.5

0.5

y

Fig. 59: Zero of f from Example 2.5.50 given by the x´coordinate of the intersection oftwo graphs.

The following example gives an application of Newton’s method to a stan-dard problem from quantum theory.

Example 2.5.50. Find an approximation x1 to the solution of

x0 “ cospx0q

such that |x0 ´ x1| ă 10´6. Solution: Define f : RÑ R by

fpxq :“ x´ cospxq

for all x P R. Then f is infinitely often differentiable with

f 1pxq “ 1` sinpxq , f 2pxq “ cospxq

fpxqf 2pxq

f 12pxq“

cospxq ¨ px´ cospxqq

p1` sinpxqq2

where only in the last identity it has to be assumed that x is different fromπ2` 2kπ for all k P Z. Further

f´π

6

¯

“π

6´

?3

2ă 0 , f

´π

4

¯

“π

4´

1?

2ą 0 ,

204

and hence according Theorem 2.3.37, f has a zero in the open intervalI :“ pπ6, π4q. Also

f 1pxq “ 1` sinpxq ą 1` sinpπ6q “ 32 ą 0

for all x P I . Further,ˆ

f ¨ f 2

f 12

˙ 1

pxq “3 cospxq ` x sinpxq ´ 2x

p1` sinpxqq2

and

3 cospxq`x sinpxq´2x ě 3 cos´π

4

¯

`π

6sin

´π

6

¯

´2 ¨π

4“

3?

2´

5π

12ą 0

and hence ff 2f 12 is strictly increasing on rπ6, π4s as a consequence ofTheorem 2.5.10. Therefore,

1

27

´

´9`?

3π¯

“cos

`

π6

˘

¨`

π6´ cos

`

π6

˘˘

p1` sin`

π6

˘

q2ăfpxqf 2pxq

f 12pxq

ăcos

`

π4

˘

¨`

π4´ cos

`

π4

˘˘

p1` sin`

π4

˘

q2“π ´ 2

?2

8` 6?

2

andˇ

ˇ

ˇ

ˇ

fpxqf 2pxq

f 12pxq

ˇ

ˇ

ˇ

ˇ

ă1

27

´

9´?

3π¯

ă α :“1

3ă 1

for all x P I . Starting the iteration from 0.7 gives to six decimal places

0.739436 , 0.739085

with the corresponding errors

0.000527006 , 4.08749 ¨ 10´8 .

Hence the zero x0 of f in the interval I agrees with

x1 “ 0.739085

205

to six decimal places. That there is no further zero of f can be con-cluded as follows. Since the derivative of f does not vanish in the intervalp´π2, π2q, it follows by Theorem 2.5.4 that there are no other zeros inthis interval. Further, for |x| ě π2 pą 1q there is no zero of f because| cospxq| ď 1 for all x P R. The quantity

U2 ` pU1 ´ U2qx20

is the ground state energy of a particle in a finite square well potential withU3 “ U1, γ “ 0, KL “ 2. See [79].

Problems

1) Give the maximum and minimum values of f and the points wherethey are assumed.

a) fpxq :“ x2 ` 5x` 7 ,x P r´5, 0s ,b) fptq :“ t3 ` 6t2 ` 9t` 14 , t P r´5, 0s ,c) fpsq :“ s4 ` p83qs3 ´ 6s2 ` 1 , s P r´5, 5s ,d) fptq :“ 4pt´ 3q2 ¨ pt2 ` 1q , t P r´1, 4s ,e) fpxq :“ p9x` 12qp3x2 ´ 4q ,x P r´1, 0s ,f) fpxq :“ px2 ` x` 1q expp´xq ,x P r´0.3, 1.5s ,g) fpxq :“ expp´x

?3 q cospxq ,x P r0,8q .

2) Consider a projectile that is shot into the atmosphere. If v ě 0 isthe component of its speed at initial time 0 in the vertical direction,its height zptq above ground at time t ě 0 is given by zptq “ vt ´gt22 where g “ 9.81ms2 is the acceleration due to gravity and it isassumed that zp0q “ 0. Calculate the maximal height the projectilereaches and also the time of its flight, i.e., the time when it returns tothe ground.

3) Reconsider the situation from previous problem, but now with inclu-sion of a viscous frictional force opposing the motion of the projec-tile. Then zptq “ α rpv`αgq¨p1´expp´tαqq´gtswhere it is againassumed that zp0q “ 0. Here α “ mλ where m ą 0 is the mass ofthe projectile and λ ą 0 is a parameter describing the strength of thefriction. Calculate the maximal height the projectile reaches and alsothe time of its flight, i.e., the time when it returns to the ground.

206

4) Let a ą 0 and b ą 0. Find an equation for the straight line throughthe point pa, bq that cuts from the first quadrant a triangle of minimumarea. State that area.

5) Let a ą 0 and b ą 0. Find an equation for the straight line throughthe point pa, bq whose intersection with the first quadrant is shortest.State the length of that intersection.

6) Find the maximal volume of a cylinder of given surface area A ą 0.

7) From each corner of a rectangular cardboard of side lengths a ą 0and b ą 0, a square of side length x ě 0 is removed, and the edgesare turned up to form an open box. Find the value of x for which thevolume of that box is maximal.

8) A rectangular movie screen on a wall is h1-meters above the floor andh2-meters high. Imagine yourself sitting in front of the screen andlooking into the direction of its center. Measured in this direction,what distance x from the wall will give you the largest viewing angleθ of the movie screen? [This is the angle between the straight linesthat connect your eyes to the lowest and the highest points on thescreen.] Assume that the height of your eyes above the floor is hs-meters where hs ă h1.

9) Imagine that the upper half-plane H` :“ R ˆ p0,8q and the lowerhalf-planeH´ :“ Rˆp´8, 0q of R2 are filled with different ‘physi-cal media’ with the x´axis being the interface I . Further, let px1, y1qP H` , px2, y2q P H´ . Light rays in both media proceed alongstraight lines and at constant speeds v1 and v2, respectively. Ac-cording to Fermat’s principle, a ray connecting px1, y1q and px2, y2qchooses the path that takes the least time. Show that that path satis-fies Snell’s law, i.e., sinpθ1q sinpθ2q “ v1v2 where θ1 (θ2) is theangle of the part of the ray in H` Y I (H´ Y I) with the normal tothe x´axis originating from its intersection with I .

10) For the following functions find the intervals of increase and de-crease, the local maximum and minimum values and their locationsand the intervals of convexity and concavity and the inflection points.Use the gathered information to sketch the graph of the function. Ifavailable, check your result with a graphing device.

a) fpsq :“ 7s4 ´ 3s2 ` 1 , s P R ,b) fptq :“ t4 ` p83qt3 ´ 6t2 ` 3 , t P R ,c) fpxq :“ 4px´ 3q2 ¨ px2 ` 1q ,x P R .

207

11) For the following functions find vertical and horizontal asymptotes,the intervals of increase and decrease, the local maximum and min-imum values and their locations and the intervals of convexity andconcavity and the inflection points. Use the gathered information tosketch the graph of the function. If available, check your result witha graphing device.

a) fpxq :“ xp1` x2q ,x P R ,b) fpxq :“

?x2 ` 1´ x ,x P R ,

c) fpxq :“ p9x` 12qp3x2 ´ 4q ,x P R zt´2?

3, 2?

3u .

12) Calculate the linearization of f around the given point.

a) fpxq :“ p1` xqn ,x ą ´1 , around x “ 0 where n P R ,b) fpxq :“ lnpxq ,x ą 0 , around x “ 1 ,c) fpϕq :“ sinpϕq ,ϕ P R , around ϕ “ 0 ,d) fpϕq :“ tanpϕq ,ϕ P p´π2, π2q , around ϕ “ 0 ,e) fpxq :“ sinhpxq :“ pex ´ e´xq2 , x P R , around x “ 0 ,f) fpϕq :“ lnrp54q ` cosp3ϕqs , ϕ P R , around ϕ0 “ 3π4 ,g) fpxq :“ p3x2 ´ x ` 5qp5x2 ` 6x ´ 3q , x P R ztx P R :

5x2 ` 6x´ 3 “ 0u , around x “ 1 .

13) Show that

a) p1` xqn ą 1` nx for all x ą 0 and n ě 1 ,b) p1` xqn ă 1` nx for all x ą 0 and 0 ă n ă 1 ,c) lnx ď x´ 1 for all x ą 0 ,d) sinpϕq ă ϕ for all ϕ ą 0 ,e) tanpϕq ą ϕ for all ϕ P p0, π2q ,f) sinhpxq :“ pex ´ e´xq2 ą x for all x ą 0 .g) lnx ě px´ 1qx for all x ą 0 .

14) Calculate

a) limxÑ8

´

1`a

x

¯x

, b) limxÑ8

´

1´a

x

¯´x

,

c) limxÑ0

tanpxq

x, d) lim

xÑ0

x tanpxq

1´ cospxq,

e) limxÑ0`

sinpxq?x

, f) limxÑ0

ˆ

1

x´

1

sinpxq

˙

,

g) limxÑ8

lnpxq

x, h) lim

xÑ8

r lnpxq sn`2

x,

208

i) limxÑ1

lnpxq

tanpπxq, j) lim

xÑ0`xx , k) lim

xÑ0`xa lnpxq ,

l) limxÑ0`

r sinpxq s tanpxq , m) limxÑ8

x1x ,

n) limxÑ0`

xsinpxq , o) limxÑ8

r cosp1xq sx ,

p) limxÑ0`

cosp3xq ´ cosp2xq

x2, q) lim

xÑ1`

1` cospπxq

x2 ´ 2x` 1

where n P N, a P R.

15) Explain why Newton’s method fails to find the zero(s) of f in thefollowing cases.

a) fpxq :“ x2´x´6 ,x P R , with initial approximation x “ 12 ,b) fpxq :“ x13 ,x P R .

16) A circular arch of length L ą 0 and height h ą 0 is to be constructedwhere Lh ą π.

a) Show that x :“ Lp2rq, where r ą 0 is the radius of the corre-sponding circle, satisfies the transcendental equation

cospxq “ 1´2h

Lx .

b) Assume that Lh “ 7. By Newton’s method, find an approxi-mation x0 to x such that |x0 ´ x| ă 10´6.

17) The characteristic frequencies of the transverse oscillations of a stringof lengthL ą 0 with fixed left end and right end subject to the bound-ary condition v 1pLq ` hvpLq “ 0, where v : r0, Ls Ñ R is the am-plitude of deflection of the string and h P R, is given by ω “ xLwhere

tanx “ ´x

hL(2.5.35)

[20]. Assume hL “ 13, and find by Newton’s method an ap-proximation x0 to the smallest solution x ą 0 of (2.5.35) such that|x0 ´ x| ă 10´6.

18) The characteristic frequencies of the transverse vibrations of a ho-mogeneous beam of length L ą 0 with fixed ends are given byω “ rEJpρSqs12pxLq2 where

coshpxq cospxq “ 1 , (2.5.36)

E is Young’s modulus, J is the moment of inertia of a transversesection, S is the area of the section, ρ is the density of the material

209

of the beam, and coshpyq :“ pey ` e´yq2 for all y P R [65]. ByNewton’s method, find an approximation x0 to the smallest solutionx ą 0 of (2.5.36) such that |x0 ´ x| ă 10´6.

19) (Binomial theorem) Let n P N˚. Define f : p´1,8q Ñ R by

fpxq :“nÿ

k“0

ˆ

n

k

˙

xk

for all x P p´1,8q where the so called ‘binomial coefficients’ aredefined by

ˆ

n

0

˙

:“ 1 ,

ˆ

n

k

˙

:“1

k!n ¨ pn´ 1q ¨ ¨ ¨ pn´ pk ´ 1qq

for every k P N˚.

a) Show thatp1` xqf 1pxq “ nfpxq

for all x P p´1,8q.b) Conclude from part a) that

fpxq “ p1` xqn

for all x P p´1,8q.c) Show the binomial theorem, i.e., that

px` yqn “nÿ

k“0

ˆ

n

k

˙

xkyn´k

for all x, y P R.

210

1x

y

A

1

Fig. 60: The yellow area A enclosed by the graph of f :“ p r0, 1s Ñ R, x ÞÑ 1´ x2 q andthe coordinate axes is determined by Archimedes’ method.

2.6 Riemann IntegrationAn early example of integration is given by Archimedes’ quadrature of thesegment of the parabola. For this, he presents two proofs. Here, we displayhis first proof because it anticipates the definition of the Riemann integral.The second proof will be given at beginning of Section 3.3 on series ofreal numbers. We use his method to calculate the area A of the parabolicsegment

tpx, yq P R2 : x P r0, 1s ^ 0 ď y ď 1´ x2u

that is contained the rectangle r0, 1sˆr0, 1s, see Fig. 60. He approximatesAby what would be called upper and lower sums today, but the constructionof those sums was geometrically motivated. We slightly alter that construc-tion, but otherwise closely follow his method. For this, we divide the x-axisinto intervals of equal lengths, for instance, into four intervals

r0, 14s , r14, 24s , r24, 34s , r34, 44s

211

1

4

1

2

3

4

1x

1

y

Fig. 61: The yellow area gives the upper bound U4 for A, compare text.

of equal lengths 14. Then the sum U4 of the areas of the two-dimensionalintervals

r0, 14s ˆ r 0, 1´ p04q2 s , r14, 24s ˆ r 0, 1´ p14q2 s ,

r24, 34s ˆ r 0, 1´ p24q2 s , r34, 44s ˆ r 0, 1´ p34q2 s

given by

U4 “1

4

3ÿ

k“0

ˆ

1´k2

42

˙

exceeds A, and the sum L4 of the areas of the two-dimensional intervals

r0, 14s ˆ r 0, 1´ p14q2 s , r14, 24s ˆ r 0, 1´ p24q2 s ,

r24, 34s ˆ r 0, 1´ p34q2 s , r34, 44s ˆ r 0, 1´ p44q2 s

given by

L4 “1

4

4ÿ

k“1

ˆ

1´k2

42

˙

212

1

4

1

2

3

4

1x

1

y

Fig. 62: The yellow area gives the lower bound L4 for A, compare text.

is smaller than A,L4 ď A ď U4 .

In the same way, by division of the x-domain into intervals of equal lengths1n, where n P N˚, we arrive at

Un “1

n

n´1ÿ

k“0

ˆ

1´k2

n2

˙

, Ln “1

n

nÿ

k“1

ˆ

1´k2

n2

˙

and the inequalitiesLn ď A ď Un .

SinceUn ´ Ln “

1

n

´n

n

¯2

“1

n,

we conclude thatLn ď A ď Ln `

1

n.

213

Further,

Ln “1

n

˜

n´nÿ

k“1

k2

n2

¸

“ 1´1

n3

nÿ

k“1

k2“ 1´

pn` 1qp2n` 1q

6n2

“ 1´1

3

ˆ

1`1

n

˙ˆ

1`1

2n

˙

where it has been used thatnÿ

k“1

k2“

1

6npn` 1qp2n` 1q .

The last formula was known to Archimedes. He proved it in his treatise onspirals [37]. Of course, it is tempting (and correct) to take the limit nÑ 8

to conclude that

A ě limnÑ8

Ln “2

3, A ď lim

nÑ8

ˆ

Ln `1

n

˙

“ limnÑ8

Ln “2

3

and hence thatA “

2

3. (2.6.1)

Below, the Riemann integral of f : r0, 1s Ñ R defined by fpxq :“ 1 ´ x2

for every x P r0, 1s, will be defined essentially as the common limit of thesequences L1, L2, . . . and U1, U2, . . . , which give the area enclosed by thegraph of f and the coordinate axes, and denoted by

ż 1

0

fpxq dx

where Leibniz’s signş

is a stylized S and is intended to remind of thesummation involved in the definition of the integral. Hence the previousreasoning shows that

ż 1

0

fpxq dx “2

3.

214

Note that (2.6.1) presupposes an intuitive geometric notion of the area A.Today, the limits would be used for the definition of A. As derivativesof functions are used to define tangents at curves, integrals of functionsare used to define areas (or volumes in Calculus III). Also, note that thewhole calculation, including the limit value, uses only rational numbers andtherefore does not pose a problem to ancient Greek mathematics. In othercases where the quadrature failed, like the quadrature of the circle, thatarea was not describable by a rational number. Finally, instead of (2.6.1),Archimedes showed an equivalent result that expressed A in terms of a ra-tional multiple of the area of a triangle inscribed into the parabolic segment.For the last result, we refer to the beginning of Section 3.3 in Calculus IIon series of real numbers.

We return to the question of showing thatA “ 23. Since there was no limitconcept at the time, this proof had to be performed by a so called ‘doublereductio ad absurdum’, i.e., by leading both assumptions that A ă 23 andthat A ą 23 to a contradiction which leaves only the option that A “ 23.Since

2

3´

1

nď

2

3´

3n` 1

6n2“ Ln ď A ď Un “

2

3`

3n´ 1

6n2ď

2

3`

1

n,

this can be done as follows. For this, we assume that A “ p23q ` ε forsome ε ą 0. Then, it follows for n ą 1ε that

2

3` ε “ A ď

2

3`

1

nă

2

3` ε .

On the other hand, if A “ p23q ´ ε for some ε ą 0, it follows for n ą 1εthat

2

3´ ε “ A ě

2

3´

1

ną

2

3´ ε .

Hence the only remaining possibility is that A “ 23. Of course, in ancientGreece only rational ε were considered in such analysis.

A generalization of Archimedes’ result to natural powers of x were made

215

only in the 17th century by Descartes and Fermat, but unpublished, and in1647 by Bonaventura Cavalieri [24]. The next decisive step was the discov-ery of the fundamental theorem of calculus independently by Newton [83]and Leibniz [68], see Theorems 2.6.19, 2.6.21, i.e., the realization that dif-ferentiation and integration are inverse processes.

For motivation of that theorem, we go back to the start of Section 2.4 tothe discussion of Galileo’s results on bodies in free fall near the surface ofthe earth. Starting from the fallen distance sptq at time t,

sptq “1

2gt2 (2.6.2)

for all t ě 0, we determined the instantaneous speed vptq of the body attime t as the derivative

vptq “ s 1ptq “ gt

where g “ 9.81msec2 is the acceleration of the earth’s gravitational field.We now investigate the reverse question, how to calculate sptq from the in-stantaneous speeds between times 0 and t. There are two main approachesto this problem.

The first uses that vptq “ s 1ptq for every t ą 0 and concludes that s isthe ‘anti-derivative’ of v such that sp0q “ 0 and hence (by application ofTheorem 2.5.7) is given by (2.6.2).

A second approach leading on integration uses the following relation be-tween s and v. For every t ą 0 and n P N˚, it follows that

sptq ´ sp0q “n´1ÿ

k“0

„

s

ˆ

k ` 1

nt

˙

´ s

ˆ

k

nt

˙

“

n´1ÿ

k“0

s`

k`1nt˘

´ s`

knt˘

k`1nt´ k

nt

¨

ˆ

k ` 1

nt´

k

nt

˙

.

216

0.2 0.4 0.6 0.8 1 1.2t @secD

2

4

6

8

10

vHtL @msecD

Fig. 63: S6p1.2q is given by the yellow area under Gpvq.

For k P t0, . . . , n´ 1u,

s`

k`1nt˘

´ s`

knt˘

k`1nt´ k

nt

is the average speed in the time interval„

k

nt,k ` 1

nt

.

In this case, it is given by

s`

k`1nt˘

´ s`

knt˘

k`1nt´ k

nt

“ngt

2

«

ˆ

k ` 1

n

˙2

´

ˆ

k

n

˙2ff

“ gtk ` 1

2

n

“ v

ˆ

k

nt

˙

`gt

2n

in terms of the instantaneous speed v at the beginning of the time interval.

217

Hence, we conclude that

sptq ´ sp0q “gt2

2n`

n´1ÿ

k“0

v

ˆ

k

nt

˙

t

n“ Snptq `

gt2

2n

where

Snptq :“n´1ÿ

k“0

v

ˆ

k

nt

˙

t

n.

This leads on

sptq ´ sp0q “ limnÑ8

«

n´1ÿ

k“0

v

ˆ

k

nt

˙

t

n

ff

.

Note that the sum Snptq has the geometrical interpretation of an area underGpvq, see Fig. 63. Below the limit

limnÑ8

«

n´1ÿ

k“0

v

ˆ

k

nt

˙

t

n

ff

will coincide with the integral of the function v over the interval r0, tswhichis denoted by

ż t

0

vpτq dτ .

Hence

sptq ´ sp0q “

ż t

0

vpτq dτ

gives the relation between instantaneous speed and the distance traveledbetween times 0 and t. It is satisfied for the motion in one dimension ingeneral. The last relation gives the connection between the integral of vover the interval r0, ts, t ą 0, and its anti-derivative s. It constitutes a spe-cial case of the fundamental theorem of calculus and is valid for a wideclass of functions v. From the knowledge of an anti-derivative s of v, i.e.,some function s such that s 1pτq “ vpτq for all τ P r0, ts, this relation allowsthe calculation of the integral of v over the interval r0, ts.

218

As a consequence of the discovery of the fundamental theorem of calculus,during the 18th century, the integral was generally regarded as the inverseof the derivative, i.e., the statement of the fundamental theorem of calculuswas used to define the integral. Only in cases where an anti-derivative couldnot be found, definitions of the integral as a limit of some sort of sums oran area under a curve were used to derive approximations. In particular, thenotion of area was still considered intuitive such that no precise definitionwas needed.

At the beginning of the 19th century, the work of Fourier made it necessaryto define integrals also of discontinuous functions. Cauchy was the first togive a definition for continuous functions. Still, it contained an unnaturalelement in a preference of function values assumed at left ends of intervalsused to subdivide the domain of such a function. The first fully satisfactorydefinition, applicable to a large class of discontinuous functions, was givenby Bernhard Riemann in 1854 in his habilitation thesis [87]. The equivalentdefinition used in this text is due to Jean-Gaston Darboux.

After this introduction, we start with natural definitions of the length ofintervals, partitions of intervals and corresponding lower and upper sumsof bounded functions. Such sums already appeared in the previous cal-culation of the area of the parabolic segment and in the motivation of thefundamental theorem of calculus. They corresponded to partitions of in-tervals into subintervals of equal length. In the limit of vanishing length,we arrived at the area A as well as at integrals of v. Below, the size of apartition generalizes that length. On the other hand, we will allow for muchgeneral partitions of intervals in the definition of the integral. As a conse-quence, those partitions cannot be characterized by a single parameter, andhence a definition of the integral in form of a simple limit is not possible.Such limit is replaced by the supremum of lower sums and the infimum ofupper sums which is required to coincide for integrable functions.

Definition 2.6.1.

219

(i) Let a, b P R be such that a ď b. We define the lengths of the corre-sponding intervals pa, bq, pa, bs, ra, bq, ra, bs by

lppa, bqq “ lppa, bsq “ lpra, bqq “ lpra, bsq :“ b´ a .

A partition P of ra, bs is an ordered sequence pa0, . . . , aνq of elementsof ra, bs such that

a “ a0 ď a1 ď ¨ ¨ ¨ ď aν “ b

where ν is an element of N˚. Since pa, bq is such a partition of ra, bs,the set of all partitions of that interval is non-empty. A partition P 1

of ra, bs is called a refinement of P if P is a subsequence of P 1.

(ii) A partition P “ pa0, . . . , aνq of a bounded closed interval I of Rinduces a division of I into, in general non-disjoint, subintervals

I “ν´1ď

j“0

Ij , Ij :“ raj, aj`1s , j “ 0, . . . , ν .

The size of P is defined as the maximum of the lengths of thesesubintervals. In addition, we define for every bounded function f onI the lower sum Lpf, P q and upper sum Upf, P q corresponding to Pby:

Lpf, P q :“ν´1ÿ

j“0

inftfpxq : x P Iju lpIjq ,

Upf, P q :“ν´1ÿ

j“0

suptfpxq : x P Iju lpIjq .

Note that if K ą 0 is such that |fpxq| ď K for all x P I , it followsthat

´K ď inftfpxq : x P Ju ď suptfpxq : x P Ju ď K

220

for every subset J of I and hence that

|Lpf, P q| ďν´1ÿ

j“0

| inftfpxq : x P Iju| lpIjq ď Kν´1ÿ

j“0

lpIjq “ K lpIq ,

|Upf, P q| ďν´1ÿ

j“0

| suptfpxq : x P Iju| lpIjq ď Kν´1ÿ

j“0

lpIjq “ K lpIq .

As a consequence, the sets

tLpf, P q : P P Pu , tUpf, P q : P P Pu

are bounded where P denotes the set of all partitions of I .

Example 2.6.2. Consider the interval I :“ r0, 1s and the continuous func-tion f : I Ñ R defined by fpxq :“ x for all x P I .

P0 :“ p0, 1q , P1 :“ p0, 12, 1q

are partitions of I . The size of P0 is 1, whereas the size of P1 is 12. Also,P1 is a refinement of P0. Finally,

Lpf, P0q “ 0 ¨ 1 “ 0 , Upf, P0q “ 1 ¨ 1 “ 1 ,

Lpf, P1q “ 0 ¨1

2`

1

2¨

1

2“

1

4,

Upf, P1q “1

2¨

1

2` 1 ¨

1

2“

3

4

and hence

Lpf, P0q ď Lpf, P1q ď Upf, P1q ď Upf, P0q .

Intuitively, it is to be expected that a refinement of a partition of an intervalleads to a decrease of corresponding upper sums and an increase of cor-responding lower sums as has also been found in the special case in theprevious example. Indeed, this is intuition is correct.

221

Lemma 2.6.3. Let f be a bounded real-valued function on a closed intervalI of R. Further, let P, P 1 be partitions of I , and in particular let P 1 be arefinement of P . Then

Lpf, P q ď Lpf, P 1q ď Upf, P 1

q ď Upf, P q . (2.6.3)

Proof. The middle inequality is obvious from the definition of lower andupper sums given in Def 2.6.1(ii). Further, let P “ pa0, . . . , aνq be a parti-tion of [a,b] where ν P N˚ and a0, . . . , aν P ra, bs. Obviously, for the proofof the remaining inequalities it is sufficient (by the method of induction) toassume that P 1 “ pa0, a

11, a1, . . . , aνq where a 11 P I is such that

a0 ď a 11 ď a1

and where we simplified to keep the notation simple. Then

Lpf, P 1q ´ Lpf, P q “

inftfpxq : x P ra0, a11su ¨ lpra0, a

11sq ` inftfpxq : x P ra 11, a1su ¨ lpra

11, a1sq

´ inftfpxq : x P ra0, a1su ¨ lpra0, a1sq

ě inftfpxq : x P ra0, a1su ¨ tlpra0, a11sq ` lpra 11, a1sq ´ lpra0, a1squ “ 0 .

Analogously, it follows that

Upf, P 1q ´ Upf, P q ď 0

and hence, finally, (2.6.3).

As a consequence of their definition, lower sums are smaller than uppersums. It is not difficult to show that the same is true for the supremum ofthe lower sums and the infimum of the upper sums.

Theorem 2.6.4. Let f be a bounded real-valued function on the intervalra, bs of R and P be the set of all partitions of ra, bs where a and b are someelements of R such that a ď b. Then

supptLpf, P q : P P Puq ď infptUpf, P q : P P Puq . (2.6.4)

222

Proof. By Theorem 2.6.3, it follows for all P1, P2 P P that

Lpf, P1q ď Lpf, P q ď Upf, P q ď Upf, P2q ,

where P P P is some corresponding common refinement, and hence that

supptLpf, P1q : P1 P Puq ď Upf, P2q

and (2.6.4).

As a consequence of Lemma 2.6.3 and since every partition P of someinterval of R is a refinement of the trivial partition containing only its initialand endpoints, we can make the following definition.

Definition 2.6.5. (The Riemann integral) Let f be a bounded real-valuedfunction on the interval ra, bs of R where a and b are some elements of Rsuch that a ď b. Denote by P the set consisting of all partitions of ra, bs.We say that f is Riemann-integrable on ra, bs if

supptLpf, P q : P P Puq “ infptUpf, P q : P P Puq .

In that case, we define the integral of f on ra, bs byż b

a

fpxq dx :“ supptLpf, P q : P P Puq “ infptUpf, P q : P P Puq .

In particular if fpxq ě 0 for all x P ra, bs, we define the area A under thegraph of f by

A :“

ż b

a

fpxq dx .

Example 2.6.6. Let f be a constant function of value c P R on some in-terval ra, bs of R where a and b are some elements of R such that a ď b.In particular, f is bounded. Further, let P “ pa0, . . . , aνq be a partition ofra, bs where ν P N˚ and a0, . . . , aν P ra, bs. Then

Lpf, P q “ Upf, P q “ν´1ÿ

k“0

c lprak, ak`1sq “

ν´1ÿ

k“0

c pak`1 ´ akq

223

“ cν´1ÿ

k“0

pak`1 ´ akq “ c pb´ aq .

Hence all lower and upper sums are equal to c ¨ pb´ aq. As a consequence,f is Riemann-integrable and

ż b

a

fpxq dx “ c ¨ pb´ aq .

Note that this result can restated as saying thatż b

a

dx

is given by the difference of the values the antiderivative pra, bs Ñ R, x ÞÑxq of the integrand at b and a. That this is not just accidental will be seenlater on. The same is also true in more general cases as specified in theversion Theorem 2.6.21 of the fundamental theorem of calculus.

Note that according to the previous example, the integral of every functiondefined on an interval containing precisely one point is zero. The valueof the function in this point does not affect the value of the integral. Thisobservation will lead further down to the definition of so called zero sets.

Example 2.6.7. Consider the function f : ra, bs Ñ R defined by

fpxq :“ x ,

for all x P ra, bs where a and b are some elements of R such that a ď b.Since fpxq “ |x| ď max |a|, |b| for every x P ra, bs, f is bounded. Forevery n P N˚, define the partition Pn of ra, bs by

Pn :“

ˆ

a, a`b´ a

n, . . . , a`

n ¨ pb´ aq

n“ b

˙

.

Calculate Lpf, Pnq and Upf, Pnq for all n P N˚. Show that f is Riemann-integrable over ra, bs and calculate the value of

ż b

a

fpxq dx .

224

Solution: We have:

I “n´1ď

j“0

„

a`jpb´ aq

n, a`

pj ` 1qpb´ aq

n

and

L pf, Pnq “n´1ÿ

j“0

„

a`jpb´ aq

n

¨b´ a

n“ a ¨ pb´ aq `

pb´ aq2

n2

n´1ÿ

j“0

j

“ a ¨ pb´ aq `pb´ aq2

n2¨n

2pn´ 1q “ a ¨ pb´ aq `

pb´ aq2

2

ˆ

1´1

n

˙

,

U pf, Pnq “n´1ÿ

j“0

„

a`pj ` 1qpb´ aq

n

¨b´ a

n


n2

n´1ÿ

j“0

pj ` 1q “ a ¨ pb´ aq `pb´ aq2

n2¨n

2pn` 1q


2

ˆ

1`1

n

˙

,

HencelimnÑ8

L pf, Pnq “ limnÑ8

U pf, Pnq “1

2¨ pb2

´ a2q .

As a consequence, it follows that

1

2¨ pb2

´ a2q ď supptLpf, P q : P P Puq

and thatinfptUpf, P q : P P Puq ď

1

2¨ pb2

´ a2q

and hence by Theorem 2.6.4 that

supptLpf, P q : P P Puq “ infptUpf, P q : P P Puq “1

2¨ pb2

´ a2q

225

where P denotes the set of partitions of ra, bs. Hence f is Riemann-integrableand

ż b

a

x dx “1

2¨ pb2

´ a2q .

Note that the last result can be restated as saying thatż b

a

x dx

is given by the difference of the values the antiderivative pra, bs Ñ R, x ÞÑx22q of the integrand at b and a. That this is not just accidental will beseen later on. The same is also true in more general cases as specified inthe version Theorem 2.6.21 of the fundamental theorem of calculus.

In the past, we have seen that the property of convergence of a sequence aswell as of the continuity and differentiability of functions is automatically‘transferred’ to sums, products and quotients, see Theorems 2.3.4, 2.3.46, 2.3.48and 2.4.8. Also did this fact considerably simplify the process of the de-cision whether a given sequence is convergent or given functions are con-tinuous or differentiable. In many cases, this is an obvious consequenceof the convergence of elementary sequences as well as of the continuityor differentiability of elementary functions. For these reasons, it is naturalto ask whether multiples, sums, products and quotients of integrable func-tions are integrable as well. Indeed, this is the case for multiples, sums andproducts. In the case of quotients, this is the case if the divisor is in addi-tion nowhere vanishing, and if the quotient is bounded. The correspondingproof is relatively simple in the case of multiples and sums of integrablefunctions and is part of the following theorem. In the case of productsand quotients, the statement is a consequence of Lebesgue’s criterion forRiemann-integrability, Theorem 2.6.13, which is proved in the appendix.Within the definition of Riemann-integrability above, we also defined thearea under the graph of a positive integrable function in terms of its integral.This is reasonable in view of applications only if that integral is positive.This positivity is a simple consequence of the positivity of the lower sums.

226

Theorem 2.6.8. (Linearity and positivity of the integral) Let f, g bebounded and Riemann-integrable on the interval ra, bs of R where a andb are elements of R such that a ď b and c P R. Then f ` g and cf areRiemann-integrable on ra, bs and

ż b

a

pfpxq ` gpxqq dx “

ż b

a

fpxq dx`

ż b

a

gpxq dx ,

ż b

a

cfpxq dx “ c

ż b

a

fpxq dx .

If f is in addition positive, thenż b

a

fpxq dx ě 0 .

Proof. In the following, we denote by P the set of all partitions of ra, bs.First, if M1 ą 0 and M2 ą 0 are such that |fpxq| ď M1 and |gpxq| ď M2,then

|pf ` gqpxq| “ |fpxq ` gpxq| ď |fpxq| ` |gpxq| ďM1 `M2 ,

|pcfqpxq| “ |cfpxq| “ |c| |fpxq| ď |c|M1

for all x P ra, bs and hence f ` g and cf are bounded for every c P R.Second, it follows for every subinterval J of I :“ ra, bs that

inftfpxq : x P Ju ` inftgpxq : x P Ju ď fpxq ` gpxq “ pf ` gqpxq ,

pf ` gqpxq “ fpxq ` gpxq ď suptfpxq : x P Ju ` suptgpxq : x P Ju

for all x P J and hence that

inftfpxq : x P Ju ` inftgpxq : x P Ju

ď inftpf ` gqpxq : x P Ju ď suptpf ` gqpxq : x P Ju

ď suptfpxq : x P Ju ` suptgpxq : x P Ju .

Hence it follows for every partition P of I that

Lpf, P q ` Lpg, P q ď Lpf ` g, P q ď Upf ` g, P q

227

ď Upf, P q ` Upg, P q .

If n P N˚, by refining partitions, we can construct Pn P P such thatż b

a

fpxq dx´1

2nă Lpf, Pnq ,

ż b

a

gpxq dx´1

2nă Lpg, Pnq ,

Upf, Pnq ă

ż b

a

fpxq dx`1

2n, Upg, Pnq ă

ż b

a

gpxq dx`1

2n.

Henceż b

a

fpxq dx`

ż b

a

gpxq dx´1

nď Lpf ` g, Pnq ď Upf ` g, Pnq

ď

ż b

a

fpxq dx`

ż b

a

gpxq dx`1

n

andż b

a

fpxq dx`

ż b

a

gpxq dx´1

nď suptLpf ` g, P q : P P Pu

ď inftUpf ` g, P q : P P Pu ď

ż b

a

fpxq dx`

ż b

a

gpxq dx`1

n.

Since the last is true for every n P N˚, we conclude that

suptLpf ` g, P q : P P Pu “ inftUpf ` g, P q : P P Pu

“

ż b

a

fpxq dx`

ż b

a

gpxq dx .

Hence f ` g is Riemann-integrable andż b

a

pfpxq ` gpxqq dx “

ż b

a

fpxq dx`

ż b

a

gpxq dx .

Further, if c ě 0, it follows for every subinterval J of I that

inftcfpxq : x P Ju “ c inftfpxq : x P Ju ,

228

suptcfpxq : x P Ju “ c suptfpxq : x P Ju

and hence that

Lpcf, P q “ c Lpf, P q , Upcf, P q “ c Upf, P q

for every partition P of I . The last implies that

suptLpcf, P q : P P Pu “ c suptLpf, P q : P P Pu “ c

ż b

a

fpxq dx ,

inftUpcf, P q : P P Pu “ c inftUpf, P q : P P Pu “ c

ż b

a

fpxq dx .

If c ď 0, it follows for every subinterval J of I that

inftcfpxq : x P Ju “ c suptfpxq : x P Ju ,

suptcfpxq : x P Ju “ c inftfpxq : x P Ju

and hence that

Lpcf, P q “ c Upf, P q , Upcf, P q “ c Lpf, P q

for every partition P of I . The last implies that

suptLpcf, P q : P P Pu “ c inftUpf, P q : P P Pu “ c

ż b

a

fpxq dx ,

inftUpcf, P q : P P Pu “ c suptLpf, P q : P P Pu “ c

ż b

a

fpxq dx .

Hence it follows in both cases thatż b

a

cfpxq dx “ c

ż b

a

fpxq dx .

Finally, if f is such that fpxq ě 0 for all x P I , then

inftfpxq : x P Ju ě 0

229

for all subintervals J of I and hence

Lpf, P q ě 0

for every partition P of I . As a consequence,ż b

a

fpxq dx “ suptLpf, P q : P P Pu ě 0 .

The Riemann integral can be viewed as a map into the real numbers withdomain given by the set of bounded Riemann-integrable functions oversome bounded closed interval I of R. According to the previous theo-rem, that map is ‘linear’, i.e., the integral of the sum of such functionsis equal to the sums of their corresponding integrals and the integral of ascalar multiple of such a function is given by that multiple of the integralof that function. In addition, it is positive, in the sense that it maps suchfunctions which are in addition positive, i.e., which assume only positive(ě 0) values, into a positive real number. It is easy to see that the linearityand positivity of the map implies also its monotony, i.e., if such functions fand g satisfy f ď g, defined by fpxq ď gpxq for all x P I , then the integralof f is equal or smaller than the integral of g.

Corollary 2.6.9. (Monotony of the integral) Let f, g be bounded andRiemann-integrable on the interval ra, bs of R where a and b are elementsof R such that a ď b, and in addition let fpxq ď gpxq for all x P ra, bs.Then

ż b

a

fpxq dx ď

ż b

a

gpxq dx .

Proof. For this, we define the auxiliary function h : ra, bs Ñ R by hpxq :“gpxq ´ fpxq for all x P ra, bs. According to Theorem 2.6.8, h is boundedand Riemann-integrable. Finally, since fpxq ď gpxq for all x P ra, bs, itfollows that hpxq ě 0 for all x P ra, bs. Hence it follows by the linearityand positivity of the integral that

0 ď

ż b

a

hpxq dx “

ż b

a

gpxq dx`

ż b

a

r´fpxqs dx “

ż b

a

gpxq dx´

ż b

a

fpxq dx

230

and hence thatż b

a

fpxq dx ď

ż b

a

gpxq dx .

The reader might have wondered why we did not define divisions of in-tervals induced by partitions in such a way that they contain only pairwisedisjoint intervals, although that would have been possible. In our definitionsubsequent intervals in a division contain a common point. Hence, in a cer-tain sense, associated upper and lower sums count the values of the functionin such points twice. The reason for our definition is that it is technicallysimpler than one which uses pairwise disjoint intervals and that the use ofa definition of the latter type would have led to the same integral. The lastis reflected by the fact that values of functions in individual points don’tinfluence the value of the integral. For this note that by Example 2.6.6,it follows that the integral of any function defined on a interval containingonly one point is zero. The value of the function in this point does not affectthe value of the integral. The reason behind this behavior is, of course, thefact that we defined the length of intervals as the difference between theirright and left boundary. Hence the length of an interval containing onlyone point is zero. Such intervals are examples of so called zero sets. Thevalues assumed by a function on a zero set do not influence the value of theintegral. There are several definitions of zero sets possible. The followingcommon definition uses the intuition that they should be, in some sense, ofvanishing length.

Definition 2.6.10. (Sets of measure zero) A subset S of R is said to havemeasure zero if for every ε ą 0 there is a corresponding sequence I0, I1, . . .of open subintervals of R such that the union of those intervals contains Sand at the same time such that

limnÑ8

nÿ

k“0

lpIkq ă ε .

Remark 2.6.11. Note that any finite subset of R and also any subset of aset of measure zero has measure zero.

231

Theorem 2.6.12. Every countable subset S of R is a set of measure zero.

Proof. Since S is countable, there is a bijection ϕ : N Ñ S. Let ε ą0 and define for each n P N the corresponding interval In :“ pϕpnq ´ε2n`3, ϕpnq ` ε2n`3q. Then for each N P N:

Nÿ

n“0

lpInq “ ε ¨Nÿ

n“0

ˆ

1

2

˙n`2

“ε

4¨

1´`

12

˘N`1

1´ 12

“ε

2¨

«

1´

ˆ

1

2

˙N`1ff

and hence

limNÑ8

Nÿ

k“0

lpIkq “ε

2ă ε .

So far, we proved existence of the integral only in few simple cases. Thefollowing celebrated theorem due to Henri Lebesgue changes this. It givesa characterization of Riemann-integrability. Because of its technical char-acter, the proof is transferred to the Appendix.

Theorem 2.6.13. (Lebesgue’s criterion for Riemann-integrability) Letf : ra, bs Ñ R be bounded where a and b are some elements of R suchthat a ă b. Further, let D be the set of discontinuities of f . Then f isRiemann-integrable if and only if D is a set of measure zero.

Proof. See the proof of Theorem 5.2.6 in the Appendix.

Remark 2.6.14. A property is said to hold almost everywhere on a subsetS of R if it holds everywhere on S except for a set of measure zero. Thus,Theorem 2.6.13 states that a bounded function on a non-trivial boundedand closed interval of R is Riemann-integrable if and only if the function isalmost everywhere continuous.

Since|fpxq| “

a

rfpxqs2

for every x P ra, bs, if f is bounded and Riemann-integrable on the intervalra, bs of R, where a and b are elements of R such that a ď b, we conclude by

232

application of the previous theorem that also |f | is bounded and Riemann-integrable. Since

´fpxq ď |fpxq| ď fpxq

for all x P ra, bs, it follows by the monotony of the Riemann integral, Corol-lary 2.6.9, that

´

ż b

a

fpxq dx ď

ż b

a

|fpxq| dx ď

ż b

a

fpxq dx

and hence thatˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx

ˇ

ˇ

ˇ

ˇ

ď

ż b

a

|fpxq| dx .

The last estimate is frequently applied. For a first application, see Exam-ple 2.6.16. As a consequence, we proved the following theorem.

Theorem 2.6.15. Let f be bounded and Riemann-integrable on the intervalra, bs of R where a and b are elements of R such that a ď b. Then |f | isbounded and Riemann-integrable and

ˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx

ˇ

ˇ

ˇ

ˇ

ď

ż b

a

|fpxq| dx .

Example 2.6.16. For many functions that are important for applications,there are integral representations which are often crucial for the derivationof their properties. For instance, for every n P Z, the corresponding Besselfunction of the first kind Jn satisfies

Jnpxq “1

π

ż π

0

cospx sin θ ´ nθq dθ

for all x P R and is the solution of the differential equation

x2f 2pxq ` xf 1pxq ` px2´ n2

qfpxq “ 0 ,

for all x P R. By Corollary 2.6.9, it follows the simple estimate

|Jnpxq| ď1

π

ż π

0

| cospx sin θ ´ nθq| dθ ď1

π

ż π

0

dθ “ 1

233

-20 20x

-0.5

1y

Fig. 64: Graph of J0.

for all x P R and hence that Jn is a bounded function. Bessel functionsoccur frequently in the description of physical systems that are ‘axiallysymmetric’, i.e., symmetric with respect to rotations around an axis.

Within the definition of Riemann-integrability above, we also defined thearea under the graph of an bounded integrable function that assumes onlypositive (ě 0) values in terms of its integral. Geometric intuition suggeststhat areas are additive, that is, if A is the set under the graph of a boundedintegrable function and A is the disjoint union of two such sets B and C,we expect that the area of A is equal to the sum of the areas of B and C.Indeed in the following, it will be shown that this intuition is reflected inthe additivity of the integral.

Theorem 2.6.17. (Additivity of upper and lower Integrals) Let f : ra, bs ÑR be bounded where a and b are some elements of R such that a ď b andc P ra, bs. Then

supptLpf, P q : P P Puq “ supptLpf |ra,cs, P q : P P Pra,csuq

` supptLpf |rc,bs, P q : P P Prc,bsuq ,

infptUpf, P q : P P Puq “ infptUpf |ra,cs, P q : P P Pra,csuq

` infptUpf |rc,bs, P q : P P Prc,bsuq

234

where P,Pra,cs,Prc,bs denote the set consisting of all partitions of ra, bs,ra, cs and rc, bs, respectively.

Proof. For this, let P1 “ pa0, . . . , aνq P Pra,cs and P2 “ paν`1, . . . , aν`µq PPrc,bs, where ν, µ are some elements of N˚, and

P :“ pa0, . . . , aν , aν`1, . . . , aν`µq

the corresponding element of P. Then

Lpf, P q “ Lpf |ra,cs, P1q ` Lpf |rc,bs, P2q ,

Upf, P q “ Upf |ra,cs, P1q ` Upf |rc,bs, P2q .

Now let ε ą 0. Obviously because of Lemma 2.6.3, we can assume withoutrestriction that P is such that

supptLpf, P q : P P Puq ´ Lpf, P q ďε

3,

supptLpf |ra,cs, P q : P P Pra,csuq ´ Lpf, P1q ďε

3,

supptLpf |rc,bs, P q : P P Prc,bsuq ´ Lpf, P2q ďε

3.

Then alsoˇ

ˇ supptLpf, P q : P P Puq ´ supptLpf |ra,cs, P q : P P Pra,csuq

´ supptLpf |rc,bs, P q : P P Prc,bsuqˇ

ˇ ď ε .

Analogously because of Lemma 2.6.3, we can also assume without restric-tion that P is such that

Upf, P q ´ infptUpf, P q : P P Puq ďε

3,

Upf, P1q ´ infptUpf |ra,cs, P q : P P Pra,csuq ďε

3,

Upf, P2q ´ infptUpf |rc,bs, P q : P P Prc,bsuq ďε

3.

Then alsoˇ

ˇ infptUpf, P q : P P Puq ´ infptLpf |ra,cs, P q : P P Pra,csuq

235

´ infptLpf |rc,bs, P q : P P Prc,bsuqˇ

ˇ ď ε .

Corollary 2.6.18. (Additivity of the Riemann Integral) Let f : ra, bs ÑR be bounded and Riemann-integrable where a and b are some elements ofR such that a ď b, and c P ra, bs. Then

ż b

a

fpxq dx “

ż c

a

fpxq dx`

ż b

c

fpxq dx .

Proof. The statement is a simple consequence of Theorem 2.6.13 andLemma 2.6.17.

So far, we calculated the value of the integral only in some simple cases andfrom its definition. At the moment, by help of the linearity of the integraland the results in these cases, we can calculate integrals of linear functionsover bounded closed intervals of R, only. The next fundamental theoremwill give us a powerful tool for such calculation. Below, that fundamentaltheorem will be given in two variations. Both are direct consequences of theadditivity. The first displays that integration and differentiation are inverseprocesses. The second is a consequence of the first. For a certain class ofintegrands, it allows the calculation of the integral from the knowledge ofthe values of an antiderivative its integrand at the ends of the interval ofintegration.

Theorem 2.6.19. Let f : ra, bs Ñ R be bounded and Riemann-integrablewhere a and b are some elements of R such that a ă b. Then F : ra, bs Ñ Rdefined by

F pxq :“

ż x

a

fptq dt

for every x P ra, bs is continuous. Furthermore, if f is continuous in somepoint x P pa, bq, then F is differentiable in x and

F 1pxq “ fpxq .

236

Proof. For x, y P ra, bs, it follows by the Corollaries 2.6.18, 2.6.9 that

|F pyq ´ F pxq| “

ˇ

ˇ

ˇ

ˇ

ż y

x

fptq dt

ˇ

ˇ

ˇ

ˇ

ďM ¨ |y ´ x|

if y ě x as well as that

|F pyq ´ F pxq| “

ˇ

ˇ

ˇ

ˇ

ż x

y

fptq dt

ˇ

ˇ

ˇ

ˇ

ďM ¨ |y ´ x|

if y ă x, where M ě 0 is such that |fptq| ďM for all t P ra, bs, and hencethe continuity of F . Further, let f be continuous in some point x P pa, bq.Hence given ε ą 0, there is δ ą 0 such that

|fptq ´ fpxq| ă ε

for all t P ra, bs such that |t ´ x| ă δ. (Otherwise, there is some ε ą 0along with a sequence t0, t1, . . . in ra, bs such that |fptnq ´ fpxq| ě ε and|tn ´ x| ă 1n for all n P N. Then t0, t1, . . . is converging to x, butfpt0q, fpt1q, . . . is not convergent to fpxq. ) Now let h P R˚ be such that|h| ă δ and small enough such that x ` h P pa, bq. We consider the casesh ą 0 and h ă 0. In the first case, it follows by Theorem 2.6.13 andCorollary 2.6.18, 2.6.9 thatˇ

ˇ

ˇ

ˇ

F px` hq ´ F pxq

h´ fpxq

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

1

h

„ż x`h

a

fptq dt´

ż x

a

fptq dt

´ fpxq

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

1

h

„ż x

a

fptq dt`

ż x`h

x

fptq dt´

ż x

a

fptq dt

´ fpxq

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

1

h

ż x`h

x

rfptq ´ fpxqs dt

ˇ

ˇ

ˇ

ˇ

ď1

h

ż x`h

x

|fptq ´ fpxq| dt ď ε .

Analogously, in the second case it follows thatˇ

ˇ

ˇ

ˇ

F px` hq ´ F pxq

h´ fpxq

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

1

h

„ż x`h

a

fptq dt´

ż x

a

fptq dt

´ fpxq

ˇ

ˇ

ˇ

ˇ

237

“

ˇ

ˇ

ˇ

ˇ

1

h

„ż x`h

a

fptq dt´

ż x`h

a

fptq dt´

ż x

x`h

fptq dt

´ fpxq

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

´1

h

ż x

x`h

rfptq ´ fpxqs dt

ˇ

ˇ

ˇ

ˇ

ď1

|h|

ż x

x`h

|fptq ´ fpxq| dt ď ε .


limhÑ0,h‰0

F px` hq ´ F pxq

h“ fpxq

and that F is differentiable in x with derivative fpxq.

Remark 2.6.20. Note that because of Theorem 2.6.13, the function F inTheorem 2.6.19 is differentiable with derivative fpxq for almost all x Ppa, bq.

Theorem 2.6.21. (Fundamental Theorem of Calculus) Let f : ra, bs ÑR be bounded and Riemann-integrable where a and b are some elements ofR such that a ă b. Further, let F be a continuous function on ra, bs as wellas differentiable on pa, bq such that F 1pxq “ fpxq, for all x P pa, bq. Then

ż b

a

fpxq dx “ F pbq ´ F paq .

In calculations, we sometimes use the notation

rF pxqs |ba :“ F pbq ´ F paq .

Proof. Let ε ą 0 and P “ pa0, . . . , aνq be a partition of ra, bs where ν isan element of N˚. By Theorem 2.5.6 for every j P t0, 1, . . . , ν ´ 1u, thereis a corresponding cj P raj, aj`1s such that

F paj`1q ´ F pajq “ F 1pcjqpaj`1 ´ ajq

where we define F 1paq :“ fpaq and F 1pbq :“ fpbq. Hence

F pbq ´ F paq “ν´1ÿ

j“0

rF paj`1q ´ F pajqs “ν´1ÿ

j“0

fpcjqpaj`1 ´ ajq .

238

andLpf, P q ď F pbq ´ F paq ď Upf, P q .

Hence

supptLpf, P q : P P Puq ď F pbq ´ F paq

ď infptUpf, P q : P P Puq “ supptLpf, P q : P P Puq .

Example 2.6.22. Calculateż π

0

7 sin´x

3

¯

dx .

Solution: Byfpxq :“ 7 sin

´x

3

¯

for all x P r0, πs, there is defined a continuous and hence Riemann-integrablefunction on r0, πs. Further by

F pxq :“ ´21 cos´x

3

¯

for all x P r0, πs, there is defined a continuous function on r0, πs which isdifferentiable on p0, πq such that g 1pxq “ fpxq for all x P p0, πq. Hence byTheorem 2.6.21

ż π

0

sin´x

3

¯

dx “ ´21 cos´π

3

¯

` 21 cosp0q “ 21´21

2“

21

2.

Example 2.6.23. A simple number theoretic function is the greatest integeror floor function defined by rxs :“ n for all x P rn, n ` 1q and n P N.Calculate

ż x

0

rys dy ,

ż 0

x

rys dy

for all x ě 0 and x ă 0, respectively. Solution: Note that the greatestinteger functions is almost everywhere continuous and hence according to

239

-4 -2 2 4x

-4

-2

2

4y

Fig. 65: Graph of the greatest integer function and anti-derivative.

Theorem 2.6.13 also Riemann-integrable on any closed interval of R. Forevery n P N and every x P rn, n ` 1q, it follows by Corollary 2.6.18 andTheorem 2.6.21 that

ż x

0

rys dy “

ż n

0

rys dy `

ż x

n

rys dy “n´1ÿ

k“0

ż k`1

k

rys dy `

ż x

n

n dy

“

n´1ÿ

k“0

ż k`1

k

k dy ` npx´ nq “n´1ÿ

k“0

k ` npx´ nq

“n

2pn´ 1q ` npx´ nq “ n

ˆ

x´n` 1

2

˙

“ rxs

ˆ

x´1` rxs

2

˙

.

Analogously, it follows for every n P Z such that n ď ´1 and every x Prn, n` 1q that

ż 0

x

rys dy “

ż n`1

x

rys dy `

ż 0

n`1

rys dy “

ż n`1

x

n dy `´1ÿ

k“n`1

ż k`1

k

rys dy

240

“ npn` 1´ xq `´1ÿ

k“n`1

ż k`1

k

k dy “ npn` 1´ xq `´1ÿ

k“n`1

k

“ npn` 1´ xq ´n

2pn` 1q “ n

ˆ

n` 1

2´ x

˙

“ ´rxs

ˆ

x´1` rxs

2

˙

.

See Fig. 65.

A basic method for the evaluation of integrals with trigonometric integrandsconsists in the application of the addition theorems for sine and cosine.


0

sinpmθq sinpnθq dθ

where m,n P N˚. Solution: By help of the addition theorem for the cosinefunction, it follows that

cosppm` nqθq “ cospmθq cospnθq ´ sinpmθq sinpnθq ,

cosppm´ nqθq “ cospmθq cospnθq ` sinpmθq sinpnθq ,

and hence that

sinpmθq sinpnθq “1

2r cosppm´ nqθq ´ cosppm´ nqθq s

for all θ P R. This leads toż π

0

sinpmθq sinpnθq dθ “1

2

ż π

0

r cosppm´ nqθq ´ cosppm` nqθq s dθ

“1

2pm´ nqr sinppm´ nqθq sπ0 ´

1

2pm` nqr sinppm` nqθq sπ0 “ 0

if m ‰ n andż π

0

sinpmθq sinpnθq dθ “1

2

ż π

0

r 1´ cosppm` nqθq s dθ

“π

2´

1

2pm` nqr sinppm` nqθq sπ0 “

π

2

if m “ n.

241

Example 2.6.25. Find the solutions of the following (‘differential’) equa-tion for f : RÑ R:

f 1pxq “ e2x` sinp3xq (2.6.5)

for all x P R. Solution: If f is such function, it follows that f is continu-ously differentiable. Hence it follows by Theorem 2.6.21 that

fpxq ´ fpx0q “

ż x

x0

f 1pyq dy “

ż x

x0

pe2y` sinp3yqq dy

“

„

1

2e2y´

1

3cosp3yq

x

x0

“1

2e2x´

1

3cosp3xq ´

1

2e2x0 `

1

3cosp3x0q

where x0 P R and x ą x0. Hence

fpxq “1

2e2x´

1

3cosp3xq ` c ,

for all x P R where

c “ fp0q ´1

2`

1

3“ fp0q ´

1

6.

On the other hand if c P R and fc : RÑ R is defined by

fcpxq :“1

2e2x´

1

3cosp3xq ` c

for all x P R, then it follows by direct calculation that fc satisfies (2.6.5)for all x P R. Hence the solutions of the differential equation are given bythe family of functions fc , c P R. Note that c “ fp0q ´ p16q. Hence forevery c P R, there is precisely one solution of the differential equation with‘initial value’ fp0q “ c. The same is true for initial values given in anyother point of R.

Example 2.6.26. Find the solutions of the following differential equationfor f : RÑ R.

f 1pxq ` afpxq “ 3 (2.6.6)

242

for all x P R where a P R. Solution: If f is such function, it follows thatf is continuously differentiable. Further, by using the auxiliary functionh : RÑ R defined by

hpxq :“ eax

for every x P R, it follows that

phfq 1pxq “ hpxqf 1pxq ` h 1pxqfpxq “ eaxf 1pxq ` aeaxfpxq

“ eaxrf 1pxq ` afpxqs “ 3 eax

for all x P R. Hence it follows by Theorem 2.6.21 that

phfqpxq ´ phfqpx0q “

ż x

x0

3 eay dy “3

aeax ´

3

aeax0

and therefore that

phfqpxq “ phfqpx0q `3

ap1´ eax0q `

3

apeax ´ 1q

for x ą x0 where x0 P R. From this, we conclude that

phfqpxq “3

apeax ´ 1q ` c

and

fpxq “3

ap1´ eáxq ` c eáx

for all x P R where c “ fp0q. On the other hand if c P R and fc : R Ñ Ris defined by

fcpxq :“3

ap1´ eáxq ` c eáx

for all x P R, then it follows by direct calculation that fc satisfies (2.6.6)for all x P R. Hence the solutions of the differential equation are givenby the family of functions fc , c P R. Note that c “ fp0q. Hence forevery c P R, there is precisely one solution of the differential equation with‘initial value’ fp0q “ c. The same is true for initial values given in anyother point of R.

243

Problems

1) Calculate

a)ż 3

2

px2 ` 5x` 7q dx , b)ż 2

0

9 t23 dt ,

c)ż 2

´1

pe2x ´ 3xq dx , d)ż 2

´3

sinpπxq

πdx ,

e)ż 2

1

ˆ

4?x´ 3x

˙

dx , f)ż 4

1

?3x` 2 dx ,

g)ż 1

0

x

2x2 ` 1dx , h)

ż 5

2

x2 ´ 3x` 5

x3dx ,

i)ż π2

0

1

3sinpx2q cospx2q dx dx ,

j)ż π

π2

4

5sinpxq cos2pxq dx , k)

ż 3

1

|3x´ 4| dx ,

l)ż 3

´1

“

x` | 5x´ 3 |2‰

dx ,

m)ż 2

´2

„

1

3|x´ 1 | ` 4 |x` 1 |

dx ,

n)ż 5

´2

5

4|x´ 1 | ¨ |x` 1 | ¨ |x` 2 | dx ,

o)ż 3π

0

| sinpx2q| dx , p)ż π6

´π6

| cosp3xq| dx ,

q)ż 2π

0

sinpmθq sinpnθq dθ , r)ż 2π

0

sinpmθq cospnθq dθ ,

s)ż 2π

0

cospmθq cospnθq dθ

where m,n P N˚.


fpxq :“

$

’

&

’

%

?3 p1` xq if x ď ´12

?32 if ´12 ă x ă 12

?3 p1´ xq if 12 ď x

for every x P R. Calculate the area in R2 that is enclosed by the graphof f and the x-axis. Verify your result using facts from elementary

244

geometry. Use the result to calculate the area enclosed by a hexagonof side length 1.

3) Calculate the area in R2 that is enclosed by the graphs of the polyno-mials

p1pxq :“ ´1` p72qx´ x2 , p2pxq :“ 4´ p72qx` x2

where x P R.

4) Calculate the area in R2 that is enclosed by the curve

C :“ tpx, yq P R2 : y2 ´ 4x2 ` 4x4 “ 0u .

5) Show that

cospxq ď 1´x2

π

for all x P r0, π2s.

6) Find the solutions to the differential equation for f : RÑ R.

a) f 1pxq ´ 3fpxq “ x2 ,x P R ,b) f 1pxq ` 3fpxq “ ex4 ,x P R ,c) 2f 1pxq ´ fpxq “ 3 e´x ,x P R .

7) Consider the following differential equation for f : RÑ R.

f 2pxq “ 3x` 4

for all x P R.

a) Find the solutions of this equation.b) Find that solution which satisfies fp0q “ 1 and f 1p0q “ 2.c) Find that solution which satisfies fp0q “ 2 and fp1q “ 3.

8) Calculate

a0 :“1

2π

ż 2π

0

fpxq dx , ak :“1

π

ż 2π

0

cospkxqfpxq dx ,

bk :“1

π

ż 2π

0

sinpkxqfpxq dx

for all k P N˚.

a)

fpxq :“

#

´1 if x P r0, πs1 if x P pπ, 2πs

,

245

b) fpxq :“ x for all x P r0, 2πs ,

c)

fpxq :“

#

x if x P r0, πs2π ´ x if x P pπ, 2πs

.

“

Remark: These are the coefficients of the Fourier expansion of f .The representation

fpxq “ limnÑ8

#

a0 ` limnÿ

k“1

rak cospkxq ` bk sinpkxqs

+

is valid for every point x P r0, 2πs of continuity of f .‰

9) Calculate the area in R2 that is enclosed by the ellipse

C :“

"

px, yq P R2 :x2

a2`y2

b2“ 1

*

where a, b ą 0.

10) Calculate the area in R2 that is enclosed by the branches of hyperbo-las

C1 :“

"

px, yq P R2 : y ě 0^y2

a2´x2

b2“ 1

*

,

C2 :“

"

px, yq P R2 : y ď c^py ´ cq2

a2´x2

b2“ 1

*

where a, b ą 0 and c ą a.

11) Let a, b P R be such that a ă b. Further, let f : ra, bs Ñ R bepositive, i.e., such that fpxq ě 0 for all x P ra, bs, and assume avalue ą 0 in some point of ra, bs. Show that

ż b

a

fpxq dx ą 0 .

12) Let a, b P R be such that a ă b. Further, let f : ra, bs Ñ R and g :ra, bs Ñ R be bounded and Riemann-integrable. Show the followingCauchy-Schwartz inequality for integrals:

˜

ż b

a

fpxqgpxq dx

¸2

ď

˜

ż b

a

f2pxq dx

¸˜

ż b

a

g2pxq dx

¸

.

246

In addition, show that equality holds if and only if there are α, β P Rsatisfying that α2 ` β2 ‰ 0 and such that αf ` βg “ 0.

“

Hint:Consider

ż b

a

r fpxq ` λ gpxq s2 dx

as a function of λ P R.‰

13) Newton’s equation of motion for a point particle of mass m ě 0moving on a straight line is given by


for all t P R where fptq is the position of the particle at time tand F pxq is the external force at the point x. For the specifiedforce, calculate the solution function f of (2.6.7) with initial posi-tion fp0q “ x0 and initial speed f 1p0q “ v0 where x0, v0 P R.

a) F pxq “ 0 ,x P R ,b) F pxq “ F0 ,x P Rwhere F0 is some real parameter .

14) Newton’s equation of motion for a point particle of mass m ě 0moving on a straight line under the influence of a viscous friction isgiven by

mf 2ptq “ ´λf 1ptq (2.6.8)

for all t P R where fptq is the position of the particle at time t andλ ą 0 is a parameter describing the strength of the friction. Calculatethe solution function f of (2.6.8) with initial position fp0q “ x0 andinitial speed f 1p0q “ v0 where x0, v0 P R. Investigate, whether fhas a limit value for tÑ8.

15) Newton’s equation of motion for a point particle of mass m ě 0moving on a straight line under the influence of low viscous friction,for instance friction exerted by air, is given by

mf 2ptq “ ´λ pf 1ptqq2 (2.6.9)

for all t P R where fptq is the position of the particle at time t andλ ą 0 is a parameter describing the strength of the friction. Findsolutions f of (2.6.9) with initial position fp0q “ x0 and initial speedf 1p0q “ v0 where x0, v0 P R.

16) Consider a projectile that is shot into the atmosphere. According toNewton’s equation of motion, the height fptq above ground at timet P R satisfies the equation

mf 2ptq “ ´g ´ λ pf 1ptqq2 (2.6.10)

247

for all t P R where g “ 9.81ms2 is the acceleration due to gravityand λ ą 0 is a parameter describing the strength of the friction. Findsolutions f of (2.6.10) with initial height fp0q “ z0 and initial speedcomponent f 1p0q “ v0 where z0, v0 P R.

248

3 Calculus II

3.1 Techniques of IntegrationThis section studies standard techniques of integration, namely the meth-ods of change of variables (also referred to as ‘integration by substitution’),integration by parts, integration by decomposition of rational integrandsinto partial fractions and, finally, approximate numerical calculation of in-tegrals.

3.1.1 Change of Variables

The method of change of variables (also referred to as ‘integration by sub-stitution’) is based on the chain rule for differentiation. For motivation, weconsider a continuously differentiable and increasing function g defined ona non-trivial open interval I of R and a continuously differentiable functionF that is defined on an open interval containing Ranpgq. Further, let c, d P Ibe such that c ă d.

Then it follows by the chain rule for differentiation that F ˝ g : I Ñ Ris continuously differentiable with derivative given by

pF ˝ gq 1puq “ F 1pgpuqq g 1puq

for all u P I . Further, it follows by the fundamental theorem of calculus,Theorem 2.6.21, that

ż gpdq

gpcq

F 1pxq dx “ F pgpdqq ´ F pgpcqq “ pF ˝ gqpdq ´ pF ˝ gqpcq

“

ż d

c

pF ˝ gq 1puq du “

ż d

c

F 1pgpuqq g 1puq du .

Hence by defining f :“ F 1, we arrive at the formula for the change ofvariables

ż gpdq

gpcq

fpxq dx “

ż d

c

fpgpuqq g 1puq du

249

for f . We note that the previous reasoning proves the validity of this equa-tion if, in addition to the assumptions above on g, c and d, f is a continuousfunction that is defined on a open interval of containing Ranpgq for whichthere is a antiderivative F , i.e., for which there is a differentiable functionF : Dpfq Ñ R such that

F 1pxq “ fpxq

for all x P Dpfq. In the proof of the following theorem, the last is con-cluded from the continuity of the function f and the fundamental theoremof calculus in the form of Theorem 2.6.19.

Theorem 3.1.1. (Change of variables) Let c, d P R such that c ă d.Further, let g : rc, ds Ñ R be continuous, such that gpcq ď gpdq and contin-uously differentiable on pc, dq with a derivative which can be extended to acontinuous function on rc, ds. Finally, let I be an open interval interval ofR containing gprc, dsq and f : I Ñ R be continuous. Then

ż gpdq

gpcq

fpxq dx “

ż d

c

fpgpuqq ¨ g 1puq du . (3.1.1)

Proof. In the special case that g is a constant function, the statement of thetheorem is obviously true. In the remainder of this proof, we consider thecase of a non-constant g. We denote by g 1 the extension of the derivative ofg|pc,dq to a continuous function on rc, ds and define G : rc, ds Ñ R by

Gpuq :“

ż u

c

fpgpuqq ¨ g 1puq du

for all u P rc, ds. By Theorem 2.6.19 it follows that G is continuous as wellas differentiable on pc, dq with

G 1puq “ fpgpuqq ¨ g 1puq

for all u P pc, dq. Further, we define F : rx0, x1s Ñ R by

F pxq :“

ż x

x0

fpx 1q dx 1 ´

ż gpcq

x0

fpx 1q dx 1

250

for all x P rx0, x1swhere x0, x1 P I are such that x0 is smaller than the min-imum value of g and x1 is larger than the maximum value of g, respectively.By Theorem 2.6.19 it follows that F is continuous as well as differentiableon px0, x1q with

F 1pxq “ fpxq

for all x P px0, x1q. Hence it follows by Theorems 2.3.51, 2.4.10 that F ˝ gis continuous as well as differentiable on pc, dq with

pF ˝ gq 1puq “ fpgpuqq ¨ g 1puq “ G 1puq

for all u P pc, dq. From Theorem 2.5.7 and F pgpcqq “ Gpcq “ 0, it followsthat F ˝ g “ G and hence by Corollary 2.6.18 also (3.1.1).

Example 3.1.2. Calculateż 3

1

x?x´ 1 dx .

Solution: For this, we define g : R Ñ R by gpuq :“ u ` 1 for all u PR. Then g is increasing and continuously differentiable with a derivativefunction constant of value 1. In particular, gp0q “ 1 and gp2q “ 3. Further,we define the continuous function f : R Ñ R by fpxq :“ x

a

|x´ 1| forall x P R. Hence it follows by Theorem 3.1.1 that

ż 3

2

x?x´ 1 dx “

ż gp2q

gp0q

fpxq dx “

ż 2

0

fpgpuqq g 1puq du

“

ż 2

0

pu` 1q?u du “

ż 2

0

pu32` u12

q du “

„

2

5u52

`2

3u32

ˇ

ˇ

ˇ

ˇ

2

0

“

„

2

15p3u` 5qu32

ˇ

ˇ

ˇ

ˇ

2

0

“22

15232

“44

15

?2 .

Note that, we could have achieved this result also by the following moresimple reasoning.ż 3

1

x?x´ 1 dx “

ż 3

1

px´ 1` 1q?x´ 1 dx

251

“

ż 3

1

“

px´ 1q32 ` px´ 1q12‰

dx “

„

2

5px´ 1q52 `

2

3px´ 1q32

ˇ

ˇ

ˇ

ˇ

3

1

“

„

2

15p3x` 2q px´ 1q32

ˇ

ˇ

ˇ

ˇ

3

1

“22

15232

“44

15

?2 .

Simple substitutions can often be avoided by application of such simple‘tricks’. Below, we will give some examples where this is not the case.


1

sinplnxq

xdx .

Solution: For this, we define g : RÑ R by gpuq :“ eu for all u P R. Theng is increasing and continuously differentiable with derivative g 1puq “ eu

for all u P R. In particular, gp0q “ 1 and gpln 2q “ exppln 2q “ 2. Further,we define the continuous function f : p0,8q Ñ R by fpxq :“ sinplnxqxfor all x ą 0. Then, it follows by Theorem 3.1.1 thatż 2

1

sinplnxq

xdx “

ż gpln 2q

gp0q

fpxq dx “

ż ln 2

0

fpgpuqq g 1puq du

“

ż ln 2

0

sinpln euq

eueu du “

ż lnp2q

0

sinu du “ r´ cosus |ln 20 “ 1´ cospln 2q

“ 1´ cos

ˆ

2ln 2

2

˙

“ 1´ cos2

ˆ

ln 2

2

˙

` sin2

ˆ

ln 2

2

˙

“ 2 sin2

ˆ

ln 2

2

˙

where, in particular, the addition theorem for the cosine was applied.

The reason for continuing the simplification of the result 1 ´ cospln 2q ismotivated by applications. Usually in applications, a calculation of the pre-vious type is only a small step in a sequence of steps toward a final result.Hence, typically, such result would be needed as input for the next step.Therefore, it is useful to reduce results in their ‘size’ in order to avoid afinal result of even larger ‘size’. Usually, the implications of results of rel-atively large ‘size’ are less obvious. Note also that, the final expression

252

makes obvious the positivity of the integral which is due to the positivityof the integrand in the interval of integration. The last can be seen from theinequality

0 ď lnx ď x´ 1 ď π

for all x P r1, 2s where the inequality (2.5.12) for the case a “ 1 was ap-plied. Quite generally, such a consistency check of the signs of results canavoid errors.

Also in this case, the application of change of variables could have beenavoided. Usually, for a successful application of the method of change ofvariables, the presence of an ‘inner function’ in the integrand is needed.The function g in Theorem 3.1.1 is then defined in such a way that that in-ner function is simplified. In many simple cases, the derivative of that innerfunction is also present in the integrand. Often, this can be used to ‘guess’an antiderivative F of the integrand. For instance in this case, an obviouscandidate for an inner function is the natural logarithm function ln. Sinceln 1pxq “ 1x for all x ą 0, we see that its derivative is also present in theintegrand. Hence a first guess (incorrect) for such F might be

F pxq :“ sinplnxq

for all x ą 0. Then it would follow by the chain rule for differentiation that

F 1pxq “ cosplnxq ¨

1

x“

cosplnxq

x

for all x ą 0. F 1 does not coincide with the integrand on the interval r1, 2sbecause of the presence of the cosine function instead of the sine function.Of course, there is a simple remedy for this. A second (correct) guess forsuch F would be

F pxq :“ ´ cosplnxq

for all x ą 0. As a consequence of the chain rule for differentiation, thisgives

F 1pxq “ sinplnxq ¨

1

x“

sinplnxq

x

253

and hence that F is a antiderivative of the integrand. Hence, we concludeby the fundamental theorem of calculus thatż 2

1

sinplnxq

xdx “ r´ cosplnxqs |21 “ 1´ cospln 2q “ 2 sin2

ˆ

ln 2

2

˙

.

We give now in succession four examples of more serious applications ofchange of variables. The first three give standard trigonometric substitu-tions whose goal is the removal of square roots in integrands. The fourthexample gives a standard substitution that is used to transform rational ex-pression in sine and cosine functions of the same argument into rationalexpressions of the new variable.

Example 3.1.4. Calculateż x

0

dya

y2 ` a2

for every x ą 0 where a ą 0. Solution: Define g : p´π2, π2q Ñ R by

gpθq :“ a ¨ tan θ

for all θ P p´π2, π2q. Then g is a bijective as well as continuouslydifferentiable such that

g 1pθq “ a ¨ p1` tan2θq

for all θ P p´π2, π2q. The inverse g´1 is given by

g´1pxq :“ arctan

´x

a

¯

for all x P R. By Theorem 3.1.1ż x

0

dya

y2 ` a2“

ż gpg´1pxqq

gp0q

dya

y2 ` a2“

ż g´1pxq

0

g 1pθq dθa

pgpθqq2 ` a2

“

ż g´1pxq

0

dθ

cos θ“ ln

ˆ

1` sinpg´1pxqq

cospg´1pxqq

˙

“ ln

˜

x

a`

c

1`x2

a2

¸

.

254


0

?9´ x2 dx .

Solution: Define g : p´π2, π2q Ñ p´3, 3q by gpθq :“ 3 sin θ for allθ P p´π2, π2q. Then g is a bijective as well as continuously differentiablesuch that

g 1pθq “ 3 cos θ

for all θ P p´π2, π2q. The inverse g´1 is given by

g´1pxq “ arcsin

´x

3

¯

for all x P p´3, 3q. By Theorem 3.1.1ż 2

0

?9´ x2 dx “

ż gparcsinp23qq

gp0q

?9´ x2 dx

“

ż arcsinp23q

0

a

9´ pgpθqq2 ¨ g 1pθq dθ “ 9

ż arcsinp23q

0

cos2θ dθ

“9

2

ż arcsinp23q

0

p1` cosp2θqq dθ

“9

2

„

arcsin

ˆ

2

3

˙

`1

2sinp2 arcsinp23qq

“9

2

„

arcsin

ˆ

2

3

˙

`2

3cosparcsinp23qq

“?

5`9

2arcsin

ˆ

2

3

˙

.


4

x´4¨?x2 ´ 9 dx .

Solution: Define g : p0, π2q Ñ p3,8q by gpθq :“ 3 cos θ for all θ Pp0, π2q. Then g is a bijective as well as continuously differentiable suchthat

g 1pθq “ 3 ¨sin θ

cos2θ

255

for all θ P p0, π2q. The inverse g´1 is given by

g´1pxq “ arccos

ˆ

3

x

˙

for all x P p3,8q. By Theorem 3.1.1

ż 5

4

x´4¨?x2 ´ 9 dx “

ż gparccosp35qq

gparccosp34qq

x´4¨?x2 ´ 9 dx

“

ż arccosp35q

arccosp34q

pgpθqq´4¨a

pgpθqq2 ´ 9 ¨ g 1pθq dθ

“1

9

ż arccosp35q

arccosp34q

cos θ sin2θ dθ

“1

27

“

sin3parccosp35qq ´ sin3

parccosp34qq‰

“1

27rp45q3 ´ p

?74q3s .


ż π2

0

dθ

5` 4 cos θ.

Solution: Define g : RÑ p´π, πq by

gpxq :“ 2 arctanx

for all x P R. This is a standard substitution to transform a rational inte-grand in sin and cos into a rational integrand. Then g is bijective as well ascontinuously differentiable such that

g 1pxq “2

1` x2

for all x P R. The inverse g´1 is given by

g´1pθq :“ tan pθ2q

256

-1 1 2 3x

-2

2

y

Fig. 66: Graphs of solutions of the differential equation (3.1.2) in the case that a “ 1 withinitial values ´π,´π2, π2 and π at x “ 0. Compare Example 3.1.8.

for all θ P p´π, πq. By Theorem 3.1.1ż π2

0

dθ

5` 4 cos θ“

ż gp1q

gp0q

dθ

5` 4 cos θ“

ż 1

0

g 1pxq dx

5` 4 cospgpxqq

“

ż 1

0

g 1pxq dx

5` 4”

2 cos2´

gpxq2

¯

´ 1ı “

ż 1

0

g 1pxq dx

5` 4

„

2

1`tan2pgpxq2 q

´ 1

“

ż 1

0

2

1` x2¨

dx

5` 4`

21`x2

´ 1˘ “ 2

ż 1

0

dx

x2 ` 9“

2

3arctan

ˆ

1

3

˙

.

Note that

cospgpxqq “1´ x2

1` x2, sinpgpxqq “

2x

1` x2.

for all x P R.

The following example gives a typical application of change of variablesto the solution of (‘separable’) ordinary differential equations of the firstorder.

257

Example 3.1.8. Find solutions of the following differential equation forf : RÑ R with the specified initial values.

f 1pxq “ a sinpfpxqq (3.1.2)

for all x P R where a ą 0, fp0q P p0, πq. Solution: If f is such function, itfollows that f is continuously differentiable. Since fp0q P p0, πq, it followsby the continuity of f the existence of an open interval c, d P R such thatc ă 0 ă d and such that fp[c, d]q Ă p0, πq. Since a ą 0 and the sinefunction is ě 0 on the interval [0, π], it follows from (3.1.2) that

fpx1q “ fpx0q ` fpx1q ´ fpx0q “ fpx0q `

ż x1

x0

f 1pxq dx

“ fpx0q `

ż x1

x0

a sinpfpxqq dx ě fpx0q

for all x0, x1 P [c, d] such that x0 ď x1. In addition, the restriction of f to[c, d] is non-constant since the sine function has no zeros on p0, πq. Hencewe conclude from (3.1.2) by Theorem 3.1.1 for x P [c, d] that

apx´ cq “

ż x

c

a du “

ż x

c

f 1puq

sinpfpuqqdu “

ż fpxq

fpcq

dθ

sinpθq.

Further, it follows by use of the transformation g from the previous Exam-ple 3.1.7 and Theorem 3.1.1 that

ż fpxq

fpcq

dθ

sinpθq“

ż gptanpfpxq2qq

gptanpfpcq2qq

dθ

sinpθq“

ż tanpfpxq2q

tanpfpcq2q

g 1pxq

sinpgpxqqdx

“

ż tanpfpxq2q

tanpfpcq2q

dx

x“ ln

ˆ

tanpfpxq2q

tanpfpcq2q

˙

.


apx´ cq “ ln

ˆ

tanpfpxq2q

tanpfpcq2q

˙

(3.1.3)

258

which leads to

fpxq “ 2 arctan

„

tan

ˆ

fpcq

2

˙

eapx´cq

. (3.1.4)

From (3.1.3), we conclude that

tan

ˆ

fpcq

2

˙

e´ac “ tan

ˆ

fp0q

2

˙

.

Substituting this identity into (3.1.3) gives

fpxq “ 2 arctan

„

tan

ˆ

fp0q

2

˙

eax

.

On the other hand, for every c P p´π, πq, it follows by elementary calcula-tion that f : RÑ R defined by

fpxq :“ 2 arctan”

tan´ c

2

¯

eaxı

for all x P R satisfies (3.1.2) and fp0q “ c. As a side remark, note that forevery k P Z the constant function of value kπ is a solution of (3.1.2). Inaddition, if f is a solution of (3.1.2), then for every k P Z also fk : RÑ Rdefined by fkpxq :“ fpxq ` 2πkq for every x P R is a solution of (3.1.2).

For the motivation of the following theorem, we consider the map R :“pR2 Ñ R2 defined by

Rpx, yq :“ p´x, yq

for all px, yq P R2. A geometrical interpretation of R is that of a reflectionin the y-axis. This can be seen as follows. For this, let px, yq be some pointin R2. Then the line segment from px, yq to Rpx, yq “ p´x, yq, at the in-tersection p0, yq with the y-axis, is at a right angle with the y-axis and bothpoints px, yq and Rpx, yq are at a distance |x| from the y-axis. Therefore, Rmeets the geometrical definition of the reflection in the y-axis.

Intuitively (according to elementary geometry), we would not expect that

259

-3 -1 1 3x

1

2

4y

Fig. 67: The line segment from p´1, 3q to Rp´1, 3q “ p1, 3q intersects the y-axis at aright angle and is halved by that axis. The yellow rectangles are mapped onto each otherby R. Compare text.

such reflection changes areas, i.e., if S is some subset of R2 of area A, thenwe expect that the set RpSq has the same area. For instance, a rectangle

ra, bs ˆ rc, ds

in R2, where a ď b and c ď d, is mapped by R into the rectangle

Rp ra, bs ˆ rc, ds q “ r´b,´as ˆ rc, ds .

Both rectangles have the same area pb´ aqpd´ cq.

Within the definition of Riemann-integrability above, we defined the areaunder the graph of a bounded integrable f : ra, bs Ñ R, where a, b P R aresuch that a ă b, that assumes only positive (ě 0) values by

ż b

a

fpxq dx .

260

We consider the associated function f : r´b,ás Ñ R defined by fpxq :“fp´xq for all x P r´b,ás. We claim that the graph of f is the image ofthe graph of f under R, i.e.,

Gpf q “ RpGpfqq .

Indeed, if x P r´b,ás, then ´x P ra, bs and

px, fpxqq “ px, fp´xqq “ Rp´x, fp´xqq P RpGpfqq .

Also, if x P ra, bs, then ´x P r´b,ás and

Rpx, fpxqq “ p´x, fpxqq “ p´x, fp´p´xqqq “ p´x, fp´xqq P Gpf q .

Therefore, we expect that f is bounded, integrable and that the area underthe graph of f is equal to the area under the graph of f , i.e., that

ż b

a

fpxq dx “

ż á

´b

fp´xq dx . (3.1.5)

Indeed, it is shown within the proof of the following theorem that this isthe case. Note that we can view this result as a kind of change of variables.For this, we define g : R Ñ R by gpxq :“ ´x. The g is decreasing andcontinuously differentiable with a derivative function which is constant ofvalue ´1. Hence g does not satisfy the assumptions of Theorem 3.1.1. Inparticular, gpáq “ a and gp´bq “ b. A formal application of the changeof variable formula (3.1.1) would give

ż b

a

fpxq dx “

ż ´b

á

fpgpuqq ¨ g 1puq du pincorrectq

which does not make sense according to our definitions because á ą´b. The correct formula (3.1.5), can be ‘obtained’ from this formula byexchange of the integration limits.

Theorem 3.1.9. Let f be a bounded Riemann-integrable function on ra, bswhere a and b are some elements of R such that a ă b. Then

ż b

a

fpxq dx “

ż á

´b

fp´xq dx .

261

-2 -1 1 2x

1

2

3

4y

Fig. 68: The graphs of p r´2,´1s Ñ R, x ÞÑ p´xq2 q and and p r1, 2s Ñ R, x ÞÑ x2 q arereflection symmetric with respect to the y-axis. Compare text.

Proof. Define f´ : r´b,ás Ñ R by f´pxq :“ fp´xq for all x P r´b,ás.Then f´ is bounded, and for any partition P “ pa0, . . . , aνq of ra, bs whereν P N˚, a0, . . . , aν P ra, bs, P´ :“ páν , . . . ,á0q it is a partition ofr´b,ás, and in particular Lpf, P q “ Lpf´, P´q, Upf, P q “ Upf´, P´q.Analogously, for any partition P “ pa0, . . . , aνq of r´b,ás where ν P N˚,a0, . . . , aν P r´b,ás, P´ :“ páν , . . . ,á0q is a partition of ra, bs, andin particular Lpf´, P q “ Lpf, P´q, Upf´, P q “ Upf, P´q. Hence the setconsisting of the lower sums of f is equal to the set of lower sums of fánd the set consisting of the upper sums of f is equal to the correspondingset of upper sums of f´.

The following example displays a typical application of the previous the-orem to functions f that are defined on intervals that are symmetric to theorigin, i.e., of the form rá, as, where a ě 0, as well as bounded, integrableand antisymmetric, i.e., such that fp´xq “ ´fpxq for all x P rá, as. Theirintegrals vanish.

262


´1

3 sinp2xq dx .

Solution: By Theorem 3.1.9, it follows thatż 1

´1

3 sinp2xq dx “

ż 1

´1

3 sinp´2xq dx “ ´

ż 1

´1

3 sinp2xq dx

and hence thatż 1

´1

3 sinp2xq dx “ 0 .

A variation of the previous reasoning is displayed in the next example.


0

x sin2pxq dx .

Solution: First by Theorem 3.1.9, it follows thatż π

0

x sin2pxq dx “

ż 0

´π

p´xq sin2p´xq dx “ ´

ż 0

´π

x sin2pxq dx .

Further, it follows by Theorem 3.1.1 and Example 2.6.24 that

´

ż 0

´π

x sin2pxq dx “ ´

ż π

0

py ´ πq sin2py ´ πq dy

“ ´

ż π

0

y sin2pyq dy ` π

ż π

0

sin2pyq dy “ ´

ż π

0

y sin2pyq dy `

π2

2

and, finally, thatż π

0

x sin2pxq dx “

π2

4.

263

Another typical application of Theorem 3.1.9 applies to functions f definedon intervals that are symmetric to the origin, i.e., of the form r´a, as, wherea ě 0, that are bounded, integrable and symmetric, i.e., such that fp´xq “fpxq for all x P r´a, as. The value of the integral of such a function istwice the value of the corresponding integral of its restrictions to r0, as.

Example 3.1.12. Show thatż π

´π

sinpxq

xdx “ 2

ż π

0

sinpxq

xdx .

Solution: By Corollary 2.6.18 and Theorem 3.1.9, it follows thatż π

´π

sinpxq

xdx “

ż 0

´π

sinpxq

xdx`

ż π

0

sinpxq

xdx

“

ż π

0

sinp´xq

´xdx`

ż π

0

sinpxq

xdx “ 2

ż π

0

sinpxq

xdx .

Remark 3.1.13. The solution of following problem n) from 1) illustratesthe general rule that one should never blindfoldly rely on computer pro-grams. In Mathematica 5.1, the command

Integraterpx^2´ 2x` 4q^t32u, tx, 1, 2us

gives the output

1

16p68` 27 Logr3sq

which is incorrect.

Problems

1) Calculate the value of the integral. For this, if the antiderivative ofthe integrand is not obvious, use a suitable substitution.

a)ż 1

0

p2x` 1q12 dx , b)ż 1

0

u p2u` 1q12 du ,

264

c)ż 1

´1

x p3x2 ` 1q12 dx , d)ż 3

2

s?

2s2 ` 3ds ,

e)ż 5

3

px´ 2q´2 ¨ sin

ˆ

x

x´ 2

˙

dx , f)ż 2

´2

4u` 2

u2 ` u´ 12du ,

g)ż 3

1

sinp?xq

?x

dx , h)ż 1

0

u eú22 du ,

i)ż x

0

tanpθq dθ , x P r0, π2q , j)ż 4

2

x?x` 2

dx ,

k)ż 4

3

b

3`?x dx , l)

ż 2

1

x12

x13 ` 4dx ,

m)ż 6

3

c

3`

b

2`?x dx , n)

ż 2

1

dx

px2 ´ 2x` 4q32,

o)ż 7

3

dx

x2?x2 ´ 4

, p)ż 3

1

dx

x2?x2 ` 9

,

q)ż 2

1

dx

x2?

5´ x2, r)

ż π2

0

dθ

sinp3θq ` 2,

s)ż π

0

a

1` sinp2θq dθ , t)ż π2

0

dθ

sinpθq ` 2 cospθq,

u)ż π2

´π2

cos4pθq

sin4pθq ` cos4pθq

dθ .

2) Let a P R, f : rá, as Ñ R be Riemann-integrable and g : R Ñ Rbe Riemann-integrable over every interval rb, cs where b, c P R aresuch that b ď c. Show that

a)ż a

á

fpxq dx “ 0

if f is antisymmetric, i.e., if fp´xq “ ´fpxq for all x P

rá, as.b)

ż a

á

fpxq dx “ 2

ż a

0

fpxq dx

if f is symmetric, i.e., if fp´xq “ fpxq for all x P rá, as.c)

ż c

b

fpxq dx “

ż c`τ

b`τ

fpxq dx

265

if b, c P R are such that b ď c and f is periodic with periodτ ě 0, i.e., if fpx` τq “ fpxq for all x P R.

3) Calculate the area in (´8, 0 ]2 that is enclosed by the strophoid

C :“

px, yq P R2 : pa´ xq y2 ´ pa` xqx2 “ 0(

where a ą 0.

4) Find solutions of the following differential equation for f : R Ñ Rwith the specified initial values.

f 1pxq “ 2 cospfpxqq ` 3

for all x P R, fp0q P [´ π, πq.

3.1.2 Integration by Parts

The method of integration by parts is based on the product rule for differ-entiation. For motivation, we consider continuous functions F : ra, bs Ñ Rand G : ra, bs Ñ R whose restrictions to the open interval pa, bq are dif-ferentiable with derivatives which can be extended to bounded Riemann-integrable functions f : ra, bs Ñ R and g : ra, bs Ñ R, respectively. Thenit follows by the fundamental theorem of calculus and the product rule fordifferentiation that

F pbqGpbq ´ F paqGpaq “

ż b

a

pFGq 1pxq dx

“

ż b

a

rF 1pxqGpxq ` F pxqG 1

pxqs dx

“

ż b

a

F 1pxqGpxq dx`

ż b

a

F pxqG 1pxq dx

“

ż b

a

fpxqGpxq dx`

ż b

a

F pxqgpxq dx

and hence thatż b

a

F pxqgpxq dx “ F pbqGpbq ´ F paqGpaq ´

ż b

a

fpxqGpxq dx .

266

We note the sign change and how antiderivatives, denoted by capital letters,switch positions inside the integrals.

A typical application of the last formula consists in the following steps.The integrand of a given integral needs to be represented by a product offunctions. Its first function will be differentiated in the process. It is an an-tiderivative of that derivative. The last will appear as the first factor in thetransformed integrand. For the second function an antiderivative should beavailable. That antiderivative will appear as the second factor in the trans-formed integrand. The final result is obtained in form of a difference. Theminuend is given by the difference of the product of the first factor withthe antiderivative of the second factor evaluated at the upper limit of inte-gration and the value of that product at the lower limit of integration. Thesubtrahend is given by the integral over the original interval of integrationwith the transformed integrand.

Theorem 3.1.14. (Integration by parts) Let f , g be bounded Riemann-integrable functions on ra, bs where a and b are elements of R such thata ă b. Further, let F,G be continuous functions on ra, bs which are dif-ferentiable on pa, bq and such that F 1pxq “ fpxq and G 1pxq “ gpxq for allx P pa, bq. Then

ż b

a

F pxqgpxq dx “ F pbqGpbq ´ F paqGpaq ´

ż b

a

fpxqGpxq dx .

Proof. First as a consequence of Theorem 2.6.13, fG and Fg are bothRiemann-integrable as products of Riemann-integrable functions. More-over, FG is continuous and differentiable such that pFGq 1pxq “ fpxqGpxq`F pxqgpxq for all x P pa, bq, and fG ` Fg is Riemann-integrable by The-orem 2.6.8 as a sum of Riemann-integrable functions. Hence by Theo-rem 2.6.21

ż b

a

fpxqGpxqdx`

ż b

a

F pxqgpxqdx “

ż b

a

`

fpxqGpxq ` F pxqgpxq˘

dx

“ F pbqGpbq ´ F paqGpaq .

267

The first example gives a typical application of integration by parts wherethe occurrence of the derivative of the first factor in the transformed inte-grand is used to lower the order of a polynomial appearing in the originalintegral.


0

x cosp3xq dx .

Solution: Define F,G, f, g : r0, πs Ñ R by

F pxq :“ x , gpxq :“ cosp3xq , fpxq :“ 1 , Gpxq :“1

3sinp3xq

for all x P r0, πs. Hence by Theorems 3.1.14, 2.6.21:ż π

0

x cosp3xq dx “ ´1

3

ż π

0

sinp3xq dx “1

9cosp3πq ´

1

9cosp0q “ ´

2

9.

Another typical application consists in a repeated use of integration by partsuntil the original integral reappears, but multiplied by a factor which isdifferent from 1. In such a case the resulting equation can be solved for theoriginal integral.


0

ex sinp2xq dx .

Solution: Define F,G, f, g : r0, πs Ñ R by

F pxq :“ ex , gpxq :“ sinp2xq , fpxq :“ ex , Gpxq :“ ´1

2cosp2xq

for all x P r0, πs. Then by Theorem 3.1.14,ż π

0

ex sinp2xq dx “1

2p1´ eπq `

1

2

ż π

0

ex cosp2xq dx (3.1.6)

268

To determine the last integral, define F,G, f, g : r0, πs Ñ R by

F pxq :“ ex , gpxq :“ cosp2xq , fpxq :“ ex , Gpxq :“1

2sinp2xq

for all x P r0, πs. Then by Theorem 3.1.14,

1

2

ż π

0

ex cosp2xq dx “ ´1

4

ż π

0

ex sinp2xq dx . (3.1.7)

and hence by (3.1.6), (3.1.7) finally:ż π

0

ex sinp2xq dx “ ´2

5peπ ´ 1q .

Of course, every integrand can be represented by its product with the con-stant function of value 1. Such a representation can sometimes lead to asuccessful application of the method of partial integration as in the follow-ing example.

Example 3.1.17. Calculateż e

1

lnp4xq dx .

Solution: Define F,G, f, g : r1, es Ñ R by

F pxq :“ lnp4xq , gpxq :“ 1 , fpxq :“1

x, Gpxq :“ x

for all x P r0, es. Then by Theorem 3.1.14,ż e

1

lnp4xq dx “ p1` ln 4q ¨ e´ ln 4´

ż e

1

dx “ pe´ 1q ¨ ln 4` 1 .

Often, the method of partial integration can be used to derive a recursionrelation for an integral containing a parameter. Such a case is considered inthe following example. In particular, its result will lead to the subsequentWallis’ product representation of π.

269


In :“

ż π

0

sinnpxq dx

for n P N˚. Solution: For n “ 1, 2, we conclude thatż π

0

sinpxq dx “ r´ cospxqsπ0 “ 2 ,

ż π

0

sin2pxq dx

“

ż π

0

1

2r1´ cosp2xqs dx “

1

2

„

x´1

2sinp2xq

π

0

“ π .

For n ě 3, we conclude by partial integration that

In “

ż π

0

sinnpxq dx “

ż π

0

sinn´1pxq sinpxq dx

“

sinn´1pxqr´ cospxqs

(π

0´

ż π

0

pn´ 1q sinn´2pxq cospxqr´ cospxqs dx

“ pn´ 1q

ż π

0

sinn´2pxq cos2

pxq dx

“ pn´ 1q

ż π

0

sinn´2pxqr1´ sin2

pxqs dx “ pn´ 1qpIn´2 ´ Inq

and hence thatIn “

n´ 1

nIn´2 .

Hence we conclude by induction that

I2k`1 “ 2 ¨2

3¨ ¨ ¨

2k

2k ` 1, I2k “ π ¨

1

2¨ ¨ ¨

2k ´ 1

2k

for all k P N zt0, 1u.

The result from the previous example leads on John Wallis’ product repre-sentation of π which will be used in the subsequent derivation of Stirling’sformula and in the calculation of Gaussian integrals.

270

10 20 30 40 50n

3.1

Π

3.2

3.3

Fig. 69: Sequences a1, a2, . . . and b1, b2, . . . from the proof of Wallis product representa-tion for π, Theorem 3.1.19, that converge to π from below and above, respectively.

Theorem 3.1.19. (Wallis’ product representation of π, 1656, [98])

limkÑ8

4pk ` 1q

„

2

3¨ ¨ ¨

2k

2k ` 1

2

“ π .

Proof. In this, we are using the notation from the previous example. Since0 ď sinpxq ď 1 for all x P r0, πs, it follows that

sinn`1pxq “ sinpxq sinnpxq ď sinnpxq

for all x P r0, πs and hence that

In`1 “

ż π

0

sinn`1pxq dx ď

ż π

0

sinnpxq dx “ In

for all n P N˚. As a consequence,

2 ¨2

3¨ ¨ ¨

2k

2k ` 1“ I2k`1 ď I2k “ π ¨

1

2¨ ¨ ¨

2k ´ 1

2k

271

ď I2k´1 “ 2 ¨2

3¨ ¨ ¨

2pk ´ 1q

2k ´ 1

and

2 ¨2

3¨ ¨ ¨

2k

2k ` 1¨

2

1¨

4

3¨ ¨ ¨

2pk ´ 1q

2k ´ 3¨

2k

2k ´ 1¨

2k ` 1

2k ` 1

“ ak :“ p4k ` 2q

„

2

3¨ ¨ ¨

2k

2k ` 1

2

ď π

ď 2 ¨2

3¨ ¨ ¨

2pk ´ 1q

2k ´ 1¨

2

1¨

4

3¨ ¨ ¨

2pk ´ 1q

2k ´ 3¨

2k

2k ´ 1

“ bk :“ 4k

„

2

3¨ ¨ ¨

2pk ´ 1q

2k ´ 1

2

for k P N zt0, 1, 2u. Further,

ak`1

ak“

4k ` 6

4k ` 2

ˆ

2pk ` 1q

2k ` 3

˙2

“8pk ` 1q2

p4k ` 2qp2k ` 3q“

8k2 ` 16k ` 8

8k2 ` 16k ` 6ą 1 ,

bk`1

bk“

4pk ` 1q

4k

ˆ

2k

2k ` 1

˙2

“4k2 ` 4k

4k2 ` 4k ` 1ă 1 ,

bkak“

4k

4k ` 2

ˆ

2k ` 1

2k

˙2

“p2k ` 1q

2k“ 1`

1

2k

for all k P N zt0, 1, 2u. Hence the sequences a3, a4, . . . and are convergent,as increasing sequence that is bounded from above by π and decreasingsequence that is bounded from below by π, respectively, and converge tothe same limit π.

Essentially as an application of Wallis’ product formula, we prove Stirling’sasymptotic formula for factorials which is often used in applications .

Theorem 3.1.20. (Stirling’s formula, 1730, [92])

limnÑ8

n!?n

´n

e

¯´n

“?

2π . (3.1.8)

272

10 20 30 40 50x

1.01

1.02

1.03

1.04

1.05

y

Fig. 70: Graph of pp0,8q Ñ R, x ÞÑ Γpx` 1q pxeq´x?

2πx q. Note that Γpn` 1q “ n!for every n P N. See Theorem 3.1.20.

Proof. First, we notice that ln is concave since ln2pxq “ ´1x2 ă 0 for allx ą 0. Hence it follows by Theorem 2.5.33 that

ż x`1

x

lnpyq dy ě

ż x`1

x

„

lnpxq ` py ´ xq ln

ˆ

x` 1

x

˙

dy “ lnpxq

´ x ln

ˆ

x` 1

x

˙

`

ˆ

x`1

2

˙

ln

ˆ

x` 1

x

˙

“1

2r lnpxq ` lnpx` 1qs

for all x ą 0. In addition, it follows from the Definition 2.5.29 of theconcavity of a differentiable function that

lnpyq ď lnpxq `y ´ x

x, lnpyq ď lnpx` 1q `

y ´ px` 1q

x` 1,

where x ą 0 and y ą 0, and hence thatż x`1

x

lnpyq dy ď lnpxq ´ 1` 1`1

2x“ lnpxq `

1

2x,

273

ż x`1

x

lnpyq dy ď lnpx` 1q ´ 1` 1´1

2px` 1q“ lnpx` 1q ´

1

2px` 1q

andż x`1

x

lnpyq dy ď1

2

ˆ

lnpxq ` lnpx` 1q `1

2x´

1

2px` 1q

˙

“1

2r lnpxq ` lnpx` 1qs `

1

4

ˆ

1

x´

1

x` 1

˙

.


0 ď

ż x`1

x

lnpyq dy ´1

2r lnpxq ` lnpx` 1qs ď

1

4

ˆ

1

x´

1

x` 1

˙

for all x ą 0 and hence that

0 ď

ż n

1

lnpyq dy ´1

2

n´1ÿ

k“1

r lnpkq ` lnpk ` 1qs ď1

4

n´1ÿ

k“1

ˆ

1

k´

1

k ` 1

˙

“1

4

ˆ

1´1

n

˙

ď1

4.

Therefore, we conclude that the sequence S1, S2, . . . , where

Sn :“

ż n

1

lnpyq dy ´1

2

n´1ÿ

k“1

r lnpkq ` lnpk ` 1qs

“

ż n

1

lnpyq dy ´ lnpn!q `lnpnq

2“ r y lnpyq ´ y sn1 ´ lnpn!q `

lnpnq

2

“ n lnpnq ´ pn´ 1q ´ lnpn!q `lnpnq

2“ 1` ln

„?n

n!

ń

e

¯n

for every n P N˚, is increasing as well as bounded from above and thereforeconvergent to an element of the closed interval form 0 to 14. Hence itfollows also the existence of

limnÑ8

n!?n

ń

e

¯ń

274

which will be denoted by a in the following. For the determination of itsvalue, we use Wallis’ product. According to Theorem 3.1.19

π

2“ lim

kÑ82pk ` 1q

„

2

3¨ ¨ ¨

2k

2k ` 1

2

“ limkÑ8

2pk ` 1q

2k ` 1

p2kk!q4

p2k ` 1q rp2kq!s2

“ limkÑ8

p2kk!q4

p2k ` 1q rp2kq!s2

“ limkÑ8

2´p4k`1qk p2kk!q4

p2k ` 1q rp2kq!s2

«

1?k

ˆ

k

e

˙´kff4 «

?2k

ˆ

2k

e

˙2kff2

“ limkÑ8

k pk!q4

2p2k ` 1q rp2kq!s2

«

1?k

ˆ

k

e

˙´kff4 «

?2k

ˆ

2k

e

˙2kff2

“a2

4.

Hence it follows that a “?

2π and, finally, (3.1.8).

The example below gives another application of the method of partial in-tegration to an integrand containing a parameter which leads on Euler’sfamous product representations of the sine and the cosine. These represen-tations will be used later on in the proof of the reflection formula for thegamma function. For the formulation of these representations, we need tointroduce the product symbol.

Definition 3.1.21. (Product symbol) If I is some non-empty finite indexset and ai P R for every i P I , the symbol

ź

iPI

ai

denotes the product of all ai where i runs through the elements of I . Notethat, as a consequence of the commutativity and associativity of multipli-cation, the order in which the products are performed is inessential.

Example 3.1.22. ( Euler’s product representation of the sine and cosine,1748, [38]) Show that for every x P R

sin´πx

2

¯

“πx

2limnÑ8

nź

k“1

ˆ

1´x2

4k2

˙

,

275

cos´πx

2

¯

“ limnÑ8

nź

k“0

ˆ

1´x2

p2k ` 1q2

˙

. (3.1.9)

Solution: For this, we define for every n P N a corresponding In : RÑ Rby

Inpxq :“

ż π2

0

cospxtq cosnptq dt

for every x P R. In particular, this implies that

I0pxq “π

2

#

1 if x “ 0

sinpπx2qpπx2q if x ‰ 0,

for x R t´1, 1u

I1pxq “

ż π2

0

cospxtq cosptq dt “

ż π2

0

1

2rcosppx` 1qtq ` cosppx´ 1qtqs dt

“1

2

„

sinppx` 1qtq

x` 1`

sinppx´ 1qtq

x´ 1

π2

0

“cospπx2q

1´ x2,

for x P t´1, 1u

I1pxq “

ż π2

0

cos2ptq dt “

ż π2

0

1

2r1` cosp2tqs dt

“1

2

„

t`1

2sinp2tq

π2

0

“π

4

and hence

I1pxq “

#

π4 if x P t´1, 1u

cospπx2qp1´ x2q if x R t´1, 1u.

In the following, let x P R. T For n P N z t0, 1u, we conclude by partialintegration that

x2Inpxq “ x

ż π2

0

cosnptqx cospxtq dt “ rx cosnptq sinpxtqsπ20

276

` n

ż π2

0

sinptq cosn´1ptqx sinpxtq dt “

“

´n sinptq cosn´1ptq cospxtq

‰π2

0

` n

ż π2

0

“

cosnptq ´ pn´ 1q sin2ptq cosn´2

ptq‰

cospxtq dt

“ n

ż π2

0

“

n cosnptq ´ pn´ 1q cosn´2ptq

‰

cospxtq dt

“ n2Inpxq ´ npn´ 1qIn´2pxq .

Therefore, it follows that

In´2pxq “n2 ´ x2

npn´ 1qInpxq

and hence thatIn´2pxq

In´2p0q“

ˆ

1´x2

n2

˙

Inpxq

Inp0q.

From this, it follows by induction that

I0pxq

I0p0q“I2npxq

I2np0q

nź

k“1

ˆ

1´x2

p2kq2

˙

,

I1pxq

I1p0q“I2n`1pxq

I2n`1p0q

nź

k“1

ˆ

1´x2

p2k ` 1q2

˙

for every n P N˚. In the following, we show that

limnÑ8

Inpxq

Inp0q“ 1 . (3.1.10)

For this, we note that

| cospxtq ´ 1| “ | cosp|x|tq ´ 1| “

ˇ

ˇ

ˇ

ˇ

ˇ

ż |x|t

0

r´ sinpsqsds

ˇ

ˇ

ˇ

ˇ

ˇ

ď |x|t

for t ě 0. Hence it follows for every n P N˚ that

|Inpxq ´ Inp0q| “

ˇ

ˇ

ˇ

ˇ

ˇ

ż π2

0

r cospxtq ´ 1s cosnptq dt

ˇ

ˇ

ˇ

ˇ

ˇ

277

ď |x|

ż π2

0

t cosptq cosn´1ptq dt ď |x|

ż π2

0

sinptq cosn´1ptq dt “

|x|

n

where it has been used that

t cosptq ď sinptq

for 0 ď t ď π2. Hence it follows (3.1.10) and, finally, (3.1.9). For this,note that the second relation in (3.1.9) is trivially satisfied for x P t´1, 1u.

The following application of the method of partial integration to an inte-grand containing a parameter leads on a recursion formula that will be usedin the method of integration of rational expressions by decomposition intopartial fractions displayed in the next section.

Example 3.1.23. Let m be some natural number ě 1, a P R and c ą 0.Define F,G, f, g : RÑ R by

F pyq :“ py2` c2

q´m , gpyq :“ 1 , fpyq :“ ´2my ¨ py2

` c2q´pm`1q ,

Gpyq :“ y

for all y P R. Then by Theorem 3.1.14 for every x ą aż x

a

dy

py2 ` c2qm“

x

px2 ` c2qm´

a

pa2 ` c2qm` 2m

ż x

a

y2dy

py2 ` c2qm`1

“x

px2 ` c2qm´

a

pa2 ` c2qm` 2m

ż x

a

dy

py2 ` c2qm

´ 2mc2

ż x

a

dy

py2 ` c2qm`1

and hence it follows the recursion (or ‘reduction’) formulaż x

a

dy

py2 ` c2qm`1“

1

2mc2

„

x

px2 ` c2qm´

a

pa2 ` c2qm

`2m´ 1

2mc2

ż x

a

dy

py2 ` c2qm,

which is used in the method of integration by decomposition into partialfractions below.

278

The following final example gives a another typical application of the methodof partial integration. Also in this, the integrand contains a parameter. Themethod is used to derive an estimate for a special function, a Bessel func-tion, defined in terms of an integral. It is a remarkable fact that estimateseven of elementary functions are often easier to achieve by help of integralrepresentations.


|Jnpxq| ď2

π¨

x

n2 ´ x2

for all n P N˚ and x P R such that 0 ď x ă n. Solution: Define

F pθq :“1

x cos θ ´ n, gpθq :“ px cos θ ´ nq ¨ cospx sin θ ´ nθq ,

fpθq :“x sinpθq

px cos θ ´ nq2, Gpθq :“ sinpx sin θ ´ nθq

for all θ P r0, πs. Then by Theorem 3.1.14,

Jnpxq “1

π

ż π

0

cospx sin θ ´ nθq dθ

“ ´1

π

ż π

0

x sinpθq

px cos θ ´ nq2¨ sinpx sin θ ´ nθq dθ ,

and hence

|Jnpxq| ď1

π

ż π

0

x sinpθq

px cos θ ´ nq2dθ “

2

π¨

x

n2 ´ x2.

Problems

1) Calculate the value of the integral. In this, where applicable, n P N˚.

a)ż 3

0

4t2e´5t dt , b)ż π2

0

ϕ ¨ r sinp2ϕq ` 3 cosp7ϕq s dϕ ,

279

c)ż π

0

e´ϕ cosp2ϕq dϕ , d)ż 2

1

lnp2xq

x2dx ,

e)ż 1

0

x2 arctanp3xq dx , f)ż 1

?2

0

lnp2x2 ` 1q dx ,

g)ż π

0

x sinpnxq dx , h)ż 3

1

xn lnpxq dx .

2) Derive a reduction formula where the integral is expressed in termsof the same integral with a smaller n. In this n P N˚, a P R, x ě aand, where applicable, m P N, b, c P R˚.

a)ż x

a

sinnpyq dy , b)ż x

a

cosnpyq dy ,

c)ż x

a

yneby dy , d)ż x

a

yn sinpbyq dy ,

e)ż x

a

yn cospbyq dy , f)ż x

a

ymrlnpyqsn dy ,

g)ż x

a

e´cy sinpbyq dy , h)ż x

a

e´cy cospbyq dy .

3) Let I be some non-empty open interval of R, h : I Ñ R a map anda, b P I be such that a ă b.

a) If h is twice differentiable on I and such that hpaq “ hpbq “ 0,show that

ż b

a

hpxq dx “1

2

ż b

a

px´ bqpx´ aqh 2pxq dx .

b) If h is four times differentiable on I and such that hpaq “hpbq “ h 1paq “ h 1pbq “ 0, show that

ż b

a

hpxq dx “1

24

ż b

a

px´ bq2px´ aq2h pivqpxq dx .

[Remark: Note that if h “ f ´ p where f : I Ñ R is twice andfour times differentiable, respectively, and p : I Ñ R is a polynomialfunction of the order 1, 3, respectively, then h 2 “ f 2, h pivq “ f pivq,respectively. In connection with the above formulas, this fact is usedin the estimation of the errors for the Trapezoid Rule / Simpson Rulefor the numerical approximation of integrals. See Section 3.1.4.]

280

4) Let a, b P R be such that a ă b and f, g : ra, bs Ñ R be restrictions tora, bs of twice continuously differentiable functions defined on openintervals of R containing ra, bs. In addition, let fpaq “ fpbq “ 0 andgpaq “ gpbq “ 0.

a) Show that

ż b

a

gpxqf 2pxq dx “

ż b

a

g 2pxqfpxq dx .

b) In addition, assume that f and g solve the differential equations

´ f 2pxq ` Upxq fpxq “ λ fpxq ,

´ g 2pxq ` Upxq gpxq “ µ gpxq

where U : ra, bs Ñ R is continuous and λ, µ P R are such thatλ ‰ µ. Show that

ż b

a

fpxqgpxq dx “ 0 .

3.1.3 Partial Fractions

The method of integration of rational expressions by decomposition intopartial fractions is suggested by the following simple observation. For this,let a1, a2, A1, A2 P R. Then

A1

x´ a1

`A2

x´ a2

“A1px´ a2q ` A2px´ a1q

px´ a1qpx´ a2q

“pA1 ` A2qx´ pA1a2 ` A2a1q

x2 ´ pa1 ` a2qx` a1a2

for all x P R zta1, a2u. Note that for the left hand side of the last equation,as a function of x, there is an antiderivative which is given by

A1 lnp|x´ a1|q ` A2 lnp|x´ a2|q

for every x P R zta1, a2u.

281

On the other hand, for a given quotient pq of polynomials p of first or-der and q of second order an antiderivative is usually not obvious. Herewe exclude the case that the quotient can be reduced to the quotient of azero order polynomial and a first order polynomial. Also, we assume thatthe coefficient of the leading order of q is equal to 1 which can always beachieved by appropriate definition of p and q. Therefore, for the purpose ofintegration, it is natural to try to represent such a quotient pq in the form

ppxq

qpxq“

A1

x´ a1

`A2

x´ a2

(3.1.11)

for all x P R z r ta1, a2u Y q´1pt0uq s and for some a1, a2 P R, A1, A2 P R˚

such that a1 ‰ a2. In this, we notice that the vanishing of one of the coeffi-cients A1, A2 or a1 “ a2 would lead on the excluded case that the quotientcan be reduced to a quotient of a zero order polynomial and a first orderpolynomial.

In the following, we will determine A1, A2, a1 and a2. We immediatelynote from the singular behavior of the right hand side of equation (3.1.11)near a1 and a2 that q needs to vanish in the points a1 and a2. This can alsobe shown as follows. The equation (3.1.11) implies that

rA1px´ a2q ` A2px´ a1qs qpxq “ ppxqpx´ a1qpx´ a2q

for all x P R z r ta1, a2u Y q´1pt0uq s. Hence

A1pa1 ´ a2q qpa1q “ limxÑa1

rA1px´ a2q ` A2px´ a1qs qpxq

“ limxÑa1

ppxqpx´ a1qpx´ a2q “ 0 ,

A2pa2 ´ a1q qpa2q “ limxÑa2

rA1px´ a2q ` A2px´ a1qs qpxq

“ limxÑa2

ppxqpx´ a1qpx´ a2q “ 0 .

Since A1 ‰ 0, A2 ‰ 0 and a1 ‰ a2, this implies that

qpa1q “ qpa2q “ 0 .

282

Hence q has the two different zeros a1, a2 and

qpxq “ px´ a1qpx´ a2q

for all x P R. Then (3.1.11) implies that

ppxq “ A1px´ a2q ` A2px´ a1q

for all x P R zta1, a2u and therefore that

ppa1q “ limxÑa1

ppxq “ A1pa1 ´ a2q , ppa2q “ limxÑa2

ppxq “ A2pa2 ´ a1q .

The last system gives

A1 “ ´ppa1q

a2 ´ a1

, A2 “ppa2q

a2 ´ a1

.

Indeed, if p is a polynomial of first order and a1, a2 P R are such thata1 ‰ a2, then

´ppa1q

a2 ´ a1

¨1

x´ a1

`ppa2q

a2 ´ a1

¨1

x´ a2

“´ppa1qpx´ a2q ` ppa2qpx´ a1q

a2 ´ a1

¨1

px´ a1qpx´ a2q

for all x P R zta1, a2u. In addition,

´ppa1qpa1 ´ a2q ` ppa2qpa1 ´ a1q

a2 ´ a1

“ ppa1q

´ppa1qpa2 ´ a2q ` ppa2qpa2 ´ a1q

a2 ´ a1

“ ppa2q.

Hence´ppa1qpx´ a2q ` ppa2qpx´ a1q

a2 ´ a1

“ ppxq

for all x P R and

ppxq

px´ a1qpx´ a2q“ ´

ppa1q

a2 ´ a1

¨1

x´ a1

`ppa2q

a2 ´ a1

¨1

x´ a2

283

for all x P R zta1, a2u gives a decomposition as required. In particular, anantiderivative of

ˆ

R zta1, a2u Ñ R , x ÞÑppxq

px´ a1qpx´ a2q

˙

is given by

´ppa1q

a2 ´ a1

lnp|x´ a1|q `ppa2q

a2 ´ a1

¨ lnp|x´ a2|q

for every x P R zta1, a2u.

As noticed above, a decomposition of the type (3.1.11) is impossible ifthe polynomial q has a double zero or no real zero. For this reason, we tryto find a similar decomposition also for these cases. If q has a double zeroa P R, then pq is given by

ppxq

px´ aq2,

for all x P R ztau. Then

ppxq

px´ aq2“p 1paq px´ aq ` ppaq

px´ aq2“p 1paq

x´ a`

ppaq

px´ aq2

for all x P R ztau. Hence an antiderivative ofˆ

R ztau Ñ R, x ÞÑppxq

px´ aq2

˙

is given by

p 1paq lnp|x´ a|q ´ppaq

x´ a

for every x P R ztau.

Finally, if q has no real zero, then

qpxq “ x2` cx` d “

´

x`c

2

¯2

` d´c2

4

284

for all x P R where c, d P R are such that

d ąc2

4.

Further, p is given by ppxq “ ax` b for all x P R and some a, b P R. Then

ppxq

qpxq“

ax` b

x2 ` cx` d“ax` ac

2` b´ ac

2

x2 ` cx` d

“1

2¨

2ax` ac

x2 ` cx` d`

´

b´ac

2

¯

¨1

`

x` c2

˘2` d´ c2

4

“a

2¨

2x` c

x2 ` cx` d`

b´ ac2

b

d´ c2

4

¨1

b

d´ c2

4

¨1

1`

ˆ

x` c2

b

d´ c2

4

˙2

for all x P R. The first summand on the right hand side of the last equation,as a function of x, has an antiderivative given by

a

2¨ lnpx2

` cx` dq

for all x P R. Hence it remains to find an antiderivative for the secondsummand. Since we know from Calculus I that

arctan 1pxq “1

1` x2

for all x P R, such is given by

b´ ac2

b

d´ c2

4

¨ arctan

¨

˝

x` c2

b

d´ c2

4

˛

‚

for all x P R. Note that in the last step, we could also have employedchange of variables, but the procedure here is more direct. Hence in thecase that

d ąc2

4,

285

an antiderivative of pq, given by

ax` b

x2 ` cx` d

for every x P R, is given by

a

2¨ lnpx2

` cx` dq `b´ ac

2b

d´ c2

4

¨ arctan

¨

˝

x` c2

b

d´ c2

4

˛

‚

for all x P R.

The previous analysis can be generalized to quotients of the form pq wherep, q are polynomials of order m and n, respectively, such that m ă n. Theresult is given below without proof. The proof can be found in texts onfunction theory, that is, the theory of functions of one complex variable.For readers that already know complex numbers, we just indicate how theirintroduction might be helpful in this respect. For this, we consider the casethat ppxq “ 1 and q “ x2 ` 1 for all x P R. The polynomial q has noreal zero, but if we extend q to the complex plane by qpzq :“ z2 ` 1 forevery complex number z, then q has the roots i and í, where i denotes theimaginary unit, since

qpiq “ i2 ` 1 “ ´1` 1 “ 0 , qpiq “ píq2 ` 1 “ ´1` 1 “ 0 .

In particular,1

qpzq“i

2

ˆ

1

z ` i´

1

z ´ i

˙

for every complex z different from i and í. As reflected in this example,the introduction of complex numbers allows in every case the decomposi-tion of the extension of pq to complex numbers into sums of functions thatassume the values

1

z ´ a, . . . ,

1

pz ´ aqµpaq,

in every complex z not among the zeros of that extension of q where aruns through the zeros of q and for every such a the symbol µa denotes thecorresponding multiplicity. This fact simplifies the discussion significantly.

286

Lemma 3.1.25. Let p, q : R Ñ R be polynomials of degree m,n P

N˚, respectively, where m ă n. Finally, let a1, . . . ar be the (possiblyempty) sequence of pairwise different real roots of q, where r P N, and letm1, . . . ,mr be the sequence in N˚ consisting of the corresponding multi-plicities.

(i) There are s P N along with (possibly empty and apart from reorderingunique) sequences pbr`1, cr`1q, . . . , pbr`s, cr`sq of pairwise differentelements of Rˆ p0,8q and mr`1, . . . ,mr`s in N˚ such that

qpxq “ qn ¨ px´ a1qm1 . . . px´ arq

mr ¨“

px´ br`1q2` cr`1

‰mr`1

. . .“

px´ br`sq2` cr`s

‰mr`s

for all x P R where qn is the coefficient of the n´th order of q.

(ii) There are unique sequences of real numbers A11, . . . , A1m1 , . . . ,Ar1, . . . , Armr and pairs of real numbers pBr`1,1, Cr`1,1q, . . . ,pBr`1,mr`1 , Cr`1,mr`1q, . . . , pBr`s,1, Cr`s,1q, . . . , pBr`s,mr`s ,Cr`s,mr`sq, respectively, such that

ppxq

qpxq“

A11

x´ a1

` ¨ ¨ ¨ À1m1

px´ a1qm1` . . .

Àr1x´ ar

` ¨ ¨ ¨ Àrmr

px´ arqmr

`Br`1,1 x` Cr`1,1

px´ br`1q2 ` cr`1

` ¨ ¨ ¨ `Br`1,mr`1 x` Cr`1,mr`1

rpx´ br`1q2 ` cr`1s

mr`1` . . .

`Br`s,1 x` Cr`s,1px´ br`sq2 ` cr`s

` ¨ ¨ ¨ `Br`s,mr`s x` Cr`s,mr`srpx´ br`sq2 ` cr`ss

mr`s

for all x P R zta1, . . . , aku.

Proof. See Function Theory.

Corollary 3.1.26. Let p, q,m, n; a1, . . . ak,m1, . . . ,mr, pb1, c1q . . . ,pbn´k, cn´kq, mr`1, . . . ,mr`s, A11, . . . , A1m1 , . . . , Ar1, . . . , Armr and

287

pBr`1,1, Cr`1,1q, . . . , pBr`1,mr`1 , Cr`1,mr`1q, . . . , pBr`s,1, Cr`s,1q, . . . ,pBr`s,mr`s , Cr`s,mr`sq as in Lemma 3.1.25. Then by

F pxq :“ A11 lnp|x´ a1|q ´ ¨ ¨ ¨ ´1

m1 ´ 1¨

A1m1

px´ a1qm1´1

` . . .

` Ar1 lnp|x´ ar|q ´ ¨ ¨ ¨ ´1

mr ´ 1¨

Armrpx´ arqmr´1

` . . .

`Br`1,1

2lnrpx´ br`1q

2` cr`1s

`br`1Br`1,1 ` Cr`1,1

cr`1

arctan

ˆ

x´ br`1

cr`1

˙

` . . .

`Br`1,1

2p1´mr`1q¨

1

rpx´ br`1q2 ` cr`1s

mr`1´1

` pbr`1Br`1,1 ` Cr`1,1q ¨ Fr`1pxq ` . . .

`Br`s,1

2lnrpx´ br`sq

2` cr`ss

`br`sBr`s,1 ` Cr`s,1

cr`sarctan

ˆ

x´ br`scr`s

˙

` . . .

`Br`s,1

2p1´mr`sq¨

1

rpx´ br`sq2 ` cr`ssmr`s´1

` pbr`sBr`s,1 ` Cr`s,1q ¨ Fr`spxq

for all x P R zta1, . . . , aku, there is defined an anti-derivative F of pq.Here Fr`1, . . . , Fr`s : RÑ R denote anti-derivatives satisfying

F 1r`lpxq “

1

rpx´ br`lq2 ` cr`lsmr`l

for all x P R and l “ 2, . . . s. Note that such functions can be calculated bythe recursion formula from Example 3.1.23.

In the following, we give five examples of typical applications of the pre-vious lemma and its corollary. The fifth example gives such application tothe solution of a (‘separable’) first order differential equation.

288


0

4

x2 ´ 9dx .

Solution:ż 2

0

4

x2 ´ 9dx “

ż 2

0

4

px´ 3qpx` 3qdx “

ż 2

0

2

3

ˆ

1

x´ 3´

1

x` 3

˙

dx

“2

3plnp|2´ 3|q ´ lnp|2` 3|qq ´

2

3plnp| ´ 3|q ´ lnp|3|qq “ ´

2

3lnp5q ,

where it has been used that for every function f

1

pfpxq ` aqpfpxq ` bq“

1

b´ a

ˆ

1

fpxq ` a´

1

fpxq ` b

˙

, (3.1.12)

where a, b P R are such that a ‰ b and x P Dpfq is such that fpxq Rt´a,´bu. The previous identity is also of use in applications of the methodof integration by partial fractions to more complicated situations.


0

3x` 4

x2 ` 2x` 2dx .

Solution:ż 3

0

3x` 4

x2 ` 2x` 2dx “

ż 3

0

3

2

2x` 2

x2 ` 2x` 2dx`

ż 3

0

1

px` 1q2 ` 1dx

“3

2lnp32

` 2 ¨ 3` 2q ` arctanp3` 1q ´3

2lnp2q ´ arctanp1q

“3

2ln

ˆ

17

2

˙

` arctanp4q ´π

4.


1

1

x2px2 ` 1q2dx .

289

Solution: Since the integrand is a restriction of the composition of the mapspR˚ Ñ R, x ÞÑ 1rxpx`1q2s q and pR˚ Ñ R, x ÞÑ x2 q, by Lemma (3.1.25)there are A,B,C P R such that

1

x2px2 ` 1q2“A

x2`

B

x2 ` 1`

C

px2 ` 1q2(3.1.13)

for all x ‰ 0. Hence for all x P R

1 “ Apx2`1q2`Bx2

px2`1q`Cx2

“ pA`Bqx4`p2A`B`Cqx2

`A

and hence A “ 1, B “ ´1 and C “ ´1. Hence it follows by the recursionformula from Example 3.1.23 that

ż 2

1

1

x2px2 ` 1q2dx “

ż 2

1

1

x2dx´

ż 2

1

1

x2 ` 1dx´

ż 2

1

1

px2 ` 1q2dx

“1

2`π

4´ arctanp2q ´

ż 2

1

1

px2 ` 1q2dx

“1

2`π

4´ arctanp2q ´

1

2

ˆ

2

5´

1

3

˙

´1

2

ż 2

1

1

x2 ` 1dx

“7

15`

3

2

´π

4´ arctanp2q

¯

.

Another way of arriving at the decomposition (3.1.13) is by help of theidentity (3.1.12) which leads on

1

x2px2 ` 1q2“

1

x2 ` 1¨

1

x2px2 ` 1q“

1

x2 ` 1

ˆ

1

x2´

1

x2 ` 1

˙

“1

x2px2 ` 1q´

1

px2 ` 1q2“

1

x2´

1

x2 ` 1´

1

px2 ` 1q2

for all x P R˚.

Example 3.1.30. Calculateż x

a

dy

1` y4

290

-4 -2 2 4x

-1

1

y

Fig. 71: Graph of the antiderivative F of fpxq :“ 1p1`x4q, x P R, satisfying F p0q “ 0.Compare Example 3.1.30.

where a P R and x ě a. Solution: Since x4 ` 1 ą 0 for all x P R,according to Lemma 3.1.25 there are b, c, d, e P R such that

1` y4“ py2

` by ` cq ¨ py2` dy ` eq (3.1.14)

“ y4` dy3

` ey2` by3

` bdy2` bey ` cy2

` cdy ` ce

“ y4` pb` dqy3

` pc` e` bdqy2` pbe` cdqy ` ce

for all y P R. This equation is satisfied if and only if

b` d “ 0 , c` e` bd “ 0 , be` cd “ 0 , ce “ 1 .

From the first equation, we conclude that d “ ´b which leads to the equiv-alent reduced system

d “ ´b , e` c “ b2 , bpe´ cq “ 0 , ce “ 1 .

The assumption that b “ 0 leads to e “ ´c and ´1 “ ´ce “ c2 . Hence itfollows that b ‰ 0. Therefore, the second equation of the last system leadsto the equivalent reduced system

d “ ´b , b2“ 2c , e “ c , c2

“ 1

291

which has the solution c “ e “ 1 and b “?

2, d “ ´?

2. (The otherremaining solution c “ e “ 1 and b “ ´

?2, d “

?2 results in a reordering

of the factors in (3.1.14)). Hence it follows that

1` y4“ py2

`?

2 y ` 1q ¨ py2´?

2 y ` 1q

for all y P R. Note that, the last could have also been more simply derivedas follows

1` y4“ 1` 2y2

` y4´ 2y2

“ py2` 1q2 ´ p

?2yq2

“ py2`?

2 y ` 1q ¨ py2´?

2 y ` 1q

valid for all y P R. Further, according to Corollary 3.1.26 there are uniquelydetermined A,B,C,D P R such that

1

1` y4“

Ay `B

y2 `?

2 y ` 1`

Cy `D

y2 ´?

2 y ` 1(3.1.15)

for all y P R. In particular, this implies that

1

1` y4“

1

1` pýq4“

Áy `B

y2 ´?

2 y ` 1`

Ćy `D

y2 `?

2 y ` 1

“Ćy `D

y2 `?

2 y ` 1`

Áy `B

y2 ´?

2 y ` 1

for all y P R. Since A,B,C and D are uniquely determined by the equa-tions (3.1.15) for every y P R, it follows that C “ Á and D “ B. Hencewe conclude that there are uniquely determined A,B P R such that

1

1` y4“

Ay `B

y2 `?

2 y ` 1`

Áy `B

y2 ´?

2 y ` 1

for all y P R. In particular,

1 “1

1` 04“ 2B

292

and hence B “ 12. Also

1

2“

1

1` 14“A` p12q

2`?

2`´A` p12q

2´?

2

“2´

?2

2

ˆ

A`1

2

˙

`2`

?2

2

ˆ

´A`1

2

˙

“´2?

2

2A`

2

2

and hence A “?

24. We conclude that

1

1` y4“

1

4

„

?2 y ` 2

y2 `?

2 y ` 1`

´?

2 y ` 2

y2 ´?

2 y ` 1

“1

4

„

?2 y ` 1

y2 `?

2 y ` 1´

?2 y ´ 1

y2 ´?

2 y ` 1

`

`

?2

4

« ?2

`?2 y ` 1

˘2` 1

`

?2

`?2 y ´ 1

˘2` 1

ff

for all y P R. Hence it follows thatż x

a

dy

1` y4“

?2

8

„

ln

ˆ

x2 `?

2x` 1

a2 `?

2 a` 1

˙

´ ln

ˆ

x2 ´?

2x` 1

a2 ´?

2 a` 1

˙

`

?2

4

”

arctanp?

2x` 1q ´ arctanp?

2 a` 1qı

`

?2

4

”

arctanp?

2x´ 1q ´ arctanp?

2 a´ 1qı

.

Remark 3.1.31. The previous example gives another illustration of the gen-eral rule that one should never blindfoldly rely on computer programs. InMathematica 5.1, the command

Integrater1p1` x^4q, xs

gives the output

1

4?

2p´2ArcTanr1´

?2xs ` 2ArcTanr1`

?2xs ´ Logr´1`

?2x´ x2

s

293

-4 -3 -2 -1 1 2 3 4x

0.5

0.8

1.2

y

Fig. 72: Graphs of the solutions f0, f14, f12, f34 and f1 of (3.1.16) in the case thata “ 1. Compare Example 3.1.32.

` Logr1`?

2x` x2sq

which is incorrect. A first inspection of the last formula reveals that the ar-gument of the first natural logarithm function is becoming negative for largex such that the logarithm is not defined. This gives a first indication that theexpression is incorrect. Comparison with the result from Example 3.1.30shows that the sign of that argument has to be reversed.

Example 3.1.32. Find solutions of the following differential equation forf : RÑ R with the specified initial values.

f 1pxq “ afpxqp1´ fpxqq (3.1.16)

for all x P R where a ą 0, fp0q P p0, 1q. Solution: If f is such function, itfollows that f is continuously differentiable. Since fp0q P p0, 1q, it followsby the continuity of f the existence of an open interval c, d P R such thatc ă 0 ă d and such that fp[c, d]q Ă p0, 1q. Since a ą 0 and the functionafp1´ fq is ą 0 on the interval [c, d], it follows from (3.1.2) that

fpx1q “ fpx0q ` fpx1q ´ fpx0q “ fpx0q `

ż x1

x0

f 1pxq dx

294

-0.5 0.5 1x

-9

-11

y

Fig. 73: Graphs of the solutions f2 and f4 of (3.1.16) in the case that a “ 1. CompareExample 3.1.32.

“ fpx0q `

ż x1

x0

a fpxqp1´ fpxqqdx ě fpx0q

for all x0, x1 P [c, d] such that x0 ď x1. In addition, the restriction of fto [c, d] is non-constant since the function pR Ñ R, x ÞÑ axp1 ´ xqq hasno zeros on p0, 1q. Hence we conclude from (3.1.2) by Theorem 3.1.1 forx P [c, d] that

apx´ cq “

ż x

c

a du “

ż x

c

f 1pyq

fpyqp1´ fpyqqdy “

ż fpxq

fpcq

du

up1´ uq

“

ż fpxq

fpcq

ˆ

1

u`

1

1´ u

˙

du “

„

ln

ˆ

u

1´ u

˙fpxq

fpcq

“ ln

ˆ

1´ fpcq

fpcq¨

fpxq

1´ fpxq

˙

and hence thatfpcq

1´ fpcqe´aceax “

fpxq

1´ fpxq“fpxq ´ 1` 1

1´ fpxq“

1

1´ fpxq´ 1 .

This implies that

fpcq

1´ fpcqe´ac “

fp0q

1´ fp0q

295

and hence that

fp0q

1´ fp0qeax “

1

1´ fpxq´ 1 .

Finally, this leads on

fpxq “ 1´1

1` fp0q1´fp0q

eax“

eax

eax ` 1´fp0qfp0q

.

On the other hand, for every c P R˚, it follows by elementary calculationthat the function fc defined by

fcpxq :“

$

&

%

eax

eax` 1´cc

for x P R if 0 ă c ď 1

eax

eax` 1´cc

for x P R zta´1 lnppc´ 1qcqu if c ą 1 or c ă 0

satisfies (3.1.16). Note also that f0, defined as the constant function of valuezero on R, is a further solution of (3.1.16) such that f0p0q “ 0.

Problems

1) Calculate the integral.

a)ż 2

´2

3u` 2

u2 ` u´ 12du , b)

ż 2

0

2x` 1

x2 ´ 6x` 9dx ,

c)ż 4

3

u3

u2 ` 3du , d)

ż 1

0

3x` 1

x3 ´ 7x´ 6dx ,

e)ż 3

2

x2 ` 3

x3 ` 6x2 ` 12x` 8dx , f)

ż 3

1

x2 ` 3x´ 1

x3 ´ 2x2 ´ 7x´ 4dx ,

g)ż 1

´1

x2 ` 3x` 4

x3 ` 4x2 ` x` 4dx , h)

ż 3

´3

x2 ´ 1

x4 ` 4x2 ` 4dx ,

i)ż 4

0

3x` 5

x4 ` 4x2 ` 3dx , j)

ż 4

2

3x` 7

x4 ` 4x3 ` 6x2 ` 4x` 1dx ,

k)ż 12

´12

x3 ` 4x` 5

x4 ´ 2x2 ` 1dx , l)

ż 1

0

x3 ` 3x2 ` 1

x4 ´ 15x2 ` 10x` 24dx ,

296

m)ż 0

´2

2x2 ` 1

x4 ` x3 ´ 9x2 ` 11x´ 4dx ,

n)ż 1

0

x3 ` x` 1

x4 ´ 3x3 ´ 3x2 ` 7x` 6dx ,

o)ż 3

0

x2 ` 1

x4 ` 2x3 ` 3x2 ` 4x` 2dx ,

p)ż 2

1

2x3 ` x2 ` 4

x4 ` 3x3 ` 5x2 ` 9x` 6dx .

3.1.4 Approximate Numerical Calculation of Integrals

Usually, in cases where an evaluation of a given integral in terms of knownfunctions appears to be impossible, resort is taken to approximation meth-ods. Basic numerical methods for this, the midpoint rule, the trapezoid ruleand Simpson’s rule, are given within this section. Each of them uses ap-proximations of integrands analogous to those leading to upper and lowersums in the definition of the Riemann integral. For this, partitions of the in-terval of integration I are used which induce divisions into subintervals ofequal length. Generally, the decrease of that length leads to better approxi-mations. On each subinterval, the corresponding restriction of the integrandf : I Ñ R is replaced by a certain polynomial approximation characteris-tic for each method. The integral of f over I is then approximated by thesum of the integrals of the approximating polynomials over the subinter-vals. The midpoint rule uses on each subinterval the constant polynomialwhose value coincides with the value of f in the midpoint of that interval.This is equivalent to the approximation of f by its linearization around themidpoint of the subinterval, since the integral of the non-constant part ofthe polynomial over that interval vanishes. The last is the reason, why themidpoint rule leads to results which are similar in accuracy to those of thetrapezoid method. The trapezoid method approximates f on each subinter-val by the linear polynomial that interpolates between the values of f at theinterval ends, i.e., by that linear polynomial that assumes the same valuesas f at both ends of the subinterval. Finally, Simpson’s method approxi-mates f on each subinterval by the quadratic polynomial that interpolatesbetween the value of f at the end points and at the midpoint of that interval.

297

From this description, it might be expected that among those methods,Simpson’s rule is the most accurate, followed by the trapezoid rule andthe midpoint rule. Indeed, Simpson’s rule is the most accurate which isalso reflected in the fact that its error is proportional to n´4 where n is theof number of subintervals of the division. On the other hand, the error ofboth, the midpoint and the trapezoidal rule, is proportional to n´2. Often,the trapezoid rule gives better approximations than the midpoint rule, butthere are also cases known where the opposite is true. For instance, in theexamples below this is the case. All these methods, can lead to poor resultsin the case of an oscillating f as long as the length of the subintervals iscomparable to the average distance of subsequent minima and maxima off . Such cases are depicted in the figures below.

The key for the following derivation of an error estimate for the midpointrule is the observation that the associated integrals over the subintervals co-incide with those of the linearization of the integrand around the midpoints.As a consequence, the remainder estimate of Corollary 2.5.26 to Taylor’stheorem can be applied.

Theorem 3.1.33. (Midpoint Rule) Let a, b P R be such that a ă b,f : ra, bs Ñ R be bounded and twice differentiable on pa, bq such that|f 2pxq| ď K for all x P pa, bq and some K ě 0. Then

(i)ˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx´ f

ˆ

a` b

2

˙

pb´ aq

ˇ

ˇ

ˇ

ˇ

ďK

24pb´ aq3 .

(ii) In addition, let n P N˚, h :“ pb ´ aqn and ai :“ a ` i ¨ h for alli P t0, . . . , nu. Then

ˇ

ˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx´ hn´1ÿ

i“0

f´ai ` ai`1

2

¯

ˇ

ˇ

ˇ

ˇ

ˇ

ďK

24

pb´ aq3

n2.

298

1.2 1.4 1.6 1.8 2x

10

20

30

40

y

Fig. 74: Midpoint approximation.

Proof. (i) By the Corollary 2.5.26 to Taylor’s theorem, it follows that

|fpxq ´ p1pxq| ďK

2

ˆ

xá` b

2

˙2

for all x P pa, bq where

p1pxq :“ f

ˆ

a` b

2

˙

` f 1ˆ

a` b

2

˙ˆ

xá` b

2

˙

for all x P R is the first-degree Taylor-polynomial of f centered aroundpa` bq2. Further,ż b

a

p1pxq dx “ f

ˆ

a` b

2

˙

pb´ aq ` f 1ˆ

a` b

2

˙

¨

ż b

a

ˆ

xá` b

2

˙

dx

“ f

ˆ

a` b

2

˙

pb´ aq `1

2¨ f 1

ˆ

a` b

2

˙

¨

ˆ

xá` b

2

˙2 ˇˇ

ˇ

ˇ

b

a

“ f

ˆ

a` b

2

˙

pb´ aq .

299

In addition,ˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx´

ż b

a

p1pxq dx

ˇ

ˇ

ˇ

ˇ

ď

ż b

a

|fpxq ´ p1pxq| dx

ďK

2

ż b

a

ˆ

xá` b

2

˙2

dx “K

6

ˆ

xá` b

2

˙3 ˇˇ

ˇ

ˇ

b

a

“K

24pb´ aq3 .

(ii) is a simple consequence of (i).

Example 3.1.34. We use the midpoint rule to approximate the value of

lnp2q “

ż 2

1

dx

x.

For this, we use the partition

pa0, a1, a2, a3, a4q “ p44, 54, 64, 74, 84q

leading to a division of r1, 2s into the four subintervals of length h “ 14.Then

h3ÿ

i“0

fái ` ai`1

2

¯

“1

2

3ÿ

i“0

1

ai ` ai`1

“1

2

ˆ

144` 5

4

`1

54` 6

4

`1

64` 7

4

`1

74` 8

4

˙

“ 2

ˆ

1

9`

1

11`

1

13`

1

15

˙

“4448

6435« 0.691

where fpxq :“ 1x for all x P r1, 2s and the last approximation is to threedecimal places. To three decimal places, lnp2q is given by

lnp2q « 0.693 .

The result of this application of the midpoint gives lnp2q within an error of2 ¨ 10´3. Since

|f 2pxq| “ 2x´3ď 2

300

1.2 1.4 1.6 1.8 2x

10

20

30

40

y

Fig. 75: Trapezoid approximation.

for all x P p1, 2q, Theorem 3.1.33 (ii) leads to the error boundˇ

ˇ

ˇ

ˇ

4448

6435´ lnp2q

ˇ

ˇ

ˇ

ˇ

ď2

24 ¨ 16“

1

192ă 6 ¨ 10´3 .

The following derivation of an error estimate for the trapezoid rule exploitsthe fact that the difference of the approximating polynomial on a subintervaland the restriction of the integrand vanishes at the interval ends. By partialintegration, the integral of such a difference can be transformed into anintegral containing the second order derivative of the difference, instead.Since the approximating polynomial is only of first order, the last coincideswith the second order derivative of the restriction of integrand. This leadsto an error estimate in terms of a bound on the second derivative of f .

Theorem 3.1.35. (Trapezoid Rule) Let I be some non-empty open intervalof R, f : I Ñ R be twice continuously differentiable and a, b P I be suchthat a ă b. In particular, let |f 2pxq| ď K for all x P pa, bq and someK ě 0.Then

301

(i)ˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx´fpaq ` fpbq

2¨ pb´ aq

ˇ

ˇ

ˇ

ˇ

ďK

12pb´ aq3 .

(ii) In addition let n P N˚, h :“ pb ´ aqn and ai :“ a ` i ¨ h for alli P t0, . . . , nu. Then

ˇ

ˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx´ hn´1ÿ

i“0

fpaiq ` fpai`1q

2

ˇ

ˇ

ˇ

ˇ

ˇ

ďK

12

pb´ aq3

n2.

Proof. Define

ppxq :“ fpaq `fpbq ´ fpaq

b´ a¨ px´ aq

for all x P R and h :“ f ´ p. In particular, it follows that hpaq “ hpbq “ 0and

ż b

a

ppxq dx “fpaq ` fpbq

2¨ pb´ aq .

By partial integration, it follows thatż b

a

hpxq dx “1

2

ż b

a

px´ bqpx´ aqh 2pxq dx “1

2

ż b

a

px´ bqpx´ aqf 2pxq dx

and hence thatˇ

ˇ

ˇ

ˇ

ż b

a

hpxq dx

ˇ

ˇ

ˇ

ˇ

ď1

2

ż b

a

pb´ xq px´ aq |f 2pxq| dx

ďK

2

ż b

a

pb´ xq px´ aq dx “K

12pb´ aq3 .

(ii) is a simple consequence of piq.

Example 3.1.36. As before the midpoint rule, we use the trapezoid rule toapproximate the value of

lnp2q “

ż 2

1

dx

x.

302

Again, we use the partition

pa0, a1, a2, a3, a4q “ p44, 54, 64, 74, 84q


hn´1ÿ

i“0

fpaiq ` fpai`1q

2“

1

8

3ÿ

i“0

ˆ

1

ai`

1

ai`1

˙

“1

8

ˆ

4

4`

4

5`

4

5`

4

6`

4

6`

4

7`

4

7`

4

8

˙

“1

2

ˆ

1

4`

2

5`

2

6`

2

7`

1

8

˙

“1171

1680« 0.697

where fpxq :“ 1x for all x P r1, 2s and the last approximation is to threedecimal places. To three decimal places, lnp2q is given by

lnp2q « 0.693 .

The result of this application of the midpoint gives lnp2q within an error of4 ¨ 10´3. Since

|f 2pxq| “ 2x´3ď 2


ˇ

ˇ

ˇ

1171

1680´ lnp2q

ˇ

ˇ

ˇ

ˇ

ď2

12 ¨ 16“

1

96ă 11 ¨ 10´3 .

The following derivation of an error estimate for Simpson’s rule is similarto that for the trapezoid rule. Again, it uses partial integration to exploit thethe fact that the difference of the approximating polynomial on a subintervaland the restriction of the integrand vanishes at the endpoints and also in themiddle of the interval. This leads to an error estimate in terms of a boundon the fourth derivative of the integrand.

303

Theorem 3.1.37. (Simpson’s Rule) Let h ą 0, I be some open interval ofR containing r´h, hs, f : I Ñ R be four times continuously differentiableand |f pivqpxq| ď K for all x P p´h, hq and some K ě 0. Then

ˇ

ˇ

ˇ

ˇ

ż h

´h

fpxq dx´1

3rfp´hq ` 4fp0q ` fphqs ¨ h

ˇ

ˇ

ˇ

ˇ

ďK

90h5 .

Proof. Define

ppxq :“

"

1

2rfphq ` fp´hqs ´ fp0q

*

¨x2

h2`

1

2rfphq ´ fp´hqs ¨

x

h` fp0q

for all x P R and g :“ f ´ p. Then gp´hq “ gp0q “ gphq “ 0 and

ż h

´h

ppxq dx “

"

1

2rfphq ` fp´hqs ´ fp0q

*

¨2h

3` fp0q ¨ 2h

“1

3rfp´hq ` 4fp0q ` fphqs ¨ h .

By partial integration, it follows thatż 0

´h

px` hq3 p3x´ hq g pivqpxq dx`

ż h

0

px´ hq3 p3x` hq g pivqpxq dx

“

ż h

0

px´ hq3 p3x` hq rg pivqpxq ` g pivqp´xqs dx “ 72

ż h

´h

gpxq dx

and hence thatˇ

ˇ

ˇ

ˇ

ż h

´h

gpxq dx

ˇ

ˇ

ˇ

ˇ

ďK

36

ż h

0

ph´ xq3 p3x` hq dx

“K

36

„

3

5ph´ xq5 ´ h ph´ xq4

ˇ

ˇ

ˇ

ˇ

h

0

“K

90h5 .

304

1.2 1.4 1.6 1.8 2x

10

20

30

40

50y

1.2 1.4 1.6 1.8 2x

10

20

30

40

50y

Fig. 76: Simpson’s approximation.

Corollary 3.1.38. Let I be some non-empty open interval of R, f : I Ñ Rbe four times continuously differentiable and a, b P I be such that a ă b.In particular, let |f pivqpxq| ď K for all x P pa, bq and some K ě 0. Finally,let n P N˚, h :“ pb´ aqn and ai :“ a` i ¨ h for all i P t0, . . . , nu. Then

ˇ

ˇ

ˇ

ˇ

ˇ

ż b

a

fpxq dx´h

6

n´1ÿ

i“1

„

fpaiq ` 4fppai ` ai`1q2q ` fpai`1q

ˇ

ˇ

ˇ

ˇ

ˇ

ďK

2880

pb´ aq5

n4.

Note that

h

6

n´1ÿ

i“1

„


“2

3¨ h

n´1ÿ

i“1

fppai ` ai`1q2q `1

3¨ h

n´1ÿ

i“1

fpaiq ` fpai`1q

2

305

hence equals the sum of two-thirds of the corresponding sum for the mid-point rule and one-third of the corresponding sum for the trapezoid rule.

Proof. The corollary is a simple consequence of Theorem 3.1.37.

Example 3.1.39. As before the midpoint and trapezoid rule, we use Simp-son’s rule to approximate the value of

lnp2q “

ż 2

1

dx

x.

Again, we use the partition

pa0, a1, a2, a3, a4q “ p44, 54, 64, 74, 84q


h

6

n´1ÿ

i“1

„


“2

3¨

4448

6435`

1

3¨

1171

1680“

1498711

2162160« 0.693155

where fpxq :“ 1x for all x P r1, 2s and the last approximation is to sixdecimal places. Also, the corresponding sums for the midpoint rule and thetrapezoid rule have been used. To six decimal places, lnp2q is given by

lnp2q « 0.693147 .

The result of this application of Simpson’s rule gives lnp2q within an errorof 8 ¨ 10´6. Since

|f pivqpxq| “ 24x´5ď 24


ˇ

ˇ

ˇ

1498711

2162160´ lnp2q

ˇ

ˇ

ˇ

ˇ

ď24

2880 ¨ 256“

1

30720ă 4 ¨ 10´5 .

306

Problems

1) Calculate the integral. In addition, evaluate the integral approxi-mately, using the midpoint rule, the trapezoidal rule and Simpson’srule. In this, subdivide the interval of integration into 4 intervals ofequal length. Compare the approximation to the exact result.

a)ż 1

0

du

p1` uq2, b)

ż 1

0

2x

p1` x2q2dx ,

c)ż 1

0

3u2

p1` u3q2du .

2) By using Simpson’s rule, approximate the area in R2 that is enclosedby the Cartesian leaf

C :“ tpx, yq P R2 : 3?

2 py2 ´ x2q ` 2x px2 ` 3y2q “ 0u

where a ą 0. In this, subdivide the interval of integration into 4intervals of equal length. Compare the approximation to the exactresult which is given by 1.5.

3) The time for one complete swing (‘period’) T of a pendulum withlength L ą 0 is given by

T “2a

Lg?

1´ k2

`

π ´ k2Ipkq˘

where

Ipkq “

ż 1

´1

?1´ u2

?1´ k2u2

`?1´ k2 `

?1´ k2u2

˘ du ,

θ0 P p´π2, π2q is the initial angle of elongation from the posi-tion of rest of the pendulum, k :“ | sinpθ02q|, and where g is theacceleration of the Earth’s gravitational field. By using Simpson’srule, approximate T for θ0 “ π4. In this, subdivide the interval ofintegration into 4 intervals of equal length.

307

3.2 Improper IntegralsA large number of integrals in applications are ‘improper’ in the sense thatthey are not Riemann integrals of functions over bounded closed intervalsof R. For instance in physics, integrals over unbounded sets occur natu-rally in the description of systems of infinite extension which are basic forphysics. Another important source for improper integrals is in theory ofspecial functions where the majority of integral representations is in formof improper Riemann integrals (or, alternatively, Lebesgue integrals). Alsospecial functions have important applications. The majority appears as so-lutions of differential equations from applications, like Bessel functions,hypergeometric functions, confluent hypergeometric functions or ellipticfunctions. Others, like the gamma function or the beta function appear nat-urally in the definitions of the former.

For this reason, in this section we also introduce basic special functions,the gamma function and the beta function, by help of such integral repre-sentations and derive their basic properties. In particular, Legendre’s du-plication formula, Euler’s reflection formula and Gauss’ representation forthe gamma function are proved in this section. In applications, these resultsare often needed also for complex arguments. As is known, these followfrom those for real arguments by help of the principle of analytic continu-ation. In addition, elementary properties of Gaussian integrals are derivedthat are frequently used in quantum theory and in probability theory. Orig-inal proofs of some of these results used improper double integrals. In themeantime, more elementary proofs have been found that allow their deriva-tion already at an early stage in a calculus course. In particular, we useresults from [26] and [61].

For motivation, in the following we consider the problem of the calcula-tion of the period of a simple pendulum in Earths gravitational field whichleads in a natural way on an improper Riemann integral. A simple pen-dulum is defined as a particle of mass m ą 0 suspended from a point Oby a string of length L ą 0 and of negligible mass. During the time of

308

Θ

L

m

O

Fig. 77: A simple pendulum. The dashed line marks the rest position. Compare text.

development of calculus in the 17th century, such motion was consideredin 1673 by the inventor of the pendulum clock, Christian Huygens [56]. Inthe analysis below, we use Newton’s equation of motion . The last was notknown to Huygens at that time.

Newton’s equation of motion give the following differential equation forthe angle of elongation θ from the rest position of the pendulum as a func-tion of time.

θ 2 `g

Lsin θ “ 0 (3.2.1)

where g is the acceleration of Earth’s gravitational field. The general solu-tion of this equation is not expressible in terms of elementary functions, butonly in terms of special functions called ‘elliptic functions’. In the follow-ing, instead of finding the solutions of (3.2.1), we pursue the goal of findingthe time τ for the pendulum to reach the angle 0 after release from rest atinitial time 0, i.e., θ 1p0q “ 0, and with initial elongation θ0 P p0, π2q.The time τ corresponds to one-fourth of the time necessary for comple-tion of one complete swing, i.e., to one-fourth of the period of the pen-

309

dulum. For this, we assume that there is a unique solution θ : R Ñ Rof (3.2.1) such that θp0q “ θ0, θ 1p0q “ 0, 0 P Ranpθq, and we defineτ :“ min θ´1pt0uq. Only this solution of (3.2.1), whose existence anduniqueness can be proved, we consider in the following. Note that theseassumptions imply that θ is twice differentiable and, as a particular conse-quence of (3.2.1), that θ 2 is continuous.

In a first step, we use the conserved energy for the solutions of 3.2.1, seeExample 2.5.9, to derive a differential equation for θ that contains no higherorder derivatives of θ than of first order. Multiplication of (3.2.1) by θ 1 gives

0 “ θ 1θ 2 `g

Lθ 1 sin θ “

ˆ

1

2θ 1 2 ´

g

Lcos θ

˙ 1

.

Hence it follows by Theorem 2.5.7 that the function inside the brackets isconstant and therefore that

1

2pθ 1ptqq2 ´

g

Lcos θptq “

1

2pθ 1p0qq2 ´

g

Lcos θp0q “ ´

g

Lcos θ0

which leads topθ 1ptqq2 “

2g

Lrcos θptq ´ cos θ0s

for every t P R. The solution of the last equation for θ 1ptq for some t P Rrequires the knowledge of the sign of θ 1ptq. By the fundamental theorem ofcalculus, it follows from (3.2.1) that

θ 1ptq “ θ 1ptq ´ θ 1p0q “

ż t

0

θ 2psq ds “ ´g

L

ż t

0

sin θpsq ds ď 0

for all t P r0, τ s where it has been used that θpτq “ 0 and θptq P r0, θ0s Ă

r0, π2q. Both follow from the definition of τ . Hence, we conclude that

θ 1ptq “ ´

c

2g

L

a

cos θptq ´ cos θ0

for all t P r0, τ s.

310

Since θ 1ptq ă 0 for all τ P p0, τq, it follows by Theorems 2.3.44, 2.5.10and 2.5.18 that for the restriction of θ to the interval r0, τ s there is a strictlydecreasing continuous inverse function θ´1 : r0, θ0s Ñ R whose restrictionto p0, θ0q is differentiable such that

pθ´1q1pϕq “

1

θ 1pθ´1pϕqq“ ´

d

L

2g

1a

cos θpθ´1pϕqq ´ cos θ0

“ ´

d

L

2g

1?

cosϕ´ cos θ0

“ ´1

2

d

L

g

1b

sin2`

θ02

˘

´ sin2`

ϕ2

˘

for all ϕ P p0, θ0q where the addition theorem for the cosine has been usedto conclude that

cosα “ cos´

2α

2

¯

“ cos2´α

2

¯

´ sin2´α

2

¯

“ 1´ 2 sin2´α

2

¯

for every α P R. Hence it follows by the fundamental theorem of calculusthat

τ “ pθ´1qp0q “ pθ´1

qpϕq ´ rpθ´1qpϕq ´ pθ´1

qp0qs

“ pθ´1qpϕq ´

ż ϕ

0

pθ´1q1pϕq dϕ

“ pθ´1qpϕq `

1

2

d

L

g

ż ϕ

0

dϕb

sin2`

θ02

˘

´ sin2`

ϕ2

˘

pθ´1qpϕq `

1

2k

d

L

g

ż ϕ

0

dϕb

1´ 1k2

sin2`

ϕ2

˘

for every ϕ P r0, θ0q where k P p0, 1q is defined by

k :“ sin

ˆ

θ0

2

˙

.

By use of the substitution g : r0, sinpϕ2qks Ñ R defined by

gpuq :“ 2 arcsinpkuq

311

for every u P r0, sinpϕ2qks, we arrive at

τ “ pθ´1qpϕq `

d

L

g

ż 1k

sinpϕ2q

0

dua

p1´ u2qp1´ k2u2q.

Finally, since θ´1 : r0, θ0s Ñ R is continuous, we conclude that

τ “ limϕÑθ0

d

L

g

ż 1k

sinpϕ2q

0

dua

p1´ u2qp1´ k2u2q

“ limuÑ1

d

L

g

ż u

0

dua

p1´ u2qp1´ k2u2q.

It would be natural to indicate the last by

τ “

d

L

g

ż 1

0

dua

p1´ u2qp1´ k2u2q,

but the integrand of the last ‘integral’ is not defined at u “ 1 and its restric-tion to the interval r0, 1q is an unbounded function. Hence the last ‘integral’is no Riemann integral. The definitions below turn it into an improper Rie-mann integral defined by

ż 1

0

dua

p1´ u2qp1´ k2u2q:“ lim

uÑ1

d

L

g

ż u

0

dua

p1´ u2qp1´ k2u2q.

As a side remark, we mention thatż u

0

dua

p1´ u2qp1´ k2u2q

for 0 ď u ď 1 is called an elliptic integral of the first kind (in Jacobianform) and is denoted by the symbol F pu|kq.

Definition 3.2.1. (Improper Riemann integrals)

312

(i) Let a P R, b P RY t8u such that a ă b if b ‰ 8 and f : ra, bq Ñ Rbe almost everywhere continuous. Then F : ra, bq Ñ R, defined by

F pxq :“

ż x

a

fpyq dy

for every y P ra, bq, is a continuous function according to Theo-rem 2.6.19. We say that f is improper Riemann-integrable if thereis L P R such that

limxÑb

F pxq “ limxÑb

ż x

a

fpyq dy “ L .

In this case, we define the improper Riemann integral of f byż b

a

fpyq dy “ limxÑb

ż x

a

fpyq dy .

(ii) Let a P R Y t´8u, b P R be such that a ă b if a ‰ ´8 and f :pa, bs Ñ R be almost everywhere continuous. Then F : pa, bs Ñ Rdefined by

F pxq :“

ż b

x

fpyq dy

for every y P ra, bq is a continuous function according to Theo-rem 2.6.19. We say that f is improper Riemann-integrable if thereis some L P R such that

limxÑa

F pxq “ limxÑa

ż b

x

fpyq dy “ L .

In this case, we define the improper Riemann integral of f byż b

a

fpyq dy “ limxÑa

ż b

x

fpyq dy .

313

(iii) Let a P R Y t´8u, b P R Y t8u such that a ă b if a ‰ ´8 andb ‰ 8. Further, let f : pa, bq Ñ R be almost everywhere continuous.We say that f is improper Riemann-integrable if, both, f |pa,cs andf |rc,bq are improper Riemann-integrable for some c P pa, bq. In thiscase, we define

ż b

a

fpxq dx :“

ż c

a

fpxq dx`

ż b

c

fpxq dx .

That this definition is indeed independent of c is a consequence ofthe additivity of the Riemann integral, Theorem 2.6.18. The proof ofthis will be given in the subsequent second remark below.

Remark 3.2.2. Note that according to the previous definition, the restric-tions to pa, bs, ra, bq or pa, bq of a continuous function defined on a boundedclosed interval ra, bs, where a, b P R are such that a ă b, are improperRiemann-integrable, and that the values of the associated improper inte-grals all coincide with the Riemann integral of that function.

Remark 3.2.3. In the following, we use the notation from Definition 3.2.1.That Definition 3.2.1 (iii) is independent of c P pa, bq can be seen as follows.For this, let d P pc, bq and sequences a1, a2, . . . in pa, ds, b1, b2, . . . in rd, bqthat are convergent to a and b, respectively. Then it follows by the additivityof the Riemann integral, Theorem 2.6.18, for sufficiently large n P N˚ that

ż c

ak

fpxq dx`

ż d

c

fpxq dx “

ż d

ak

fpxq dx ,

ż d

c

fpxq dx`

ż bk

d

fpxq dx “

ż bk

c

fpxq dx .

Hence it follows by the limit laws thatż c

a

fpxq dx “ limkÑ8

ż c

ak

fpxq dx “ ´

ż d

c

fpxq dx` limkÑ8

ż d

ak

fpxq dx

ż b

c

fpxq dx “ limkÑ8

ż bk

c

fpxq dx “

ż d

c

fpxq dx` limkÑ8

ż bk

d

fpxq dx .

314

Therefore, f |pa,ds and f |rd,bq are improper Riemann-integrable and satisfy

ż c

a

fpxq dx “ ´

ż d

c

fpxq dx`

ż d

a

fpxq dx ,

ż b

c

fpxq dx “

ż d

c

fpxq dx`

ż b

d

fpxq dx .

The last implies thatż c

a

fpxq dx`

ż b

c

fpxq dx “

ż d

a

fpxq dx`

ż b

d

fpxq dx .

The case that d P pa, cq is analogous. If a1, a2, . . . in pa, ds, b1, b2, . . .in rd, bq are convergent to a and b, respectively. Then it follows by theadditivity of the Riemann integral, Theorem 2.6.18, for sufficiently largen P N˚ that

ż d

ak

fpxq dx`

ż c

d

fpxq dx “

ż c

ak

fpxq dx ,

ż c

d

fpxq dx`

ż bk

c

fpxq dx “

ż bk

d

fpxq dx .

Hence it follows by the limit laws thatż c

a

fpxq dx “ limkÑ8

ż c

ak

fpxq dx “

ż c

d

fpxq dx` limkÑ8

ż d

ak

fpxq dx

ż b

c

fpxq dx “ limkÑ8

ż bk

c

fpxq dx “ ´

ż c

d

fpxq dx` limkÑ8

ż bk

d

fpxq dx .

Therefore, f |pa,ds and f |rd,bq are improper Riemann-integrable and satisfy

ż c

a

fpxq dx “

ż c

d

fpxq dx`

ż d

a

fpxq dx ,

ż b

c

fpxq dx “ ´

ż c

d

fpxq dx`

ż b

d

fpxq dx .

315

The last implies thatż c

a

fpxq dx`

ż b

c

fpxq dx “

ż d

a

fpxq dx`

ż b

d

fpxq dx .

In the following, we give two prime examples of improper integrals whoseintegrands are restrictions of powers of the identity function on p0,8q. Thefirst example shows that such are improper integrable over an interval p0, as,where a ą 0, if and only if that power is greater than ´1. The secondexample shows that such are improper integrable over an interval ra,8q,where a ą 0, if and only if that power is smaller than ´1.

Example 3.2.4. Define fα :“ pp0, as Ñ R, x ÞÑ 1xαq for every real αwhere a ą 0. Show that

(i) fα is improper Riemann-integrable for every α ă 1 andż a

0

dx

xα“

a1´α

1´ α.

(ii) fα is not improper Riemann-integrable for every α ě 1.

Solution: For α P R zt1u and ε P p0, aq, it follows thatż a

ε

dx

xα“

„

x1´α

1´ α

a

ε

“a1´α ´ ε1´α

1´ α

and for α “ 1 thatż a

ε

dx

x“ r lnpxqsaε “ lnpaq ´ lnpεq

and hence the statements.

Example 3.2.5. Define fα :“ p ra,8q Ñ R, x ÞÑ 1xαq for every real αwhere a ą 0. Show that

316

(i) fα is improper Riemann-integrable for every α ą 1 andż 8

a

dx

xα“

1

α ´ 1¨

1

aα´1.

(ii) fα is not improper Riemann-integrable for every α ď 1.

Solution: For α P R zt1u and x ą a, it follows thatż x

a

dy

yα“

„

y1´α

1´ α

x

a

“x1´α ´ a1´α

1´ α

and for α “ 1ż x

a

dy

y“ r lnpyqsxa “ lnpxq ´ lnpaq

and hence the statements.

The following gives an important criterion for improper Riemann integra-bility. It is based on the fact that for every bounded continuous and increas-ing function F : ra, bq Ñ R where a P R, b P RY t8u are such that a ă bif b ‰ 8,

limxÑb

F pxq

exists. Within the following theorem, F is an antiderivative of the absolutevalue of an almost everywhere continuous integrand. In this connection,the theorem is applied by showing that that absolute value has an improperRiemann-integrable majorant.

Theorem 3.2.6. Let a P R, b P R Y t8u be such that a ă b if b ‰ 8 andf : ra, bq Ñ R be almost everywhere continuous. Finally, let G : ra, bq ÑR be defined by

Gpxq :“

ż x

a

|fpyq| dy

for all x P ra, bq and be bounded. Then, f and |f | are improper Riemann-integrable.

317

Proof. Let pbnqnPN be a sequence in ra, bqwhich is convergent to b. SinceGis bounded and increasing, suptRanGu exists. As a consequence for givenε ą 0, there is some c P ra, bq such that suptRanGu ´Gpcq ă ε. Hence italso follows that suptRanGu ´ Gpxq ă ε for all x P rc, bq. Then there isn0 P N such that bn ě c for all n P N satisfying n ě n0. Therefore it alsofollows that | suptRanGu ´ Gpbnq| ă ε for n P N satisfying n ě n0 andhence finally

limnÑ8

Gpbnq “ suptRanGu .

Hence |f | is improper Riemann-integrable andż b

a

|fpxq| dx “ suptRanGu .

Further, for every x P ra, bq, it follows thatż x

a

p|fpyq| ´ fpyqq dy ď 2

ż x

a

|fpyq| dy ď 2 suptRanGu .

Hence |f | ´ f and therefore also f is improper Riemann-integrable.

As an application of the previous theorem, the next example defines thegamma function. The last extends the factorial function pN Ñ N, n ÞÑ n!qto a function with domain given by all real numbers which are no negativeintegers and such that the functional relationship of the factorial that pn `1q! “ pn ` 1qn! for all n P N is preserved. The proof of the last will begiven within the example that is next to the following example. A mainreason for the importance of the gamma function for applications is the factthat it appears naturally in the definition of many special functions that aresolutions of differential equations from applications.

Example 3.2.7. Show that fy :“ pp0,8q Ñ R, x ÞÑ e´xxy´1q is improperRiemann-integrable for every y ą 0. Hence we can define the gammafunction Γ : p0,8q Ñ R by

Γpyq :“

ż 8

0

e´xxy´1 dx

318

1 2 3 4 5x

2

6

24

y

1 2 3 4 5x

0.5

1

y

Fig. 78: Graphs of the gamma function Γ (left) and 1Γ.

for all y P p0,8q. Solution: Let y ą 0. For every ε ą 0, it follows thatż 1

ε

e´xxy´1 dx ď

ż 1

ε

xy´1 dx ď1

y

and hence by Theorem 3.2.6 that fy|p0,1s is improper Riemann-integrableand that

ż 1

0

e´xxy´1 dx ď1

y.

Further, hy :“ pr1,8q Ñ R, x Ñ e´x2xy´1q has a maximum at x0 :“maxt1, 2py ´ 1qu. Hence it follows for every R ě 1 that

ż R

1

e´xxy´1 dx ď hypx0q

ż R

1

e´x2 dx ď 2hypx0q e´12

and by Theorem 3.2.6 that fy|r1,8q is improper Riemann-integrable. Notefor later use that

Γp1q “

ż 1

0

e´x dx` limRÑ8

ż R

1

e´x dx “ limRÑ8

ż R

0

e´x dx

“ limRÑ8

p1´ e´Rq “ 1 .

319


Γpy ` 1q “ y ¨ Γpyq (3.2.2)

for all y ą 0 and hence that

Γpn` 1q “ n! (3.2.3)

for all n P N. Solution: By partial integration it follows for every y ą 0,ε P p0, 1q and R P p1,8q that

ż R

ε

e´xxy dx “ e´εε y ´ e´RR y` y

ż R

ε

e´xxy´1 dx

and hence (3.2.2). Since Γp1q “ 1 “ 0! , from this follows (3.2.3) byinduction.

As another example of an application of Theorem 3.2.6, the next exampledefines Gaussian integrals. Such integrals appear in quantum theory in thestudy of the quantization of the harmonic oscillator which is of fundamentalimportance for physics. In addition, they appear naturally in the study of thenormal distribution in probability theory. The last distribution is frequentlyused for the description of error progression due to random errors occurringin measurements of physical quantities.

Example 3.2.9. ( Gaussian integrals, I ) Show that fm,n :“ p r0,8q ÑR, x ÞÑ xme´nx

22 q is improper Riemann-integrable for all m P N, n P N˚.In particular, show that I : Nˆ N˚ Ñ R defined by

Ipm,nq :“

ż 8

0

xme´nx22 dx

for all m P N, n P N˚ satisfies

Ipm` 2, nq “m` 1

nIpm,nq (3.2.4)

for all m P N, n P N˚ and, in particular,

Ip2k ` 1, nq “2kk!

nk`1,

320

Ip2pk ` 1q, nq “1 ¨ 3 ¨ ¨ ¨ p2k ` 1q

nk`1¨

1?nIp0, 1q (3.2.5)

for all k P N. Solution: First, it follows for n P N˚ and x ě 1 thatż x

0

y eńy22 dy “ ´

1

n

ż x

0

pńyq eńy22 dy “ ´

1

n

”

eńy22ıx

0

“1

n

´

1´ eńx22

¯

,ż x

0

eńy22 dy “

ż 1

0

eńy22 dy `

ż x

1

eńy22 dy

ď

ż 1

0

eńy22 dy `

ż x

0

yeńy22 dy

and hence by Theorem 3.2.6 that f0,m and f1,m are improper Riemann-integrable as well as that

Ip1, nq “

ż 8

0

y eńy22 dy “

1

n. (3.2.6)

Further, according to Example 2.5.12, ex ě 1 and ex ě x for all x ě 0.Hence it follows that

ex ě ex ´ 1 “

ż x

0

ey dy ě

ż x

0

y dy “x2

2

for all x ě 0 and in this way inductively that

ex ěxm

m!

for all x ě 1 and m P N. In addition, it follows for m P N, n P N˚ andx ě 0 that

ż x

0

ym`2eńy22 dy “ ´

1

n

ż x

0

ym`1pńyq eńy

22 dy

“

„

´1

nym`1 eńy

22

x

0

`1

n

ż x

0

pm` 1q ym eńy22 dy

321

“ ´1

nxm`1 eńx

22`m` 1

n

ż x

0

ym eńy22 dy . (3.2.7)

Since,ˇ

ˇ

ˇ

ˇ

´1

nxm`1 eńx

22

ˇ

ˇ

ˇ

ˇ

ďm!

nexńx

22 ,

we notice thatlimxÑ8

´1

nxm`1 eńx

22“ 0 .

Hence it follows from (3.2.7) inductively the improper Riemann-integrabilityof fn,m for all m P N and n P N˚ as well as the validity of (3.2.4) for allm P N and n P N˚. Further, it follows from (3.2.6) and (3.2.7) by inductionthat

Ip2k ` 1, nq “2kk!

nk`1, Ip2pk ` 1q, nq “

1 ¨ 3 ¨ ¨ ¨ p2k ` 1q

nk`1Ip0, nq

for all k P N and, finally, as a consequence of

ż x

0

eńy22 dy “

1?n

ż

?nx

0

eú22 du

that

Ip2pk ` 1q, nq “1 ¨ 3 ¨ ¨ ¨ p2k ` 1q

nk`1¨

1?nIp0, 1q .

Equation (3.2.5) reduces the calculation of the Gaussian integrals Ipm, 1qfor even m P N to the calculation of

ż 8

0

e´x22 dx .

The determination of the last is the object of the following example. As anapplication of the result, the value of Γp12q is calculated in the subsequentexample.

322

Example 3.2.10. ( Gaussian integrals, II ) Together with Wallis’ productrepresentation of π from Theorem 3.1.19, the application of the results ofExample 3.2.9 allow the calculation of Ip0, 1q as follows. Employing thenotation of Example 3.2.9, in a first step, we conclude for m P N, n P N˚that

0 ă

ż x

0

ym py ´ tq2 eńy22 dy “

ż x

0

ym`2 eńy22 dy

´ 2t

ż x

0

ym`1 eńy22 dy ` t2

ż x

0

ym`2 eńy22 dy

and hence thatˆż x

0

ym`2 eńy22 dy

˙ˆż x

0

ym`2 eńy22 dy

˙

´

ˆż x

0

ym`1 eńy22 dy

˙2

ą 0

for all x ě 0 and, finally,

Ipm` 1, nq ăa

Ipm,nq Ipm` 2, nq .

In particular, since according to (3.2.4)

Ipn` 1, nq “ Ipn´ 1, nq , Ipn` 2, nq “n` 1

nIpn, nq ,

we conclude that

Ipn` 1, nq ăa

Ipn, nq Ipn` 2, nq “

c

n

n` 1Ipn` 2, nq

ă Ipn` 2, nq ,

Ipn, nq ăa

Ipn´ 1, nq Ipn` 1, nq “ Ipn` 1, nq

and hence that

Ipn, nq ă Ipn` 1, nq ă Ipn` 2, nq .

323

In particular, the case n “ 2k ` 1 where k P N, leads to

2kk!

p2k ` 1qk`1“ Ip2k ` 1, 2k ` 1q ă Ip2k ` 2, 2k ` 1q

“1 ¨ 3 ¨ ¨ ¨ p2k ` 1q

p2k ` 1qk`1¨

1?

2k ` 1Ip0, 1q ă Ip2k ` 3, 2k ` 1q

“2k`1pk ` 1q!

p2k ` 1qk`2

and hence tod

2k ` 1

4pk ` 1q2?k ` 1

2 ¨ 4 ¨ ¨ ¨ 2k

3 ¨ ¨ ¨ p2k ` 1qă Ip0, 1q

ă

d

2k ` 1

4pk ` 2q¨

2k ` 3

2k ` 1¨ 2?k ` 2 ¨

2 ¨ 4 ¨ ¨ ¨ 2pk ` 1q

3 ¨ ¨ ¨ p2k ` 3q

Finally, taking the limit k Ñ 8 in the last expression and applying Wallis’product representation of π (3.2.5) leads to

Ip0, 1q “

ż 8

0

e´y22 dy “

c

π

2. (3.2.8)


Γp12q “?π .

Solution: For this, let ε, R ą 0. By change of variables, it follows thatż R

ε

e´y22 dy “

1?

2

ż R22

ε22

x´12e´x dx

and hence by taking the limits thatc

π

2“

1?

2Γp12q .

324

0

0.5

1

1.5

2

x

0

0.5

1

1.5

2

y

0

20

40z

0

0.5

1

1.5

2

x

Fig. 79: Graph of the Beta function.

As another example of an application of Theorem 3.2.6, the next exampledefines Euler’s beta function.

Example 3.2.12. ( Beta function, I ) Show that fx,y :“ pp0, 1q Ñ R, x ÞÑtx´1p1´tqy´1q is improper Riemann-integrable for all x ą 0, y ą 0. Hencewe can define the Beta function B : p0,8q2 Ñ R by

Bpx, yq :“

ż 1

0

tx´1p1´ tqy´1 dt

for all x ą 0, y ą 0. Solution: For this, let x ą 0, y ą 0. In addition, letε, δ P p0, 12q. Thenż 12

ε

tx´1p1´ tqy´1 dt ď 2

ż 12

ε

tx´1 dt ď

„

2tx

x

12

ε

“2

x

„ˆ

1

2

˙x

´ εx

ď1

x

ˆ

1

2

˙x´1

,

ż 1´δ

12

tx´1p1´ tqy´1 dt ď 2

ż 1´δ

12

p1´ tqy´1“

„

´2p1´ tqy

y

1´δ

12

325

“2

y

„ˆ

1

2

˙y

´ δ y

ď1

y

ˆ

1

2

˙y´1

.

Hence it follows by Theorem 3.2.6 that fx,y|p0,12s and fx,y|p12,1q are im-proper Riemann-integrable and thatż 12

0

tx´1p1´tqy´1 dt ď

1

x

ˆ

1

2

˙x´1

,

ż 1

12

tx´1p1´tqy´1 dt ď

1

y

ˆ

1

2

˙y´1

.

As a consequence, fx,y is improper Riemann-integrable and satisfiesż 1

0

tx´1p1´ tqy´1 dt ď

1

x

ˆ

1

2

˙x´1

`1

y

ˆ

1

2

˙y´1

.

The next example represents the Gamma function essentially as a limit ofthe beta function.As another example of an application of Theorem 3.2.6, the next exampledefines Euler’s beta function.

Example 3.2.13. ( Beta function, II ) Show that

limyÑ8

yxBpx, yq “ Γpxq (3.2.9)

for all x ą 0. Solution: For this, let x ą 0, y ą 2. In addition, letε, δ P p0, 12q. Then

ż 1´δ

ε

tx´1p1´ tqy´1 dt

“1

y ´ 1

ż py´1qp1´δq

py´1qε

ˆ

s

y ´ 1

˙x´1 ˆ

1´s

y ´ 1

˙y´1

ds

“1

py ´ 1qx

ż py´1qp1´δq

py´1qε

sx´1

ˆ

1´s

y ´ 1

˙y´1

ds .

Further,ˇ

ˇ

ˇ

ˇ

ˇ

ż py´1qp1´δq

py´1qε

sx´1

ˆ

1´s

y ´ 1

˙y´1

ds´

ż py´1qp1´δq

py´1qε

sx´1e´s ds

ˇ

ˇ

ˇ

ˇ

ˇ

326

ď

ż py´1qp1´δq

py´1qε

sx´1e´s

ˇ

ˇ

ˇ

ˇ

ˇ

1´

ˆ

1´s

y ´ 1

˙y´1

es

ˇ

ˇ

ˇ

ˇ

ˇ

ds .

We consider the auxiliary function h : r0,8q Ñ R by

hpsq :“ 1´

ˆ

1´s

y ´ 1

˙y´1

es

for all s P r0,8q. Then hp0q “ 0, h is continuous and differentiable onp0,8q with derivative

h 1psq “s

y ´ 1

ˆ

1´s

y ´ 1

˙y´2

es ą 0

for 0 ă s ă y ´ 1. Hence it follows for s P r0, y ´ 1s that

|hpsq| “ hpsq “

ż s

0

u

y ´ 1

ˆ

1´u

y ´ 1

˙y´2

eu du

“

ż s

0

u

y ´ 1exp

„

u` py ´ 2q ln

ˆ

1´u

y ´ 1

˙

du

ď

ż s

0

u

y ´ 1exp

„

u´ py ´ 2qu

y ´ 1

du “

ż s

0

u

y ´ 1exp

ˆ

u

y ´ 1

˙

du

ďe s2

2py ´ 1q

where the case a “ 1 of (2.5.12) has been used. Hence it follows furtherthat

ż py´1qp1´δq

py´1qε

sx´1e´s

ˇ

ˇ

ˇ

ˇ

ˇ

1´

ˆ

1´s

y ´ 1

˙y´1

es

ˇ

ˇ

ˇ

ˇ

ˇ

ds

ď

ż py´1qp1´δq

py´1qε

sx´1e´se s2

2py ´ 1qds ď

e

2py ´ 1qΓpx` 2q .

From the previous, we conclude that

|py ´ 1qxBpx, yq ´ Γpxq| ďe

2py ´ 1qΓpx` 2q

327

and hence thatlimyÑ8

yxBpx, yq “ Γpxq .

The following example expresses the beta function in terms of the gammafunction.

Example 3.2.14. ( Beta function, III ) Show that

Bpx, yq “ΓpxqΓpyq

Γpx` yq(3.2.10)

for all x, y ą 0. Solution: For this, let x, y ą 0. In the first step, we showby use of partial integration that

Bpx, y ` 1q “y

xBpx` 1, yq . (3.2.11)

For this, let ε, δ P p0, 12q. Thenż 1´δ

ε

tx´1p1´ tqy dt “

„

1

xtxp1´ tqy

1´δ

ε

`y

x

ż 1´δ

ε

txp1´ tqy´1 dt

“1

xrp1´ δqxδy ´ εxp1´ εqys `

y

x

ż 1´δ

ε

txp1´ tqy´1 dt

which implies (3.2.11). Further, it follows thatż 1´δ

ε

tx´1p1´ tqy dt`

ż 1´δ

ε

txp1´ tqy´1 dt

“

ż 1´δ

ε

p1´ t` tq tx´1p1´ tqy´1 dt “

ż 1´δ

ε

tx´1p1´ tqy´1 dt

and hence that

Bpx, y ` 1q `Bpx` 1, yq “ Bpx, yq .

As a consequence, we obtain from (3.2.11) the equation

Bpx, yq “ Bpx, y ` 1q `Bpx` 1, yq “x` y

yBpx, y ` 1q

328

which results inBpx, y ` 1q “

y

x` yBpx, yq . (3.2.12)

By induction, we conclude from (3.2.12) that

Bpx, y ` nq “y ¨ py ` 1q ¨ ¨ ¨ py ` n´ 1q

px` yq ¨ px` y ` 1q ¨ ¨ ¨ px` y ` n´ 1qBpx, yq

for every n P N˚. In particular,

Bpy, nq “1 ¨ 2 ¨ ¨ ¨ pn´ 1q

y ¨ py ` 1q ¨ ¨ ¨ py ` n´ 1q,

Bpx` y, nq “1 ¨ 2 ¨ ¨ ¨ pn´ 1q

px` yq ¨ px` y ` 1q ¨ ¨ ¨ px` y ` n´ 1q(3.2.13)

where it has been used that

Bpz, 1q “

ż 1

0

tz´1 dt “1

z

for every z ą 0. Hence it follows that

Bpx, yq “Bpy, nqBpx, y ` nq

Bpx` y, nq

“

ˆ

n

y ` n

˙xny Bpy, nq py ` nqxBpx, y ` nq

nx`y Bpx` y, nq.

From this follows (3.2.10) by taking the limit nÑ 8 and applying (3.2.9).

Note that, as a consequence of (3.2.9) and the first identity of (3.2.13), wearrive at Gauss’ representation of the gamma function

Γpxq “ limnÑ8

pn` 1qxBpx, n` 1q “ limnÑ8

nxBpx, n` 1q

“ limnÑ8

nxn!

x ¨ px` 1q ¨ ¨ ¨ px` nq

for every x ą 0.

329

Theorem 3.2.15. (Gauss’ representation of the gamma function) Forevery x ą 0

Γpxq “ limnÑ8

nxn!

x ¨ px` 1q ¨ ¨ ¨ px` nq. (3.2.14)

As an application of Gauss’ representation of the gamma function and theproduct representation of the sine, (3.1.9), we prove the reflection formulafor Γ.

Theorem 3.2.16. (Euler’s reflection formula for the gamma function)The equation

ΓpxqΓp1´ xq “π

sinpπxq(3.2.15)

holds for all 0 ă x ă 1.

Proof. For this, let 0 ă x ă 1. Then it follows by (3.1.9) that

ΓpxqΓp1´ xq “

„

limnÑ8

nxn!

x ¨ px` 1q ¨ ¨ ¨ px` nq

¨

„

limnÑ8

n1´xn!

p1´ xq ¨ p2´ xq ¨ ¨ ¨ rpn` 1q ´ xs

“1

xlimnÑ8

pn!q2

p1´ x2q ¨ ¨ ¨ pn2 ´ x2q¨

n

pn` 1q ´ x

“1

xlimnÑ8

1`

1´ x2

12

˘

¨ ¨ ¨`

1´ x2

n2

˘ “1

x

πx

sinpπxq“

π

sinpπxq.

Remark 3.2.17. Note that the reflection formula (3.2.15) can and is used toextend the gamma function to negative values of its argument. See Fig. 80.

As final examples for the application of improper integrals, Legendre’s du-plication formula for the gamma function is proved, and an occasionallyoccurring integral is evaluated in terms of the gamma function.

330

-1.5 -0.5 1 2 3 4x

-10

10y

-4 -1.5 1 2 3 4x

-1

1

2

3

4

y

Fig. 80: Graphs of the extensions of the gamma function Γ (left) and 1Γ to negativevalues of the argument.

Example 3.2.18. Show Legendre’s duplication formula for the gamma func-tion

Γp2xq “1?π

22x´1 ΓpxqΓpx` p12qq (3.2.16)

for all x ą 0. Solution: For this, let x ą 0 and ε, δ P p0, 12q. Then

ΓpxqΓpxq

Γp2xq“ Bpx, xq “

ż 1

0

rtp1´ tqsx´1 dt

Further, it follows by change of variables that

ż 1´δ

ε

rtp1´ tqsx´1 dt “

ż 1´δ

ε

"ˆ

t´1

2`

1

2

˙„

1

2´

ˆ

t´1

2

˙*x´1

dt

“

ż p12q´δ

ε´p12q

ˆ

1

4´ u2

˙x´1

du “ 22´2x

ż p12q´δ

ε´p12q

“

1´ p2uq2‰x´1

du

“ 21´2x

ż 1´2δ

2ε´1

p1´ v2qx´1 dv

“ 21´2x

„ż 1´2δ

0

p1´ v2qx´1 dv `

ż 0

2ε´1

p1´ v2qx´1 dv

331

“ 21´2x

„ż 1´2δ

0

p1´ v2qx´1 dv `

ż 1´2ε

0

p1´ v2qx´1 dv

. (3.2.17)

Further, it follows by change of variables thatż b

a

p1´ v2qx´1 dv “

1

2

ż b2

a2y´12

p1´ yqx´1 dy ,

where 0 ă a ă b, and hence by taking the limit aÑ 0 thatż b

0

p1´ v2qx´1 dv “

1

2

ż b2

0

y´12p1´ yqx´1 dy .

Hence it follows from (3.2.17) thatż 1´δ

ε

rtp1´ tqsx´1 dt

“ 2´2x

«

ż p1´2δq2

0

y´12p1´ yqx´1 dy `

ż p1´2εq2

0

y´12p1´ yqx´1 dy

ff

and by taking the limits that

ΓpxqΓpxq

Γp2xq“ 21´2xBp12, xq “ 21´2x Γp12qΓpxq

Γpx` p12qq.

Example 3.2.19. Show thatż π2

0

sinµpθq cosνpθq dθ “Γ`

µ`12

˘

Γ`

ν`12

˘

2 Γ`

µ`ν2` 1

˘ (3.2.18)

for all µ, ν ą ´12. Solution: For this, let µ, ν ą ´12 and ε, δ P p0, 12q.Then it follows by change of variables that

ż 1´δ

ε

tpµ´1q2p1´ tqpν´1q2 dt

“

ż arcsinp?

1´δ q

arcsinp?ε q

2 sinpθq cospθq rsin2pθqspµ´1q2

r1´ sin2pθqspν´1q2 dθ

332

“ 2

ż arcsinp?

1´δ q

arcsinp?ε q

sinµpθq cosµpθq dθ

and hence by taking the limits that

Γ`

µ`12

˘

Γ`

ν`12

˘

Γ`

µ`ν2` 1

˘ “ B

ˆ

µ` 1

2,ν ` 1

2

˙

“ 2

ż π2

0

sinµpθq cosνpθq dθ .

Problems

1) Show the existence in the sense of an improper Riemann integral andcalculate the value. In this, if applicable, s, a ą 0.

a)ż 1

0

lnpxq dx , b)ż 1

0

x lnpxq dx ,

c)ż 8

0

e´sx dx , d)ż 8

0

e´sx sinpaxq dx ,

e)ż 8

0

e´sx cospaxq dx , f)ż 8

0

e´?x dx ,

g)ż 8

0

e´?x

?xdx , h)

ż 8

´8

x expp´x2 q dx ,

i)ż 8

0

dx

1` x3, j)

ż 8

0

dx

x4 ` 3,

k)ż 8

´8

dx

ex ` e´x, l)

ż 8

´8

dx

x2 ` a2,

m)ż 8

0

dx

x2 ` 5x` 6, n)

ż 8

0

dx

x3 ` 2x2 ` 3x` 6.

2) The radial part of the ‘wave function’ of an electron in a bound statearound a proton is given by Rnl : p0,8q Ñ R where n P N˚, l Pt0, . . . , n ´ 1u are the principal quantum number and the azimuthalquantum number, respectively [62]. Calculate the expectation valuexry of the radial position of the electron in the corresponding stategiven by

xry “

ş8

0r3R2

nlprq drş8

0r2R2

nlprq dr¨ a

333

1 2 3 4r

0.1

0.2

0.3

0.4

0.5

y

2 4 6 8 10 12r

0.1

0.2

y

2 4 6 8 10 12r

0.1

0.2

y

4 8 12 16 20 24r

0.05

0.1

y

4 8 12 16 20 24r

0.05

0.1

y

4 8 12 16 20 24r

0.05

0.1

y

Fig. 81: Graphs of pp0,8q Ñ R, r ÞÑ r2R2nlprqq corresponding to a) to f). Compare

Problem 2.

334

where a « 0.529 ¨ 10´8 cm is the Bohr radius.

a) R10prq “ 2 e´r , b) R20prq “

?2

2

´

1´r

2

¯

e´r2 ,

c) R21prq “

?6

12r e´r2 ,

d) R30prq “2?

3

9

ˆ

1´2

3r `

2

27r2˙

e´r3 ,

e) R31prq “8

27?

6

ˆ

r ´1

6r2˙

e´r3 ,

f) R32prq “4

81?

30r2 e´r3

for all r ą 0.

3) The ‘wave function’ of a ‘harmonic oscillator’, i.e., the ‘wave func-tion’ of a point particle of massm ą 0 under the influence of a linearrestoring force, is given by ψn : R Ñ R where n P N is the princi-pal quantum number [3]. Calculate the expectation value xxy of theposition of the mass point in the corresponding state given by

xxy “

ş8

´8xψ2

npxq dxş8

´8ψ2npxq dx

.

[In this, a “ pmω~q12, ω “ pkmq12, k ą 0 is the spring’sconstant and ~ is the reduced Planck’s constant.]

a) ψ0pxq “

ˆ

a?π

˙12

eá2x2

2 ,

b) ψ1pxq “

ˆ

a

2?π

˙12

2axeá2x2

2 ,

c) ψ2pxq “

ˆ

a

8?π

˙12

p4a2x2 ´ 2q eá2x2

2 ,

d) ψ3pxq “

ˆ

a

48?π

˙12

p8a3x3 ´ 12axq eá2x2

2

for all x P R.

4) The time for one complete swing (‘period’) T of a pendulum with

335

-4 -2 2 4x

0.1

0.2

0.6

y

-4 -2 2 4x

0.4

y

-4 -2 2 4x

0.4

y

-4 -2 2 4x

0.4

y

Fig. 82: Squares of the wave functions of a harmonic oscillator. Compare Problem 3.

336

length L ą 0 is given by

T “ 2

d

L

g

ż 1

´1

dua

p1´ u2qp1´ k2u2q

where θ0 P p´π2, π2q is the initial angle of elongation from theposition of rest of the pendulum, k :“ | sinpθ02q|, and where gis the acceleration of the Earth’s gravitational field. Show that thecorresponding integral exists in the improper Riemann sense. Splitthe integrand into a Riemann integrable and an improper Riemannintegrable part where the last leads on an integral that can easily becalculated. In this way, we give another representation of T thatinvolves only a proper Riemann integral.

337

A

E

C

A

E

CB

D

Fig. 83: Archimedes’ construction in the quadrature of the parabola. Refer to text.

3.3 Series of Real NumbersIn this section, we start the study of series of real numbers. A special case ofan important series, the geometric series, already appeared in Archimedes’second proof of his quadrature of the parabola. For motivation, this secondproof is considered in the following.

For this, we consider a parabola along with a line segment AE betweentwo points A and E on that parabola and the point C of smallest distancefrom AE. See Fig 83. Archimedes proved that the area of the parabolicsegment ACE is 43 of the area of the inscribed triangle with corners A,Cand E. He did this by dissecting the parabolic segment iteratively by tri-angles constructed from line segments between points on the parabola asfollows. In the first step, two triangles with corners A,B,C and C,D,Eare constructed in the same way from the line segments AC and CE, re-spectively, as the triangle with corners A,B,C was constructed from theline segment AE, i.e., the points B and D are the points of minimal dis-tance from AC and CE, respectively. Then the same process is continuedwith the line segments AB, BC, CD, DE leading to four new triangles andso forth.

At the time of Archimedes writing of his quadrature of the parabola, the

338

A

B

E

D

G

I

C

Fig. 84: Auxiliary diagram for the description of results on parabolic segments used inArchimedes’ proof. Refer to text.

following facts were known to be true for every line segment AE on aparabola. See Fig 84.

(i) The tangent to the point C on the parabola of largest distance fromAE is parallel to AE.

(ii) The parallel to the axis of the parabola through C halves every linesegment BD between two points B and D on the parabola that isparallel to AB.

(iii) If I , G are the points of intersection of the parallel to the axis throughC with BD and AE, respectively, then

CI

CG“pBIq2

pAGq2. (3.3.1)

Note that they imply that

AT ď AP ď 2AT (3.3.2)

where AP denotes the area of the parabolic segment ACE and AT denotesthe area of the inscribed triangle with corners A,C and E. See Fig 85.

339

A

E

C

L

M

G

Fig. 85: The double of the area of the triangle ACE gives an upper bound for the area ofthe parabolic segment ACE. Refer to text.

A

B C

D

EF

G

H

IJ

K

Fig. 86: Archimedes’ construction of quadrature of the parabola including auxiliary lines(dashed) and points. Refer to text.

340

Archimedes did not prove these facts, but referred for such proofs to earlierworks on conics by Euclid and Aristaeus. We will give proofs in Exam-ple 3.5.26 below using methods from analytical geometry. By help of thisknowledge, Archimedes concluded that the areas of the triangles ABC,CDE are 14 of the areas of the triangles ACG and GCE, respectively. SeeFig 86.

This can seen as follows. We denote by I the intersection of the parallelto AE through B with the parallel to the axis through C. Note that Fig 86suggests that its prolongation goes through the point D, but this will not beused in the following. That this is indeed the case will be side result of theproof. Further, we denote by G the intersection of AE with the parallel tothe axis through C. Finally, we denote by J,H the intersections of the par-allel to the axis of the parabola through B with AC and AE, respectively.Since this parallel halves AC, BH and CG as well as BI , HG are parallel,we conclude that

AH “ HG “1

2AG , BI “ HG , BH “ IG

and hence by (3.3.1) that

CI

CG“pBIq2

pAGq2“pHGq2

p2HGq2“

1

4. (3.3.3)

Further, the triangles with corners AJH and ACG are similar. Hence

JH

CG“AH

AG“

1

2.

In particular, by help of the last and (3.3.3), it follows that

BJ “ BH ´ JH “ IG´ JH “ CG´ CI ´ JH “3

4CG´ JH

“3

2JH ´ JH “

1

2JH .

Hence the triangles ABC and ACH have the side AC in common and thecorresponding height of the triangle ABC (“ distance from AC to B) is

341

half of that corresponding height of the triangleACH (“ distance fromACto H). Hence the area of the triangle ABC is half the area of the triangleACH . Now also the trianglesACH andACG have the sideAC in commonand the corresponding height of the triangle ACH (“ distance from AC toH) is half of that corresponding height of the triangle ACG (“ distancefrom AC to G). Hence it follows that the area of the triangle ABC is 14of the area of the triangle ACG.

The reasoning is analogous for the areas of the triangles CDE and GCE,respectively. See Fig 86. For this, we denote by I the intersection of theparallel to AE through D with the parallel to the axis through C. Note thatthis definition of the point I could conflict with its previous definition. Butonly the last definition will be used in the following, and a by product ofthe proof is that these points indeed coincide. As before, we denote by Gthe intersection of AE with the parallel to the axis through C. Finally, wedenote by K,F the intersections of the parallel to the axis of the parabolathrough D with CE and AE, respectively. Since this parallel halves CEand CG, DF as well as ID, GF are parallel, we conclude that

GF “ FE “1

2GE , ID “ GF , DF “ IG

and hence by (3.3.1) that

CI

CG“pIDq2

pGEq2“pGF q2

p2GF q2“

1

4. (3.3.4)

Note that this implies, that both previous definitions of I coincide. Further,the triangles with corners FKE and GCE are similar. Hence

KF

CG“FE

GE“

1

2.

In particular, by help of the last and (3.3.4), it follows that

DK “ DF ´KF “ IG´KF “3

4CG´KF

342

“3

2KF ´KF “

1

2KF .

Hence the triangles CDE and CEF have the side CE in common and thecorresponding height of the triangle CDE (“ distance from CE to D) ishalf of that corresponding height of the triangle CEF (“ distance from CEto F ). Hence the area of the triangle CDE is half the area of the triangleCEF . Now also the triangles CEF and GCE have the side CE in commonand the corresponding height of the triangle CEF (“ distance from CE toF ) is half of that corresponding height of the triangle GCE (“ distancefrom CE to G). Hence it follows that the area of the triangle CDE is 14 ofthe area of the triangle GCE

As a consequence, the sum of the areas of the triangles ABC and CDEis 14 of the area of the triangle ACE. Hence it follows that

AT4ď AP ´ AT ď 2

AT4

and inductively that

AT4n`1

ď AP ´ AT

nÿ

k“0

ˆ

1

4

˙k

ď 2AT

4n`1(3.3.5)

for every n P N. At this point observes that

1

4n`1`

1

3¨

1

4n`1“

4

3¨

1

4n`1“

1

3¨

1

4n

for every n P N which leads to

1

3¨

1

4n`1`

n`1ÿ

k“0

ˆ

1

4

˙k

“1

4n`1`

1

3¨

1

4n`1`

nÿ

k“0

ˆ

1

4

˙k

“1

3¨

1

4n`

nÿ

k“0

ˆ

1

4

˙k

343

for every n P N. Hence it follows that

1

3¨

1

4n`

nÿ

k“0

ˆ

1

4

˙k

“1

3¨

1

40`

0ÿ

k“0

ˆ

1

4

˙k

“4

3

for every n P N. For every n P N, this leads to

AT4n`1

ď AP ´ AT

ˆ

4

3´

1

3¨

1

4n

˙

ď 2AT

4n`1

which is equivalent to

7

3ÄT

4n`1“

AT4n`1

ÀT3¨

1

4nď AP ´

4

3AT

ď 2AT

4n`1ÀT3¨

1

4n“

10

3ÄT

4n`1. (3.3.6)

Differently to Archimedes, we can conclude from this by help of Theo-rem 2.3.12 directly that

AP “4

3AT .

Since the limit concept was not developed at that time, Archimedes hadto employ a usual ‘double reductio ad absurdum’ argument for this, i.e.,to lead both assumptions that AP ă 4AT 3 and that AP ą 4AT 3 to acontradiction which leaves only the option that AP “ 4AT 3. This can bedone as follows. First, we notice that AP ě 4AT 3 according to (3.3.6).Therefore the assumption that AP “ 4AT 3´ ε for some ε ą 0 contradicts(3.3.6) . Second, we assume that AP “ 4AT 3` ε for some ε ą 0. Then,it follows for n P N satisfying

n ą10

3 ε ln 4AT

thatAP ´

4

3AT ą

10

3ÄT

4n`1

which contradicts (3.3.6) . Hence the only remaining possibility is thatAP “ 4AT 3. Of course, in ancient Greece only rational ε were considered

344

in such analysis.

A modern way of stating Archimedes’ result can be given as follows. Sinceit follows from (3.3.5) that

1

AT

ˆ

AP ´ 2AT

4n`1

˙

ď

nÿ

k“0

ˆ

1

4

˙k

ď1

AT

ˆ

AP ´AT

4n`1

˙

ďAPAT

(3.3.7)

for every n P N, the sequence S0, S1, . . . , defined by

Sn :“nÿ

k“0

ˆ

1

4

˙k

for every n P N, is increasing and bounded from above by AP AT andhence convergent. In particular, it follows from (3.3.7) by Theorem 2.3.12that

AP “ AT limnÑ8

nÿ

k“0

ˆ

1

4

˙k

.

In the following, the natural notation

8ÿ

k“0

ˆ

1

4

˙k

:“ limnÑ8

nÿ

k“0

ˆ

1

4

˙k

will be used and referenced as the ‘sum of the sequence x0, x1, . . . ’, definedby

xk :“

ˆ

1

4

˙k

for every k P N. In addition, the sequence S0, S1, . . . will be called ‘thesequence of partial sums of x0, x1, . . . ’. Sequences of partial sums are alsocalled ‘series’. In this sense, Archimedes calculates the sum of the se-quence 1, q, q2, . . . for the case q “ 14 which is given by 43. The seriescorresponding to the sequences 1, q, q2, . . . where q runs through all realnumbers are called ‘geometric series’.

345

Definition 3.3.1. Let x1, x2, . . . be a sequence of elements of R. We saythat x1, x2, . . . is summable if the corresponding sequence of partial sumsS1, S2, . . . , defined by

Sn :“nÿ

k“1

xk (3.3.8)

for every n P N, is convergent to some real number. In this case, the sumof x1, x2, . . . is denoted by

8ÿ

k“1

xk . (3.3.9)

Otherwise, we say that x1, x2, . . . is not summable. The sequence in (3.3.8)is also called a series and in case of its convergence a convergent series withits sum denoted by (3.3.9). In case of its divergence, that series is calleddivergent.

In the following, we give two examples of series that play an importantrole in the analysis of the convergence of series, geometric series and theharmonic series. The former contain a real parameter. If and only if theabsolute value of that parameter is smaller than 1, the corresponding geo-metric series converges. The harmonic series is divergent.

Example 3.3.2. (Geometric series) Let x P R. In the following, we usethe convention that x0 :“ 1. Show that the so called geometric seriesS0, S1, . . . , defined by

Sn :“nÿ

k“0

xk

for every n P N, is convergent if and only if |x| ă 1. In the last case, showthat

8ÿ

k“0

xk “1

1´ x.

Solution: Note that in the case x “ 1, it follows that Sn “ n and hence thedivergence of the corresponding the geometric series. For x ‰ 1, it follows

346

10 20 30 40 50n

1

2

3

4

5

Fig. 87: Partial sums of the harmonic series and graphs of ln and 2´1 ` [2 lnp2q]´1 ln .

that

x ¨ Sn “nÿ

k“0

xk`1“

n`1ÿ

k“1

xk “ Sn ´ 1` xn`1

and hence that

Sn “1´ xn`1

1´ x.

As a consequence, the series of partial sums is convergent if and only if|x| ă 1, and in this case

8ÿ

k“0

xk “ limnÑ8

Sn “1

1´ x.

Example 3.3.3. (Harmonic series) Show that the harmonic series, definedby

Sn :“nÿ

k“1

1

k

347

for every n P N˚, is divergent. Solution: For every n P N zt0, 1u, it followsthat

2nÿ

k“1

1

kě

20ÿ

k“1

1

k`

22ÿ

k“21

1

k` ¨ ¨ ¨ `

2nÿ

k“2n´1

1

k

ě

20ÿ

k“1

1

20`

22ÿ

k“21

1

22` ¨ ¨ ¨ `

2nÿ

k“2n´1

1

2n

ě 1` p22´ 21

q ¨1

22` ¨ ¨ ¨ ` p2n ´ 2n´1

q ¨1

2n

“ 1`1

2` ¨ ¨ ¨ `

1

2“ 1`

n´ 1

2“n` 1

2

ˆ

“lnp2nq ` lnp2q

2 lnp2q

˙

and hence the divergence of the harmonic series.

Remark 3.3.4. Note that because series are sequences of partial sums, wecan apply the limit laws of Theorem 2.3.4 to series.

Often, a given series consists of the partial sums corresponding to a se-quence of the form fp1q, fp2q, . . . where f : r1,8q Ñ R is some function.For instance in the case of a geometric series corresponding to q ą 0, suchfunction is given by

fpxq :“ e px´1q ln q

for x ě 1, and in the case of the harmonic series, such function is given by

fpxq :“1

x

for x ě 1. We note that in such case, the sequence of partial sums fp1q,fp2q, . . . , defined by

nÿ

k“1

fpkq

for every n P N˚, has the form of a Riemann sum, i.e., the form of sumsused in the definition of the Riemann integral, corresponding to a decompo-sition of R into the intervals r1, 2s, r2, 3s, . . . of length 1. Hence, we would

348

expect that there is a relationship between the existence of the improperRiemann integral of f and the convergence of the series. Indeed, this istrue for a particular class of functions f .

Theorem 3.3.5. (Integral test) Let f : r1,8q Ñ R be positive decreasingand almost everywhere continuous. Then fp1q, fp2q, . . . is summable ifand only if f is improper Riemann-integrable. In this case,

ż 8

1

fpxq dx ď8ÿ

k“1

fpkq ď fp1q `

ż 8

1

fpxq dx (3.3.10)

as well asż 8

m`1

fpxq dx ď8ÿ

k“m`1

fpkq ď

ż 8

m

fpxq dx (3.3.11)

for every m P N˚.

Proof. For this, we define the auxiliary function g : r1,8q Ñ R by gp1q :“fp2q as well as gpxq :“ fpk ` 1q for all x P pk, k ` 1s and k P N˚. Then

ż n`1

m

gpxq dx “nÿ

k“m

ż k`1

k

gpxq dx “n`1ÿ

k“m`1

fpkq

for every m,n P N˚ such that m ď n. If f is improper Riemann-integrable,it follows because of |g| ď f and by Theorem 3.2.6 that g is improperRiemann-integrable and hence that fp1q, fp2q, . . . is summable and

8ÿ

k“m`1

fpkq ď

ż 8

m

fpxq dx

for every m P N˚. If on the other hand fp1q, fp2q, . . . is summable, wedefine the auxiliary function h : r1,8q Ñ R by hpxq :“ fpkq for allx P rk, k ` 1q and k P N˚. Then

ż x

m

fpyq dy ď

ż x

m

hpyq dy ď8ÿ

k“m

fpkq

349

for every m P N˚ and x P r1,8q. Hence it follows by Theorem 3.2.6 thatf is improper Riemann-integrable and that

8ÿ

k“m

fpkq ě

ż 8

m

fpyq dy .

for every m P N˚.

Remark 3.3.6. Note that (3.3.11) can be used to estimate remainder termsof the sequence.

The following two examples give applications of the integral test to furtherseries that play an important role in the analysis of the convergence of se-ries. In particular, the following example defines Riemann’s zeta functionwhich has important applications in the description of the distribution ofthe prime numbers. Further applications are in quantum statistical physicsand quantum field theory. Finally, there is a famous problem concerning thezeros of the extension of Riemann’s zeta function to complex numbers. Alleven integers that are smaller than 0 are zeros of that extension. Riemann’sconjecture from 1859 claims that all other zeros have the real part 12. Itis not yet known whether this is true. The solution to this problem wouldhave profound consequences in the theory of numbers.

Example 3.3.7. (Riemann’s Zeta function) Show that by

ζpsq :“8ÿ

n“1

1

ns

for every s P p1,8q there is defined a function ζ : p1,8q Ñ R. This func-tion is called Riemann’s zeta function. Solution: For every s P p1,8q thecorresponding function fs : r1,8q Ñ R defined by fspxq :“ 1xs for everyx ě 1 is positive decreasing and continuous and by Example 3.2.5 improperRiemann-integrable. Hence the statement follows from Theorem 3.3.5. Inaddition, it follows by (3.3.10) that

1

s´ 1“

ż 8

1

fspxq dx ď ζpsq ď 1`

ż 8

1

fspxq dx “s

s´ 1.

350

1.5 2 2.5 3s

2

4

6

8

10y

Fig. 88: Graphs of ζ (black), 1p1´ sq (blue) and sp1´ sq (red).

10 20 30 40 50n

0.5

1

1.5

2

2.5

Fig. 89: Partial sums of the series from Example 3.3.8 for the case p “ 1.

351

Example 3.3.8. Let p ě 1. Determine whether the sequence a2, a3, . . .defined by

an :“nÿ

k“2

1

k plnpkqqp

for every n P N zt0, 1u is convergent or divergent. Solution: For this, wedefine the auxiliary function h : r2,8q Ñ R by hpxq :“ x ¨ plnpxqqp forevery x ě 2. Then h is strictly positive, strictly increasing and continuousand hence f :“ r1,8q Ñ R defined by fpxq :“ 1rpx ` 1qplnpx ` 1qqpsfor every x ě 1 is positive, strictly decreasing and continuous. Further forp “ 1:

ż n

1

dx

px` 1q lnpx` 1q“ lnplnpn` 1qq ´ lnplnp2qq

for every n P N˚. Using that lnplnp2mqq “ lnpm lnp2qq for every m P N˚,it follows that f is not improper Riemann-integrable. Hence it follows byTheorem 3.3.5 the divergence of the corresponding sequence a2, a3, . . . .For p ą 1 it follows that

ż x

1

dy

py ` 1qplnpy ` 1qqp“

1

1´ p¨ rplnpx` 1qq1´p ´ plnp2qq1´ps

for every x ě 1 and hence the improper Riemann-integrability of f and byTheorem 3.3.5 the convergence of the corresponding sequence a2, a3, . . . .

The following comparison test is often applied to decide the convergence ofa given series. For motivation, we investigate the convergence of the seriesS1, S2, . . . defined by

Sn :“nÿ

k“1

1

k2 ` 2

for all n P N˚. A basic strategy in the solution of any problem is to in-vestigate whether that problem has a peculiarity that prevents its immediatesolution. Indeed, without the addition of 2 in the denominator of the sum-mands, S1, S2, . . . would coincide with the zeta series corresponding tos “ 2 which was shown to converge. In such cases, it is often possible to

352

reduce, in some sense, the solution of the given problem to the solution ofthe simpler problem. For instance in this case, we notice that

Sn :“nÿ

k“1

1

k2 ` 2ď

nÿ

k“1

1

k2ď

8ÿ

k“1

1

k2“ ζp2q

for every n P N˚. Hence S1, S2, . . . is an increasing sequence that isbounded from above and therefore convergent (with a sum that is smallerthan ζp2q). The following theorem generalizes this method of comparisonof series.

Theorem 3.3.9. (Comparison test) Let x1, x2, . . . and y1, y2, . . . be se-quences of positive real numbers. Further, let xn ď c yn for all n P

tN,N ` 1, . . . u where c ě 0 and N is some element of N˚. If y1, y2, . . . issummable, then x1, x2, . . . is summable, too.

Proof. If y1, y2, . . . is summable, it follows that

nÿ

k“1

xk ďnÿ

k“1

c yk “ cnÿ

k“1

yk ď c8ÿ

k“1

yk

for every n P N˚. Hence the sequence of partial sums of x1, x2, . . . isincreasing (since xk ě 0 for all k P N˚) and bounded from above andtherefore convergent.

In Example 3.3.3, we proved that the harmonic series is divergent by show-ing that

2nÿ

k“1

1

kě

lnp2nq ` lnp2q

2 lnp2q

for every n P N. In addition, Fig 87 supported the validity of the moregeneral estimate

nÿ

k“1

1

kě

lnpnq ` lnp2q

2 lnp2q

for every n P N˚. The last could indicate a logarithmic increase of the par-tial sums of the harmonic series with the number of summands. Indeed, as

353

another application of the previous theorem, the following example provesthe more precise statement that

limnÑ8

«˜

nÿ

k“1

1

k

¸

´ lnpnq

ff

“ γ

where γ is a real number in the interval r0, 1s called Euler’s constant. Toseven decimal places, γ is given by 0.5772156.

Example 3.3.10. Show that the sequence a1, a2, . . . defined by

an :“

˜

nÿ

k“1

1

k

¸

´ lnpnq

for all n P N˚ is convergent. See Fig. 87. Solution: For this, we define anauxiliary sequence b1, b2, . . . by

bn :“

˜

nÿ

k“1

1

k

¸

´ lnpn` 1q “ an` lnpnq´ lnpn` 1q “ an` ln

ˆ

n

n` 1

˙

for all n P N˚. Then

bn`1 ´ bn “1

n` 1´ ln

ˆ

n` 2

n` 1

˙

“1

n` 1

ż 1

0

ˆ

1´n` 1

x` n` 1

˙

dx “1

n` 1

ż 1

0

x

x` n` 1dx

and hence0 ď bn`1 ´ bn ď

1

pn` 1q2

for all n P N˚. Therefore b1, b2, . . . is increasing and bounded from abovesince

bn “ b1 `

n´1ÿ

k“1

pbk`1 ´ bkq ď b1 `

8ÿ

k“1

1

k2

354

for all n P N z t0, 1u. Hence b1, b2, . . . and a1, a2, . . . are convergent. Theconstant

γ :“ limnÑ8

«˜

nÿ

k“1

1

k

¸

´ lnpnq

ff

is known as Euler constant. Presently, it is not yet known whether it isrational or irrational. Since

lnpn` 1q “

ż n`1

1

dx

x“

nÿ

k“1

ż k`1

k

dx

xď

nÿ

k“1

1

k“ 1`

n´1ÿ

k“1

1

k ` 1

ď 1`n´1ÿ

k“1

ż k`1

k

dx

x“ 1`

ż n

1

dx

x“ 1` lnpnq ,

it follows that

0 ď ln

ˆ

n` 1

n

˙

ď an ď 1

for every n P N z t0, 1u and hence that 0 ď γ ď 1. To seven decimal placesit is given by 0.5772156.

The following example derives Weierstrass’ representation of the gammafunction as a simple consequence of the previous result and Gauss’ repre-sentation (3.2.14) of the gamma function.

Example 3.3.11. (Weierstrass’ representation of the gamma function)Show Weierstrass’ representation of the gamma function

1

Γpxq“ xeγx lim

nÑ8

nź

k“1

´

1`x

k

¯

e´xk (3.3.12)

for every x ą 0 where γ is Euler’s constant. Solution: For this, let x ą 0.According to (3.2.14), Γpxq is given by

Γpxq “ limnÑ8

nxn!

x ¨ px` 1q ¨ ¨ ¨ px` nq“ x´1 lim

nÑ8nx

nź

k“1

1

1` xk

355

Further,

nx “ exppx lnpnqq “ exp

˜

x

«

lnpnq ńÿ

k“1

1

k

ff¸

exp

˜

nÿ

k“1

x

k

¸

“ exp

˜

x

«

lnpnq ńÿ

k“1

1

k

ff¸

nź

k“1

exk .


Γpxq “ x´1 limnÑ8

exp

˜

x

«

lnpnq ńÿ

k“1

1

k

ff¸

nź

k“1

exk

1` xk

“ x´1e´γx limnÑ8

nź

k“1

exk

1` xk


The following comparison test is a simple consequence of the comparisontest from Theorem 3.3.9.

Theorem 3.3.12. (Limit comparison test) Let x1, x2, . . . and y1, y2, . . .be sequences of positive real numbers. Further, let

limnÑ8

xnyn“ 1 .

(Note that this implies that yn ą 0 for all n P tN,N`1, . . . u and someN P

N˚.) Then x1, x2, . . . is summable if and only if y1, y2, . . . is summable.

Proof. Since limnÑ8pxkykq “ 1, there is N P N satisfying N ě 2 andsuch that

1

2ďxkykď

3

2

for all k P N˚ such that k ě N . In particular, this implies that

0 ď yk ď 2xk , 0 ď xk ď3yk2

356

for all k P N˚ such that k ě N . Hence it follows by help of Theorem 3.3.9that the sequence xN , xN`1, . . . is summable if and only if yN , yN`1, . . . issummable. Since

nÿ

k“1

xk “N´1ÿ

k“1

xk `nÿ

k“N

xk ,nÿ

k“1

yk “N´1ÿ

k“1

yk `nÿ

k“N

yk

for every n P N˚ satisfying n ě N , the last also implies the statement ofthe theorem.

In the following, we give three typical applications of the previous compar-ison test. In particular, the two subsequent examples study series which arefrequently used in the analysis of the convergence of given series. Fur-thermore, the following example, along with the fact that the sequencewhose members are all equal to 1 is not summable, shows that the series1, 12s, 13s, . . . is not summable for s ď 1 and hence that ζpsq cannot bedefined for s ď 1 in the same way as for s ą 1.

Example 3.3.13. Let p ă 1. Determine, whether the sequence a1, a2, . . .defined by

an :“1

np

for all n P N˚, is summable. Solution: Since p ă 1, it follows for everyn P N˚ that

1

npě

1

n

for all n P N˚ and hence by Theorem 3.3.9 and Example 3.3.3 the diver-gence of a1, a2, . . . .

Example 3.3.14. Let p ă 1. Determine whether the sequence a2, a3, . . .defined by

an :“nÿ

k“2

1

k ¨ plnpkqqp

357

for every n P N zt0, 1u is convergent or divergent. Solution: Since p ă 1,it follows for every k ě 3 that

plnpkqq1´p ě 11´p“ 1

and hence that1

k ¨ plnpkqqpě

1

k ¨ lnpkq.

Hence it follows by Theorem 3.3.9 and Example 3.3.8 that the sequencea2, a3, . . . is divergent.

Example 3.3.15. Determine whether the sequence a1, a2, . . . defined by

an :“3n2 ` n` 1

pn5 ` 2q12

for all n P N is summable. Solution: Define

bn :“3

n12

for all n P N˚. Then b1, b2, . . . is not summable according to Exam-ple 3.3.13. In addition, it follows that

limnÑ8

anbn“ 1

and hence by Theorem 3.3.12 also that a1, a2, . . . is not summable.

In the 17th and 18th century, it was generally assumed that the reorderingof the members of a sequence lead to a sequence which is summable if andonly if the same is true for the original sequence and in that case that thesums of both sequences coincide. Indeed, we will see in the following thatthis is true for absolutely summable sequences that include sequences ofpositive pě 0q real numbers. On the other hand, we will also see that theabove statement is false in more general cases. This false belief led to con-tradictions which plagued the calculus in those centuries. We present oneexample from that time [60] of a too naive handling of series resulting from

358

a reordering of a sequence whose members alternate in sign.

Since 1668 [78], it was known that

8ÿ

n“1

p´1qn`1

n“ lnp2q ,

a fact that will be proved in Example 3.4.19. On the other hand, it wasargued that therefore

lnp2q “

ˆ

1´1

2`

1

3´

1

4`

1

5´

1

6` . . .

˙

“

ˆ

1`1

3`

1

5` . . .

˙

`

ˆ

1

2`

1

4`

1

6` . . .

˙

´ 2

ˆ

1

2`

1

4`

1

6` . . .

˙

“

ˆ

1`1

2`

1

3` . . .

˙

´2

2

ˆ

1

1`

1

2`

1

3` . . .

˙

“ 0 .

It should be noted that already the second line in the above ‘derivation’cannot be concluded by the limit laws because all three series inside thebrackets diverge. Hence the above can also be viewed as a classic exampleof the false treatment of 8 as a real number which was quite common atthat time. The discovery of such apparent contradictions contributed es-sentially to a re-examination and rigorous founding of the theory of infiniteseries.

A simple example for the fact that the reordering of a sequence can affectits sum is the following. For this, we consider a reordering of the sequencea1, a2, . . . defined by

ak :“p´1qk`1

k

for every k P N˚. The partial sums of this sequence are called the alternat-ing harmonic series whose sum was also considered in the above ‘deriva-tion’.

359

10 20 30 40 50n3

0.5

0.6

0.7

0.9

1

1.1

Fig. 90: Partial sums of the alternating harmonic series and its rearrangement from Exam-ple 3.3.16.

Example 3.3.16. (A rearrangement of the alternating harmonic series)For this, we define the sequence a1, a2, . . . by

ak :“p´1qk`1

k

for every k P N˚, and the sequence b1, b2, . . . by

b3k´2 :“1

4k ´ 3“ p´1q4k´2

¨1

4k ´ 3“ a4k´3 ,

b3k´1 :“1

4k ´ 1“ p´1q4k ¨

1

4k ´ 1“ a4k´1 ,

b3k :“ ´1

2k“ p´1q2k`1

¨1

2k“ a2k

for every k P N˚. From the last, we conclude that the sequence b1, b2, . . .contains only members of the sequence a1, a2, . . . . The fact that it containsall of them can be seen as follows. For this, let k P N˚. If k is even, then

b3 pk2q “ a2 pk2q “ ak .

360

If k is odd, then there is l P N˚ such that k “ 2l ´ 1. If l is even, then

b3 pl2q´1 “ a4 pl2q´1 “ a2l´1 “ ak .

Finally, if l is odd, then

b3 ppl`1q2q´2 “ a4 ppl`1q2q´3 “ a2l´1 “ ak .

Hence, b1, b2, . . . is a reordering of a1, a2, . . . . The ninth partial sum corre-sponding to a1, a2, . . . is given by

1´1

2`

1

3´

1

4`

1

5´

1

6`

1

7´

1

8`

1

9,

whereas the ninth partial sum corresponding to b1, b2, . . . is given by

1`1

3´

1

2`

1

5`

1

7´

1

4`

1

9`

1

11´

1

6.

Assuming the convergence of the alternating harmonic series which is provedin Example 3.3.19, it follows that

8ÿ

k“1

p´1qk

kă 1´

1

2`

1

3“

5

6.

Further, because of

b3k´2 ` b3k´1 ` b3k “8k ´ 3

2kp4k ´ 3qp4k ´ 1qą 0

for every k P N˚, it follows that

3nÿ

k“1

bk ą5

6

for every n P N˚. Therefore, either b1, b2, . . . is not summable (!), or

8ÿ

k“1

bk ą5

6ą

8ÿ

k“1

p´1qk

k. p!q

361

In the following, we continue the study of series with view on sums ofalternating sequences. Any sequence y1, y2, . . . of real numbers can berepresented in the equivalent form x1|y1|, x2|y2|, . . . where the sequencex1, x2, . . . assumes values in t´1, 1u. In this sense, y1, y2, . . . is always aproduct of a bounded sequence that describes sign changes and a sequenceof positive numbers. In the case that the partial sums of x1, x2, . . . staybounded, as is the case for alternating y1, y2, . . . , consideration of this prod-uct structure is helpful in the analysis of the convergence of the series thatcorresponds to y1, y2, . . . . The basis for such analysis is provided by thefollowing summation by parts formula which resembles the formula forpartial integration.

Theorem 3.3.17. (Summation by parts) Let x1, x2, . . . and y1, y2, . . . besequences of real numbers and S1, S2, . . . be the sequence of partial sumsof x1, x2, . . . . Then

nÿ

k“m

xkyk “ pSn ` cqyn`1 ´ pSm´1 ` cqym ńÿ

k“m

pSk ` cqpyk`1 ´ ykq

for all m,n P N˚ such that n ě m and all c P R where we define S0 :“ 0.

Proof. It follows for all m,n P N˚, that

nÿ

k“m

xkyk “nÿ

k“m

pSk ´ Sk´1qyk “nÿ

k“m

Skyk ń´1ÿ

k“m´1

Skyk`1

“

nÿ

k“m

Skyk ńÿ

k“m

Skyk`1 ` Snyn`1 ´ Sm´1ym

“ Snyn`1 ´ Sm´1ym ńÿ

k“m

Skpyk`1 ´ ykq


k“m

pSk ` cqpyk`1 ´ ykq `nÿ

k“m

c pyk`1 ´ ykq


k“m

pSk ` cqpyk`1 ´ ykq ` cyn`1 ´ cym

362

“ pSn ` cqyn`1 ´ pSm´1 ` cqym ´nÿ

k“m

pSk ` cqpyk`1 ´ ykq .

The following Dirichlet’s test is mainly a consequence of the summationby parts formula. This test is frequently used in connection with the sum-mation of alternating sequences also because it provides a very simple es-timate of the error resulting from the truncation of the series after finitelymany terms.

Theorem 3.3.18. (Dirichlet’s test) Let x1, x2, . . . be a sequence of realnumbers such that its partial sums form a bounded sequence and y1, y2, . . .be a decreasing sequence of real numbers such that limkÑ8 yk “ 0. Thenthe sequence x1y1, x2y2, . . . is summable,

8ÿ

k“1

xkyk “M1y1 `

8ÿ

k“1

pSk ´M1qpyk ´ yk`1q (3.3.13)

and for every n P N˚ˇ

ˇ

ˇ

ˇ

ˇ

8ÿ

k“n`1

xkyk

ˇ

ˇ

ˇ

ˇ

ˇ

ď pM2 ´M1q ¨ yn`1 (3.3.14)

where M1,M2 P R are a lower bound and upper bound, respectively, of thepartial sums of x1, x2, . . . .

Proof. For this let S1, S2, . . . be the sequence of partial sums of x1, x2, . . .and M1,M2 P R be lower and upper bounds, respectively. Then by Theo-rem 3.3.17

nÿ

k“1

xkyk “ pSn ´M1qyn`1 `M1y1 `

nÿ

k“1

pSk ´M1qpyk ´ yk`1q

as well as

0 ďnÿ

k“1

pSk ´M1qpyk ´ yk`1q ď

nÿ

k“1

pM2 ´M1qpyk ´ yk`1q

363

“ pM2 ´M1qpy1 ´ yn`1q ď pM2 ´M1q y1

for all n P N˚. Therefore the sequence

1ÿ

k“1

pSk ´M1qpyk ´ yk`1q,2ÿ

k“1

pSk ´M1qpyk ´ yk`1q, . . .

is increasing as well as bounded from above and hence convergent. There-fore, since limkÑ8 yk “ 0, it follows the summability of x1y1, x2y2, . . .and p3.3.13q. Finally, it follows for every n P N˚ that

8ÿ

k“n`1

xkyk “ ´pSn ´M1qyn`1 `

8ÿ

k“n`1


and hence

8ÿ

k“n`1

xkyk ě ´pSn ´M1qyn`1 ě ´pM2 ´M1qyn`1

8ÿ

k“n`1

xkyk ď ´pSn ´M1qyn`1 `

8ÿ

k“n`1


“ pM2 ´ Snqyn`1 ď pM2 ´M1qyn`1

and (3.3.14).

Example 3.3.19. Let s ą 0. Determine whether the sequence a1, a2, . . .defined by

an :“p´1qn´1

ns

for all n P N˚ is summable. Solution: Define

xn :“ p´1qn´1 , yn :“1

ns

364

1.5 2 2.5 3s

-10

-5

5

10y

Fig. 91: Graph of an extended Riemann’s zeta function ζ.

for all n P N˚. Then the partial sums S1, S2, . . . of x1, x2, . . . oscillatebetween 0 and 1 and y1, y2, . . . is decreasing as well as convergent to 0.Hence by Theorem 3.3.18 a1, a2, . . . is summable and

8ÿ

k“1

ak “8ÿ

k“0

ˆ

1

p2k ` 1qs´

1

p2k ` 2qs

˙

“

8ÿ

k“0

ˆ

1

p2k ` 1qs`

1

p2k ` 2qs

˙

´ 21´s¨ ζpsq “ p1´ 21´s

q ¨ ζpsq

if, in addition, s ą 1. Note that the last formula can and is used to define ζon p0, 1q. See Fig. 91.

In some cases where Dirichlet’s test cannot be applied Abel’s test is ofuse. Also Abel’s test is mainly a consequence of the summation by partsformula.

Theorem 3.3.20. (Abel’s test) Let x1, x2, . . . be a summable sequence ofreal numbers and y1, y2, . . . a decreasing convergent sequence of real num-

365

bers. Then the sequence x1y1, x2y2, . . . is summable and

8ÿ

k“1

xkyk “M1y1 `

˜

8ÿ

k“1

xk ´M1

¸

¨ limnÑ8

yk `8ÿ

k“1


(3.3.15)where M1 P R is a lower bound of the partial sums of x1, x2, . . . .

Proof. For this, let S1, S2, . . . be the sequence of partial sums of x1, x2, . . .and M1,M2 P R be lower and upper bounds, respectively. Further, letM3,M4 P R be lower and upper bounds, respectively, of y1, y2, . . . . Thenby Theorem 3.3.17

nÿ

k“1

xkyk “ pSn ´M1qyn`1 `M1y1 `

nÿ

k“1


as well as

0 ďnÿ

k“1

pSk ´M1qpyk ´ yk`1q ď

nÿ

k“1


“ pM2 ´M1qpy1 ´ yn`1q ď pM2 ´M1q pM4 ´M3q

for all n P N˚. Therefore, the sequence

1ÿ

k“1

pSk ´M1qpyk ´ yk`1q,2ÿ

k“1

pSk ´M1qpyk ´ yk`1q, . . .

is increasing as well as bounded from above and hence convergent. Finally,it follows the summability of x1y1, x2y2, . . . and p3.3.15q by the limit lawsfor sequences.

The following example gives an application of Abel’s test.

Example 3.3.21. Show that the sequence a1, a2, . . . defined by

a2n´1 :“n´ 1

n2, a2n :“ ´

1

n` 1

366

5 10 15 20 25 30n

0.1

0.3

0.5 5 10 15 20 25 30n

-0.8

-0.6

-0.4

-0.2

Fig. 92: Sequences of absolute values and partial sums of the sequence from Exam-ple 3.3.21.

for every n P N˚ is summable. Solution: We note that

|a2n| ´ |a2n´1| “1

n` 1´n´ 1

n2“

1

n2pn` 1qą 0 ,

|a2n| ´ |a2n`1| “1

n` 1´

n

pn` 1q2“

1

pn` 1q2ą 0

for all n P N˚ and hence that the sequence |a1|, |a2|, . . . is neither decreas-ing nor increasing. Hence Dirichlet’s test cannot be directly applied. Onthe other hand, Abel’s test can be applied successfully as follows. For this,we define

x1 :“ 1 , x2n`1 :“1

n` 1, x2n :“ ´

1

n,

y1 :“ 0 , y2n`1 :“n

n` 1, y2n :“

n

n` 1

for every n P N˚. Then

x1 ¨ y1 “ 0 “ a1 , x2n`1 ¨ y2n`1 “1

n` 1¨

n

n` 1“

n

pn` 1q2“ a2n`1 ,

x2n ¨ y2n “ ´1

n¨

n

n` 1“ ´

1

n` 1“ a2n

367

for all n P N˚. The partial sums of the sequence x1, x2, . . . are given by

2n´1ÿ

k“1

xk “nÿ

k“1

x2k´1 `

n´1ÿ

k“1

x2k “

nÿ

k“1

1

k´

n´1ÿ

k“1

1

k“

1

n,

2nÿ

k“1

xk “nÿ

k“1

x2k´1 `

nÿ

k“1

x2k “

nÿ

k“1

1

k´

nÿ

k“1

1

k“ 0

for every n P N such that n ě 2, and hence x1, x2, . . . is summable. Fur-ther, ý1,ý2, . . . is decreasing and convergent to ´1. Hence it followsby Abel’s test that á1,á2, . . . is summable. Therefore, a1, a2, . . . issummable, too.

In the following, we define and study absolutely summable sequences. Anyreordering of such a sequence leads to a convergent series whose sum coin-cides with the sum of the original series. In applications mainly absolutelysummable sequences occur. Exceptions are rare. One such exception isdescribed in [9].

Definition 3.3.22. (Absolute summability) A sequence x1, x2, . . . of realnumbers is said to be absolutely summable if the corresponding sequence|x1|, |x2|, . . . is summable. It is called conditionally summable if it issummable, but |x1|, |x2|, . . . is not.

Of course, the previous definition is reasonable only if any absolutely summablesequence is summable, too. The last is easy to prove.

Theorem 3.3.23. Any absolutely summable sequence of real numbers issummable.

Proof. For this, let x1, x2, . . . be some absolutely summable sequence ofreal numbers. Then x1 ` |x1|, x2 ` |x2|, . . . is a sequence of positive realnumbers and

nÿ

k“1

pxk ` |xk|q ď 2nÿ

k“1

|xk| ď 28ÿ

k“1

|xk|

368

for all n P N˚. Hence the sequence of partial sums corresponding to the se-quence x1`|x1|, x2`|x2|, . . . is increasing as well as bounded from aboveand hence convergent. Therefore, x1 ` |x1|, x2 ` |x2|, . . . is summable.Hence it follows by the limit laws that x1, x2, . . . is summable, too.

Remark 3.3.24. Note that the previous definition and theorem reduce thedecision whether a given sequence is absolutely summable (and thereforealso summable) to the decision whether a corresponding sequence of posi-tive real numbers is summable. Usually, the decision of the last is relativelyeasy, and we already developed a number of tools for this. For this reason,the second step in the analysis is often the inspection whether the sequenceis absolutely summable. Usually, the first step inspects whether the summa-bility of the sequence can be concluded by help of the limit laws from thealready known summability of certain sequences, or whether there are ob-vious reasons why the sequence is not summable. If this fails, absolutesummability is investigated. If this also fails, the applicability of Dirich-let’s test or Abel’s test is investigated next.

Example 3.3.25. In Example 3.3.2, we have seen that the geometric series,defined by

Sn :“nÿ

k“0

xk

for every n P N, is convergent if and only if |x| ă 1 where x0 :“ 1. In thelast case, this also implies that the geometric series defined by

Sn :“nÿ

k“0

|xk| “nÿ

k“0

|x|k ,

where |x|0 :“ 1, is convergent and hence that 1, x, x2, . . . is absolutelysummable.

Example 3.3.26. Determine whether the sequence

sinp1q

12,sinp2q

22,sinp3q

32, . . . (3.3.16)

369

is absolutely summable. Solution: For every k P N˚, it follows thatˇ

ˇ

ˇ

ˇ

sinpkq

k2

ˇ

ˇ

ˇ

ˇ

ď1

k2.

Hence it follows by Example 3.3.7 and Theorem 3.3.9 that the sequence(3.3.16) is absolutely summable.

Example 3.3.27. The examples of the harmonic series Example 3.3.3 andthe alternating harmonic series, i.e., the case s “ 1 in Example 3.3.19, showthat not every summable sequence is absolutely summable.

The following characterization of summability is sometimes useful in theanalysis of sequences and will be used later on. It is a simple consequenceof the definition of summability of a sequence and the completeness of thereal number system in the form of Theorem 2.3.17.

Theorem 3.3.28. (Cauchy’s characterization of summable sequences)A sequence x1, x2, . . . of real numbers is summable if and only if the cor-responding sequence of partial sums is a Cauchy sequence, i.e., if and onlyif for every ε ą 0, there is some N P N˚ such that

ˇ

ˇ

ˇ

ˇ

ˇ

nÿ

k“m

xk

ˇ

ˇ

ˇ

ˇ

ˇ

ď ε

for all m,n P N˚ satisfying n ě m ě N .

Proof. First, if x1, x2, . . . is a sequence of real numbers whose correspond-ing sequence of partial sums is a Cauchy sequence, then it follows Theo-rem 2.3.17 that the last sequence is convergent and hence that x1, x2, . . . isa summable. If x1, x2, . . . is a summable sequence of real numbers, thenthe corresponding sequence of partial sums is convergent and hence also aCauchy sequence according to Theorem 2.3.17. The last can also be proveddirectly as follows. For this, let ε ą 0. Since x1, x2, . . . is summable, thereis N P N˚ such that

ˇ

ˇ

ˇ

ˇ

ˇ

mÿ

k“1

xk ´8ÿ

k“1

xk

ˇ

ˇ

ˇ

ˇ

ˇ

ďε

2

370

for all m P N˚ satisfying m ě N . Hence it follows for all m,n P N˚ suchthat n ě m ě N ` 1 thatˇ

ˇ

ˇ

ˇ

ˇ

nÿ

k“m

xk

ˇ

ˇ

ˇ

ˇ

ˇ

“

ˇ

ˇ

ˇ

ˇ

ˇ

nÿ

k“1

xk ´m´1ÿ

k“1

xk

ˇ

ˇ

ˇ

ˇ

ˇ

ď

ˇ

ˇ

ˇ

ˇ

ˇ

nÿ

k“1

xk ´8ÿ

k“1

xk

ˇ

ˇ

ˇ

ˇ

ˇ

`

ˇ

ˇ

ˇ

ˇ

ˇ

8ÿ

k“1

xk ´m´1ÿ

k“1

xk

ˇ

ˇ

ˇ

ˇ

ˇ

ď ε.

The following corollary is often used to show that a given sequence is notsummable.

Corollary 3.3.29. Let x1, x2, . . . be a summable sequence of real numbers.Then

limnÑ8

xn “ 0 .

Example 3.3.30. We consider the sequence x1, x2, . . . defined by

xn :“ p´1qn ¨n` 1

n

for every n P N˚. If x1, x2, . . . were convergent to 0 also every of itssubsequences would converge to zero. On the other hand,

limnÑ8

x2n “ limnÑ8

2n` 1

2n“ 1 .

Hence x1, x2, . . . is not convergent to 0 and therefore also not summable.

In the following, we give the two most important tests, the ratio test and theroot test, for the decision whether a given sequence is absolutely summableor not. Both tests compare, by application of Theorem 3.3.9, the corre-sponding series to geometric series. Usually, the structure of the membersof the sequence decides which of the tests is applied. The ratio test uses forthis the ratio of the absolute values of subsequent members and the root testthe n-th root of the absolute value of the n-th member. Since the structureof the last is often more complicated than that of the ratio, the quotient testis more frequently applied.

371

Theorem 3.3.31. (Ratio test) Let x1, x2, . . . be a sequence of real numbers.

(i) If there are q P p0, 1q and N P N˚ such thatˇ

ˇ

ˇ

ˇ

xn`1

xn

ˇ

ˇ

ˇ

ˇ

ď q

for all n P N˚ such that n ě N , then x1, x2, . . . is absolutely summable.Note that this can only be the case if only finitely many of the mem-bers of x1, x2, . . . are zero.

(ii) If there is N P N˚ such thatˇ

ˇ

ˇ

ˇ

xn`1

xn

ˇ

ˇ

ˇ

ˇ

ě 1

for all n P N˚ such that n ě N . Then x1, x2, . . . is not summable.Also this can only be the case if only finitely many of the membersof x1, x2, . . . are zero.

Proof. ‘(i)’: For this, let q P p0, 1q and N P N˚ be such that |xn`1| ď q|xn|for all n P N˚ satisfying n ě N . Then it follows by induction that

|xn| ď |xN | ¨ qn´N

for all n P N˚ such that n ě N . Hence it follows by Example 3.3.2 andTheorem 3.3.9 the absolute summability of x1, x2, . . . . ‘(ii)’: For this letN P N˚ be such that |xn`1||xn| ě 1 for all n P N˚ satisfying n ě N .Then it follows by induction that

|xn| ě |xN |

for all n P N˚ satisfying n ě N and hence since xN ‰ 0 that x1, x2, . . . isnot converging to 0. Hence it follows by Corollary 3.3.29 that x1, x2, . . . isnot summable.

372

Example 3.3.32. Find all values real x for which the sequence

x0

0!,x1

1!,x2

2!, . . .

is summable. Solution: For x “ 0, the corresponding sequence is obvi-ously absolutely summable. For x P R˚ and n P N, it follows that

limnÑ8

ˇ

ˇ

ˇ

ˇ

xn`1

pn` 1q!¨n!

xn

ˇ

ˇ

ˇ

ˇ

“ limnÑ8

|x|

n` 1“ 0

and hence by Theorem 3.3.31 the absolute summability of the correspond-ing sequence.

Theorem 3.3.33. (Root test) Let x1, x2, . . . be a sequence of real numbers.

(i) If there are q P r0, 1q and N P N˚ such that

|xn|1nď q

for all n P N˚ satisfying n ě N , then x1, x2, . . . is absolutelysummable.

(ii) If there is N P N˚ such that

|xn|1ně 1

for all n P N˚ satisfying n ě N , then x1, x2, . . . is not summable.

Proof. ‘(i)’: For this, let q P r0, 1q and N P N˚ be such that |xn|1n ď q forall n P N˚ satisfying n ě N . Then it follows that

|xn| ď qn

for all n P N˚ satisfying n ě N and hence by Example 3.3.2 and Theo-rem 3.3.9 the absolute summability of x1, x2, . . . . ‘(ii)’: For this letN P N˚be such that |xn|1n ě 1 for all n P N˚ satisfying n ě N . Then it followsthat

|xn| ě 1

for all n P N˚ such that n ě N and hence that x1, x2, . . . is not convergingto 0. Hence it follows by Corollary 3.3.29 that x1, x2, . . . is not summable.

373

Example 3.3.34. Determine whether the sequence

p´1q2 ¨1

pln 2q2, p´1q3 ¨

1

pln 3q3, p´1q4 ¨

1

pln 4q4, . . .

is summable. Solution: For n P N zt0, 1u, it follows that

limnÑ8

ˇ

ˇ

ˇ

ˇ

p´1qn ¨1

plnnqn

ˇ

ˇ

ˇ

ˇ

1n

“ limnÑ8

1

lnn“ 0

and hence by Theorem 3.3.33 the absolute summability of the sequence.

Example 3.3.35. Note that in the case of the sequence a1, a2, . . . definedby

an :“1

ns

for all n P N˚, where s ą 0, that neither the ratio nor the root test can beapplied, since

limnÑ8

ˇ

ˇ

ˇ

ˇ

an`1

an

ˇ

ˇ

ˇ

ˇ

“ limnÑ8

ˆ

n

n` 1

˙s

“ 1s “ 1 ,

limnÑ8

n´sn “ limnÑ8

e´s lnpnqn“ e0

“ 1 .

Finally, by application of Cauchy’s characterization of summable sequences,Theorem 3.3.28, we prove that every reordering of an absolutely summablesequence leads to a convergent series whose sum coincides with the sum ofthe original series.

Theorem 3.3.36. (Rearrangements of absolutely convergent series) Letx1, x2, . . . be an absolutely summable sequence of real numbers. Further,let f : N˚ Ñ N˚ be bijective. Then the sequence xfp1q, xfp2q, . . . is alsoabsolutely summable and

8ÿ

k“1

xk “8ÿ

k“1

xfpkq . (3.3.17)

374

Proof. First, it follows that the sequence of partial sums of |xfp1q|, |xfp2q|, . . .is increasing with upper bound

ř8

k“0 |xk| and hence convergent. Hence|xfp1q|, |xfp2q|, . . . is absolutely summable. Further, let ε ą 0. By The-orem 3.3.28, there is N P N˚ such that for all n,m P N˚ satisfyingn ě m ě N , it follows that

nÿ

k“m

|xk| ď ε .

Since f is bijective, there is Nf P N˚ such

t1, . . . , Nu Ă tfp1q, . . . , fpNf qu .

Hence it follows for every n P N˚ satisfying n ě maxtN,Nfu:ˇ

ˇ

ˇ

ˇ

ˇ

nÿ

k“1

xfpkq ´nÿ

k“1

xk

ˇ

ˇ

ˇ

ˇ

ˇ

ď ε .

Hence it follows also (3.3.17).

Problems

1) Express the periodic decimal expansion as a fraction.

a) 0.9 , b) 0.3 , c) 0.377 .

2) Calculate

a)8ÿ

n“1

3p1` p´1qnq

2n, b)

8ÿ

n“1

1

4

ˆ

1

n´

1

n` 4

˙

,

c)8ÿ

n“1

1

npn` 3q, d)

8ÿ

n“1

1

npn` 1qpn` 3q.

3) Determine whether the sequence a1, a2, . . . is absolutely summable,conditionally summable or not summable.

a) ak “1

k´

ˆ

4

5

˙k

, b) ak “ 51k ,

375

a) ak “1

plnpk ` 1qqk, b) ak “

p´1qk

31k,

c) ak “arctanpkq

k43 ` 1, d) ak “

22k´1

p2k ´ 1q!,

e) ak “p´3qkp1` k2q

k!, f) ak “

?k ` 3´

?k

k,

g) ak “ p´1qk ¨e´1k

k2 ` 2, h) ak “ p´1qk ¨

k

k2 ` 3,

i) ak “ p´1qk ¨k2 ´ 3

k2 ` k ` 2, j) ak “

3k3

ek2,

k) ak “p2kq!

p3kq!, l) ak “

kk

k!,

k) ak “

ˆ

1`1

k

˙k

, l) ak “rlnpkqs3

k2

where k P N˚.

4) Determine the values q ě 1 for which the corresponding sequencea3, a4, . . .

ak :“1

k lnpkq rlnplnpkqqsq,

k P t3, 4, . . . u, is summable. Give reasons for your answer.

5) Define a4k :“ a4k`1 :“ 1, a4k`2 :“ a4k`3 :“ ´1 for every k P N.Determine whether the sequence

ak

3?k ` 7

,

k P N, is absolutely summable, conditionally summable or not summable.Give reasons for your answer.

6) Estimate the error if the sum of the first N terms is used as an ap-proximation of the series.

a)8ÿ

n“1

1

n2, N “ 3 , b)

8ÿ

n“2

1

n rlnpnqs2, N “ 9 ,

a)8ÿ

n“1

p´1qn`1

n, N “ 7 , b)

8ÿ

n“1

p´1qn`1

n2, N “ 14 .

7) Calculate the sum correct to 3 decimal places

a)8ÿ

n“1

1

n4, b)

8ÿ

n“2

1

n5 lnpnq,

376

a)8ÿ

n“1

p´1qn1

p2nq!, b)

8ÿ

n“1

p´1qn`1 n!

n2n.

8) A rubber ball falls from initial height 3m. Whenever it hits theground, it bounces up 34-th of the previous height. What total dis-tance is covered by the ball before it comes to rest?

9) If a1, a2, . . . is sequence of real numbers such that

limnÑ8

an “ 0 ,

does this imply the summability of the sequence? Give reasons foryour answer.

10) Give an example for a convergent sequence of real numbers a1, a2, . . .and a divergent sequence of real numbers b1, b2, . . . satisfying

limnÑ8

an`1

an“ 1 , lim

nÑ8

bn`1

bn“ 1 .

11) Give an example for a convergent sequence of real numbers a1, a2, . . .and a divergent sequence of real numbers b1, b2, . . . satisfying

limnÑ8

panq1n “ 1 , lim

nÑ8pbnq

1n “ 1 .

12) Assume that a1, a2, . . . is a summable sequence of positive real num-bers. Show that the sequence a21, a

22, . . . is also summable. Is the last

also generally true if members of a1, a2, . . . can be negative?

377

calculus: a modern approach

Documents