Misplaced Pages

Taylor's theorem

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Lagrange error bound) Approximation of a function by a truncated power series
The exponential function y = e x {\textstyle y=e^{x}} (red) and the corresponding Taylor polynomial of degree four (dashed green) around the origin.
Part of a series of articles about
Calculus
a b f ( t ) d t = f ( b ) f ( a ) {\displaystyle \int _{a}^{b}f'(t)\,dt=f(b)-f(a)}
Differential
Definitions
Concepts
Rules and identities
Integral
Definitions
Integration by
Series
Convergence tests
Vector
Theorems
Multivariable
Formalisms
Definitions
Advanced
Specialized
Miscellanea

In calculus, Taylor's theorem gives an approximation of a k {\textstyle k} -times differentiable function around a given point by a polynomial of degree k {\textstyle k} , called the k {\textstyle k} -th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation at the order k {\textstyle k} of the Taylor series of the function. The first-order Taylor polynomial is the linear approximation of the function, and the second-order Taylor polynomial is often referred to as the quadratic approximation. There are several versions of Taylor's theorem, some giving explicit estimates of the approximation error of the function by its Taylor polynomial.

Taylor's theorem is named after the mathematician Brook Taylor, who stated a version of it in 1715, although an earlier version of the result was already mentioned in 1671 by James Gregory.

Taylor's theorem is taught in introductory-level calculus courses and is one of the central elementary tools in mathematical analysis. It gives simple arithmetic formulas to accurately compute values of many transcendental functions such as the exponential function and trigonometric functions. It is the starting point of the study of analytic functions, and is fundamental in various areas of mathematics, as well as in numerical analysis and mathematical physics. Taylor's theorem also generalizes to multivariate and vector valued functions. It provided the mathematical basis for some landmark early computing machines: Charles Babbage's Difference Engine calculated sines, cosines, logarithms, and other transcendental functions by numerically integrating the first 7 terms of their Taylor series.

Motivation

Graph of f ( x ) = e x {\textstyle f(x)=e^{x}} (blue) with its linear approximation P 1 ( x ) = 1 + x {\textstyle P_{1}(x)=1+x} (red) at a = 0 {\textstyle a=0} .

If a real-valued function f ( x ) {\textstyle f(x)} is differentiable at the point x = a {\textstyle x=a} , then it has a linear approximation near this point. This means that there exists a function h1(x) such that

f ( x ) = f ( a ) + f ( a ) ( x a ) + h 1 ( x ) ( x a ) , lim x a h 1 ( x ) = 0. {\displaystyle f(x)=f(a)+f'(a)(x-a)+h_{1}(x)(x-a),\quad \lim _{x\to a}h_{1}(x)=0.}

Here

P 1 ( x ) = f ( a ) + f ( a ) ( x a ) {\displaystyle P_{1}(x)=f(a)+f'(a)(x-a)}

is the linear approximation of f ( x ) {\textstyle f(x)} for x near the point a, whose graph y = P 1 ( x ) {\textstyle y=P_{1}(x)} is the tangent line to the graph y = f ( x ) {\textstyle y=f(x)} at x = a. The error in the approximation is: R 1 ( x ) = f ( x ) P 1 ( x ) = h 1 ( x ) ( x a ) . {\displaystyle R_{1}(x)=f(x)-P_{1}(x)=h_{1}(x)(x-a).}

As x tends to a, this error goes to zero much faster than f ( a ) ( x a ) {\displaystyle f'(a)(x{-}a)} , making f ( x ) P 1 ( x ) {\displaystyle f(x)\approx P_{1}(x)} a useful approximation.

Graph of f ( x ) = e x {\textstyle f(x)=e^{x}} (blue) with its quadratic approximation P 2 ( x ) = 1 + x + x 2 2 {\displaystyle P_{2}(x)=1+x+{\dfrac {x^{2}}{2}}} (red) at a = 0 {\textstyle a=0} . Note the improvement in the approximation.

For a better approximation to f ( x ) {\textstyle f(x)} , we can fit a quadratic polynomial instead of a linear function:

P 2 ( x ) = f ( a ) + f ( a ) ( x a ) + f ( a ) 2 ( x a ) 2 . {\displaystyle P_{2}(x)=f(a)+f'(a)(x-a)+{\frac {f''(a)}{2}}(x-a)^{2}.}

Instead of just matching one derivative of f ( x ) {\textstyle f(x)} at x = a {\textstyle x=a} , this polynomial has the same first and second derivatives, as is evident upon differentiation.

Taylor's theorem ensures that the quadratic approximation is, in a sufficiently small neighborhood of x = a {\textstyle x=a} , more accurate than the linear approximation. Specifically,

f ( x ) = P 2 ( x ) + h 2 ( x ) ( x a ) 2 , lim x a h 2 ( x ) = 0. {\displaystyle f(x)=P_{2}(x)+h_{2}(x)(x-a)^{2},\quad \lim _{x\to a}h_{2}(x)=0.}

Here the error in the approximation is

R 2 ( x ) = f ( x ) P 2 ( x ) = h 2 ( x ) ( x a ) 2 , {\displaystyle R_{2}(x)=f(x)-P_{2}(x)=h_{2}(x)(x-a)^{2},}

which, given the limiting behavior of h 2 {\displaystyle h_{2}} , goes to zero faster than ( x a ) 2 {\displaystyle (x-a)^{2}} as x tends to a.

Approximation of f ( x ) = 1 1 + x 2 {\textstyle f(x)={\dfrac {1}{1+x^{2}}}} (blue) by its Taylor polynomials P k {\textstyle P_{k}} of order k = 1 , , 16 {\textstyle k=1,\ldots ,16} centered at x = 0 {\textstyle x=0} (red) and x = 1 {\textstyle x=1} (green). The approximations do not improve at all outside ( 1 , 1 ) {\displaystyle (-1,1)} and ( 1 2 , 1 + 2 ) {\textstyle (1-{\sqrt {2}},1+{\sqrt {2}})} , respectively.

Similarly, we might get still better approximations to f if we use polynomials of higher degree, since then we can match even more derivatives with f at the selected base point.

In general, the error in approximating a function by a polynomial of degree k will go to zero much faster than ( x a ) k {\displaystyle (x-a)^{k}} as x tends to a. However, there are functions, even infinitely differentiable ones, for which increasing the degree of the approximating polynomial does not increase the accuracy of approximation: we say such a function fails to be analytic at x = a: it is not (locally) determined by its derivatives at this point.

Taylor's theorem is of asymptotic nature: it only tells us that the error R k {\textstyle R_{k}} in an approximation by a k {\textstyle k} -th order Taylor polynomial Pk tends to zero faster than any nonzero k {\textstyle k} -th degree polynomial as x a {\textstyle x\to a} . It does not tell us how large the error is in any concrete neighborhood of the center of expansion, but for this purpose there are explicit formulas for the remainder term (given below) which are valid under some additional regularity assumptions on f. These enhanced versions of Taylor's theorem typically lead to uniform estimates for the approximation error in a small neighborhood of the center of expansion, but the estimates do not necessarily hold for neighborhoods which are too large, even if the function f is analytic. In that situation one may have to select several Taylor polynomials with different centers of expansion to have reliable Taylor-approximations of the original function (see animation on the right.)

There are several ways we might use the remainder term:

  1. Estimate the error for a polynomial Pk(x) of degree k estimating f ( x ) {\textstyle f(x)} on a given interval (ar, a + r). (Given the interval and degree, we find the error.)
  2. Find the smallest degree k for which the polynomial Pk(x) approximates f ( x ) {\textstyle f(x)} to within a given error tolerance on a given interval (ar, a + r) . (Given the interval and error tolerance, we find the degree.)
  3. Find the largest interval (ar, a + r) on which Pk(x) approximates f ( x ) {\textstyle f(x)} to within a given error tolerance. (Given the degree and error tolerance, we find the interval.)

Taylor's theorem in one real variable

Statement of the theorem

The precise statement of the most basic version of Taylor's theorem is as follows:

Taylor's theorem — Let k ≥ 1 be an integer and let the function f : RR be k times differentiable at the point aR. Then there exists a function hk : RR such that

f ( x ) = i = 0 k f ( i ) ( a ) i ! ( x a ) i + h k ( x ) ( x a ) k , {\displaystyle f(x)=\sum _{i=0}^{k}{\frac {f^{(i)}(a)}{i!}}(x-a)^{i}+h_{k}(x)(x-a)^{k},} and lim x a h k ( x ) = 0. {\displaystyle \lim _{x\to a}h_{k}(x)=0.} This is called the Peano form of the remainder.

The polynomial appearing in Taylor's theorem is the k {\textstyle {\boldsymbol {k}}} -th order Taylor polynomial

P k ( x ) = f ( a ) + f ( a ) ( x a ) + f ( a ) 2 ! ( x a ) 2 + + f ( k ) ( a ) k ! ( x a ) k {\displaystyle P_{k}(x)=f(a)+f'(a)(x-a)+{\frac {f''(a)}{2!}}(x-a)^{2}+\cdots +{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}}

of the function f at the point a. The Taylor polynomial is the unique "asymptotic best fit" polynomial in the sense that if there exists a function hk : RR and a k {\textstyle k} -th order polynomial p such that

f ( x ) = p ( x ) + h k ( x ) ( x a ) k , lim x a h k ( x ) = 0 , {\displaystyle f(x)=p(x)+h_{k}(x)(x-a)^{k},\quad \lim _{x\to a}h_{k}(x)=0,}

then p = Pk. Taylor's theorem describes the asymptotic behavior of the remainder term

R k ( x ) = f ( x ) P k ( x ) , {\displaystyle R_{k}(x)=f(x)-P_{k}(x),}

which is the approximation error when approximating f with its Taylor polynomial. Using the little-o notation, the statement in Taylor's theorem reads as

R k ( x ) = o ( | x a | k ) , x a . {\displaystyle R_{k}(x)=o(|x-a|^{k}),\quad x\to a.}

Explicit formulas for the remainder

Under stronger regularity assumptions on f there are several precise formulas for the remainder term Rk of the Taylor polynomial, the most common ones being the following.

Mean-value forms of the remainder — Let f : RR be k + 1 times differentiable on the open interval with f continuous on the closed interval between a {\textstyle a} and x {\textstyle x} . Then

R k ( x ) = f ( k + 1 ) ( ξ L ) ( k + 1 ) ! ( x a ) k + 1 {\displaystyle R_{k}(x)={\frac {f^{(k+1)}(\xi _{L})}{(k+1)!}}(x-a)^{k+1}}

for some real number ξ L {\textstyle \xi _{L}} between a {\textstyle a} and x {\textstyle x} . This is the Lagrange form of the remainder.

Similarly,

R k ( x ) = f ( k + 1 ) ( ξ C ) k ! ( x ξ C ) k ( x a ) {\displaystyle R_{k}(x)={\frac {f^{(k+1)}(\xi _{C})}{k!}}(x-\xi _{C})^{k}(x-a)}

for some real number ξ C {\textstyle \xi _{C}} between a {\textstyle a} and x {\textstyle x} . This is the Cauchy form of the remainder.

Both can be thought of as specific cases of the following result: Consider p > 0 {\displaystyle p>0}

R k ( x ) = f ( k + 1 ) ( ξ S ) k ! ( x ξ S ) k + 1 p ( x a ) p p {\displaystyle R_{k}(x)={\frac {f^{(k+1)}(\xi _{S})}{k!}}(x-\xi _{S})^{k+1-p}{\frac {(x-a)^{p}}{p}}} for some real number ξ S {\textstyle \xi _{S}} between a {\textstyle a} and x {\textstyle x} . This is the Schlömilch form of the remainder (sometimes called the Schlömilch-Roche). The choice p = k + 1 {\textstyle p=k+1} is the Lagrange form, whilst the choice p = 1 {\textstyle p=1} is the Cauchy form.

These refinements of Taylor's theorem are usually proved using the mean value theorem, whence the name. Additionally, notice that this is precisely the mean value theorem when k = 0 {\textstyle k=0} . Also other similar expressions can be found. For example, if G(t) is continuous on the closed interval and differentiable with a non-vanishing derivative on the open interval between a {\textstyle a} and x {\textstyle x} , then

R k ( x ) = f ( k + 1 ) ( ξ ) k ! ( x ξ ) k G ( x ) G ( a ) G ( ξ ) {\displaystyle R_{k}(x)={\frac {f^{(k+1)}(\xi )}{k!}}(x-\xi )^{k}{\frac {G(x)-G(a)}{G'(\xi )}}}

for some number ξ {\textstyle \xi } between a {\textstyle a} and x {\textstyle x} . This version covers the Lagrange and Cauchy forms of the remainder as special cases, and is proved below using Cauchy's mean value theorem. The Lagrange form is obtained by taking G ( t ) = ( x t ) k + 1 {\displaystyle G(t)=(x-t)^{k+1}} and the Cauchy form is obtained by taking G ( t ) = t a {\displaystyle G(t)=t-a} .

The statement for the integral form of the remainder is more advanced than the previous ones, and requires understanding of Lebesgue integration theory for the full generality. However, it holds also in the sense of Riemann integral provided the (k + 1)th derivative of f is continuous on the closed interval .

Integral form of the remainder — Let f ( k ) {\textstyle f^{(k)}} be absolutely continuous on the closed interval between a {\textstyle a} and x {\textstyle x} . Then

R k ( x ) = a x f ( k + 1 ) ( t ) k ! ( x t ) k d t . {\displaystyle R_{k}(x)=\int _{a}^{x}{\frac {f^{(k+1)}(t)}{k!}}(x-t)^{k}\,dt.}

Due to the absolute continuity of f on the closed interval between a {\textstyle a} and x {\textstyle x} , its derivative f exists as an L-function, and the result can be proven by a formal calculation using the fundamental theorem of calculus and integration by parts.

Estimates for the remainder

It is often useful in practice to be able to estimate the remainder term appearing in the Taylor approximation, rather than having an exact formula for it. Suppose that f is (k + 1)-times continuously differentiable in an interval I containing a. Suppose that there are real constants q and Q such that

q f ( k + 1 ) ( x ) Q {\displaystyle q\leq f^{(k+1)}(x)\leq Q}

throughout I. Then the remainder term satisfies the inequality

q ( x a ) k + 1 ( k + 1 ) ! R k ( x ) Q ( x a ) k + 1 ( k + 1 ) ! , {\displaystyle q{\frac {(x-a)^{k+1}}{(k+1)!}}\leq R_{k}(x)\leq Q{\frac {(x-a)^{k+1}}{(k+1)!}},}

if x > a, and a similar estimate if x < a. This is a simple consequence of the Lagrange form of the remainder. In particular, if

| f ( k + 1 ) ( x ) | M {\displaystyle |f^{(k+1)}(x)|\leq M}

on an interval I = (ar,a + r) with some r > 0 {\displaystyle r>0} , then

| R k ( x ) | M | x a | k + 1 ( k + 1 ) ! M r k + 1 ( k + 1 ) ! {\displaystyle |R_{k}(x)|\leq M{\frac {|x-a|^{k+1}}{(k+1)!}}\leq M{\frac {r^{k+1}}{(k+1)!}}}

for all x∈(ar,a + r). The second inequality is called a uniform estimate, because it holds uniformly for all x on the interval (ar,a + r).

Example

Approximation of e x {\textstyle e^{x}} (blue) by its Taylor polynomials P k {\displaystyle P_{k}} of order k = 1 , , 7 {\textstyle k=1,\ldots ,7} centered at x = 0 {\textstyle x=0} (red).

Suppose that we wish to find the approximate value of the function f ( x ) = e x {\textstyle f(x)=e^{x}} on the interval [ 1 , 1 ] {\textstyle } while ensuring that the error in the approximation is no more than 10. In this example we pretend that we only know the following properties of the exponential function:

e 0 = 1 , d d x e x = e x , e x > 0 , x R . {\displaystyle e^{0}=1,\qquad {\frac {d}{dx}}e^{x}=e^{x},\qquad e^{x}>0,\qquad x\in \mathbb {R} .} (★)

From these properties it follows that f ( k ) ( x ) = e x {\textstyle f^{(k)}(x)=e^{x}} for all k {\textstyle k} , and in particular, f ( k ) ( 0 ) = 1 {\textstyle f^{(k)}(0)=1} . Hence the k {\textstyle k} -th order Taylor polynomial of f {\textstyle f} at 0 {\textstyle 0} and its remainder term in the Lagrange form are given by

P k ( x ) = 1 + x + x 2 2 ! + + x k k ! , R k ( x ) = e ξ ( k + 1 ) ! x k + 1 , {\displaystyle P_{k}(x)=1+x+{\frac {x^{2}}{2!}}+\cdots +{\frac {x^{k}}{k!}},\qquad R_{k}(x)={\frac {e^{\xi }}{(k+1)!}}x^{k+1},}

where ξ {\textstyle \xi } is some number between 0 and x. Since e is increasing by (), we can simply use e x 1 {\textstyle e^{x}\leq 1} for x [ 1 , 0 ] {\textstyle x\in } to estimate the remainder on the subinterval [ 1 , 0 ] {\displaystyle } . To obtain an upper bound for the remainder on [ 0 , 1 ] {\displaystyle } , we use the property e ξ < e x {\textstyle e^{\xi }<e^{x}} for 0 < ξ < x {\textstyle 0<\xi <x} to estimate

e x = 1 + x + e ξ 2 x 2 < 1 + x + e x 2 x 2 , 0 < x 1 {\displaystyle e^{x}=1+x+{\frac {e^{\xi }}{2}}x^{2}<1+x+{\frac {e^{x}}{2}}x^{2},\qquad 0<x\leq 1}

using the second order Taylor expansion. Then we solve for e to deduce that

e x 1 + x 1 x 2 2 = 2 1 + x 2 x 2 4 , 0 x 1 {\displaystyle e^{x}\leq {\frac {1+x}{1-{\frac {x^{2}}{2}}}}=2{\frac {1+x}{2-x^{2}}}\leq 4,\qquad 0\leq x\leq 1}

simply by maximizing the numerator and minimizing the denominator. Combining these estimates for e we see that

| R k ( x ) | 4 | x | k + 1 ( k + 1 ) ! 4 ( k + 1 ) ! , 1 x 1 , {\displaystyle |R_{k}(x)|\leq {\frac {4|x|^{k+1}}{(k+1)!}}\leq {\frac {4}{(k+1)!}},\qquad -1\leq x\leq 1,}

so the required precision is certainly reached, when

4 ( k + 1 ) ! < 10 5 4 10 5 < ( k + 1 ) ! k 9. {\displaystyle {\frac {4}{(k+1)!}}<10^{-5}\quad \Longleftrightarrow \quad 4\cdot 10^{5}<(k+1)!\quad \Longleftrightarrow \quad k\geq 9.}

(See factorial or compute by hand the values 9 ! = 362880 {\textstyle 9!=362880} and 10 ! = 3628800 {\textstyle 10!=3628800} .) As a conclusion, Taylor's theorem leads to the approximation

e x = 1 + x + x 2 2 ! + + x 9 9 ! + R 9 ( x ) , | R 9 ( x ) | < 10 5 , 1 x 1. {\displaystyle e^{x}=1+x+{\frac {x^{2}}{2!}}+\cdots +{\frac {x^{9}}{9!}}+R_{9}(x),\qquad |R_{9}(x)|<10^{-5},\qquad -1\leq x\leq 1.}

For instance, this approximation provides a decimal expression e 2.71828 {\displaystyle e\approx 2.71828} , correct up to five decimal places.

Relationship to analyticity

Taylor expansions of real analytic functions

Let IR be an open interval. By definition, a function f : IR is real analytic if it is locally defined by a convergent power series. This means that for every a ∈ I there exists some r > 0 and a sequence of coefficients ck ∈ R such that (ar, a + r) ⊂ I and

f ( x ) = k = 0 c k ( x a ) k = c 0 + c 1 ( x a ) + c 2 ( x a ) 2 + , | x a | < r . {\displaystyle f(x)=\sum _{k=0}^{\infty }c_{k}(x-a)^{k}=c_{0}+c_{1}(x-a)+c_{2}(x-a)^{2}+\cdots ,\qquad |x-a|<r.}

In general, the radius of convergence of a power series can be computed from the Cauchy–Hadamard formula

1 R = lim sup k | c k | 1 k . {\displaystyle {\frac {1}{R}}=\limsup _{k\to \infty }|c_{k}|^{\frac {1}{k}}.}

This result is based on comparison with a geometric series, and the same method shows that if the power series based on a converges for some bR, it must converge uniformly on the closed interval [ a r b , a + r b ] {\textstyle } , where r b = | b a | {\textstyle r_{b}=\left\vert b-a\right\vert } . Here only the convergence of the power series is considered, and it might well be that (aR,a + R) extends beyond the domain I of f.

The Taylor polynomials of the real analytic function f at a are simply the finite truncations

P k ( x ) = j = 0 k c j ( x a ) j , c j = f ( j ) ( a ) j ! {\displaystyle P_{k}(x)=\sum _{j=0}^{k}c_{j}(x-a)^{j},\qquad c_{j}={\frac {f^{(j)}(a)}{j!}}}

of its locally defining power series, and the corresponding remainder terms are locally given by the analytic functions

R k ( x ) = j = k + 1 c j ( x a ) j = ( x a ) k h k ( x ) , | x a | < r . {\displaystyle R_{k}(x)=\sum _{j=k+1}^{\infty }c_{j}(x-a)^{j}=(x-a)^{k}h_{k}(x),\qquad |x-a|<r.}

Here the functions

h k : ( a r , a + r ) R h k ( x ) = ( x a ) j = 0 c k + 1 + j ( x a ) j {\displaystyle {\begin{aligned}&h_{k}:(a-r,a+r)\to \mathbb {R} \\&h_{k}(x)=(x-a)\sum _{j=0}^{\infty }c_{k+1+j}\left(x-a\right)^{j}\end{aligned}}}

are also analytic, since their defining power series have the same radius of convergence as the original series. Assuming that ⊂ I and r < R, all these series converge uniformly on (ar, a + r). Naturally, in the case of analytic functions one can estimate the remainder term R k ( x ) {\textstyle R_{k}(x)} by the tail of the sequence of the derivatives f′(a) at the center of the expansion, but using complex analysis also another possibility arises, which is described below.

Taylor's theorem and convergence of Taylor series

The Taylor series of f will converge in some interval in which all its derivatives are bounded and do not grow too fast as k goes to infinity. (However, even if the Taylor series converges, it might not converge to f, as explained below; f is then said to be non-analytic.)

One might think of the Taylor series

f ( x ) k = 0 c k ( x a ) k = c 0 + c 1 ( x a ) + c 2 ( x a ) 2 + {\displaystyle f(x)\approx \sum _{k=0}^{\infty }c_{k}(x-a)^{k}=c_{0}+c_{1}(x-a)+c_{2}(x-a)^{2}+\cdots }

of an infinitely many times differentiable function f : RR as its "infinite order Taylor polynomial" at a. Now the estimates for the remainder imply that if, for any r, the derivatives of f are known to be bounded over (a − r, a + r), then for any order k and for any r > 0 there exists a constant Mk,r > 0 such that

| R k ( x ) | M k , r | x a | k + 1 ( k + 1 ) ! {\displaystyle |R_{k}(x)|\leq M_{k,r}{\frac {|x-a|^{k+1}}{(k+1)!}}} (★★)

for every x ∈ (a − r,a + r). Sometimes the constants Mk,r can be chosen in such way that Mk,r is bounded above, for fixed r and all k. Then the Taylor series of f converges uniformly to some analytic function

T f : ( a r , a + r ) R T f ( x ) = k = 0 f ( k ) ( a ) k ! ( x a ) k {\displaystyle {\begin{aligned}&T_{f}:(a-r,a+r)\to \mathbb {R} \\&T_{f}(x)=\sum _{k=0}^{\infty }{\frac {f^{(k)}(a)}{k!}}\left(x-a\right)^{k}\end{aligned}}}

(One also gets convergence even if Mk,r is not bounded above as long as it grows slowly enough.)

The limit function Tf is by definition always analytic, but it is not necessarily equal to the original function f, even if f is infinitely differentiable. In this case, we say f is a non-analytic smooth function, for example a flat function:

f : R R f ( x ) = { e 1 x 2 x > 0 0 x 0. {\displaystyle {\begin{aligned}&f:\mathbb {R} \to \mathbb {R} \\&f(x)={\begin{cases}e^{-{\frac {1}{x^{2}}}}&x>0\\0&x\leq 0.\end{cases}}\end{aligned}}}

Using the chain rule repeatedly by mathematical induction, one shows that for any order k,

f ( k ) ( x ) = { p k ( x ) x 3 k e 1 x 2 x > 0 0 x 0 {\displaystyle f^{(k)}(x)={\begin{cases}{\frac {p_{k}(x)}{x^{3k}}}\cdot e^{-{\frac {1}{x^{2}}}}&x>0\\0&x\leq 0\end{cases}}}

for some polynomial pk of degree 2(k − 1). The function e 1 x 2 {\displaystyle e^{-{\frac {1}{x^{2}}}}} tends to zero faster than any polynomial as x 0 {\textstyle x\to 0} , so f is infinitely many times differentiable and f(0) = 0 for every positive integer k. The above results all hold in this case:

  • The Taylor series of f converges uniformly to the zero function Tf(x) = 0, which is analytic with all coefficients equal to zero.
  • The function f is unequal to this Taylor series, and hence non-analytic.
  • For any order k ∈ N and radius r > 0 there exists Mk,r > 0 satisfying the remainder bound (★★) above.

However, as k increases for fixed r, the value of Mk,r grows more quickly than r, and the error does not go to zero.

Taylor's theorem in complex analysis

Taylor's theorem generalizes to functions f : CC which are complex differentiable in an open subset U ⊂ C of the complex plane. However, its usefulness is dwarfed by other general theorems in complex analysis. Namely, stronger versions of related results can be deduced for complex differentiable functions f : U → C using Cauchy's integral formula as follows.

Let r > 0 such that the closed disk B(zr) ∪ S(zr) is contained in U. Then Cauchy's integral formula with a positive parametrization γ(t) = z + re of the circle S(z, r) with t [ 0 , 2 π ] {\displaystyle t\in } gives

f ( z ) = 1 2 π i γ f ( w ) w z d w , f ( z ) = 1 2 π i γ f ( w ) ( w z ) 2 d w , , f ( k ) ( z ) = k ! 2 π i γ f ( w ) ( w z ) k + 1 d w . {\displaystyle f(z)={\frac {1}{2\pi i}}\int _{\gamma }{\frac {f(w)}{w-z}}\,dw,\quad f'(z)={\frac {1}{2\pi i}}\int _{\gamma }{\frac {f(w)}{(w-z)^{2}}}\,dw,\quad \ldots ,\quad f^{(k)}(z)={\frac {k!}{2\pi i}}\int _{\gamma }{\frac {f(w)}{(w-z)^{k+1}}}\,dw.}

Here all the integrands are continuous on the circle S(zr), which justifies differentiation under the integral sign. In particular, if f is once complex differentiable on the open set U, then it is actually infinitely many times complex differentiable on U. One also obtains Cauchy's estimate

| f ( k ) ( z ) | k ! 2 π γ M r | w z | k + 1 d w = k ! M r r k , M r = max | w c | = r | f ( w ) | {\displaystyle |f^{(k)}(z)|\leq {\frac {k!}{2\pi }}\int _{\gamma }{\frac {M_{r}}{|w-z|^{k+1}}}\,dw={\frac {k!M_{r}}{r^{k}}},\quad M_{r}=\max _{|w-c|=r}|f(w)|}

for any z ∈ U and r > 0 such that B(zr) ∪ S(cr) ⊂ U. The estimate implies that the complex Taylor series

T f ( z ) = k = 0 f ( k ) ( c ) k ! ( z c ) k {\displaystyle T_{f}(z)=\sum _{k=0}^{\infty }{\frac {f^{(k)}(c)}{k!}}(z-c)^{k}}

of f converges uniformly on any open disk B ( c , r ) U {\textstyle B(c,r)\subset U} with S ( c , r ) U {\textstyle S(c,r)\subset U} into some function Tf. Furthermore, using the contour integral formulas for the derivatives f(c),

T f ( z ) = k = 0 ( z c ) k 2 π i γ f ( w ) ( w c ) k + 1 d w = 1 2 π i γ f ( w ) w c k = 0 ( z c w c ) k d w = 1 2 π i γ f ( w ) w c ( 1 1 z c w c ) d w = 1 2 π i γ f ( w ) w z d w = f ( z ) , {\displaystyle {\begin{aligned}T_{f}(z)&=\sum _{k=0}^{\infty }{\frac {(z-c)^{k}}{2\pi i}}\int _{\gamma }{\frac {f(w)}{(w-c)^{k+1}}}\,dw\\&={\frac {1}{2\pi i}}\int _{\gamma }{\frac {f(w)}{w-c}}\sum _{k=0}^{\infty }\left({\frac {z-c}{w-c}}\right)^{k}\,dw\\&={\frac {1}{2\pi i}}\int _{\gamma }{\frac {f(w)}{w-c}}\left({\frac {1}{1-{\frac {z-c}{w-c}}}}\right)\,dw\\&={\frac {1}{2\pi i}}\int _{\gamma }{\frac {f(w)}{w-z}}\,dw\\&=f(z),\end{aligned}}}

so any complex differentiable function f in an open set U ⊂ C is in fact complex analytic. All that is said for real analytic functions here holds also for complex analytic functions with the open interval I replaced by an open subset U ∈ C and a-centered intervals (a − ra + r) replaced by c-centered disks B(cr). In particular, the Taylor expansion holds in the form

f ( z ) = P k ( z ) + R k ( z ) , P k ( z ) = j = 0 k f ( j ) ( c ) j ! ( z c ) j , {\displaystyle f(z)=P_{k}(z)+R_{k}(z),\quad P_{k}(z)=\sum _{j=0}^{k}{\frac {f^{(j)}(c)}{j!}}(z-c)^{j},}

where the remainder term Rk is complex analytic. Methods of complex analysis provide some powerful results regarding Taylor expansions. For example, using Cauchy's integral formula for any positively oriented Jordan curve γ {\textstyle \gamma } which parametrizes the boundary W U {\textstyle \partial W\subset U} of a region W U {\textstyle W\subset U} , one obtains expressions for the derivatives f(c) as above, and modifying slightly the computation for Tf(z) = f(z), one arrives at the exact formula

R k ( z ) = j = k + 1 ( z c ) j 2 π i γ f ( w ) ( w c ) j + 1 d w = ( z c ) k + 1 2 π i γ f ( w ) d w ( w c ) k + 1 ( w z ) , z W . {\displaystyle R_{k}(z)=\sum _{j=k+1}^{\infty }{\frac {(z-c)^{j}}{2\pi i}}\int _{\gamma }{\frac {f(w)}{(w-c)^{j+1}}}\,dw={\frac {(z-c)^{k+1}}{2\pi i}}\int _{\gamma }{\frac {f(w)\,dw}{(w-c)^{k+1}(w-z)}},\qquad z\in W.}

The important feature here is that the quality of the approximation by a Taylor polynomial on the region W U {\textstyle W\subset U} is dominated by the values of the function f itself on the boundary W U {\textstyle \partial W\subset U} . Similarly, applying Cauchy's estimates to the series expression for the remainder, one obtains the uniform estimates

| R k ( z ) | j = k + 1 M r | z c | j r j = M r r k + 1 | z c | k + 1 1 | z c | r M r β k + 1 1 β , | z c | r β < 1. {\displaystyle |R_{k}(z)|\leq \sum _{j=k+1}^{\infty }{\frac {M_{r}|z-c|^{j}}{r^{j}}}={\frac {M_{r}}{r^{k+1}}}{\frac {|z-c|^{k+1}}{1-{\frac {|z-c|}{r}}}}\leq {\frac {M_{r}\beta ^{k+1}}{1-\beta }},\qquad {\frac {|z-c|}{r}}\leq \beta <1.}

Example

Complex plot of f ( z ) = 1 1 + z 2 {\textstyle f(z)={\frac {1}{1+z^{2}}}} . Modulus is shown by elevation and argument by coloring: cyan =  0 {\textstyle 0} , blue =  π 3 {\textstyle {\frac {\pi }{3}}} , violet =  2 π 3 {\textstyle {\frac {2\pi }{3}}} , red =  π {\displaystyle \pi } , yellow =  4 π 3 {\textstyle {\frac {4\pi }{3}}} , green =  5 π 3 {\textstyle {\frac {5\pi }{3}}} .

The function

f : R R f ( x ) = 1 1 + x 2 {\displaystyle {\begin{aligned}&f:\mathbb {R} \to \mathbb {R} \\&f(x)={\frac {1}{1+x^{2}}}\end{aligned}}}

is real analytic, that is, locally determined by its Taylor series. This function was plotted above to illustrate the fact that some elementary functions cannot be approximated by Taylor polynomials in neighborhoods of the center of expansion which are too large. This kind of behavior is easily understood in the framework of complex analysis. Namely, the function f extends into a meromorphic function

f : C { } C { } f ( z ) = 1 1 + z 2 {\displaystyle {\begin{aligned}&f:\mathbb {C} \cup \{\infty \}\to \mathbb {C} \cup \{\infty \}\\&f(z)={\frac {1}{1+z^{2}}}\end{aligned}}}

on the compactified complex plane. It has simple poles at z = i {\textstyle z=i} and z = i {\textstyle z=-i} , and it is analytic elsewhere. Now its Taylor series centered at z0 converges on any disc B(z0, r) with r < |z − z0|, where the same Taylor series converges at z ∈ C. Therefore, Taylor series of f centered at 0 converges on B(0, 1) and it does not converge for any zC with |z| > 1 due to the poles at i and −i. For the same reason the Taylor series of f centered at 1 converges on B ( 1 , 2 ) {\textstyle B(1,{\sqrt {2}})} and does not converge for any z ∈ C with | z 1 | > 2 {\textstyle \left\vert z-1\right\vert >{\sqrt {2}}} .

Generalizations of Taylor's theorem

Higher-order differentiability

A function f: RR is differentiable at aR if and only if there exists a linear functional L : RR and a function h : RR such that

f ( x ) = f ( a ) + L ( x a ) + h ( x ) x a , lim x a h ( x ) = 0. {\displaystyle f({\boldsymbol {x}})=f({\boldsymbol {a}})+L({\boldsymbol {x}}-{\boldsymbol {a}})+h({\boldsymbol {x}})\lVert {\boldsymbol {x}}-{\boldsymbol {a}}\rVert ,\qquad \lim _{{\boldsymbol {x}}\to {\boldsymbol {a}}}h({\boldsymbol {x}})=0.}

If this is the case, then L = d f ( a ) {\textstyle L=df({\boldsymbol {a}})} is the (uniquely defined) differential of f at the point a. Furthermore, then the partial derivatives of f exist at a and the differential of f at a is given by

d f ( a ) ( v ) = f x 1 ( a ) v 1 + + f x n ( a ) v n . {\displaystyle df({\boldsymbol {a}})({\boldsymbol {v}})={\frac {\partial f}{\partial x_{1}}}({\boldsymbol {a}})v_{1}+\cdots +{\frac {\partial f}{\partial x_{n}}}({\boldsymbol {a}})v_{n}.}

Introduce the multi-index notation

| α | = α 1 + + α n , α ! = α 1 ! α n ! , x α = x 1 α 1 x n α n {\displaystyle |\alpha |=\alpha _{1}+\cdots +\alpha _{n},\quad \alpha !=\alpha _{1}!\cdots \alpha _{n}!,\quad {\boldsymbol {x}}^{\alpha }=x_{1}^{\alpha _{1}}\cdots x_{n}^{\alpha _{n}}}

for αN and xR. If all the k {\textstyle k} -th order partial derivatives of f : RR are continuous at aR, then by Clairaut's theorem, one can change the order of mixed derivatives at a, so the short-hand notation

D α f = | α | f x α = α 1 + + α n f x 1 α 1 x n α n {\displaystyle D^{\alpha }f={\frac {\partial ^{|\alpha |}f}{\partial {\boldsymbol {x}}^{\alpha }}}={\frac {\partial ^{\alpha _{1}+\ldots +\alpha _{n}}f}{\partial x_{1}^{\alpha _{1}}\cdots \partial x_{n}^{\alpha _{n}}}}}

for the higher order partial derivatives is justified in this situation. The same is true if all the (k − 1)-th order partial derivatives of f exist in some neighborhood of a and are differentiable at a. Then we say that f is k times differentiable at the point a.

Taylor's theorem for multivariate functions

Using notations of the preceding section, one has the following theorem.

Multivariate version of Taylor's theorem — Let f : RR be a k-times continuously differentiable function at the point aR. Then there exist functions hα : RR, where | α | = k , {\displaystyle |\alpha |=k,} such that

f ( x ) = | α | k D α f ( a ) α ! ( x a ) α + | α | = k h α ( x ) ( x a ) α , and lim x a h α ( x ) = 0. {\displaystyle {\begin{aligned}&f({\boldsymbol {x}})=\sum _{|\alpha |\leq k}{\frac {D^{\alpha }f({\boldsymbol {a}})}{\alpha !}}({\boldsymbol {x}}-{\boldsymbol {a}})^{\alpha }+\sum _{|\alpha |=k}h_{\alpha }({\boldsymbol {x}})({\boldsymbol {x}}-{\boldsymbol {a}})^{\alpha },\\&{\mbox{and}}\quad \lim _{{\boldsymbol {x}}\to {\boldsymbol {a}}}h_{\alpha }({\boldsymbol {x}})=0.\end{aligned}}}

If the function f : RR is k + 1 times continuously differentiable in a closed ball B = { y R n : a y r } {\displaystyle B=\{\mathbf {y} \in \mathbb {R} ^{n}:\left\|\mathbf {a} -\mathbf {y} \right\|\leq r\}} for some r > 0 {\displaystyle r>0} , then one can derive an exact formula for the remainder in terms of (k+1)-th order partial derivatives of f in this neighborhood. Namely,

f ( x ) = | α | k D α f ( a ) α ! ( x a ) α + | β | = k + 1 R β ( x ) ( x a ) β , R β ( x ) = | β | β ! 0 1 ( 1 t ) | β | 1 D β f ( a + t ( x a ) ) d t . {\displaystyle {\begin{aligned}&f({\boldsymbol {x}})=\sum _{|\alpha |\leq k}{\frac {D^{\alpha }f({\boldsymbol {a}})}{\alpha !}}({\boldsymbol {x}}-{\boldsymbol {a}})^{\alpha }+\sum _{|\beta |=k+1}R_{\beta }({\boldsymbol {x}})({\boldsymbol {x}}-{\boldsymbol {a}})^{\beta },\\&R_{\beta }({\boldsymbol {x}})={\frac {|\beta |}{\beta !}}\int _{0}^{1}(1-t)^{|\beta |-1}D^{\beta }f{\big (}{\boldsymbol {a}}+t({\boldsymbol {x}}-{\boldsymbol {a}}){\big )}\,dt.\end{aligned}}}

In this case, due to the continuity of (k+1)-th order partial derivatives in the compact set B, one immediately obtains the uniform estimates

| R β ( x ) | 1 β ! max | α | = | β | max y B | D α f ( y ) | , x B . {\displaystyle \left|R_{\beta }({\boldsymbol {x}})\right|\leq {\frac {1}{\beta !}}\max _{|\alpha |=|\beta |}\max _{{\boldsymbol {y}}\in B}|D^{\alpha }f({\boldsymbol {y}})|,\qquad {\boldsymbol {x}}\in B.}

Example in two dimensions

For example, the third-order Taylor polynomial of a smooth function f : R 2 R {\displaystyle f:\mathbb {R} ^{2}\to \mathbb {R} } is, denoting x a = v {\displaystyle {\boldsymbol {x}}-{\boldsymbol {a}}={\boldsymbol {v}}} ,

P 3 ( x ) = f ( a ) + f x 1 ( a ) v 1 + f x 2 ( a ) v 2 + 2 f x 1 2 ( a ) v 1 2 2 ! + 2 f x 1 x 2 ( a ) v 1 v 2 + 2 f x 2 2 ( a ) v 2 2 2 ! + 3 f x 1 3 ( a ) v 1 3 3 ! + 3 f x 1 2 x 2 ( a ) v 1 2 v 2 2 ! + 3 f x 1 x 2 2 ( a ) v 1 v 2 2 2 ! + 3 f x 2 3 ( a ) v 2 3 3 ! {\displaystyle {\begin{aligned}P_{3}({\boldsymbol {x}})=f({\boldsymbol {a}})+{}&{\frac {\partial f}{\partial x_{1}}}({\boldsymbol {a}})v_{1}+{\frac {\partial f}{\partial x_{2}}}({\boldsymbol {a}})v_{2}+{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}({\boldsymbol {a}}){\frac {v_{1}^{2}}{2!}}+{\frac {\partial ^{2}f}{\partial x_{1}\partial x_{2}}}({\boldsymbol {a}})v_{1}v_{2}+{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}({\boldsymbol {a}}){\frac {v_{2}^{2}}{2!}}\\&+{\frac {\partial ^{3}f}{\partial x_{1}^{3}}}({\boldsymbol {a}}){\frac {v_{1}^{3}}{3!}}+{\frac {\partial ^{3}f}{\partial x_{1}^{2}\partial x_{2}}}({\boldsymbol {a}}){\frac {v_{1}^{2}v_{2}}{2!}}+{\frac {\partial ^{3}f}{\partial x_{1}\partial x_{2}^{2}}}({\boldsymbol {a}}){\frac {v_{1}v_{2}^{2}}{2!}}+{\frac {\partial ^{3}f}{\partial x_{2}^{3}}}({\boldsymbol {a}}){\frac {v_{2}^{3}}{3!}}\end{aligned}}}

Proofs

Proof for Taylor's theorem in one real variable

Let

h k ( x ) = { f ( x ) P ( x ) ( x a ) k x a 0 x = a {\displaystyle h_{k}(x)={\begin{cases}{\frac {f(x)-P(x)}{(x-a)^{k}}}&x\not =a\\0&x=a\end{cases}}}

where, as in the statement of Taylor's theorem,

P ( x ) = f ( a ) + f ( a ) ( x a ) + f ( a ) 2 ! ( x a ) 2 + + f ( k ) ( a ) k ! ( x a ) k . {\displaystyle P(x)=f(a)+f'(a)(x-a)+{\frac {f''(a)}{2!}}(x-a)^{2}+\cdots +{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}.}

It is sufficient to show that

lim x a h k ( x ) = 0. {\displaystyle \lim _{x\to a}h_{k}(x)=0.}

The proof here is based on repeated application of L'Hôpital's rule. Note that, for each j = 0 , 1 , . . . , k 1 {\textstyle j=0,1,...,k-1} , f ( j ) ( a ) = P ( j ) ( a ) {\displaystyle f^{(j)}(a)=P^{(j)}(a)} . Hence each of the first k 1 {\textstyle k-1} derivatives of the numerator in h k ( x ) {\displaystyle h_{k}(x)} vanishes at x = a {\displaystyle x=a} , and the same is true of the denominator. Also, since the condition that the function f {\textstyle f} be k {\textstyle k} times differentiable at a point requires differentiability up to order k 1 {\textstyle k-1} in a neighborhood of said point (this is true, because differentiability requires a function to be defined in a whole neighborhood of a point), the numerator and its k 2 {\textstyle k-2} derivatives are differentiable in a neighborhood of a {\textstyle a} . Clearly, the denominator also satisfies said condition, and additionally, doesn't vanish unless x = a {\textstyle x=a} , therefore all conditions necessary for L'Hôpital's rule are fulfilled, and its use is justified. So

lim x a f ( x ) P ( x ) ( x a ) k = lim x a d d x ( f ( x ) P ( x ) ) d d x ( x a ) k = = lim x a d k 1 d x k 1 ( f ( x ) P ( x ) ) d k 1 d x k 1 ( x a ) k = 1 k ! lim x a f ( k 1 ) ( x ) P ( k 1 ) ( x ) x a = 1 k ! ( f ( k ) ( a ) P ( k ) ( a ) ) = 0 {\displaystyle {\begin{aligned}\lim _{x\to a}{\frac {f(x)-P(x)}{(x-a)^{k}}}&=\lim _{x\to a}{\frac {{\frac {d}{dx}}(f(x)-P(x))}{{\frac {d}{dx}}(x-a)^{k}}}\\&=\cdots \\&=\lim _{x\to a}{\frac {{\frac {d^{k-1}}{dx^{k-1}}}(f(x)-P(x))}{{\frac {d^{k-1}}{dx^{k-1}}}(x-a)^{k}}}\\&={\frac {1}{k!}}\lim _{x\to a}{\frac {f^{(k-1)}(x)-P^{(k-1)}(x)}{x-a}}\\&={\frac {1}{k!}}(f^{(k)}(a)-P^{(k)}(a))=0\end{aligned}}}

where the second-to-last equality follows by the definition of the derivative at x = a {\textstyle x=a} .

Alternate proof for Taylor's theorem in one real variable

Let f ( x ) {\displaystyle f(x)} be any real-valued continuous function to be approximated by the Taylor polynomial.

Step 1: Let F {\textstyle F} and G {\textstyle G} be functions. Set F {\textstyle F} and G {\textstyle G} to be

F ( x ) = f ( x ) k = 0 n 1 f ( k ) ( a ) k ! ( x a ) k {\displaystyle {\begin{aligned}F(x)=f(x)-\sum _{k=0}^{n-1}{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}\end{aligned}}}

G ( x ) = ( x a ) n {\displaystyle {\begin{aligned}G(x)=(x-a)^{n}\end{aligned}}}

Step 2: Properties of F {\textstyle F} and G {\textstyle G} :

F ( a ) = f ( a ) f ( a ) f ( a ) ( a a ) . . . f ( n 1 ) ( a ) ( n 1 ) ! ( a a ) n 1 = 0 G ( a ) = ( a a ) n = 0 {\displaystyle {\begin{aligned}F(a)&=f(a)-f(a)-f'(a)(a-a)-...-{\frac {f^{(n-1)}(a)}{(n-1)!}}(a-a)^{n-1}=0\\G(a)&=(a-a)^{n}=0\end{aligned}}}

Similarly,

F ( a ) = f ( a ) f ( a ) f ( a ) ( 2 1 ) ! ( a a ) ( 2 1 ) . . . f ( n 1 ) ( a ) ( n 2 ) ! ( a a ) n 2 = 0 {\displaystyle {\begin{aligned}F'(a)=f'(a)-f'(a)-{\frac {f''(a)}{(2-1)!}}(a-a)^{(2-1)}-...-{\frac {f^{(n-1)}(a)}{(n-2)!}}(a-a)^{n-2}=0\end{aligned}}}

G ( a ) = n ( a a ) n 1 = 0 G ( n 1 ) ( a ) = F ( n 1 ) ( a ) = 0 {\displaystyle {\begin{aligned}G'(a)&=n(a-a)^{n-1}=0\\&\qquad \vdots \\G^{(n-1)}(a)&=F^{(n-1)}(a)=0\end{aligned}}}

Step 3: Use Cauchy Mean Value Theorem

Let f 1 {\displaystyle f_{1}} and g 1 {\displaystyle g_{1}} be continuous functions on [ a , b ] {\displaystyle } . Since a < x < b {\displaystyle a<x<b} so we can work with the interval [ a , x ] {\displaystyle } . Let f 1 {\displaystyle f_{1}} and g 1 {\displaystyle g_{1}} be differentiable on ( a , x ) {\displaystyle (a,x)} . Assume g 1 ( x ) 0 {\displaystyle g_{1}'(x)\neq 0} for all x ( a , b ) {\displaystyle x\in (a,b)} . Then there exists c 1 ( a , x ) {\displaystyle c_{1}\in (a,x)} such that

f 1 ( x ) f 1 ( a ) g 1 ( x ) g 1 ( a ) = f 1 ( c 1 ) g 1 ( c 1 ) {\displaystyle {\begin{aligned}{\frac {f_{1}(x)-f_{1}(a)}{g_{1}(x)-g_{1}(a)}}={\frac {f_{1}'(c_{1})}{g_{1}'(c_{1})}}\end{aligned}}}

Note: G ( x ) 0 {\displaystyle G'(x)\neq 0} in ( a , b ) {\displaystyle (a,b)} and F ( a ) , G ( a ) = 0 {\displaystyle F(a),G(a)=0} so

F ( x ) G ( x ) = F ( x ) F ( a ) G ( x ) G ( a ) = F ( c 1 ) G ( c 1 ) {\displaystyle {\begin{aligned}{\frac {F(x)}{G(x)}}={\frac {F(x)-F(a)}{G(x)-G(a)}}={\frac {F'(c_{1})}{G'(c_{1})}}\end{aligned}}}

for some c 1 ( a , x ) {\displaystyle c_{1}\in (a,x)} .

This can also be performed for ( a , c 1 ) {\displaystyle (a,c_{1})} :

F ( c 1 ) G ( c 1 ) = F ( c 1 ) F ( a ) G ( c 1 ) G ( a ) = F ( c 2 ) G ( c 2 ) {\displaystyle {\begin{aligned}{\frac {F'(c_{1})}{G'(c_{1})}}={\frac {F'(c_{1})-F'(a)}{G'(c_{1})-G'(a)}}={\frac {F''(c_{2})}{G''(c_{2})}}\end{aligned}}}

for some c 2 ( a , c 1 ) {\displaystyle c_{2}\in (a,c_{1})} . This can be continued to c n {\displaystyle c_{n}} .

This gives a partition in ( a , b ) {\displaystyle (a,b)} :

a < c n < c n 1 < < c 1 < x {\displaystyle a<c_{n}<c_{n-1}<\dots <c_{1}<x}

with

F ( x ) G ( x ) = F ( c 1 ) G ( c 1 ) = = F ( n ) ( c n ) G ( n ) ( c n ) . {\displaystyle {\frac {F(x)}{G(x)}}={\frac {F'(c_{1})}{G'(c_{1})}}=\dots ={\frac {F^{(n)}(c_{n})}{G^{(n)}(c_{n})}}.}

Set c = c n {\displaystyle c=c_{n}} :

F ( x ) G ( x ) = F ( n ) ( c ) G ( n ) ( c ) {\displaystyle {\frac {F(x)}{G(x)}}={\frac {F^{(n)}(c)}{G^{(n)}(c)}}}

Step 4: Substitute back

F ( x ) G ( x ) = f ( x ) k = 0 n 1 f ( k ) ( a ) k ! ( x a ) k ( x a ) n = F ( n ) ( c ) G ( n ) ( c ) {\displaystyle {\begin{aligned}{\frac {F(x)}{G(x)}}={\frac {f(x)-\sum _{k=0}^{n-1}{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}}{(x-a)^{n}}}={\frac {F^{(n)}(c)}{G^{(n)}(c)}}\end{aligned}}}

By the Power Rule, repeated derivatives of ( x a ) n {\displaystyle (x-a)^{n}} , G ( n ) ( c ) = n ( n 1 ) . . .1 {\displaystyle G^{(n)}(c)=n(n-1)...1} , so:

F ( n ) ( c ) G ( n ) ( c ) = f ( n ) ( c ) n ( n 1 ) 1 = f ( n ) ( c ) n ! . {\displaystyle {\frac {F^{(n)}(c)}{G^{(n)}(c)}}={\frac {f^{(n)}(c)}{n(n-1)\cdots 1}}={\frac {f^{(n)}(c)}{n!}}.}

This leads to:

f ( x ) k = 0 n 1 f ( k ) ( a ) k ! ( x a ) k = f ( n ) ( c ) n ! ( x a ) n . {\displaystyle {\begin{aligned}f(x)-\sum _{k=0}^{n-1}{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}={\frac {f^{(n)}(c)}{n!}}(x-a)^{n}\end{aligned}}.}

By rearranging, we get:

f ( x ) = k = 0 n 1 f ( k ) ( a ) k ! ( x a ) k + f ( n ) ( c ) n ! ( x a ) n , {\displaystyle {\begin{aligned}f(x)=\sum _{k=0}^{n-1}{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}+{\frac {f^{(n)}(c)}{n!}}(x-a)^{n}\end{aligned}},}

or because c n = a {\displaystyle c_{n}=a} eventually:

f ( x ) = k = 0 n f ( k ) ( a ) k ! ( x a ) k . {\displaystyle f(x)=\sum _{k=0}^{n}{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}.}

Derivation for the mean value forms of the remainder

Let G be any real-valued function, continuous on the closed interval between a {\textstyle a} and x {\textstyle x} and differentiable with a non-vanishing derivative on the open interval between a {\textstyle a} and x {\textstyle x} , and define

F ( t ) = f ( t ) + f ( t ) ( x t ) + f ( t ) 2 ! ( x t ) 2 + + f ( k ) ( t ) k ! ( x t ) k . {\displaystyle F(t)=f(t)+f'(t)(x-t)+{\frac {f''(t)}{2!}}(x-t)^{2}+\cdots +{\frac {f^{(k)}(t)}{k!}}(x-t)^{k}.}

For t [ a , x ] {\displaystyle t\in } . Then, by Cauchy's mean value theorem,

F ( ξ ) G ( ξ ) = F ( x ) F ( a ) G ( x ) G ( a ) {\displaystyle {\frac {F'(\xi )}{G'(\xi )}}={\frac {F(x)-F(a)}{G(x)-G(a)}}} (★★★)

for some ξ {\textstyle \xi } on the open interval between a {\textstyle a} and x {\textstyle x} . Note that here the numerator F ( x ) F ( a ) = R k ( x ) {\textstyle F(x)-F(a)=R_{k}(x)} is exactly the remainder of the Taylor polynomial for y = f ( x ) {\textstyle y=f(x)} . Compute

F ( t ) = f ( t ) + ( f ( t ) ( x t ) f ( t ) ) + ( f ( 3 ) ( t ) 2 ! ( x t ) 2 f ( 2 ) ( t ) 1 ! ( x t ) ) + + ( f ( k + 1 ) ( t ) k ! ( x t ) k f ( k ) ( t ) ( k 1 ) ! ( x t ) k 1 ) = f ( k + 1 ) ( t ) k ! ( x t ) k , {\displaystyle {\begin{aligned}F'(t)={}&f'(t)+{\big (}f''(t)(x-t)-f'(t){\big )}+\left({\frac {f^{(3)}(t)}{2!}}(x-t)^{2}-{\frac {f^{(2)}(t)}{1!}}(x-t)\right)+\cdots \\&\cdots +\left({\frac {f^{(k+1)}(t)}{k!}}(x-t)^{k}-{\frac {f^{(k)}(t)}{(k-1)!}}(x-t)^{k-1}\right)={\frac {f^{(k+1)}(t)}{k!}}(x-t)^{k},\end{aligned}}}

plug it into (★★★) and rearrange terms to find that

R k ( x ) = f ( k + 1 ) ( ξ ) k ! ( x ξ ) k G ( x ) G ( a ) G ( ξ ) . {\displaystyle R_{k}(x)={\frac {f^{(k+1)}(\xi )}{k!}}(x-\xi )^{k}{\frac {G(x)-G(a)}{G'(\xi )}}.}

This is the form of the remainder term mentioned after the actual statement of Taylor's theorem with remainder in the mean value form. The Lagrange form of the remainder is found by choosing G ( t ) = ( x t ) k + 1 {\displaystyle G(t)=(x-t)^{k+1}} and the Cauchy form by choosing G ( t ) = t a {\displaystyle G(t)=t-a} .

Remark. Using this method one can also recover the integral form of the remainder by choosing

G ( t ) = a t f ( k + 1 ) ( s ) k ! ( x s ) k d s , {\displaystyle G(t)=\int _{a}^{t}{\frac {f^{(k+1)}(s)}{k!}}(x-s)^{k}\,ds,}

but the requirements for f needed for the use of mean value theorem are too strong, if one aims to prove the claim in the case that f is only absolutely continuous. However, if one uses Riemann integral instead of Lebesgue integral, the assumptions cannot be weakened.

Derivation for the integral form of the remainder

Due to the absolute continuity of f ( k ) {\displaystyle f^{(k)}} on the closed interval between a {\textstyle a} and x {\textstyle x} , its derivative f ( k + 1 ) {\displaystyle f^{(k+1)}} exists as an L 1 {\displaystyle L^{1}} -function, and we can use the fundamental theorem of calculus and integration by parts. This same proof applies for the Riemann integral assuming that f ( k ) {\displaystyle f^{(k)}} is continuous on the closed interval and differentiable on the open interval between a {\textstyle a} and x {\textstyle x} , and this leads to the same result than using the mean value theorem.

The fundamental theorem of calculus states that

f ( x ) = f ( a ) + a x f ( t ) d t . {\displaystyle f(x)=f(a)+\int _{a}^{x}\,f'(t)\,dt.}

Now we can integrate by parts and use the fundamental theorem of calculus again to see that

f ( x ) = f ( a ) + ( x f ( x ) a f ( a ) ) a x t f ( t ) d t = f ( a ) + x ( f ( a ) + a x f ( t ) d t ) a f ( a ) a x t f ( t ) d t = f ( a ) + ( x a ) f ( a ) + a x ( x t ) f ( t ) d t , {\displaystyle {\begin{aligned}f(x)&=f(a)+{\Big (}xf'(x)-af'(a){\Big )}-\int _{a}^{x}tf''(t)\,dt\\&=f(a)+x\left(f'(a)+\int _{a}^{x}f''(t)\,dt\right)-af'(a)-\int _{a}^{x}tf''(t)\,dt\\&=f(a)+(x-a)f'(a)+\int _{a}^{x}\,(x-t)f''(t)\,dt,\end{aligned}}}

which is exactly Taylor's theorem with remainder in the integral form in the case k = 1 {\displaystyle k=1} . The general statement is proved using induction. Suppose that

f ( x ) = f ( a ) + f ( a ) 1 ! ( x a ) + + f ( k ) ( a ) k ! ( x a ) k + a x f ( k + 1 ) ( t ) k ! ( x t ) k d t . {\displaystyle f(x)=f(a)+{\frac {f'(a)}{1!}}(x-a)+\cdots +{\frac {f^{(k)}(a)}{k!}}(x-a)^{k}+\int _{a}^{x}{\frac {f^{(k+1)}(t)}{k!}}(x-t)^{k}\,dt.} (eq1)

Integrating the remainder term by parts we arrive at

a x f ( k + 1 ) ( t ) k ! ( x t ) k d t = [ f ( k + 1 ) ( t ) ( k + 1 ) k ! ( x t ) k + 1 ] a x + a x f ( k + 2 ) ( t ) ( k + 1 ) k ! ( x t ) k + 1 d t =   f ( k + 1 ) ( a ) ( k + 1 ) ! ( x a ) k + 1 + a x f ( k + 2 ) ( t ) ( k + 1 ) ! ( x t ) k + 1 d t . {\displaystyle {\begin{aligned}\int _{a}^{x}{\frac {f^{(k+1)}(t)}{k!}}(x-t)^{k}\,dt=&-\left_{a}^{x}+\int _{a}^{x}{\frac {f^{(k+2)}(t)}{(k+1)k!}}(x-t)^{k+1}\,dt\\=&\ {\frac {f^{(k+1)}(a)}{(k+1)!}}(x-a)^{k+1}+\int _{a}^{x}{\frac {f^{(k+2)}(t)}{(k+1)!}}(x-t)^{k+1}\,dt.\end{aligned}}}

Substituting this into the formula in (eq1) shows that if it holds for the value k {\displaystyle k} , it must also hold for the value k + 1 {\displaystyle k+1} . Therefore, since it holds for k = 1 {\displaystyle k=1} , it must hold for every positive integer k {\displaystyle k} .

Derivation for the remainder of multivariate Taylor polynomials

We prove the special case, where f : R n R {\displaystyle f:\mathbb {R} ^{n}\to \mathbb {R} } has continuous partial derivatives up to the order k + 1 {\displaystyle k+1} in some closed ball B {\displaystyle B} with center a {\displaystyle {\boldsymbol {a}}} . The strategy of the proof is to apply the one-variable case of Taylor's theorem to the restriction of f {\displaystyle f} to the line segment adjoining x {\displaystyle {\boldsymbol {x}}} and a {\displaystyle {\boldsymbol {a}}} . Parametrize the line segment between a {\displaystyle {\boldsymbol {a}}} and x {\displaystyle {\boldsymbol {x}}} by u ( t ) = a + t ( x a ) {\displaystyle {\boldsymbol {u}}(t)={\boldsymbol {a}}+t({\boldsymbol {x}}-{\boldsymbol {a}})} We apply the one-variable version of Taylor's theorem to the function g ( t ) = f ( u ( t ) ) {\displaystyle g(t)=f({\boldsymbol {u}}(t))} :

f ( x ) = g ( 1 ) = g ( 0 ) + j = 1 k 1 j ! g ( j ) ( 0 )   +   0 1 ( 1 t ) k k ! g ( k + 1 ) ( t ) d t . {\displaystyle f({\boldsymbol {x}})=g(1)=g(0)+\sum _{j=1}^{k}{\frac {1}{j!}}g^{(j)}(0)\ +\ \int _{0}^{1}{\frac {(1-t)^{k}}{k!}}g^{(k+1)}(t)\,dt.}

Applying the chain rule for several variables gives

g ( j ) ( t ) = d j d t j f ( u ( t ) ) = d j d t j f ( a + t ( x a ) ) = | α | = j ( j α ) ( D α f ) ( a + t ( x a ) ) ( x a ) α {\displaystyle {\begin{aligned}g^{(j)}(t)&={\frac {d^{j}}{dt^{j}}}f({\boldsymbol {u}}(t))\\&={\frac {d^{j}}{dt^{j}}}f({\boldsymbol {a}}+t({\boldsymbol {x}}-{\boldsymbol {a}}))\\&=\sum _{|\alpha |=j}\left({\begin{matrix}j\\\alpha \end{matrix}}\right)(D^{\alpha }f)({\boldsymbol {a}}+t({\boldsymbol {x}}-{\boldsymbol {a}}))({\boldsymbol {x}}-{\boldsymbol {a}})^{\alpha }\end{aligned}}}

where ( j α ) {\displaystyle {\tbinom {j}{\alpha }}} is the multinomial coefficient. Since 1 j ! ( j α ) = 1 α ! {\displaystyle {\tfrac {1}{j!}}{\tbinom {j}{\alpha }}={\tfrac {1}{\alpha !}}} , we get:

f ( x ) = f ( a ) + 1 | α | k 1 α ! ( D α f ) ( a ) ( x a ) α + | α | = k + 1 k + 1 α ! ( x a ) α 0 1 ( 1 t ) k ( D α f ) ( a + t ( x a ) ) d t . {\displaystyle f({\boldsymbol {x}})=f({\boldsymbol {a}})+\sum _{1\leq |\alpha |\leq k}{\frac {1}{\alpha !}}(D^{\alpha }f)({\boldsymbol {a}})({\boldsymbol {x}}-{\boldsymbol {a}})^{\alpha }+\sum _{|\alpha |=k+1}{\frac {k+1}{\alpha !}}({\boldsymbol {x}}-{\boldsymbol {a}})^{\alpha }\int _{0}^{1}(1-t)^{k}(D^{\alpha }f)({\boldsymbol {a}}+t({\boldsymbol {x}}-{\boldsymbol {a}}))\,dt.}

See also

Footnotes

  1. (2013). "Linear and quadratic approximation" Retrieved December 6, 2018
  2. Taylor, Brook (1715). Methodus Incrementorum Directa et Inversa [Direct and Reverse Methods of Incrementation] (in Latin). London. p. 21–23 (Prop. VII, Thm. 3, Cor. 2). Translated into English in Struik, D. J. (1969). A Source Book in Mathematics 1200–1800. Cambridge, Massachusetts: Harvard University Press. pp. 329–332.
  3. Kline 1972, pp. 442, 464.
  4. Genocchi, Angelo; Peano, Giuseppe (1884), Calcolo differenziale e principii di calcolo integrale, (N. 67, pp. XVII–XIX): Fratelli Bocca ed.{{citation}}: CS1 maint: location (link)
  5. Spivak, Michael (1994), Calculus (3rd ed.), Houston, TX: Publish or Perish, p. 383, ISBN 978-0-914098-89-8
  6. "Taylor formula", Encyclopedia of Mathematics, EMS Press, 2001
  7. The hypothesis of f being continuous on the closed interval between a {\textstyle a} and x {\textstyle x} is not redundant. Although f being k + 1 times differentiable on the open interval between a {\textstyle a} and x {\textstyle x} does imply that f is continuous on the open interval between a {\textstyle a} and x {\textstyle x} , it does not imply that f is continuous on the closed interval between a {\textstyle a} and x {\textstyle x} , i.e. it does not imply that f is continuous at the endpoints of that interval. Consider, for example, the function f : → R defined to equal sin ( 1 / x ) {\displaystyle \sin(1/x)} on ( 0 , 1 ] {\displaystyle (0,1]} and with f ( 0 ) = 0 {\displaystyle f(0)=0} . This is not continuous at 0, but is continuous on ( 0 , 1 ) {\displaystyle (0,1)} . Moreover, one can show that this function has an antiderivative. Therefore that antiderivative is differentiable on ( 0 , 1 ) {\displaystyle (0,1)} , its derivative (the function f) is continuous on the open interval ( 0 , 1 ) {\displaystyle (0,1)} , but its derivative f is not continuous on the closed interval [ 0 , 1 ] {\displaystyle } . So the theorem would not apply in this case.
  8. Kline 1998, §20.3; Apostol 1967, §7.7.
  9. Apostol 1967, §7.7.
  10. Apostol 1967, §7.5.
  11. Apostol 1967, §7.6
  12. Rudin 1987, §10.26
  13. This follows from iterated application of the theorem that if the partial derivatives of a function f exist in a neighborhood of a and are continuous at a, then the function is differentiable at a. See, for instance, Apostol 1974, Theorem 12.11.
  14. Königsberger Analysis 2, p. 64 ff.
  15. Folland, G. B. "Higher-Order Derivatives and Taylor's Formula in Several Variables" (PDF). Department of Mathematics | University of Washington. Retrieved 2024-02-21.
  16. Stromberg 1981
  17. Hörmander 1976, pp. 12–13

References

External links

Calculus
Precalculus
Limits
Differential calculus
Integral calculus
Vector calculus
Multivariable calculus
Sequences and series
Special functions
and numbers
History of calculus
Lists
Integrals
Miscellaneous topics
Categories: