Misplaced Pages

Implicit function theorem

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by Pokipsy76 (talk | contribs) at 14:37, 6 January 2010 (Statement of the theorem: +regularity). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 14:37, 6 January 2010 by Pokipsy76 (talk | contribs) (Statement of the theorem: +regularity)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
This article may require cleanup to meet Misplaced Pages's quality standards. No cleanup reason has been specified. Please help improve this article if you can. (October 2007) (Learn how and when to remove this message)

In the branch of mathematics called multivariable calculus, the implicit function theorem is a tool which allows relations to be converted to functions. It does this by representing the relation as the graph of a function. There may not be a single function whose graph is the entire relation, but there may be such a function on a restriction of the domain of the relation. The implicit function theorem gives a sufficient condition to ensure that there is such a function.

The theorem states that if the equation R(x, y) = 0 (an implicit function) satisfies some mild conditions on its partial derivatives, then one can in principle solve this equation for y, at least over some small interval. Geometrically, the locus defined by R(x,y) = 0 will overlap locally with the graph of a function y = f(x) (an explicit function, see article on implicit functions).

First example

The unit circle can be specified as the level curve f ( x , y ) = 1 {\displaystyle f(x,y)=1} of the function f ( x , y ) = x 2 + y 2 {\displaystyle f(x,y)=x^{2}+y^{2}} . Around point A, y can be expressed as a function y ( x ) {\displaystyle y(x)} , specifically g 1 ( x ) = 1 x 2 {\displaystyle g_{1}(x)={\sqrt {1-x^{2}}}} . No such function exists around point B.

If we define the function f ( x , y ) = x 2 + y 2 {\displaystyle f(x,y)=x^{2}+y^{2}} , then the equation f ( x , y ) = 1 {\displaystyle f(x,y)=1} cuts out the unit circle as the level set { ( x , y ) | f ( x , y ) = 1 } {\displaystyle \{(x,y)|f(x,y)=1\}} . There is no way to represent the unit circle as the graph of a function of one variable y = g ( x ) {\displaystyle y=g(x)} because for each choice of x ( 1 , 1 ) , {\displaystyle x\in (-1,1),} there are two choices of y {\displaystyle y} , namely ± 1 x 2 {\displaystyle \pm {\sqrt {1-x^{2}}}} .

However, it is possible to represent part of the circle as the graph of a function of one variable. If we let g 1 ( x ) = 1 x 2 {\displaystyle g_{1}(x)={\sqrt {1-x^{2}}}} for 1 < x < 1 {\displaystyle -1<x<1} , then the graph of y = g 1 ( x ) {\displaystyle y=g_{1}(x)} provides the upper half of the circle. Similarly, if g 2 ( x ) = 1 x 2 {\displaystyle g_{2}(x)=-{\sqrt {1-x^{2}}}} , then the graph of y = g 2 ( x ) {\displaystyle y=g_{2}(x)} gives the lower half of the circle.

The purpose of the implicit function theorem is to tell us the existence of functions like g 1 ( x ) {\displaystyle g_{1}(x)} and g 2 ( x ) {\displaystyle g_{2}(x)} , even in situations where we cannot write down explicit formulas. It guarantees that g 1 ( x ) {\displaystyle g_{1}(x)} and g 2 ( x ) {\displaystyle g_{2}(x)} are differentiable, and it even works in situations where we do not have a formula for f ( x , y ) {\displaystyle f(x,y)} .

Statement of the theorem

Let f : RR be a continuously differentiable function. We think of R as the Cartesian product R × R, and we write a point of this product as (x,y) = (x1, ..., xny1, ..., ym). f is the given relation. Our goal is to construct a function g : RR whose graph (x, g(x)) is precisely the set of all (x, y) such that f(xy) = 0.

As noted above, this may not always be possible. As such, we will fix a point (a,b) = (a1, ..., anb1, ..., bm) which satisfies f(ab) = 0, and we will ask for a g that works near the point (ab). In other words, we want an open set U of R, an open set V of R, and a function g : UV such that the graph of g satisfies the relation f = 0 on U × V. In symbols,

{ ( x , g ( x ) ) } = { ( x , y ) | f ( x , y ) = 0 } ( U × V ) . {\displaystyle \{(\mathbf {x} ,g(\mathbf {x} ))\}=\{(\mathbf {x} ,\mathbf {y} )|f(\mathbf {x} ,\mathbf {y} )=0\}\cap (U\times V).}

To state the implicit function theorem, we need the Jacobian, also called the differential or total derivative, of f {\displaystyle f} . This is the matrix of partial derivatives of f {\displaystyle f} . Abbreviating (a1, ..., anb1, ..., bm) to (a, b), the Jacobian matrix is

( D f ) ( a , b ) = [ f 1 x 1 ( a , b ) f 1 x n ( a , b ) f m x 1 ( a , b ) f m x n ( a , b ) | f 1 y 1 ( a , b ) f 1 y m ( a , b ) f m y 1 ( a , b ) f m y m ( a , b ) ] = [ X | Y ] {\displaystyle {\begin{matrix}(Df)(\mathbf {a} ,\mathbf {b} )&=&\left\\&=&{\begin{bmatrix}X&|&Y\end{bmatrix}}\\\end{matrix}}}

where X {\displaystyle X} is the matrix of partial derivatives in the x {\displaystyle x} 's and Y {\displaystyle Y} is the matrix of partial derivatives in the y {\displaystyle y} 's. The implicit function theorem says that if Y {\displaystyle Y} is an invertible matrix, then there are U {\displaystyle U} , V {\displaystyle V} , and g {\displaystyle g} as desired. Writing all the hypotheses together gives the following statement.

Let f : RR be a continuously differentiable function, and let R have coordinates (xy). Fix a point (a1,...,an,b1,...,bm) = (a,b) with f(a,b)=c, where cR. If the matrix is invertible, then there exists an open set U containing a, an open set V containing b, and a unique continuously differentiable function g:UV such that
{ ( x , g ( x ) ) } = { ( x , y ) | f ( x , y ) = c } ( U × V ) . {\displaystyle \{(\mathbf {x} ,g(\mathbf {x} ))\}=\{(\mathbf {x} ,\mathbf {y} )|f(\mathbf {x} ,\mathbf {y} )=\mathbf {c} \}\cap (U\times V).}

Regularity

It can be proved that whenever we have the additional hypothesis that f is continuously differentiable up to k times inside U×V, then the same holds true for the explicit function g inside U and

d g d x j ( x ) = ( f y ( x , g ( x ) ) ) 1 f x j ( x ) {\displaystyle {\frac {dg}{dx_{j}}}(x)=-\left({\frac {\partial f}{\partial y}}(x,g(x))\right)^{-1}{\frac {\partial f}{\partial x_{j}}}(x)} .

The circle example

Let us go back to the example of the unit circle. In this case n = m = 1 {\displaystyle n=m=1} and f ( x , y ) = x 2 + y 2 1 {\displaystyle f(x,y)=x^{2}+y^{2}-1} . The matrix of partial derivatives is just a 1×2 matrix, given by

( D f ) ( a , b ) = [ f x ( a , b ) f y ( a , b ) ] = [ 2 a 2 b ] . {\displaystyle {\begin{matrix}(Df)(a,b)&=&{\begin{bmatrix}{\frac {\partial f}{\partial x}}(a,b)&{\frac {\partial f}{\partial y}}(a,b)\\\end{bmatrix}}\\&=&{\begin{bmatrix}2a&2b\end{bmatrix}}.\\\end{matrix}}}

Thus, here, Y is just a number; the linear map defined by it is invertible iff b 0 {\displaystyle b\neq 0} . By the implicit function theorem we see that we can write the circle in the form y = g ( x ) {\displaystyle y=g(x)} for all points where y 0 {\displaystyle y\neq 0} . For ( 1 , 0 ) {\displaystyle (-1,0)} and ( 1 , 0 ) {\displaystyle (1,0)} we run into trouble, as noted before.

Application: change of coordinates

Suppose we have an m-dimensional space, parametrised by a set of coordinates ( x 1 , , x m ) {\displaystyle (x_{1},\ldots ,x_{m})} . We can introduce a new coordinate system ( x 1 , , x m ) {\displaystyle (x'_{1},\ldots ,x'_{m})} by supplying m functions h 1 h m {\displaystyle h_{1}\ldots h_{m}} . These functions allow to calculate the new coordinates ( x 1 , , x m ) {\displaystyle (x'_{1},\ldots ,x'_{m})} of a point, given the point's old coordinates ( x 1 , , x m ) {\displaystyle (x_{1},\ldots ,x_{m})} using x 1 = h 1 ( x 1 , , x m ) , , x m = h m ( x 1 , , x m ) {\displaystyle x'_{1}=h_{1}(x_{1},\ldots ,x_{m}),\ldots ,x'_{m}=h_{m}(x_{1},\ldots ,x_{m})} . One might want to verify if the opposite is possible: given coordinates ( x 1 , , x m ) {\displaystyle (x'_{1},\ldots ,x'_{m})} , can we 'go back' and calculate the same point's original coordinates ( x 1 , , x m ) {\displaystyle (x_{1},\ldots ,x_{m})} ? The implicit function theorem will provide an answer to this question. The (new and old) coordinates ( x 1 , , x m , x 1 , , x m ) {\displaystyle (x'_{1},\ldots ,x'_{m},x_{1},\ldots ,x_{m})} are related by f = 0 {\displaystyle f=0} , with

f ( x 1 , , x m , x 1 , x m ) = ( h 1 ( x 1 , x m ) x 1 , , h m ( x 1 , , x m ) x m ) . {\displaystyle f(x'_{1},\ldots ,x'_{m},x_{1},\ldots x_{m})=(h_{1}(x_{1},\ldots x_{m})-x'_{1},\ldots ,h_{m}(x_{1},\ldots ,x_{m})-x'_{m}).}

Now the Jacobian matrix of f at a certain point ( a , b ) {\displaystyle (a,b)} [ where a = ( x 1 , , x m ) , b = ( x 1 , , x m ) {\displaystyle a=(x'_{1},\ldots ,x'_{m}),b=(x_{1},\ldots ,x_{m})} ] is given by

( D f ) ( a , b ) = [ 1 0 h 1 x 1 ( b ) h 1 x m ( b ) 0 1 h m x 1 ( b ) h m x m ( b ) ] = [ 1 m | J ] . {\displaystyle {\begin{matrix}(Df)(a,b)&=&{\begin{bmatrix}-1&\cdots &0&{\frac {\partial h_{1}}{\partial x_{1}}}(b)&\cdots &{\frac {\partial h_{1}}{\partial x_{m}}}(b)\\\vdots &\ddots &\vdots &\vdots &\ddots &\vdots \\0&\cdots &-1&{\frac {\partial h_{m}}{\partial x_{1}}}(b)&\cdots &{\frac {\partial h_{m}}{\partial x_{m}}}(b)\\\end{bmatrix}}\\&=&{\begin{bmatrix}-1_{m}&|&J\end{bmatrix}}.\\\end{matrix}}}

where 1 m {\displaystyle 1_{m}} denotes the m × m {\displaystyle m\times m} identity matrix, and J is the m × m {\displaystyle m\times m} matrix of partial derivatives, evaluated at ( a , b ) {\displaystyle (a,b)} . (In the above, these blocks were denoted by X and Y. As it happens, in this particular application of the theorem, neither matrix depends on a {\displaystyle a} .) The implicit function theorem now states that we can locally express ( x 1 , , x m ) {\displaystyle (x_{1},\ldots ,x_{m})} as a function of ( x 1 , , x m ) {\displaystyle (x'_{1},\ldots ,x'_{m})} if J is invertible. Demanding J is invertible is equivalent to det J 0 {\displaystyle \det J\neq 0} , thus we see that we can go back from the primed to the unprimed coordinates if the determinant of the Jacobian J is non-zero. This statement is also known as the inverse function theorem.

Example: polar coordinates

As a simple application of the above, consider the plane, parametrised by polar coordinates ( R , θ ) {\displaystyle (R,\theta )} . We can go to a new coordinate system (cartesian coordinates) by defining functions x ( R , θ ) = R cos θ {\displaystyle x(R,\theta )=R\cos \theta } and y ( R , θ ) = R sin θ {\displaystyle y(R,\theta )=R\sin \theta } . This makes it possible given any point ( R , θ ) {\displaystyle (R,\theta )} to find corresponding cartesian coordinates ( x , y ) {\displaystyle (x,y)} . When can we go back and convert cartesian into polar coordinates? By the previous example, we need det J 0 {\displaystyle \det J\neq 0} , with

J = [ x ( R , θ ) R x ( R , θ ) θ y ( R , θ ) R y ( R , θ ) θ ] = [ cos θ R sin θ sin θ R cos θ ] . {\displaystyle J={\begin{bmatrix}{\frac {\partial x(R,\theta )}{\partial R}}&{\frac {\partial x(R,\theta )}{\partial \theta }}\\{\frac {\partial y(R,\theta )}{\partial R}}&{\frac {\partial y(R,\theta )}{\partial \theta }}\\\end{bmatrix}}={\begin{bmatrix}\cos \theta &-R\sin \theta \\\sin \theta &R\cos \theta \end{bmatrix}}.}

Since det J = R {\displaystyle \det J=R} , the conversion back to polar coordinates is only possible if R 0 {\displaystyle R\neq 0} . This is a consequence of the fact that at the origin, polar coordinates don't exist: the value of θ {\displaystyle \theta } is not well-defined.

Generalizations

Banach space version

Based on the inverse function theorem in Banach spaces, it is possible to extend the implicit function theorem to Banach space valued mappings.

Let X {\displaystyle X} , Y {\displaystyle Y} , Z {\displaystyle Z} be Banach spaces. Let the mapping f : X × Y Z {\displaystyle f:X\times Y\to Z} be Fréchet differentiable. If ( x 0 , y 0 ) X × Y {\displaystyle (x_{0},y_{0})\in X\times Y} , f ( x 0 , y 0 ) = 0 {\displaystyle f(x_{0},y_{0})=0} , and y D f ( x 0 , y 0 ) ( 0 , y ) {\displaystyle y\mapsto Df(x_{0},y_{0})(0,y)} is a Banach space isomorphism from Y {\displaystyle Y} onto Z {\displaystyle Z} , then there exist neighbourhoods U {\displaystyle U} of x 0 {\displaystyle x_{0}} and V {\displaystyle V} of y 0 {\displaystyle y_{0}} and a Frechet differentiable function g : U V {\displaystyle g:U\to V} such that f ( x , g ( x ) ) = 0 {\displaystyle f(x,g(x))=0} and f ( x , y ) = 0 {\displaystyle f(x,y)=0} if and only if y = g ( x ) {\displaystyle y=g(x)} , for all ( x , y ) U × V {\displaystyle (x,y)\in U\times V} .

Implicit functions from non-differentiable functions

Various forms of the implicit function theorem exist for the case when the function f {\displaystyle f} is not differentiable. It is standard that it holds in one dimension. The following more general form was proven by Kumagai based on an observation by Jittorntrum.

Consider a continuous function f : R n × R m R n {\displaystyle f:R^{n}\times R^{m}\rightarrow R^{n}} such that f ( x 0 , y 0 ) = 0 {\displaystyle f(x_{0},y_{0})=0} . If there exist open neighbourhoods A R n {\displaystyle A\subset R^{n}} and B R m {\displaystyle B\subset R^{m}} of x 0 {\displaystyle x_{0}} and y 0 {\displaystyle y_{0}} , respectively, such that, for all y B {\displaystyle y\in B} , f ( , y ) : A R n {\displaystyle f(\cdot ,y):A\rightarrow R^{n}} is locally one-to-one then there exist open neighbourhoods A 0 R n {\displaystyle A_{0}\subset R^{n}} and B 0 R m {\displaystyle B_{0}\subset R^{m}} of x 0 {\displaystyle x_{0}} and y 0 {\displaystyle y_{0}} , such that, for all y B 0 {\displaystyle y\in B_{0}} , the equation

f ( x , y ) = 0 {\displaystyle f(x,y)=0}

has a unique solution

x = g ( y ) A 0 {\displaystyle x=g(y)\in A_{0}} ,

where g {\displaystyle g} is a continuous function from B 0 {\displaystyle B_{0}} into A 0 {\displaystyle A_{0}} .

See also

Notes

  1. L. D. Kudryavtsev, "Implicit function" in Encyclopedia of Mathematics,M. Hazewinkel, Ed. Dordrecht, The Netherlands: Kluwer, 1990.
  2. S. Kumagai, "An implicit function theorem: Comment," Journal of Optimization Theory and Applications, 31(2):285-288, June 1980.
  3. K. Jittorntrum, "An Implicit Function Theorem", Journal of Optimization Theory and Applications, 25(4), 1978.
Categories: