%% 27. $\hbox{Pad\'e}$ approximation ATAPformats %% % % Suppose $f$ is a function with a Taylor series % $$f(z) = c_0 + c_1 z + c_2 z^2 + \cdots \eqno (27.1)$$ % at $z=0$.\footnote{This chapter is % adapted from Gonnet, G\"uttel and Trefethen .} % Whether or not the series converges doesn't matter in this % chapter (it is enough for $f$ to be a {\em formal power series}). For any % integer $m\ge 0$, the {\em \boldmath degree $m$ Taylor approximant} to % $f$ is the unique polynomial $p_m\in{\cal P}_m$ that matches the series % as far as possible, which will be at least through degree $m$, % $$(f - p_m)(z) = O(z^{m+1}). \eqno (27.2)$$ % Pad\'e approximation is the % generalization of this idea to rational approximation. For any integers % $m,n \ge 0$, $r\in {\cal R}_{mn}$ is the {\em type $(m,n)$ % Pad\'e approximant} to $f$ if their Taylor series at $z=0$ agree as far % as possible: % $$(f- r_{mn})(z) = O(z^{\hbox{\footnotesize\rm maximum}}). % \eqno (27.3)$$ % In these conditions the big O'' notation has its usual precise meaning. % Equation (27.2) asserts, for example, that the first nonzero % term in the Taylor series for $f-p_m$ is of order $z^k$ for some % $k\ge m+1$, but not necessarily $k=m+1$. % %% % % Pad\'e approximation can be viewed as the special case of % rational interpolation in which the interpolation points coalesce at a % single point. % Thus there is a close analogy between the mathematics of % the last chapter and this one, though some significant differences too % that spring from the fact that the powers $z^0, z^1, \dots$ are % ordered whereas the roots of unity are all equal in status. % We shall see that a key property is that % $r_{mn}$ exists and is unique. Generically, it matches % $f$ through term $m+n$, % $$(f- r_{mn})(z) = O(z^{m+n+1}), % \eqno (27.4)$$ % but in some cases, the matching will be to lower or higher order. % %% % For example, the type $(1,1)$ $\hbox{Pad\'e}$ approximant to $e^z$ is % $(1+\textstyle{1\over 2}z)/(1-\textstyle{1\over 2}z)$, as we can verify % numerically with the Chebfun command |padeapprox|: %% % \vskip -2em [r,a,b] = padeapprox(@exp,1,1); fprintf(' Numerator coeffs: %19.15f %19.15f\n',a) fprintf(' Denominator coeffs: %19.15f %19.15f\n',b) %% % The algorithm used by |padeapprox| will be discussed in the second half % of this chapter. %% % % The early history of Pad\'e approximation is hard to disentangle because % every continued fraction can be regarded as a Pad\'e approximant (Exercise 27.7), % and continued fractions got a lot of attention in past centuries. % For example, Gauss derived the idea of Gauss quadrature from a continued % fraction that amounts to a Pad\'e approximant to the function % $\log((z+1)/(z-1))$ at the point $z=\infty$ [Gauss 1814, Takahasi \& Mori % 1971, Trefethen 2008]. Ideas related to Pad\'e approximation have been credited to % Anderson (1740), % Lambert (1758) and Lagrange (1776), and contributions were certainly % made by Cauchy  and Jacobi . The study of % Pad\'e approximants began to come closer to the current form with the % papers of Frobenius  and % Pad\'e himself , who was a student of % Hermite and published many articles % after his initial thesis on the subject. % Throughout the early literature, and also in the more recent era, % much of the discussion of Pad\'e approximation % is connected with continued fractions, determinants, and % recurrence relations, but here we shall follow a more robust matrix formulation. % %% % % We begin with a theorem about existence, uniqueness, and % characterization, analogous to Theorem 24.1 for rational best % approximation on an interval. There, the key idea was to count points of % equioscillation of the error function $f-r$. Here, we count how many % initial terms of the Taylor series of $f-r$ are zero. The arguments are % similar, and again, everything depends on the integer known as the % defect. Recall that if $r\in {\cal R}_{mn}$ is of exact type $(\kern % 1pt\mu,\nu)$ for some $\mu\le m$, $\nu \le n$, then the defect of $r$ % with respect to ${\cal R}_{mn}$ is $d = \min\{m-\mu, n-\nu\}\ge 0$, with % $\mu=-\infty$ and $d=n$ in the special case $r=0$. % %% % % {\em {\bf Theorem 27.1: Characterization of Pad\'e approximants.} % For each $m,n\ge 0$, a function $f$ has a unique Pad\'e approximant % $r_{mn} \in {\cal R}_{mn}$ as defined by the condition $(27.3)$, and a % function $r \in {\cal R}_{mn}$ is equal to $r_{mn}$ if and only if % $(\kern .5pt f-r)(z) = O(z^{m+n+1-d\kern .5pt})$, where $d$ is the defect % of $r$ in ${\cal R}_{mn}$.} % %% % % {\em Proof.} The first part of the argument is analogous to parts 2 and % 4 of the proof of Theorem 24.1: we show that if $r$ satisfies $(\kern % .5pt f-r)(z) = O(z^{m+n+1-d\kern .5pt})$, then $r$ is the unique type % $(m,n)$ Pad\'e approximant to $f$ as defined by the condition (27.3). % Suppose then that $(\kern .5pt f-r)(z) = O(z^{m+n+1-d\kern .5pt})$ and % that $(\kern .5pt f-\tilde r)(z) = O(z^{m+n+1-d\kern .5pt})$ also for % some possibly different function $\tilde r\in {\cal R}_{mn}$. Then % $(r-\tilde r)(z) = O(z^{m+n+1-d\kern .5pt})$. However, $r-\tilde r$ is of % type $(m+n-d, 2n-d\kern .7pt )$, so it can only have $m+n-d$ zeros at % $z=0$ unless it is identically zero. This implies $\tilde r = r$. % %% % % The other half of the proof is to show that there exists a % function $r$ with $(\kern .5pt f-r)(z) = O(z^{m+n+1-d\kern .5pt})$. This % part of the argument makes use of linear algebra and is given in the two % paragraphs following (27.8). % $~\hbox{\vrule width 2.5pt depth 2.5 pt height 3.5 pt}$ % %% % Let us consider some examples to illustrate the characterization of % Theorem 27.1. First, a generic case, we noted above that the type (1,1) % $\hbox{Pad\'e}$ approximant to $e^z$ is $r_{11}(z) = (1+\textstyle{1\over % 2}z)/(1-\textstyle{1\over 2}z)$. The defect of $r_{11}$ in ${\cal % R}_{11}$ is $d=0$, and we have % $$r_{11}(z) - e^z = % \textstyle{1\over 12}z^3 + \textstyle{1\over 12} z^4 + \cdots = O(z^3).$$ % Since $m+n+1-d = 3$, this confirms that $r_{11}$ is the $\hbox{Pad\'e}$ % approximant. %% % On the other hand, if $f$ is even or odd, we soon % find ourselves in the non-generic case. Suppose we consider % $$f(z) = \cos(z) = 1 - \textstyle{1\over 2} z^2 + \textstyle{1\over 24} % z^4 - \cdots$$ % and the rational approximation % $$r(z) = 1 - \textstyle{1\over 2} z^2$$ % of exact type $(2,0)$. This gives % $$(f-r)(z) = O(z^4), ~ \ne O(z^5) .$$ % By Theorem 27.1, this implies that $r$ is the $\hbox{Pad\'e}$ % approximation to $f$ for four different choices of $(m,n)$: $(2,0)$, % $(3,0)$, $(2,1)$, and $(3,1)$. With $(m,n)= (2,0)$, for example, the % defect is $d=0$ and we need $(\kern .5pt f-r)(z) = O(z^{2+0+1-0}) = % O(z^3)$, and with $(m,n)= (3,1)$, the defect is $d=1$ and we need $(\kern % .5pt f-r)(z) = O(z^{3+1+1-1}) = O(z^4)$. Both matching conditions are % satisfied, so $r$ is the $\hbox{Pad\'e}$ approximant of both types % $(2,0)$ and $(3,1)$. Similarly it is also the $\hbox{Pad\'e}$ approximant % of types $(3,0)$ and $(2,1)$, but for no other values of $(m,n)$. %% % % This example involving an even function suggests the general situation. % In analogy to the Walsh table of Chapter 24, the {\em Pad\'e table} of a % function $f$ consists of the set of its Pad\'e approximants for various % $m,n\ge 0$ laid out in an array, with $m$ along the horizontal and $n$ % along the vertical: % $$\pmatrix{ % r_{00} & r_{10}& r_{20} & \dots \cr % r_{01} & r_{11}& r_{21} & \dots \cr % r_{02} & r_{12}& r_{22} & \dots \cr % \vdots & \vdots & \vdots & \ddots}.$$ % The idea of the Pad\'e table was proposed by Pad\'e , who called it % a table of approximate rational fractions$\dots$ analogous to the % multiplication table, unbounded to the right and below.'' Like the Walsh % table for real rational approximation on an interval, % the Pad\'e table breaks into square blocks of degenerate entries, % again as a consequence of the equioscillation-type characterization % [Trefethen 1987]: % %% % % {\em {\bf Theorem 27.2. Square blocks in the Pad\'e table.} The Pad\'e % table for any function $f$ breaks into precisely square blocks containing % identical entries. (If $f$ is rational, one of these will be infinite in % extent.) The only exception is that if an entry $r=0$ appears in the % table, then it fills all of the columns to the left of some fixed index % $m = m_0$.} % %% % _Proof._ % Essentially the same as the proof of Theorem 24.2. % $~\hbox{\vrule width 2.5pt depth 2.5 pt height 3.5 pt}$ %% % As in the case of best real approximation on an interval discussed % in Chapter 24, square blocks and defects have a variety of % consequences for $\hbox{Pad\'e}$ approximants. In particular, % the $\hbox{\em Pad\'e approximation operator}$, which maps % Taylor series $f$ to their $\hbox{Pad\'e}$ approximants % $r_{mn}$, is continuous at $f$ with respect a norm based on % Taylor coefficients if and only if $r_{mn}$ has defect $d=0$. % Another related result is that best supremum-norm approximations % on intervals $[-\varepsilon,\varepsilon]$ converge to the % $\hbox{Pad\'e}$ approximant as $\varepsilon\to 0$ if $d=0$, but % not, in general, if $d> 0$. % These results come from [Trefethen & Gutknecht 1985], with % earlier partial results due to Walsh; Werner and Wuytak; and Chui, Shisha % and Smith. %% % At this point we have come a good way into the theory of $\hbox{Pad\'e}$ % approximation without doing any algebra. To finish the job, and to lead % into algorithms, it is time to introduce vectors and matrices, closely % analogous to those of the last chapter. %% % % Given a function $f$ with Taylor coefficients $\{c_j\}$, suppose we look for % a representation of the Pad\'e approximant $r_{mn}$ as a quotient $r=p/q$ % with $p\in {\cal P}_m$ and $q\in {\cal P}_n$. Equation (27.4) is % nonlinear, but multiplying through by the denominator suggests the linear % condition % $$p(z) = f(z) q(z) + O(z^{m+n+1}), % \eqno (27.5)$$ % just as (26.4) led to (26.5). % To express this equation in matrix form, suppose that $p$ and $q$ are % represented by coefficient vectors $\bf a$ and ${\kern .5pt \bf b}$: % $${\bf a} = \pmatrix{a_0\cr a_1 \cr \vdots \cr a_m}, \qquad % {\kern .5pt \bf b} = \pmatrix{b_0\cr b_1 \cr \vdots \cr b_n},$$ % $$p(z) = \sum_{k=0}^m a_kz^k, \qquad % q(z) = \sum_{k=0}^n b_kz^k.$$ % Then (27.5) can be written as an equation involving a Toeplitz matrix of % Taylor coefficients of $f$, % that is, a matrix with constant entries along each diagonal. For $m\ge % n$, the equation looks like this: % \pmatrix{a_0\cr\noalign{\vskip 2pt} a_1 \cr \vdots \cr a_n \cr \vdots \cr % a_m \rule[-.6em]{0pt}{1em}\cr \hline \cr a_{m+1}\cr \vdots % \cr a_{m+n}} = % \pmatrix{c_0\cr\noalign{\vskip 2pt} % c_1 & c_0 \cr % \vdots & \vdots & \ddots\cr % c_n & c_{n-1} & \dots & c_0 \cr % \vdots & \vdots & & \vdots \cr % c_m & c_{m-1} & \dots & c_{m-n} \rule[-.6em]{0pt}{1em}\cr % \hline\cr % c_{m+1} & c_{m} & \dots & c_{m+1-n} \cr % \vdots & \vdots & \ddots & \vdots \cr % c_{m+n} & c_{m+n-1} & \dots & c_m % } % \pmatrix{b_0\cr b_1 \cr \vdots \cr b_n} \eqno (27.6) % coupled with the condition % $$a_{m+1} = \cdots = a_{m+n} = 0. \eqno (27.7)$$ % In other words, ${\kern .5pt \bf b}$ must be a null vector of the % $n\times(n+1)$ matrix displayed below the horizontal line. If $m %% % % One solution to (27.6)--(27.7) would be${\bf a} = {\bf 0}$and${\kern % .5pt \bf b} = {\bf 0}$, corresponding to the useless candidate$r = 0/0$. % However, an$n\times (n+1)$matrix always has a nonzero null vector, % $$C \kern .7pt {\bf b\kern .5pt} = {\bf 0}, \quad % {\bf b\kern .5pt}\ne {\bf 0},$$ % and once${\kern .5pt \bf b}$is chosen, the coefficients$a_0,\dots , % a_m$of$p$can be obtained by multiplying out the matrix-vector product % above the line. Thus there is always a solution to (27.5) with$q\ne 0$. % %% % % If$b_0\ne 0$, then dividing (27.5) by$q$shows that$p/q$is a solution % to (27.4). Some nonzero null vectors${\kern .5pt \bf b}$, however, may % begin with one or more zero components. Suppose that${\kern .5pt \bf % b}$is a nonzero null vector with$b_0 = b_1 = \cdots = b_{\sigma-1} = 0$% and$b_\sigma\ne 0$for some$\sigma \ge 1$. Then the corresponding % vector$\bf a$will also have$a_0 = a_1 = \cdots = a_{\sigma-1} = 0$% (and$a_\sigma$might be zero or nonzero). It follows from the Toeplitz % structure of (27.6) that we can shift both$\bf a$and${\kern .5 pt \bf % b}$upward by$\sigma$positions to obtain new vectors${\bf \tilde a} = % (a_\sigma,\dots,a_m,0,\dots,0)^T$and${\kern .5 pt \bf \tilde b} = % (b_\sigma,\dots,b_n,0,\dots,0)^T$while preserving the quotient$r = % \tilde p/\tilde q = p/q$. Thus$r$has defect$d\ge \sigma$, and % equations (27.6)--(26.7) are still satisfied except that$\tilde % a_{m+n-\sigma+1},\dots, \tilde a_{m+n}$may no longer be zero, implying %$(f-r)(z) = O(z^{m+n+1-\sigma})$. Thus$(f-r)(z) = O(z^{m+n+1-d})$, and % this completes the proof of Theorem 27.1. % %% % We have just shown that any nonzero null vector of the matrix$C$of % (27.8) gives a function$r$that satisfies the condition for a %$\hbox{Pad\'e}$approximation, hence must be the unique approximant % provided by Theorem 27.1. So we have proved the following theorem. %% % % {\em{\bf Theorem 27.3. Linear algebra solution of Pad\'e problem.} Given % a function$f$with Taylor coefficients$\{c_j\}$, let\/${\bf b\kern % .5pt}$be any nonzero null vector of the matrix\/$C $of\/$(27.8)$, % let\/${\bf a}$be the corresponding vector obtained from$(27.6)$, and % let\/$p\in{\cal P}_m$and\/$q\in{\cal P}_n$be the corresponding % polynomials. Then$r_{mn} = p/q$is the unique type$(m,n)$Pad\'e % approximant to$f$.} % %% % We emphasize that the vectors$\bf a$and${\kern .5pt \bf b}$are not % unique in general: a function in${\cal R}_{mn}$may have many % representations$p/q$. Nevertheless, all choices of$\bf a$and${\kern % .5pt \bf b}$lead to the same$r_{mn}$. %% % % From Theorems 27.1--27.3 one can derive a precise characterization of the % algebraic structure of the Pad\'e approximants to a function$f$, as % follows. Let$\hat r$be a rational function of exact type$(\kern % .5pt\mu,\nu)$that is the Pad\'e approximant to$f$in a %$(k+1)\times(k+1)$square block for some$k\ge 0: % \pmatrix{r_{\mu\nu} & \dots & r_{\mu+k,\nu} \cr\noalign{\vskip 3pt} % \vdots & & \vdots \cr \noalign{\vskip 3pt}r_{\mu,\nu+k} & \dots & r_{\mu+k,\nu+k}}. % Write\hat r = \hat p/\hat q$with$\hat p$and$\hat q$of % exact degrees$\mu$and$\nu$. % From Theorem 27.1 we know that the defect$d$must be distributed within % the square block according to this pattern illustrated for %$k=5$: % $$% \llap{\hbox{defect d:~~~}}\pmatrix{ % 0&0&0&0&0&0\cr % 0&1&1&1&1&1\cr % 0&1&2&2&2&2\cr % 0&1&2&3&3&3\cr % 0&1&2&3&4&4\cr % 0&1&2&3&4&5}. \eqno (27.9) %$$ % According to Theorem 27.3, the polynomials$p$and$q$that result % from solving the matrix problem (27.6)--(27.7) must be related to %$\hat p$and$\hat q$by % $$p(z) = \pi(z) \hat p(z), \quad q(z) = \pi(z) \hat q(z)$$ % for some polynomial$\pi$of degree at most$d$. Now let us define the % {\bf deficiency}$\lambda$of$r$as the distance below the % cross-diagonal in the square block, with the following pattern: % $$% \llap{\hbox{deficiency \lambda:~~~}}\pmatrix{ % 0&0&0&0&0&0\cr % 0&0&0&0&0&1\cr % 0&0&0&0&1&2\cr % 0&0&0&1&2&3\cr % 0&0&1&2&3&4\cr % 0&1&2&3&4&5}. \eqno (27.10) %$$ % From Theorem 27.1, we know that in the positions of the block with %$\lambda>0$,$(f-r)(z) = O(z^{m+n+1-\lambda})$,$\ne % O(z^{m+n+2-\lambda})$, for otherwise, the block would be bigger. For % this to happen,$\pi(z)$must be divisible by$z^\lambda$, so that at % least$\lambda$powers of$z$are lost when solutions$p$and$q$from % (27.6) are normalized to$\hat p$and$\hat q$. Since$\pi$may have % degree up to$d$, the number of degrees of freedom remaining in$p$and %$q$is$d-\lambda$, an integer we denote by$\chi$, distributed within % the block according to this pattern: % $$% \llap{\hbox{rank deficiency \chi:~~~}}\pmatrix{ % 0&0&0&0&0&0\cr % 0&1&1&1&1&0\cr % 0&1&2&2&1&0\cr % 0&1&2&2&1&0\cr % 0&1&1&1&1&0\cr % 0&0&0&0&0&0}. \eqno (27.11) %$$ % Thus the dimensionality of the space of vectors$q$is$\chi+1$, and the % same for$p$. We call$\chi$the {\em rank deficiency} of$r$because of % a fact of linear algebra: the rank of the$n\times (n+1)$matrix$C$of % (27.8) must be equal to$n-\chi$, so that its space of null vectors will % have the required dimension$\chi+1$. Some ideas related to these % developments can be found in [Heinig \& Rost 1984]. % %% % We have just outlined a proof of the following theorem, which can be % found in Section 3 of [Gragg 1972]. %% % % {\em{\bf Theorem 27.4. Structure of a Pad\'e approximant.} Let$f$and %$m,n\ge 0$be given, let the type$(m,n)$Pad\'e approximant$r_{mn}$of %$f$have exact type$(\kern .5pt \mu,\nu)$, and let$\hat p$and$\hat % q\ne 0$be polynomials of exact degrees$\mu$and$\nu$with$r_{mn} = % \hat p/\hat q$. Let the defect$d$, deficiency$\lambda$, and rank % deficiency$\chi = d-\lambda$be defined as above. Then the matrix$C $% of$(27.8)$has rank$n-\chi$, and two polynomials$p\in P_m$and$q\in % P_n$satisfy$(27.5)$if and only if % $$% p(z) = \pi(z) \kern .7pt \hat p(z), % \quad q(z) = \pi(z)\kern .7pt \hat q(z) \eqno (27.12) %$$ % for some$\pi\in {\cal P}_d$divisible by$z^\lambda$.} % %% % Although we did not state it in the last chapter, there is % an analogue of this theorem for rational interpolation in % distinct points, proved by Maehly and Witzgall  % and discussed also in [Gutknecht 1990] and %$\hbox{[Pach\'on,}$Gonnet & Van Deun 2011]. %% % With the results of the past few pages to guide us, % we are now prepared to talk about algorithms. %% % % At one level, the computation of$\hbox{Pad\'e}$approximants is trivial, % just a matter of implementing the linear algebra of (27.6)--(27.7). In % particular, in the generic case, the matrix$C $of (27.8) will have full % rank, and so will its$n\times n$submatrix obtained by deleting the % first column. One computational approach to Pad\'e approximation is % accordingly to normalize$\kern .5pt \bf b$by setting$b_0=1$and then % determine the rest of the entries of {\bf b\kern .5pt} by solving a % system of equations involving this square matrix. % %% % % This approach will fail, however, when the square matrix is singular, and % it is nonrobust with respect to rounding errors even when the matrix is % nonsingular. To work with (27.8) robustly, it is a better idea to % normalize by the condition % $$\| {\kern .5pt \bf b} \| = 1,$$ % where$\|\cdot\|$is the vector 2-norm, as in equation (26.6) of % the last chapter. We then % again consider the SVD (singular value decomposition) of$C $, a % factorization $$C = U \Sigma \kern .7pt V^*, \eqno (27.13)$$ where$U$% is$n\times n$and unitary,$V$is$(n+1)\times (n+1)$and unitary, and %$\Sigma$is an$n\times (n+1)$real diagonal matrix with diagonal entries %$\sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_n \ge 0$. % %% % Suppose$\sigma_n>0$. Then$C $has rank$n$, and the final column of %$V$provides a unique nonzero null vector${\kern .5pt \bf b}$of$C$up % to a scale factor. This null vector defines a polynomial$q\in {\cal % P}_n$. Moreover, from (27.11), we know that$(m,n)$must lie on the % outer boundary of its square block in the$\hbox{Pad\'e}$table. If$q$% is divisible by$z^\lambda$for some$\lambda\ge 1$, then$(m,n)$must % lie in the bottom row or right column, and dividing$p$and$q$by %$z^\lambda$brings it to the left column or top row, respectively. A % final trimming of any trailing zeros in$p$or$q$brings them to the % minimal forms$\hat p$and$\hat q$with exact degrees$\mu$and$\nu$. %% % On the other hand, suppose$\sigma_n = 0$, so that the number of zero % singular values of$C$is$\chi\ge 1$. In this case (27.11) tells us % that$(m,n)$must lie in the interior of its square block at a distance %$\chi$from the boundary. Both$m$and$n$can accordingly be reduced by %$\chi$and the process repeated with a new matrix and a new SVD,$\chi$% steps closer to the upper-left$(\kern .5pt \mu,\nu)$corner. After a % small number of such steps (never more than$2+\log_2(d+1)$, where$d$is % the defect), convergence is guaranteed. %% % These observations suggest the following SVD-based algorithm, introduced % in [Gonnet,$\hbox{G\"uttel}$& Trefethen 2012]. %% % % {\em {\sc Algorithm 27.1. Pure Pad\'e approximation in exact arithmetic} % \par\vskip 3pt % Input:$m\ge 0$,$n\ge 0$, and a vector {\bf c} of Taylor coefficients %$c_0,\dots,c_{m+n}$of a function$f$. % \par\vskip 3pt % Output: Polynomials$p(z) = a_0+ \cdots + a_\mu z^\mu$and %$q(z) = b_0+ \cdots + b_\nu z^\nu$,$b_0 = 1$, of the minimal degree type$(m,n)$% Pad\'e approximation of$f$. % \par\vskip 3pt %$1.$If$c_0 = \cdots = c_m = 0$, set$p=0$and$q=1$and stop. % \par\vskip 3pt %$2.$If\/$n=0$, set$p(z) = c_0 + \cdots + c_mz^m$% and$q=1$and go to Step\/~$8$. % \par\vskip 3pt % \hangindent=12pt$3.$Compute the SVD\/$(27.13)$of the$n\times (n+1)$matrix$C$. % Let$\rho\le n$be the number of nonzero singular values. % \par\vskip 3pt %$4.$If$\rho %% % In exact arithmetic, this algorithm produces the unique $\hbox{Pad\'e}$ % approximant $r_{mn}$ in a minimal-degree representation of type $(\kern % 1pt \mu,\nu)$ with $b_0=1$. The greatest importance of Algorithm 27.1, % however, is that it generalizes readily to numerical computation with % rounding errors, or with noisy Taylor coefficients $\{c_j\}$. All one % needs to do is modify the tests for zero singular values or zero % coefficients so as to incorporate a suitable tolerance, such as % $10^{-14}$ for computations in standard 16-digit arithmetic. The % following modified algorithm also comes from [Gonnet, $\hbox{G\"uttel}$ % & Trefethen 2012]. %% % % {\em {\sc Algorithm 27.2. Robust Pad\'e approximation for noisy data or % floating point arithmetic} % \par\vskip 3pt % Input: $m\ge 0$, $n\ge 0$, a vector {\bf c} of Taylor % coefficients $c_0,\dots,c_{m+n}$ % of a function $f$, and a relative tolerance ${\tt tol} \ge 0$. % \par\vskip 3pt % Output: Polynomials $p(z) = a_0+ \cdots + a_\mu z^\mu$ and % $q(z) = b_0+ \cdots + b_\nu z^\nu$, $b_0=1$, % of the minimal degree type $(m,n)$ % Pad\'e approximation of a function close to $f$. % \par\vskip 3pt % \hangindent=12pt $1.$ Rescale $f(z)$ to $f(z/\gamma)$ for some $\gamma>0$ if desired % to get a function whose Taylor coefficients % $c_0,\dots,c_{m+n}$ do not vary too widely. % \par\vskip 3pt % $2.$ Define $\tau = {\tt tol} \cdot \|{\bf c}\|_2$. % If $|c_0| = \cdots = |c_m| \le \tau,$ set $p=0$ and $q=1$ and stop. % \par\vskip 3pt % $3.$ If\/ $n=0$, set $p(z) = c_0 +\cdots + c_mz^m$ and $q = 1$ % and go to Step\/~$7$. % \par\vskip 3pt % \hangindent=12pt $4.$ Compute the SVD\/ $(27.13)$ of the $n\times (n+1)$ matrix $C$. % Let $\rho\le n$ be the number of singular % values of $C$ that are greater than $\tau$. % \par\vskip 3pt % $5.$ % If $\rho %% % % Algorithm 27.2 has been implemented in a Matlab code called {\tt % padeapprox} that is included in the Chebfun distribution, though it does % not involve chebfuns. In its basic usage, {\tt padeapprox} takes as % input a vector {\bf c} of Taylor coefficients together with a % specification of$m$and$n$, with${\tt tol} = 10^{-14}$by default. For % example, following [Gragg 1972], suppose % $$f(z) = {1-z+z^3\over 1-2z+z^2} = % 1 + z + z^2 + 2z^3 + 3z^4 + 4z^5 + \cdots.$$ Then the type$(2,5)$% Pad\'e approximation of$f$comes out with the theoretically correct % exact type$(0,3)$: % %% % \vskip -2em c = [1 1 (1:50)]; [r,a,b] = padeapprox(c,2,5); format short disp('Coefficients of numerator:'), disp(a.') disp('Coefficients of denominator:'), disp(b.') %% % % To illustrate the vital role of the SVD in such a calculation, here % is what happens if robustness is turned off by setting${\tt tol}=0$: % %% % \vskip -2em [r,a,b] = padeapprox(c,2,5,0); disp('Coefficients of numerator:'), disp(a.') disp('Coefficients of denominator:'), disp(b.') %% % We now see longer vectors with enormous entries, on the order of the % inverse of machine precision. The type appears to be$(2,5)$, but % the zeros and poles reveal that this is spurious: %% % \vskip -2em format long g disp('Zeros:'), disp(roots(a(end:-1:1))) disp('Poles:'), disp(roots(b(end:-1:1))) %% % We see that the two zeros are virtually cancelled by two poles that % differ from them by only about$10^{-24}$. Thus this approximant has two % spurious pole-zero pairs, or Froissart doublets, introduced by rounding % errors. Many$\hbox{Pad\'e}$computations over the years have been % contaminated by such effects, and in an attempt to combat them, many % authors have asserted that it is necessary to compute$\hbox{Pad\'e}$% approximations in high precision arithmetic. %% % If |padeapprox| is called with a Matlab function handle$f$rather than a % vector as its first argument, then it assumes$f$is a function analytic % in a neighborhood of the closed unit disk and computes Taylor % coefficients by the Fast Fourier Transform. For example, here is the type %$(2,2)\hbox{Pad\'e}$approximant of$f(z) = \cos(z)$: %% % \vskip -2em format long [r,a,b] = padeapprox(@cos,2,2); disp('Coefficients of numerator:'), disp(a.') disp('Coefficients of denominator:'), disp(b.') %% % One appealing application of |padeapprox| is the numerical computation of % block structure in the$\hbox{Pad\'e}$table for a given function$f$. % For example, here is a table of the computed pair$(\kern .5pt \mu,\nu)$% for each$(m,n)$in the upper-left portion of the %$\hbox{Pad\'e}$table of$\cos(z)$with$0\le m,n\le 8$. One sees the %$2\times 2$block structure resulting from the evenness of$\cos(z)$. %% % \vskip -2em nmax = 8; for n = 0:nmax for m = 0:nmax [r,a,b,mu,nu] = padeapprox(@cos,m,n); fprintf(' (%1d,%1d)',mu,nu) end fprintf('\n') end %% % We can also show the block structure with a color plot, like this: %% % \vskip -2em d = zeros(nmax+2); rand('state',7); h = tan(2*rand(50)-1); h(8,1) = 1; for n = 0:nmax, for m = 0:nmax [r,a,b,mu,nu] = padeapprox(@cos,m,n); d(n+1,m+1) = h(mu+1,nu+1); end, end pcolor(d), axis ij square off %% % \vskip 1pt %% % The pattern of$2\times 2$blocks is broken if we compute a larger segment % of the table, such as$0\le m,n \le 16$: %% % \vskip -2em nmax = 16; d = zeros(nmax+2); for n = 0:nmax, for m = 0:nmax [r,a,b,mu,nu] = padeapprox(@cos,m,n); d(n+1,m+1) = h(mu+1,nu+1); end, end pcolor(d), axis ij square off %% % \vskip 1pt %% % What is going on here is that for$m+n$greater than about$16$, %$\cos(z)$is resolved to machine precision, and the diagonal stripes of % the plot show that |padeapprox| has automatically cut$m$and$n$down to % this level. %% % For an arbitrary'' function$f$with gaps in its Taylor series, the % block structure can be quite intriguing, as illustrated by this example % with$f(z) = 1 + z + z^4 + z^7 + z^{10} + z^{13} + z^{16} + z^{17}$: %% % \vskip -2em nmax = 16; d = zeros(nmax+2); f = @(z) 1+z+z.^4+z.^7+z.^10+z.^13.+z.^16+z.^17; for n = 0:nmax, for m = 0:nmax [r,a,b,mu,nu] = padeapprox(f,m,n); d(n+1,m+1) = h(mu+1,nu+1); end, end pcolor(d), axis ij square off %% % \vskip 1pt %% % Apart from$z^{17}$, these are the initial terms of the % Taylor series of % $$f(z) = {1+z-z^3\over 1-z^3} , \eqno (27.14)$$ % an example for which$\hbox{Pad\'e}$worked out the block structure % for$0\le m\le 7$,$0\le n \le 5\hbox{[Pad\'e 1892],}$showing % vividly a$2\times 2$block, two$3\times 3$blocks, and the beginning % of the infinite block at position$(3,3)$. %% % In this chapter we have discussed how to compute$\hbox{Pad\'e}$% approximants, but not what they are useful for. As outlined in chapter % 23, applications of these approximations typically involve situations % where we know a function in one region of the$z$-plane and wish % to evaluate it in another region that lies near or beyond certain % singularities. The next chapter is devoted to a practical exploration of % such problems. %% % From a theoretical perspective, a central question for more than a % century has been, what sort of convergence of$\hbox{Pad\'e}$% approximants of a function$f$can we expect as$m$and/or$n$increase % to$\infty$? In the simplest case, suppose that$f$is an entire function, % that is, analytic for all$z$. Then for any compact set$K$in the % complex plane, we know that the type$(m,0)\hbox{Pad\'e}$approximants % converge uniformly on$K$as$m\to\infty$, since these are just the % Taylor approximants. One might hope that the same would be true of type %$(m,n_0)$approximants for fixed$n_0\ge 1$as$m\to\infty$, or of type %$(n,n)$approximants as$n\to\infty$, but in fact, pointwise convergence % need not occur in either of these situations. The problem is % that spurious pole-zero pairs, Froissart doublets, may appear at % seemingly arbitrary locations in the plane. As$m$and/or$n$increase, % the doublets get weaker and their effects more localized, but they can % never be guaranteed to go away. (In fact, there exist functions$f$% whose$\hbox{Pad\'e}$approximants have so many spurious poles that the sequence % of$(n,n)$approximants is unbounded for every$z\ne 0$[Perron 1929, % Wallin 1972].) % The same applies if$f$is meromorphic, % i.e., analytic apart from poles, or if it has more complicated % singularities such as branch points. All this is in true in exact % mathematics, and when there are rounding % errors on a computer, the doublets become ubiquitous. %% % Despite these complexities, important theorems have been proved. The % theorem of de Montessus de Ballore  concerns the case of$m\to % \infty$with fixed$n$, guaranteeing convergence in a disk about$z=0$if %$f$has exactly$n$poles there. The Nuttall--Pommerenke theorem % [Nuttall 1970, Pommerenke 1973] concerns$m = n \to \infty$and ensures % convergence for meromorphic$f$not pointwise but _in measure_ or _in % capacity_, these being precise notions that require accuracy over most % of a region as$m,n\to\infty$while allowing for localized % anomalies. This result was powerfully generalized for functions with % branch points by Stahl , who showed that as$n\to \infty$, almost all the % poles of type$(n,n)\hbox{Pad\'e}$approximants line up along branch cuts % that have a property of minimal capacity in the$z^{-1}$-plane. % For discussion of these results see [Baker & Graves-Morris 1996]. % There are also analogous results for multipoint$\hbox{Pad\'e}$approximation and % other forms of rational interpolation. For example, an analogue of the % de Montessus de Ballore theorem for interpolation % as in the last chapter was proved by Saff . %% % As a practical matter, these complexities of convergence are well combatted by % the SVD approach we have described, which can be regarded as a method of % regularization of the$\hbox{Pad\'e}$problem. %% % For reasons explained in the last chapter, the whole discussion of this % chapter has been based on the behavior of a function$f(z)$at$z=0$% rather than this book's usual context of a function$f(x)$on an interval % such as$[-1,1]$. There is an analogue of$\hbox{Pad\'e}$approximation % for$[-1,1]$called %$\hbox{\em Chebyshev--Pad\'e approximation}$, developed by % Hornecker , Maehly , Frankel and Gragg , % Clenshaw and Lord , and Geddes . The idea is % to consider the % analogue of (27.3) for Chebyshev series rather than Taylor series: % $$(f- r_{mn})(x) = O(T_{\hbox{\footnotesize\rm maximum}}(x)). % \eqno (27.14)$$ % (The Maehly version starts from the analogue of the linearized form (27.5).) % In analogy to Theorem 27.1, it turns out that any$r\in {\cal R}_{mn}$% satisfying$(\kern .5pt f-r)(x) = O(T_{m+n+1-d\kern .5pt}(x))$is the % unique$\hbox{Chebyshev--Pad\'e}$approximant according to this definition, but now, % there is no guarantee that such a function$r$exists. % For theoretical details, see [Trefethen & Gutknecht 1987], and for % computations in Chebfun, there is a code called |chebpade|. As of today, % there has not yet been a study of %$\hbox{Chebyshev--Pad\'e}$approximation employing the SVD-based robustness % ideas described in this chapter for %$\hbox{Pad\'e}$approximation. %% % For extensive information about$\hbox{Pad\'e}$approximation, see the % book by Baker and Graves-Morris . However, that % monograph uses an alternative definition according to which a %$\hbox{Pad\'e}$approximant only exists if equation (27.4) can be % satisfied, and in fact the present treatment is mathematically closer to % the landmark review of Gragg , which uses the definition (27.3). %% % % \begin{displaymath} % \framebox[4.7in][c]{\parbox{4.5in}{\vspace{2pt}\sl % {\sc Summary of Chapter 27.} Pad\'e approximation is the generalization % of Taylor polynomials to rational approximation, that is, rational % interpolation at a single point. Pad\'e approximants are characterized by % a kind of equioscillation condition and can be computed % robustly by an algorithm based on the SVD.\ \ The analogue on the % interval$[-1,1]$is known as Chebyshev--Pad\'e % approximation.\vspace{2pt}}} % \end{displaymath} % %% % \small % \parskip=2pt % \par % {\bf Exercise 27.1. Pad\'e approximation of a logarithm.} Show from Theorem 27.1 that the function %$f(z) = \log(1+z)$has Pad\'e approximants$r_{00} = 0$, %$r_{1,0}(z) = z$,$r_{01}(z) = 0$, and$r_{11} = z/(1+\textstyle{1\over 2} z)$. % \par % {\bf Exercise 27.2. Reciprocals and exponentials.} (a) Suppose$r_{mn}$is the % type$(m,n)$Pad\'e approximant to a function$f$with$f(0)\ne 0$. % Show that$1/r_{mn}$is the type$(n,m)$Pad\'e approximant to$1/f$. % (b) As a corollary, state a theorem relating the$(m,n)$and$(n,m)$% Pad\'e approximants of$e^z$. % \par % {\bf Exercise 27.3. Prescribed block structures.} % Devise functions$f$with the following structures in their % Pad\'e tables, and verify your claims numerically by color plots % for$0\le m,n \le 20$. % (a)$3\times 3$blocks everywhere. % (b)$1\times 1$blocks everywhere, except that$r_{11} = r_{21} = r_{12} = r_{22}$. % (c)$1\times 1$blocks everywhere, except that all$r_{mn}$with$n\le 2$are the same. % \par % {\bf Exercise 27.4. Order stars.} The {\em order star} of a function %$f$and its approximation$r$is the set of points$z$in the complex % plane for which$|f(z)| = |r(z)|$. Use the Matlab {\tt contour} command % to plot the order stars of the Pad\'e approximations$r_{11}$,$r_{22}$, %$r_{32}$and$r_{23}$to$e^z$. Comment on the behavior near the origin. % \par % {\bf Exercise 27.5. Nonsingularity and normality.} Show that for a given %$f$and$(m,n)$, the Pad\'e approximation$r_{mn}$has defect$d=0$if and % only if the square matrix obtained by deleting the first column of (27.8) % is nonsingular. (If all such matrices are nonsingular, the Pad\'e table % of$f$is accordingly {\bf normal}, with all its entries distinct.) % \par % {\bf Exercise 27.6. Arbitrary patterns of square blocks?} % Knowing that degeneracies in the Pad\'e table always occupy square blocks, % one might conjecture that, given any tiling of % the quarter-plane$m\ge 0$,$n\ge 0$by % square blocks, there exists a function$f$with this pattern % in its Pad\'e table. Prove that this conjecture is false. % (Hint: consider the case where the first two rows of the table are % filled with$2\times 2$blocks [Trefethen 1984].) % \par % {\bf Exercise 27.7. Continued fractions and the Pad\'e table.} % If$d_0, d_1,\dots $is a sequence of numbers, the {\em continued fraction} % $$d_0 + {d_1z \over \displaystyle 1+{d_2z\over \displaystyle 1 + \cdots}} \eqno (27.15)$$ % is a shorthand for the sequence of rational functions % $$d_0, ~ d_0 + d_1z, ~ d_0 + {d_1z\over \displaystyle 1+d_2 z},~ \dots , \eqno (27.16)$$ % known as {\em convergents} of the continued fraction. % (a) Show that if$d_0,\dots,d_{k-1}\ne 0$and$d_k=0$, then (27.15) defines % a rational function$r(z)$, and determine its exact type. % (b) Assuming$d_k \ne 0$for all$k$, % show that the convergents are the Pad\'e approximants of % types$(0,0), (1,0), (1,1), (2,1), (2,2), \dots\$ of a certain formal % power series. %