%% 20. Caratheodory--Fejer approximation ATAPformats %% % We have seen that Chebyshev interpolants are near-best % approximations in the sense that they come within a factor of at most % $O(\log n)$ of best approximations, usually even closer. For most % applications, this is all one could ask for. But there is another kind % of near-best approximations that are so close to best that for smooth % functions, they are often indistinguishable from best approximations to % machine precision on a computer. These are {\em CF % (Carath\'eodory--Fej\'er) approximations}, introduced by Gutknecht and % Trefethen . Earlier related ideas were proposed in [Darlington % 1970, Elliott 1973, Lam 1972, Talbot 1976], and the theoretical basis % goes back to the early 20th century [Carath\'eodory \& Fej\'er 1911, % Schur 1918].\footnote{Logically, this chapter could have appeared % earlier, perhaps just after Chapter 10. We have deferred it to this % point of the book, however, since the material is relatively difficult % and none of the other chapters depend on it.} %% % Before explaining the mathematics of CF approximants, let us illustrate % the remarkable degree of near-optimality they sometimes achieve. Here is % the optimal $\infty$-norm error in approximation of $f(x) = e^x$ on % $[-1,1]$ by a polynomial of degree 2: %% % \vskip -2em x = chebfun('x'); format long f = exp(x); n = 2; pbest = remez(f,n); errbest = norm(f-pbest,inf) %% % Here is the corresponding error for CF approximation computed by the % Chebfun |cf| command: %% % \vskip -2em pcf = cf(f,n); errcf = norm(f-pcf,inf) %% % These two numbers agree to an extraordinary 9 significant digits. % Comparing the best and CF polynomials directly to one another, we confirm % that they are almost the same: %% % \vskip -2em norm(pbest-pcf,inf) %% % That was for degree $n=2$, and the near-optimality of the CF approximants % grows stronger as $n$ increases. Let us explore the dependence on $n$. On % a semilog plot, the upper curve in the next figure shows the accuracy of % the best polynomial as an approximation to $f(x)$, while the lower curve % shows the accuracy of the CF polynomial as an approximation to the best % polynomial. The two errors are of entirely different orders, and for $n> % 3$, the CF and best polynomials are indistinguishable in floating point % arithmetic. %% % \vskip -2em nn = 0:10; err1 = []; err2 = []; for n = nn pbest = remez(f,n); err1 = [err1 norm(f-pbest,inf)]; pcf = cf(f,n); err2 = [err2 norm(pbest-pcf,inf)]; end hold off, semilogy(nn,err1,'.-'), grid on hold on, semilogy(nn,err2,'.-r') FS = 'fontsize'; text(7.5,2e-6,'f-p_{best}','color','b',FS,10) text(1.2,1e-14,'p_{best}-p_{CF}','color','r',FS,10) ylim([1e-18,1e2]), xlabel('n',FS,9) title(['For smooth functions, ' ... 'CF approx is almost the same as best approx'],FS,9) %% % \vskip 1pt %% % Here is the same experiment repeated for $f(x) = \tanh(4(x-0.3))$. %% % \vskip -2em f = tanh(4*(x-.3)); nn = 0:30; err1 = []; err2 = []; for n = nn pbest = remez(f,n); err1 = [err1 norm(f-pbest,inf)]; pcf = cf(f,n); err2 = [err2 norm(pbest-pcf,inf)]; end hold off, semilogy(nn,err1,'.-'), grid on hold on, semilogy(nn,err2,'.-r') text(16,2e-2,'f-p_{best}','color','b',FS,10) text(5.3,1e-13,'p_{best}-p_{CF}','color','r',FS,10) ylim([1e-18,1e2]), xlabel('n',FS,9) title('Same curves for another function f',FS,9) %% % \vskip 1pt %% % Again we see that |pbest| $\!\!-\!\!$ |pcf| is much smaller than |f| % $\!\!-\!\!$ |pbest|, implying that the CF approximant is for practical % purposes essentially optimal. (Concerning the erratic oscillations, see % Exercise 20.4.) Yet it is far easier to compute: %% % \vskip -2em tic, remez(f,20); tbest = toc tic, cf(f,20); tcf = toc %% % Turning to a non-smooth function, here again is the jagged example from % Chapter 10 with its best approximation of degree 20: %% % \vskip -2em f = cumsum(sign(sin(20*exp(x)))); hold off, plot(f,'k'), grid on tic, [pbest,err] = remez(f,20); tbest = toc; hold on, plot(pbest) title('Jagged function and best approximation',FS,9) %% % \vskip 1pt %% % We saw the error curve before: %% % \vskip -2em hold off, plot(f-pbest), grid on, hold on, axis([-1 1 -.08 .08]) plot([-1 1],err*[1 1],'--k'), plot([-1,1],-err*[1 1],'--k') title('Best approximation error curve',FS,9) %% % \vskip 1pt %% % In CF approximation, we must start from a polynomial, not a jagged % function. As a rule of thumb, truncating the Chebyshev series at 5 times % the degree of the desired approximation is usually pretty safe. Here is % what we get: %% % \vskip -2em f100 = chebfun(f,100); tic, pcf = cf(f100,20); tcf = toc; hold off, plot(f-pcf), grid on, hold on, axis([-1 1 -.08 .08]) plot([-1 1],err*[1 1],'--k'), plot([-1,1],-err*[1 1],'--k') title('CF approximation error curve',FS,9) %% % \vskip 1pt %% % Evidently the error falls short of optimality by just a few percent. Yet % again the computation is much faster: %% % \vskip -2em tbest %% tcf %% % Here for comparison is the error in Chebyshev interpolation. %% % \vskip -2em pinterp = chebfun(f,21); hold off, plot(f-pinterp), grid on, hold on, axis([-1 1 -.08 .08]) plot([-1 1],err*[1 1],'--k'), plot([-1,1],-err*[1 1],'--k') title('Chebyshev interpolation error curve',FS,9) %% % \vskip 1pt %% % The time has come to describe what CF approximation is all about. We % shall see that the hallmark of this method is the use of eigenvalues and % eigenvectors (or singular values and singular vectors) of a Hankel matrix % of Chebyshev coefficients. %% % We start with a real function $f$ on $[-1,1]$, which we want to % approximate by a polynomial of degree $n\ge 0$. Following Theorem 3.1, we % assume that $f$ is Lipschitz continuous, so it has an absolutely % convergent Chebyshev series % $$f(x) = \sum_{k=0}^\infty a_k T_k(x).$$ % Since our aim is polynomial approximation, there is no loss of generality % if we suppose that $a_0 = a_1 = \cdots = a_n = 0$, so % that the Chebyshev series of $f$ begins at the term $T_{n+1}$. For % technical simplicity, let us further suppose that the series is a finite % one, ending at the term $T_N$ for some $N \ge n+1$. Then $f$ has the % Chebyshev series % $$f(x) = \sum_{k=n+1}^N a_k T_k(x).$$ % We now transplant $f$ to a function $F$ on the unit circle in the complex % $z$-plane by defining $F(z) = F(z^{-1}) = f(x)$ for $|z|=1$, where $x = % \hbox{Re}\,z = (z+z^{-1})/2$. As in the proof of Theorem 3.1, this gives % us a formula for $F$ as a Laurent polynomial, % $$F(z) = {1\over 2} \sum_{k=n+1}^N a_k (z^k + z^{-k}).$$ % We can divide $F$ into two parts, $F(z) = G(z) + G(z^{-1})$, with % $$G(z) = {1\over 2} \sum_{k=n+1}^N a_k z^k.$$ % The function $G$ is called the _analytic part_ of $F$, since it can be % analytically continued to an analytic function in $|z|\le 1$. Similarly % $G(z^{-1})$ is the _coanalytic part_ of $F$, analytic for $1 \le |z|\le % \infty$. %% % % Now we ask the following question: what is the best % approximation $\tilde P$ to $G$ on the unit circle of the form % $$\tilde P(z) = {1\over 2} \sum_{k=-\infty}^n b_k z^k, \eqno (20.1)$$ % where the series converges for all $z$ with $1\le |z| < \infty$? In other % words, $\tilde P$ must be analytic in the exterior of the unit disk apart % from a pole of order at most $n$ at $z=\infty$. This is the problem that % Carath\'eodory and Fej\'er solved, and the solution is % elegant. First of all, $\tilde P$ exists, and it is unique. Secondly, % $G-\tilde P$ maps the unit circle onto a perfect circle that winds % counterclockwise around the origin a number of times: the winding number % is at least $n+1$. Third, as shown by Schur a few years after % Carath\'eodory and Fej\'er [Schur 1918], $\tilde P$ can % be constructed explicitly by solving a certain matrix singular value % problem. Let $H$ denote the $(N-n)\times (N-n)$ real symmetric matrix of % Chebyshev coefficients arranged like this, % H = \pmatrix{ a_{n+1} & a_{n+2} & a_{n+3} & \dots & a_N \cr\noalign{\vskip 3pt} % a_{n+2} & a_{n+3} & \cr\noalign{\vskip 3pt} a_{n+3} \cr\noalign{\vskip 3pt} \vdots \cr\noalign{\vskip 3pt} a_N }, \eqno (20.2) % where the entries in the lower-right triangle are zero. A matrix with % this structure, constant along diagonals so that $a_{ij}$ depends only on % $i+j$, is called a {\em Hankel matrix}. Let $\lambda$ be the largest % eigenvalue of $H$ in absolute value, let $u = (u_0, u_1, \dots , % u_{N-n-1})^T$ be a corresponding real eigenvector, and define % $$u(z) = u_0^{} + u_1^{} z + \cdots + u_{N-n-1}^{}z^{N-n-1}.$$ % Here is the theorem due to % Carath\'eodory and Fej\'er and Schur. % %% % *Theorem 20.1. $\hbox{Carath\'eodory--Fej\'er--Schur}$ theorem*. % _The approximation problem described above has a unique solution % $\tilde P$, and it is given by the error formula_ % $$(G-\tilde P)(z) = \lambda \kern 1pt z^{n+1} \kern 1pt % {u(z)\over \,\overline{u(z)}\,}. \eqno (20.3)$$ _The function $G-\tilde % P$ maps the unit circle to a circle of radius $|\lambda|$ and winding % number $\ge n+1$, and if $|\lambda|>|\mu|$ for all other eigenvalues % $\mu$, the winding number is exactly $n+1$._ %% % _Proof._ The result is due to % $\hbox{Carath\'eodory and Fej\'er}$  and % Schur . See Theorem 1.1 of [Gutknecht & Trefethen 1982] and % Theorem 4 of [Hayashi, Trefethen & Gutknecht 1990]. % $~\hbox{\vrule width 2.5pt depth 2.5 pt height 3.5 pt}$ %% % % Theorem 20.1 is a mathematical assertion about the approximation of a % function $G$ on the unit circle by an infinite series. We use this result % to construct the polynomial CF approximant as follows. Since $G-\tilde P$ % maps the unit circle to a circle of winding number $\ge n+1$, its real % part (times 2) % $$(G-\tilde P)(z) + (G-\tilde P)(z^{-1})$$ % maps $[-1,1]$ to an equioscillating curve with at least % $n+2$ extreme points. Thus the function % $$\tilde p(x) = \tilde P(z) + \tilde P(z^{-1})$$ % yields the equioscillatory behavior that characterizes a best % approximation polynomial of degree $n$ to $f(x)$ on $[-1,1]$ (Theorem % 10.1). Unfortunately, $\tilde p(x)$ is not a polynomial of degree $n$. % However, it will generally be very close to one. The function $\tilde P$ % will normally have Laurent series coefficients $b_k$ that decay as % $k\to-\infty$. We truncate these at degree $-n$ to define % $${\cal P}_{\hbox{\tiny CF}}^{}(z) = {1\over 2} \sum_{k=-n}^n b_k z^k,$$ % with real part (times 2) % $$p_{\hbox{\tiny CF}}^{}(x) = % {\cal P}_{\hbox{\tiny CF}}^{}(z) + {\cal P}_{\hbox{\tiny CF}}^{}(z^{-1}) % = {1\over 2} \sum_{k=-n}^n (b_k + b_{-k}) z^k.$$ If the truncated terms % are small, $f-p_{\hbox{\tiny CF}}^{}$ maps $[-1,1]$ to a curve that comes % very close to equioscillation with $\ge n+2$ extrema, and thus % $p_{\hbox{\tiny CF}}^{}$ is close to optimal. % %% % For more details on real polynomial CF approximation, with numerical % examples, see [Gutknecht & Trefethen 1982], [Trefethen 1983], and % [Hayashi, Trefethen & Gutknecht 1990]. %% % Our experiments in the opening pages of this chapter showed that CF % approximants can be exceedingly close to best. The truncation described % above gives an idea of how this happens. In the simplest case, suppose % $f$ is an analytic function on $[-1,1]$. Then by Theorem 8.1, its % Chebyshev coefficients decrease geometrically, and let us suppose that % this happens smoothly at a rate $a_k = O(\rho^k)$. Then, roughly % speaking, the dominant degree $n+1$ term of $f$ is of order % $\rho^{-n-1}$, and the terms $b_n, b_{n-1}, \dots, b_{-n}$ are of orders % $\rho^{-n-2},\rho^{-n-3},\dots, \rho^{-3n-2}$. This suggests that the % truncation in going from $\tilde p$ to $p_{\hbox{\tiny CF}}^{}$ will % introduce an error of order $\rho^{-3n-3}$. This is usually a very small % number, and in particular, much smaller than the error $\|f-p^*\|$ of % order $\rho^{-n-1}$. %% % In fact, the actual order of accuracy for polynomial CF approximation is % one order higher, $\rho^{-3n-4}$ rather than $\rho^{-3n-3}$. (The reason % is that the first truncated term is a multiple of $T_{3n+3}$, the same % Chebyshev polynomial that dominates the error $f-p^*$ itself, and so it % is not until the second truncated term, $T_{3n+4}$, that the % equioscillation is broken.) On the other hand, to go from this rough % argument to a precise theorem is not so easy, because in fact, Chebyshev % series need not decay smoothly (Exericse 20.3). Here we quote without % proof a theorem from [Gutknecht & Trefethen 1982]. %% % % {\em{\bf Theorem 20.2. Accuracy of polynomial CF approximation.} For any fixed % $m\ge 0$, let $f$ have a Lipschitz continuous $(3m+3)$rd derivative on % $[-1,1]$ with a nonzero $(m+1)$st derivative at $x=0$, and for each $s\in % (0,1]$, let $p^*$ and $p_{\hbox{\tiny CF}}^{}$ be the best and the CF % approximations of degree $m$ to $f(sx)$ on $[-1,1]$, respectively. Then % as $s\to 0$, % $$\|f-p^*\| = O(s^{m+1}) \eqno (20.4)$$ % and % $$\|f-p^*\| \ne O(s^{m+2}) \eqno (20.5)$$ % and % $$\|\kern .7pt p_{\hbox{\tiny CF}}^{}-p^*\| = O(s^{3m+4}). \eqno (20.6)$$ % } % %% % _Proof._ See Theorem 3.4 of [Gutknecht & Trefethen 1982]. % $~\hbox{\vrule width 2.5pt depth 2.5 pt height 3.5 pt}$ %% % We can verify this result numerically. The two plots below display norms % for $m=1$ and $m=2$ for the function $f(x) = e^{5x}$. %% % \vskip -2em ff = @(x) exp(5*x); for m = 1:2 ss = .8.^(0:20); errfp = []; errpp = []; for s = ss f = chebfun(@(x) ff(s*x)); pbest = remez(f,m); pcf = cf(f,m); errfp = [errfp norm(f-pbest,inf)]; errpp = [errpp norm(pcf-pbest,inf)]; end hold off, loglog(ss,errfp,'.-') hold on, loglog(ss,errpp,'.-r'), loglog(ss,ss.^(m+1),'--'); s = 0.025; text(s,.1*s^(m+1)/4,'s^{m+1}','color','b',FS,10) loglog(ss,ss.^(3*m+4),'--r') text(s,.02*s^(3*m+4)*1e4,'s^{3m+4}','color','r',FS,10) text(.015,.01+(2-m)*.5,'f-p_{best}','color','b',FS,10) text(.25,1e-12+(2-m)*1e-8,'p_{best}-p_{CF}','color','r',FS,10) axis([1e-2 1 1e-18 1e3]), xlabel('s',FS,9), ylabel error title(['Convergence for m = ' int2str(m)],FS,9), snapnow end %% % \vspace{1pt} %% % In this chapter we have considered CF approximation in its simplest % context of approximation of one polynomial $f$ of degree $N$ by another % polynomial $p_{\hbox{\tiny CF}}^{}$ of degree $n$. In fact, the method % is much more general. So long as $f$ has an absolutely convergent % Chebyshev series, which is implied for example if it is Lipschitz % continuous, then Theorem 20.1 still applies [Hayashi, Trefethen & % Gutknecht 1990]. Now $H$ is an infinite matrix which can be shown to % represent a compact operator on $\ell^2$ or $\ell^1$, its dominant % eigenvector is an infinite vector, and $u(z)$ is defined by an infinite % series. The error curve is still a continuous function of winding number % at least $n+1$. %% % Another generalization is to approximation by rational functions rather % than polynomials. Everything goes through in close analogy to what has % been written here, and now the other eigenvalues of the Hankel matrix % come into play. The theoretical underpinnings of rational CF % approximation can be found in papers of Takagi , Adamyan, Arov and % Krein , and Trefethen and Gutknecht [1983b], as well as the article % by Hayashi, Trefethen and Gutknecht cited above. Quite apart from theory, % one can compute these approximations readily by the Chebfun |cf| command % using capabilities introduced by Joris Van Deun. For details and % examples see [Van Deun & Trefethen 2011]. %% % % Further generalizations of CF approximation concern approximation of vector or matrix % functions rather than just scalars, and here, such techniques are % associated with the name {\em $H^\infty$ approximation}. An important early % paper was Glover , and there have been many extensions and generalizations % since then [Antoulas 2005, Zhou, Doyle \& Glover 1996]. % %% % % We have emphasized the practical power of CF approximants as % providing near-best approximations at low computational cost. % The conceptual and theoretical significance of the technique, however, goes % beyond this. Indeed, the eigenvalue/singular value analysis of % Carath\'eodory--Fej\'er approximation seems to be the principal known % algebraic window into the detailed analysis of best approximations, and % in most cases where best approximations of a function happen to be % known exactly, these best approximations are CF approximations in which % an approximant like $\tilde P$ or $\tilde p$ already has the required finite % form, so that nothing must be truncated to get to $P$ or $p$ [Gutknecht 1983]. % %% % % \begin{displaymath} % \framebox[4.7in][c]{\parbox{4.5in}{\vspace{2pt}\sl % {\sc Summary of Chapter 20.} % Carath\'eodory--Fej\'er approximation constructs near-best % approximations of a function $f\in C([-1,1])$ from the singular values % and vectors of a Hankel matrix of Chebyshev coefficients. If $f$ is % smooth, CF approximants are often indistinguishable in machine precision % from true best approximants.\vspace{2pt}}} % \end{displaymath} % %% % \small\smallskip\parskip=2pt % {\bf Exercise 20.1. Approximating \boldmath$\cos(nx)$.} % (a) For $n = 2,4,8,16,\dots,$ compute the degree $n$ CF approximant % to $f(x) = \cos(nx)$ and plot the error curve. How high can you go in % this sequence? % (b) What happens if $\cos(nx)$ is changed to $\cos(0.9\kern .7pt nx)$? % \par % {\bf Exercise 20.2. Approximating the jagged function.} Four of the % figures of this chapter concerned approximations of degree 20 to a jagged % function. (a) How do the $L^2$ norms of the best and CF approximations % compare? (b) The CF approximation was based on truncation of the % Chebyshev series at term $N=100$. How does the $\infty$-norm of the % error vary with $N$? % (c) Draw a conclusion from this exploration: is the imperfect % equioscillation of the error curve in the figure given in the text for % this function mostly to the fact that CF approximation is not best % approximation, or to the fact that $N <\infty$? % \par % {\bf Exercise 20.3. Complex approximation on the unit disk.} (a) Suppose % $f$ is an analytic function on the closed unit disk and $p$ is a % polynomial of degree $n$. Prove that $p$ is a best approximation to $f$ % in the $\infty$-norm on the disk $|z|\le 1$ if and only if it is a best % approximation on the circle $|z|=1$. % (b) Look up $\hbox{Rouch\'e's}$ theorem and write down a careful % statement, citing your source. (c) Suppose $f$ is an analytic function in % the closed unit disk and $p$ is a polynomial of degree $n$ such that % $f-p$ maps the unit circle to a circle of winding number at least $n+1$. % Prove that $p$ is a best approximation to $f$ on the unit disk. (In fact % it is unique, though this is not obvious.) % \par % {\bf Exercise 20.4. Irregularity of CF approximation.} The second % figure of this chapter showed quite irregular dependence of % $\|\kern .7pt p_{\hbox{\tiny CF}}^{} - p^*\|$ on the degree $n$ for the function % $f(x) = \tanh(4(x-0.3))$. In particular, $n=15$ and $n=16$ give very % different results. Following the derivation of $p_{\hbox{\tiny CF}}^{}$ in % the text, investigate this difference numerically. (a) For $n=15$, how % do the coefficients $|b_k|$ depend on $k$, and how big are the truncated % terms in going from $\tilde p$ to $p_{\hbox{\tiny CF}}^{}$? % (b) Answer the same questions for $n=16$. % \par