- f is the function to be approximated
- n is the degree of the polynomial that must approximate f
- monomials is a list of integers or a list of function. It indicates the basis for the approximation of f
- formats is a list indicating the formats that the coefficients of the polynomial must have
- range is the interval where the function must be approximated
- L is a list of interpolation points used by the method
- indic1 (optional) is one of the optional indication parameters. See the detailed description below.
- indic2 (optional) is one of the optional indication parameters. See the detailed description below.
- indic3 (optional) is one of the optional indication parameters. See the detailed description below.
- P (optional) is the minimax polynomial to be considered for solving the problem.
- Lcoeffs (optional) is a list containing the coefficients of the minimax polynomial to be considered for solving the problem.

- fpminimax uses a heuristic (but practically efficient) method to find a
good polynomial approximation of a function f on an interval range. It
implements the method published in the article:

Efficient polynomial L^\infty - approximations Nicolas Brisebarre and Sylvain Chevillard

Proceedings of the 18th IEEE Symposium on Computer Arithmetic (ARITH 18)

pp. 169-176 - The basic usage of this command is fpminimax(f, n, formats, range). It computes a polynomial approximation of f with degree at most n on the interval range. formats is a list of integers or format types (such as double, doubledouble, etc.). The polynomial returned by the command has its coefficients that fit the formats indications. For instance, if formats[0] is 35, the coefficient of degree 0 of the polynomial will fit a floating-point format of 35 bits. If formats[1] is D, the coefficient of degree 1 will be representable by a floating-point number with a precision of 53 bits (which is not necessarily an IEEE 754 double precision number. See the remark below), etc.
- In its most general form, fpminimax takes a function f to be approximated, a list of basis functions g0, ..., gn, a list of points L, and a constrained part q. It returns an expression p of the form p = q+a0*g0+...+an*gn such that (p-f) or (p-f)/f (depending whether the error to be considered is the absolute of relative error) is as small as possible on the points of L and with coefficients a0, ..., an fitting on given formats. fpminimax admits several syntaxes listed above and described below to conveniently handle usual cases (such as when g0, ..., gn is the standard monomial basis X^0, ..., X^n) and to automatically compute a suitable list of points L on a given range.
- The second argument may be either an integer, a list of integers or a list
of functions. An integer indicates the degree of the desired polynomial
approximation. A list of integers indicates the list of desired monomials.
For instance, the list [|0,2,4,6|] indicates that the polynomial must be
even and of degree at most 6. Giving an integer n as second argument is
equivalent as giving [|0,...,n|].
Finally, a list of function g_k indicates that the desired approximation
must be a linear combination of the g_k.

The list of formats is interpreted with respect to the list of monomials. For instance, if the list of monomials is [|0,2,4,6|] and the list of formats is [|161,107,53,24|], the coefficients of degree 0 is searched as a floating-point number with precision 161, the coefficient of degree 2 is searched as a number of precision 107, and so on. - The list of formats may contain either integers or format types (halfprecision, single, double, doubledouble, tripledouble, doubleextended and quad). The list may be too large or even infinite. Only the first indications will be considered. For instance, for a degree n polynomial, formats[n+1] and above will be discarded. This lets one use elliptical indications for the last coefficients. For historical reasons, and as opposed to the general rule within Sollya, the case of an end-elliptic list whose last defined element is numeric is interpreted as the infinite list where the last defined element is repeated: therefore, e.g., [|24...|] is a perfect synonym for [|SG...|]. Because this behavior can be confusing, using for the formats an end-elliptic list whose last defined element is numeric is deprecated and will be forbidden in a future release.
- The floating-point coefficients considered by fpminimax do not have an exponent range. In particular, in the format list, double is an exact synonym for 53. Currently, fpminimax only ensures that the corresponding coefficient has at most 53 bits of mantissa. It does not imply that it is an IEEE-754 double.
- By default, the list of formats is interpreted as a list of floating-point
formats. This may be changed by passing fixed as an optional argument (see
below). Let us take an example: fpminimax(f, 2, [|107, DD, 53|], [0;1]).
Here the optional argument is missing (we could have set it to floating).
Thus, fpminimax will search for a polynomial of degree 2 with a constant
coefficient that is a 107 bits floating-point number, etc.

Currently, doubledouble is just a synonym for 107 and tripledouble a synonym for 161. This behavior may change in the future (taking into account the fact that some double-doubles are not representable with 107 bits).

Second example: fpminimax(f, 2, [|25, 18, 30|], [0;1], fixed). In this case, fpminimax will search for a polynomial of degree 2 with a constant coefficient of the form m/2^25 where m is an integer. In other words, it is a fixed-point number with 25 bits after the point. Note that even with argument fixed, the formats list is allowed to contain halfprecision, single, double, doubleextended, doubledouble, quad or tripledouble. In this this case, it is just a synonym for 11, 24, 53, 64, 107, 113 or 161. This is deprecated and may change in the future. - The fourth argument may be a range or a list. Lists are for advanced users that know what they are doing. The core of the method is a kind of approximated interpolation. The list given here is a list of points that must be considered for the interpolation. It must contain at least as many points as unknown coefficients. If you give a list, it is also recommended that you provide the minimax polynomial as last argument. If you give a range, the list of points will be automatically computed.
- The fifth, sixth and seventh arguments are optional. By default, fpminimax
will approximate f while optimizing the relative error, and interpreting
the list of formats as a list of floating-point formats.

This default behavior may be changed with these optional arguments. You may provide zero, one, two or three of the arguments in any order. This lets the user indicate only the non-default arguments.

The three possible arguments are:- relative or absolute: the error to be optimized;
- floating or fixed: formats of the coefficients;
- a constrained part q.

Notice that it is perfectly allowed that the monomial basis contains monomials also present in the constrained part. This can be useful for instance to fix the upper part of a doubledouble and look only for the second double of the expansion. For instance, in the previous example, we could use the same constrained part q = 1 + x + x^2/2 but with the full monomial basis [|0,..., 4|]. The resulting polynomial will be of the form p = 1 + x + x^2/2 + a0 + a1 x + a2 x^2 + a3 x^3 + a4 x^4, or equivalently (1+a0) + (1+a1) x + (1/2 + a2) x^2 + a3 x^3 + a4 x^4. Therefore, if for instance the coefficients a0 to a4 are asked to be double, the resulting polynomial p has its three first coefficients that are doubledouble with fixed upper parts. - The last argument is for advanced users. It is the minimax polynomial that
approximates the function f in the given basis. If it is not given
this polynomial will be automatically computed by fpminimax.

This minimax polynomial is used to compute the list of interpolation points required by the method. It is also used, when floating-point coefficients are desired, to give an initial assumption for the exponents of the coeffcients. In general, you do not have to provide this argument. But if you want to obtain several polynomials of the same degree that approximate the same function on the same range, just changing the formats, you should probably consider computing only once the minimax polynomial and the list of points instead of letting fpminimax recompute them each time.

Note that in the case when a constrained part is given, the minimax polynomial must take that into account. For instance, in the previous example, the minimax would be obtained by the following command: P = remez(1-(1+x+x^2/2)/exp(x), [|3,4|], range, 1/exp(x)); Note that the constrained part is not to be added to P.

In the case when the second argument is an integer or a list of integers, there is no restriction for P, as long as it is a polynomial. However, when the second argument is a list of functions, and even if these functions are all polynomials, P must be expanded in the given basis. For instance, if the second argument is 2 or [|0, 1, 2|], P can be given in Horner form. However, if the second argument is [|1, x, x^2|], P must be written as a linear combination of 1, x and x^2, otherwise, the algorithm will fail to recover the coefficients of P and will fail with an error message.

Please also note that recovering the coefficients of P in an arbitrary basis is performed heuristically and no verification is performed to check that P does not contain other functions than the functions of the basis.

For all these reasons, it is sometimes useful to provide a list of coefficients Lcoeffs instead of P. In this case, the convention is that the i-th element of Lcoeffs is the coefficient of the i-th basis function in the minimax polynomial. Lcoeffs might contain more elements than there are basis functions, in which case the extra elements will be ignored. - For historical reasons, fpminimax and remez have different syntaxes.
Indeed, the syntax of fpminimax has been designed to be higher level and
closer to usual customer needs. However, the reader's attention is drawn on
the fact that this could be confusing: the argument f in fpminimax is always
the function to be approximated, regardless of the mode (absolute or
relative) which is passed as a separate argument, while remez takes two
arguments f and w and looks for the approximation p that minimizes f-p*w
(with w=1 by default). Therefore, in remez looking for the approximation p
that minimizes the relative error with respect to a given function g is
performed by taking f=1 and w=1/g.

It may seem at first sight that the syntax of remez is more powerful than fpminimax as one can imagine to simultaneously use a non trivial f and a non trivial w, hence performing a weighted approximation. The same result can actually be achieved with fpminimax (to the price of a slightly more involved syntax) by approximating f in the customed monomial basis formed of w, w(x)*x, w(x)*x^2, ..., w(x)*x^j, etc. - Note that fpminimax internally computes a minimax polynomial (using the same algorithm as remez command). Thus fpminimax may encounter the same problems as remez. In particular, it may be very slow when Haar condition is not fulfilled. Another consequence is that currently fpminimax has to be run with a sufficiently high working precision.

> printexpansion(P);

(0x3ff0000000000000 + 0xbc09fda15e029b00) + x * ((0x3af9eb57163024a8 + 0x37942c2f3f3e3839) + x * (0xbfdfffffffffff98 + x * (0xbbd1693f9c028849 + x * (0x3fa5555555145337 + x * (0x3c7a25f610ad9ebc + x * 0xbf56c138142da5b0)))))

> display = powers!;

> P;

x * (1 + x^2 * (-357913941 * 2^(-31) + x^2 * (35789873 * 2^(-32))))

> display = powers!;

> P;

1 + x * (1 + x * (1 * 2^(-1) + x * (375300225001191 * 2^(-51) + x * (5592621 * 2^(-27)))))

Weighted approximation: look for p such that (p-f)*w is small

> n = 3;

> f = exp(x);

> w = sqrt(1-x^2);

> monomials = [||];

> for i from 0 to n do { monomials = monomials :. x^i*sqrt(1-x^2); };

> q = fpminimax(f*w, monomials, [|D...|], [-99/100,99/100],absolute);

> display = dyadic!;

> print(q);

3194834406077049b-54 * x^3 * sqrt(1 - x^2) + 4792048811386295b-53 * x^2 * sqrt(1 - x^2) + 8982938415377717b-53 * x * sqrt(1 - x^2) + 561436817847349b-49 * sqrt(1 - x^2)

> leadingcoeff = proc(expr) {

var res;

match(expr) with

a*b+c : {res = [|a,c|]; }

a*b : {res = [|a,0|]; }

default: {res = error; };

return res;

};

> p = 0;

> for i from n to 0 by -1 do {

L = leadingcoeff(q);

p = p + L[0]*x^i;

q = L[1];

};

> print(p);

3194834406077049b-54 * x^3 + 4792048811386295b-53 * x^2 + 8982938415377717b-53 * x + 561436817847349b-49 * x^0

> display=decimal!;

> dirtyinfnorm((p-f)*w, [-1,1]);

2.7416001717508988167160223812127007458632536307517e-3

> r = remez(f*w,n,[-99/100,99/100],w);

> dirtyinfnorm((r-f)*w, [-1,1]);

2.7416001717508734413506680717675258974378965304354e-3

> P0 = fpminimax(exp(x), 4, [|DD,DD,DD,D,SG|], I);

> P1 = fpminimax(exp(x), 4, [|D,D,D,D,SG|], I, 1+x+x^2/2);

> for k from 0 to 2 do { DD(coeff(P0,k)) == coeff(P0,k); };

true

true

true

> for k from 0 to 2 do { DD(coeff(P1,k)) == coeff(P1,k); };

true

true

true

> dirtyinfnorm(P0/exp(x)-1, I);

6.2744565211592155928616387674345117702132844508175e-35

> dirtyinfnorm(P1/exp(x)-1, I);

5.8284777124834959383794298527146965060524845952998e-35

> pstar = remez(f, 5, [-1b-7;1b-7]);

> listpoints = dirtyfindzeros(f-pstar, [-1b-7; 1b-7]);

> P1 = fpminimax(f, 5, [|DD...|], listpoints, absolute, default, default, pstar);

> P2 = fpminimax(f, 5, [|D...|], listpoints, absolute, default, default, pstar);

> P3 = fpminimax(f, 5, [|D, D, D, 24...|], listpoints, absolute, default, default, pstar);

> print("Error of pstar: ", dirtyinfnorm(f-pstar, [-1b-7; 1b-7]));

Error of pstar: 7.9048441259903026332577436001060063099817726177425e-16

> print("Error of P1: ", dirtyinfnorm(f-P1, [-1b-7; 1b-7]));

Error of P1: 7.9048441259903026580081299123420463921479618202064e-16

> print("Error of P2: ", dirtyinfnorm(f-P2, [-1b-7; 1b-7]));

Error of P2: 8.2477144579950871737109573536791331686347620955985e-16

> print("Error of P3: ", dirtyinfnorm(f-P3, [-1b-7; 1b-7]));

Error of P3: 1.08454277156993282593701156841863009789063333951055e-15

> g = (2^x-1)/x;

> p = fpminimax(g, L, [|D...|], [-1/16;1/16],absolute);

> display = powers!;

> p;

-3267884797436153 * 2^(-54) * sin(x^3) + 5247089102535885 * 2^(-53) * (cos(x) - 1) + -8159095033730771 * 2^(-54) * sin(x) + 6243315658446641 * 2^(-53) * exp(x)

> T = [|1, x|];

> for i from 2 to n do T[i] = canonical(2*x*T[i-1]-T[i-2]);

> g = (2^x-1)/x;

> PCheb = fpminimax(g, T, [|DD,DE...|], [-1/16;1/16],absolute);

> display = dyadic!;

> print(PCheb);

112469560253864905423137861008261b-107 + x * (76130818934358736339243350051b-98 + x * (34355379230514621195759363b-89 + x * (381013348011182609334519067b-95 + x * (825307274780189286345649b-89 + x * (3050983523250669954220137b-94 + x * (1180123116639845252291b-86 + x * (6543992088039485657053b-92 + x * (15750497046710770365b-87 + x * 8733930098894247371b-90))))))))

> n = 4; f = 1; w = sqrt(x)/acos(1-x);

> p = remez(f, n, d, w);

> monomials = [||]; Lcoeffs = [||];

> for i from 0 to n do {

monomials = monomials :. x^i*sqrt(x);

Lcoeffs = Lcoeffs :. coeff(p, i);

};

> pf = fpminimax(acos(1-x), monomials, [|D...|], d, relative, floating, 0, Lcoeffs);

> display = dyadic!;

> print(pf);

6237591828287231b-61 * x^4 * sqrt(x) + 1137310727877715b-57 * x^3 * sqrt(x) + 7642862123083659b-58 * x^2 * sqrt(x) + 1061508612083747b-53 * x * sqrt(x) + 6369051672525773b-52 * sqrt(x)

See also: remez, dirtyfindzeros, absolute, relative, fixed, floating, default, halfprecision, single, double, doubleextended, doubledouble, quad, tripledouble, implementpoly, coeff, degree, roundcoefficients, guessdegree

Go back to the list of commands