RATING MIRRORS

by Mel Bartels

INTRODUCTION

A great deal has been written on the subject of rating mirrors.  Every book contains material (all have valuable information, but no book has it completely correct).  Rating mirrors is a popular subject of discussion at club meetings, and on the net (where a great deal of misinformation is passed on).

It is important for amateurs to be able to test their own optics.  The most common question when a new telescope arrives is, "How good are the optics?".  When the images are bad, knowing how to distinguish between poor optics, bad atmosphere, bad eyepieces, bad collimation, and so forth, is important.

Unfortunately the vast majority of amateurs do not know how to test mirrors, or interpret the results.

When considering mirror test results, remember that there is a difference between resolution, repeatability, and acccuracy! See http://www.neat.com/techinfo/pdf/accuracy.pdf

The most meaningful single number to judge mirror performance appears to be the Strehl ratio. Peter John Smith on the ATM listserv notes the following papers:
"Effects of wavefront Aberration on Visual Instrument Performance, and a Consequential Test Technique" by Nigel D. Haig and G. J. Burton. Applied Optics / Vol 26, No 3 / 1 Feb, 1987.
"Phychometrically Appropriate Assessment of Afocal Optics by Measurement of the Strehl Intensity Ratio" by Nigel D. Haig and T. L. Williams.Applied Optics / Vol 34, No 10 / 1 Apr 1995
Peter John Smith says, "the authors conclude that Strehl is by far the best single quantifier of performance for Afocal systems (telescopes) - and it can be measured quite well if needed and in fact implemented on a production basis for assessment of optical instruments. The authors conclude that - despite the type of aberrations - the Strehl ratio remains very indicative and that 0.87 is close to the 'Just Noticeable Difference' criterium used."


M-L versus Strehl controversy

What is the best way to describe a mirror's quality? M-L or Strehl?
The Millies-Lacroix graphical approach to the Foucault test was introduced to American amateurs in the February 1976 issue of Sky and Telescope. It described a tolerance envelope based on the reflected light rays passing through the theoretical Airy disk. If the zonal measurements fit within the tolerance envelope, then the mirror is judged acceptable. Vast numbers of amateur mirrors including the majority of my own have been made using this approach. However, caution needs to be exercised when using M-L to rate a mirror's quality.
The M-L is a geometric analysis which ignores the diffractive effects of light. The light as seen by inspection of the diffraction pattern does not go where the transverse aberration says it does. See Conrady's Applied Optics and Optical Design I, chapter 3, for in-depth treatments.
For instance, a plot of the mirror slope such as the Milles-Lacroix ¼ wave tolerance “tornado plot” of Foucault measurements may show a slope that might lead one to believe that the light will be thrown far from the airy disc.  Yet in reality, the airy disc intensity is often reduced only a little while the diffraction rings are enhanced a little.

A mirror that fails the M-L test can have acceptable Strehl and RMS errors.  In addition, since the M-L plot is one of slope and not surface profile, the M-L test cannot, without further processing, tell us if a particular mirror zone is high or low.  For example, if the M-L plot curves down at the edge, this is not fait accompli of a turned down edge (though most are).  In general, mirrors figured to fit within the M-L tolerance are likely to be better, when considering how the starlight as a whole focuses at the eyepiece. However conversely, mirrors that pass a minimum Strehl of 0.8 may not perform acceptably at the eyepiece. Certain errors such as turned edge and astigmatism call for tighter tolerances than the barely diffraction limited minimal 0.8 Strehl.
M-L succeeds by concentrating the telescope maker's attention on zonal measurements. Hitting the zonal measurements as accurately as possible is the best path to a good mirror. However, the slope measurements need to be integrated to obtain the true surface profile across that mirror diameter.
In addition, caution should be exercised in translating the mirror profile across a single diameter to Strehl and RMS which are numbers meant to represent all of the mirror surface.


STANDARDS

The wavelength that our eyes is most sensitive to is green light which has a wavelength of 500 nanometers or 0.5 microns or (25,400 microns to the inch) 0.00002 inches or 20 millionths of an inch.  Often a mirror is rated in red light at 750 microns.  This means that to our eyes, an error rated using red light will be 50% greater than stated.  We want the mirror to focus light to a fraction of this size.

Danjon-Couder condition 2 or Raleigh Criteria: Maximum wavefront error must not exceed a quarter wave and, for the majority of the mirror's area, the defects should be appreciable less.  Some experienced planetary observers feel that 1/10 wave gives a slightly perceptibly better image, but since most mirrors are over-rated it's not possible to make this statement unequivocally.

Danjon-Couder condition 1: The radius of the circle of least aberration should be comparable with that of the theoretical diffraction disk and, on the average, the transverse aberrations should not exceed the diffraction disk radius. 

Marechal Limit: 1/14 wavefront RMS.

Strehl ratio: 80%.

Star test: Slight differences between intra and extra focal images, no turned edge.

WAYS TO RATE MIRRORS

wavefront P-V: Measured at the focus: the distance from the highest point, i.e., the peak, to the lowest point, i.e., the valley.  Unfortunately, this method of rating a mirror is practically meaningless.  For instance, a mirror that has just one tiny spot that's high will be rated no better than a mirror with multitudinous defects of the same height.  Yet in reality, the mirror with a single defect will focus much more of its light accurately.

surface P-V: Measured on the mirror's surface: the peak to valley (P-V) distance.  An error here is doubled at the wavefront so a mirror may 'appear' twice as good as it really is.  For instance, a surface P-V of 1/8 wave is actually only 1/4 wavefront.  As in the wavefront P-V, this means of rating a mirror does not say near as much as we would like.

+- surface P-V: Above and below error on the mirror's surface from a median value.  This number will be half the surface P-V and a quarter the wavefront P-V so a mirror may 'appear' to be four times better than it really is.  For instance, our example mirror will be penciled in at += 1/16 wave!

RMS wavefront: Root mean square error, or a statistically averaged error: The RMS is computed by finding the wavefront error at a bunch of uniformly spaced points on the mirror, computing the average error, then difference each individual value from the average, square each difference, sum the squares and divide the sum by the number of data points minus 1, and then take the square root. (If you're familiar with statistics, it's the same equation as for a sample standard deviation.)  Because RMS says something about many points on a mirror, it is a much more meaningful measurement than peak to valley.

Oftentimes, P-V and RMS are related by a 1 to 3.5 ratio.  In our example mirror, if the mirror had but a single broad smooth defect, the 1/4 wavefront P-V could be converted by 1/(4*3.5) or 1/14 wavefront RMS.  Unfortunately, this is only true if the rare condition that the mirror surface is a smooth and true conic.  Therefore RMS cannot be derived simply by taking the peak-valley measurement and multiplying it by 3.5.  A mirror with a RMS that just happens to be 3.5x to 4x worse than P-V should be taken skeptically.  Chances are the optician measured peak to valley then made a best case guess of the RMS error.  See: paraboloid.html for much more on this and other related topics.

RMS surface: RMS at the surface.  Hence, in our example mirror of 1/14 wavefront RMS, the mirror would be rated at 1/28 wave surface RMS.  Wave inflation is the derisive term given to making a mirror seem better than it really is by emphasizing inflated numbers such as the RMS surface.  For instance, in red light this 1/4 wavefront mirror becomes 1/40 wave surface RMS!

adjacent RMS: Larger mirrors may be spec'd where adjacent areas of the mirror must have better or lower RMS values than distant areas of the mirror.  For instance, areas 4-6 inches apart may be required to be within 1/20 wavefront RMS whereas areas 8-12 inches apart only need to be within 1/14 RMS.  Atmospheric seeing, thanks to the atmosphere's poor mixing qualities, mimics adjacent RMS.  Rating a mirror by adjacent RMS is saying that the mirror has but broad smooth gentle errors, and admitting that it will be used at the bottom of the ocean of air.

Strehl ratio: Thanks to the wave nature of light, even if all the light is brought to a single focus point, the light will actually form a small disk surrounded by ever fainter rings.  This is the Airy disk and rings.  A perfect mirror results in 84% of the light in the Airy disk and 16% of the light in the rings.  A less than perfect mirror places less light in the Airy disk and more in the rings.  (Incidentally, the Airy disk can shrink for a bad mirror, resulting in better resolution in certain cases.)  The Strehl ratio is defined as the intensity of the image spot at its central brightest point divided by the same image intensity without aberration. A Strehl ratio of 100% means a perfect mirror - a mirror that is putting 84% of its light into the Airy disk.. Since all parts of the mirror should contribute to the light that gets into the Airy disk, the Strehl ratio is a measure of total surface quality.  The Strehl ratio can be calculated using the equation from "Modern Optical Engineering" from Warren J. Smith, pg 337, where Strehl ratio = (1 - 2 pi^2 RMS^2)^2.  As with RMS, any rating that measures all of the surface quality is a good valid method.

For instance, Univ. of Arizona's Spinning Mirror Lab's Dean Ketelsen says that the 6 meter mirror was pushed out the door with 14 nm RMS.  If the secondary is perfect, the 6 meter system will have a Strehl ratio of  ~ 90%.

encircled energy: A measurement often used by professional opticians specifies the circle diameter that contains a certain percentage of the light rays.  For instance, 90% of the reflected light rays shall pass through a 1 arcsecond circle.

social contract: Harold Suiter, in a post to the atm list archived says, "Somewhat arbitrarily, we draw a line at some level of improvement and say 'Beyond this, the optical system is diffraction-limited.'" "However, the way we draw this line must be agreed upon because it has no immediately obvious placement. It is a social contract rather than one imposed by physics."

star test: Finally we come to a method that is qualitative only.  Popularized by John Dobson and his protégé Bob Kestner (head optician for the COBE Hubble correction optics at Tinsley Labs), this method relies on slightly defocusing a test star, first inside of focus, then outside of focus.  By noting how the light from the entire mirror focuses, a rating in the form of: excellent, good, average, poor, or unacceptable, can be given to the mirror.

Here is Bratislav's scale:
1. Can't find anything wrong with it, absolutely perfect: '<expletive>' Yet to see one after ~25 yrs
2. Defects visible only in extrafocal images, and only after extensive star testing in best seeing conditions ( << 1/10 wf): 'You lucky b@$#@rd!' Can count these on fingers of one hand
3. Extrafocal defects readily visible,  but really minor ( < 1/10 wf): 'Excellent' Best examples of  best commercial telescopes (Zeiss,AP,Tak etc) Best examples of homemade optics
4. Extrafocal defects fairly obvious, but in focus image still essentially perfect ( 1/10 - 1/6 wf): 'Very good' Majority of current 'best commercial telescopes'; best examples of mass produced scopes
5. Large defects visible on extrafocal images, in focus image suffers only slightly ( 1/6 - 1/4 wf): 'Good' selected examples of mass produced telescopes, most well made amateur optics; some examples of 'best commercial scopes' can still be found here
6. In focus image visibly suffers ( ~1/4 wf): 'Acceptable' good mass produced scope, most good large/fast mirrors I've seen
7. Image deterioration serious, clearly beyond 1/4 wavefront: 'Light bucket' majority of older generation mass produced scopes, special purpose telescopes (astrographs)
8. It's difficult to determine when scope is in focus at all ( 1/2 - 1 wf): 'If you're happy with it ...' unfortunately, not that difficult to find !
9. Usable only at very low magnification  ( ~1 wf): 'I don't want to have anything to do with this one'
10. Absolutely useless: '<expletive>' unlike 1, I've seen these :-)
99% of all scopes I've seen fall into '4-10' bracket.

magnification roll-off: Another qualitative method, though it does demonstrate the MTF (modulation transfer function).  A low contrast object such as Jupiter is selected, and a series of magnifications are run through.  At some point, the image will roll-off and begin to lose it sharpness.  Dividing this magnification by the aperture gives a rating such as, this mirror is good to 50x per inch of aperture.

Here is my scale:
Mirrors that can sustain 35x to 50x per inch of aperture I rate as excellent.
Mirrors that sustain 25x to 35x per inch of aperture I rate as good.
Mirrors that sustain 25x per inch of aperture are acceptable.
Mirrors that sustain 15x to 25x per inch of aperture are poor, usable only at lower powers.
Mirrors that fail at 12x per inch of aperture are plain just not finished.
For smaller mirrors, push these numbers a little higher, for gargantuan mirrors, push the numbers a little lower.

Here is Nils Olaf Carlin's comment relating Strehl and RMS, along with discussing the pitfalls of Peak-Valley measurements:
"One obvious problem with P-V is that it only considers two points on the mirror and thus can never predict the performance of the whole mirror. Instead. RMS considers the errors over the total mirror surface, in such a way as to predict how well the light will be concentrated at the image at best focus (at least as long as the phase deviations from best wavefront are "small" - and indeed we do our best to keep them small).
An attempt at explanation: To get a perfect Airy disk, all light at the center of the disk should add in phase - any deviation somewhere on the mirror surface will introduce a phase error at the corresponding part of the wavefront. Thus the in-phase contribution will be proportional to the cosine of the phase error, and the total error will be the average of this cosine term over the aperture (the sine contributions tend to cancel at best focus).  Since the cosine for small angles is 1-k*(phase error)^2, the loss of coherence is proportional to the mean of the squares of the deviations. Thus, the Mean Square error would be even more useful - only it has the unfortunate dimension of distance squared. But the main point is if you want to know how well the mirror can concentrate light at focus, RMS is the best, if not the only meaningful, measure of error.
Another important point where RMS error is extremely simple and useful (while P-V fails utterly) is comparing the relative importance of different types of errors. For example, given the measure of error of the mirror itself (assuming perfect suspension), and the known (by PLOP) deformation from the mirror cell, what will the resulting error be?
The very useful Strehl ratio gives the concentration of light at the center of the Airy disk, compared to ideal (=1).  It is easy to remember that the Strehl ratio is lowered by the square of each *surface* error (in nm) divided by (approx) 2000. Example: a mirror with 10 nm RMS surface error will have a  Strehl ratio of 1-(10^2)/2000= 0.95. Suppose the mirror cell contributes another 4 nm RMS to the surface error - how much worse will the resulting deviation be? The Strehl ratio is lowered by (4^2)/2000 =0.008 - the diffraction peak is lowered by less than percent (!) by the deformation of the cell.
(For wavefront RMS error in nm, divide the squared error by 8000 – if you have it in fractional "wavelengths", try divide 40 by the denominator squared - for simplicity, a wavelength of 560 nm is assumed here. Example - what is the strehl ratio of a mirror with 1/14.1 wavelengths RMS error on the wavefront?  1-40/14.1^2 =0.80! Or one with 20 nm RMS surface error: 1-20^2/2000=0.80!!
A mirror with an error of 20 nm RMS on the surface (or 40 nm RMS on the wavefront) is right at the edge of the jolly old Rayleigh"quarter-wave" criterion."

Steve Koehler has worked up a small demonstration of how P-V error and RMS error compare as quality metrics.  The files are: 
    http://tech.groups.yahoo.com/group/interferometry/files/Steve%20Koehler/PV%20vs.%20RMS/pv.png
    http://tech.groups.yahoo.com/group/interferometry/files/Steve%20Koehler/PV%20vs.%20RMS/rms.png
 The first file has a fixed RMS error of 0.08, and a varying P-V error. The second file has a fixed P-V error of 1 wave and a varying RMS error.  Compare the rows of these tables to see which metric is a better measure of performance.

WHAT DOES DIFFRACTION LIMITED MEAN?

The term "diffraction limited" has so many meanings that it is meaningless.  Here are some meanings:

Optics limited by laws of diffraction.  Good and bad optics are equally limited by the laws of diffraction, so this definition applies equally well to all optics.

The optic's resolution no worse than airy disk or put another way, the output image is limited in its quality only by the aperture of the instrument and the effects of the central obstruction (a 1/3 obstruction is equivalent to a 1/4 wavefront P-V degradation).

The most common definition is 1/4 wavefront P-V.  Light rays of optics are ray traced with the goal that the optics bring all the light rays to a point, or more precisely, a circular area called the circle of energy or focus spot.  As the light rays are brought ever closer together, improvement is seen, but only up to a point.  No matter how tightly the light rays are traced, a dot and not a point remains, thanks to the diffractive nature of light.  When the improvement ceases to be significant, the optic is called 'diffraction limited'.  This is a somewhat arbitrary judgment.  Professional opticians most relate this to 1/4 wavefront P-V.

Schroeder's ASTRONOMICAL OPTICS says that the optics are diffraction limited if the Strehl ratio is greater than 0.80, which matches the 1/4 wavefront Rayleigh tolerance for spherical aberration.

Assuming the somewhat rare case of a smooth conic mirror surface, this equates to the Marechal Limit of 1/14 wavefront RMS.

Yet another definition is considering that the wave nature of light fundamentally limits the resolving power of the optics, then diffraction limited means that the errors in the optics don't materially degrade the resolving power.

Danjon-Couder condition #1 is met. As discussed earlier, this can be misleading.

McGraw-Hill Dictionary Of Scientific and Technical Terms, 5th ed. gives the following definition: "Capable of producing images whose separations are as small as the theoretical limit imposed by diffraction effects."

If a mirror gets 80% of the theoretical amount into the Airy disk, it's considered diffraction limited
 

MIRROR TESTS

Foucault: This is the time honored amateur test.  A series of zones across the mirror face are measured for longitudinal aberration.  Zonal differences down to 1/100 wave can be measured, see Burrows, Texereau, Hall.  My experience is 1/30 to 1/50 on small slower stuff.  Mirror areas between the zones are interpolated.  The popular tornado analysis is faulty.  Be wary particularly of Foucault test results with too many digits of precision.

Caustic:  The returned light rays from the mirror actually do not come to a focus on axis, instead they form caustic horns to each side.  The Caustic test measures not only in the 'Y' axis as the Foucault test does, but also measures in the 'X' axis.  This can result in extremely precise results.  Unfortunately, the X-Y testing stage is intimidating to build.

poor man Caustic: This test was invalidated in late 1999.  See the atm archives.

Ronchi: A grating of about 100 lines per inch is placed in front of the light source and the eye.  The mirror will distort the returning lines into wavy bands.  By carefully comparing to computer generated ideal bands at specific distances from the mirror, the state of the mirror may be deduced.  I use this test in my mirror making classes - it's incredibly easy to use.  Download my older DOS based software dnld/ronchi.zip.

Star: Every amateur should learn this test. Since it is performed with the diagonal, mirror mount, and the rest of the scope in place, it is the final test.  But it can be difficult to discern multiple errors on the mirror.

Wire test

(CCD) Hartmann:

Combination of tests:
Jim Burrow's mirror math and his sixtests

Interferometric:
The thermal properties of pyrex as it constantly tries to equilibrates to slight changes in the air temperature probably limits quality to 1/10 wavefront anyhow.

Zernike polynomials
 

MIRROR SMOOTHNESS

Wolfgang Rohr is an experienced optics tester in Germany who uses to test other people's telescopes very frequently.  Besides the normal interferometric tests with a Bath interferometer against a flat, he also applies phase contrast tests that show the roughness of the optics. This test is made at double sensitivity as well, since the optic is measured in double pass against the flat.


MORE INFORMATION

Much of the information in this article was found at the ATM list archives.