Rating Mirrors

by Mel Bartels

Introduction

The most common question when a new telescope arrives is, "How good are the optics?". Distinguishing between poor optics, bad atmosphere, bad eyepieces, bad collimation, and so forth, is key to improving telescope performance. Unfortunately many amateurs do not know how to test mirrors, or interpret the results.

Rating mirrors means picking a metric or standard and selecting a particular test designed to measure that metric. Here's a short list of standards and tests.

First, we need to talk about a telescope's resolution. No matter how perfect the optic, resolution is limited by the diffractive nature of light. The reflected light at focus forms a disk, called the Airy disk, that is surrounding by a series of rings of ever dimmer light. Empirically we resolve a little better than theory suggests because we can see the elongation caused by slightly overlapping Airy disks.

Our goal is very simple and direct: do not materially degrade resolution by poor optics that cause aberrations. Standards and tests are designed to ferret out poor optics.

Next, we need to talk slope tests and standards based on these tests, and surface tests and their standards.
Most standards and tests are based on slopes. The mirror is divided radially into zones, each zone ideally having the proper slope to reflect light at the correct angle such that the light passes through a mirror's theoretical resolution disk. It is important to keep in mind that the light does not go exactly where geometry says; instead, the light rays form an Airy disk surrounded by rings. Standards for slope tests ensure that the light is grouped tightly enough so that the Airy disk and rings are not materially altered in size or brightness.
Surface tests measure the mirror's surface by calculating the Optical Path Difference (OPD) or by measuring the surface directly with a profilometer.

Next, we need to talk about the numbers game. A 1/4 Peak to Valley (P-V) rating at the wavefront or image plane translates to 1/8 wave (P-V) on the mirror's surface or stated another way, +-1/16 wave (P-V). If the error is very smooth, then this will be tested, best case, to 1/14 wave RMS at the wavefront, or 1/28 wave RMS on the mirror's surface, and so on. It's useless to say that a mirror is quarter-wave without stating if it is peak to valley or RMS or at the wavefront or on the mirror's surface or is a +-.

Next, let's talk wavelength of the light used in the test. The wavelength that our eyes is most sensitive to is green light which has a wavelength of 500 nanometers or 0.5 microns or (25,400 microns to the inch) 0.00002 inches or 20 millionths of an inch. Often a mirror is rated in red light at 750 microns. This means that to our eyes, an error rated using red light will be 50% greater than stated.

Next, we need to talk about small slow optics versus large fast optics. Small slow optics benefit from tighter standards: the difference between 1/4 wavefront and 1/10 wavefront 6 inch [15cm] F8 mirrors is apparent. And, a mirror like this can be readily made to a high standard. Thanks to atmospheric seeing and the extreme difficulty making large fast mirrors, a peak to valley 1/4 wavefront with a smooth surface is a good rating for a large fast mirror.

Finally, we need to talk about optical envy and snobbery. Someone asks for help interpreting a test score. Let's say their mirror tests with an RTA of 0.5. Someone else jumps in to say that they would never use a mirror with an RTA greater than 0.25. Reputations suffer and people are made to feel that their mirrors are sub-standard. On the next night of poor seeing, it will be pretty hard for the owner to avoid thinking that they have a bad mirror. Yet the mirror is quite good.

Standards

The Danjon-Couder standard has two conditions:

  1. The mirror's surface should be as smooth as possible, and
  2. The overall peak to valley error on the mirror's surface should be a small fraction of the wavelength of light

The Milles-Lacroix (M-L) standard says that the light rays, geometrically speaking, should all pass through the mirror's theoretical resolution disk. The M-L graphical approach to plotting the zonal readings from the Foucault test was introduced to American amateurs in the February 1976 issue of Sky and Telescope. It describes a tornado looking envelope based on the reflected light rays passing through the theoretical disk of resolution. If the zonal measurements fit within the tolerance envelope, then the mirror is judged acceptable.

The Relative Transverse Aberration (RTA) standard says that light rays reflecting from the mirror should pass within the mirror's theoretical resolution disk. A passing test score is 1.0 or less. I find the weighted RTA (WRTA) useful.

The most meaningful single number to judge mirror performance appears to be the Strehl ratio. Peter John Smith on the ATM listserv notes the following papers:
"Effects of wavefront Aberration on Visual Instrument Performance, and a Consequential Test Technique" by Nigel D. Haig and G. J. Burton. Applied Optics / Vol 26, No 3 / 1 Feb, 1987.
"Phychometrically Appropriate Assessment of Afocal Optics by Measurement of the Strehl Intensity Ratio" by Nigel D. Haig and T. L. Williams.Applied Optics / Vol 34, No 10 / 1 Apr 1995
Peter John Smith says, "the authors conclude that Strehl is by far the best single quantifier of performance for Afocal systems (telescopes) - and it can be measured quite well if needed and in fact implemented on a production basis for assessment of optical instruments. The authors conclude that - despite the type of aberrations - the Strehl ratio remains very indicative and that 0.87 is close to the 'Just Noticeable Difference' criterium used."
A perfect mirror results in 84% of the light in the Airy disk and 16% of the light in the rings. A less than perfect mirror places less light in the Airy disk and more in the rings. (Incidentally, the Airy disk can shrink for a bad mirror, resulting in better resolution in certain cases.) The Strehl ratio is defined as the intensity of the image spot at its central brightest point divided by the same image intensity without aberration. A Strehl ratio of 100% means a perfect mirror - a mirror that is putting 84% of its light into the Airy disk. Since all parts of the mirror should contribute to the light that gets into the Airy disk, the Strehl ratio is a measure of total surface quality. The Strehl ratio can be calculated using the equation from "Modern Optical Engineering" from Warren J. Smith, pg 337, where Strehl ratio = (1 - 2 pi^2 RMS^2)^2. As with RMS, any rating that measures all of the surface quality is a good valid method.
For example, Univ. of Arizona's Spinning Mirror Lab's Dean Ketelsen says that the 6 meter mirror was pushed out the door with 14 nm RMS. If the secondary is perfect, the 6 meter system will have a Strehl ratio of ~ 90%
Sometimes if not often the Strehl is calculated from a pool of too few measurements. The resulting Strehls are wildly inaccurate and not useful and can be smelled out from extravagent claims.

Here is Nils Olaf Carlin's comment on the pitfalls of Peak-Valley measurements:
"One obvious problem with P-V is that it only considers two points on the mirror and thus can never predict the performance of the whole mirror. Instead. RMS considers the errors over the total mirror surface, in such a way as to predict how well the light will be concentrated at the image at best focus (at least as long as the phase deviations from best wavefront are "small" - and indeed we do our best to keep them small).
An attempt at explanation: To get a perfect Airy disk, all light at the center of the disk should add in phase - any deviation somewhere on the mirror surface will introduce a phase error at the corresponding part of the wavefront. Thus the in-phase contribution will be proportional to the cosine of the phase error, and the total error will be the average of this cosine term over the aperture (the sine contributions tend to cancel at best focus). Since the cosine for small angles is 1-k*(phase error)^2, the loss of coherence is proportional to the mean of the squares of the deviations. Thus, the Mean Square error would be even more useful - only it has the unfortunate dimension of distance squared. But the main point is if you want to know how well the mirror can concentrate light at focus, RMS is the best, if not the only meaningful, measure of error.
Another important point where RMS error is extremely simple and useful (while P-V fails utterly) is comparing the relative importance of different types of errors. For example, given the measure of error of the mirror itself (assuming perfect suspension), and the known (by PLOP) deformation from the mirror cell, what will the resulting error be?
The very useful Strehl ratio gives the concentration of light at the center of the Airy disk, compared to ideal (=1). It is easy to remember that the Strehl ratio is lowered by the square of each *surface* error (in nm) divided by (approx) 2000. Example: a mirror with 10 nm RMS surface error will have a Strehl ratio of 1-(10^2)/2000= 0.95. Suppose the mirror cell contributes another 4 nm RMS to the surface error - how much worse will the resulting deviation be? The Strehl ratio is lowered by (4^2)/2000 =0.008 - the diffraction peak is lowered by less than percent (!) by the deformation of the cell.
(For wavefront RMS error in nm, divide the squared error by 8000 if you have it in fractional "wavelengths", try divide 40 by the denominator squared - for simplicity, a wavelength of 560 nm is assumed here. Example - what is the strehl ratio of a mirror with 1/14.1 wavelengths RMS error on the wavefront? 1-40/14.1^2 =0.80! Or one with 20 nm RMS surface error: 1-20^2/2000=0.80!!
A mirror with an error of 20 nm RMS on the surface (or 40 nm RMS on the wavefront) is right at the edge of the jolly old Rayleigh"quarter-wave" criterion."

Marechal Limit: 1/14 wavefront RMS which is loosely equivalent to a Strehl of 0.80. RMS is the root mean square error, or a statistically averaged error: The RMS is computed by finding the wavefront error at a bunch of uniformly spaced points on the mirror, computing the average error, then difference each individual value from the average, square each difference, sum the squares and divide the sum by the number of data points minus 1, and then take the square root. (If you're familiar with statistics, it's the same equation as for a sample standard deviation.) Because RMS says something about many points on a mirror, it is a much more meaningful measurement than peak to valley. Oftentimes, P-V and RMS are related by a 1 to 3.5 ratio. If the mirror has but a single broad smooth defect, the 1/4 wavefront P-V could be converted by 1/(4*3.5) = 1/14 wavefront RMS. Unfortunately, this is only true if the rare condition that the mirror surface is a smooth and true conic. Therefore RMS cannot be derived simply by taking the peak-valley measurement and multiplying it by 3.5. A mirror with a RMS that just happens to be 3.5x to 4x worse than P-V should be taken skeptically. Chances are that peak to valley was measured then the best case guess of the RMS error was calculated as above. See: paraboloid.html for much more on this and other related topics.

Adjacent RMS: Larger mirrors may be spec'd where adjacent areas of the mirror must have better or lower RMS values than distant areas of the mirror. For instance, areas 4-6 inches apart may be required to be within 1/20 wavefront RMS whereas areas 8-12 inches apart only need to be within 1/14 RMS. Atmospheric seeing, thanks to the atmosphere's poor mixing qualities, mimics adjacent RMS. Rating a mirror by adjacent RMS is saying that the mirror has but broad smooth gentle errors, and admitting that it will be used at the bottom of the ocean of air.

Star test standard: immaterial differences between intra and extra focal images, diagonal breakout on either side of focus occurs at about the same distance, and there is no turned edge. The star test is qualitative only. Popularized by John Dobson and his protoge Bob Kestner (head optician for the COBE Hubble correction optics at Tinsley Labs), this method relies on slightly defocusing a test star, first inside of focus, then outside of focus. By noting how the light from the entire mirror focuses, a rating in the form of: excellent, good, average, poor, or unacceptable, can be given to the mirror.
Here is Bratislav's star test scale:

  1. Can't find anything wrong with it, absolutely perfect: expletive' Yet to see one after ~25 yrs.
  2. Defects visible only in extrafocal images, and only after extensive star testing in best seeing conditions (1/10 wf): 'You lucky b@$#@rd!' Can count these on fingers of one hand.
  3. Extrafocal defects readily visible, but really minor (1/10 wf): 'Excellent' Best examples of best commercial telescopes (Zeiss, AP, Tak, etc). Best examples of homemade optics.
  4. Extrafocal defects fairly obvious, but in focus image still essentially perfect (1/10 - 1/6 wf): 'Very good' Majority of current 'best commercial telescopes'; best examples of mass produced scopes.
  5. Large defects visible on extrafocal images, in focus image suffers only slightly (1/6 - 1/4 wf): 'Good' selected examples of mass produced telescopes, most well made amateur optics; some examples of 'best commercial scopes' can still be found here.
  6. In focus image visibly suffers (~1/4 wf): 'Acceptable' good mass produced scope, most good large/fast mirrors I've seen.
  7. Image deterioration serious, clearly beyond 1/4 wavefront: 'Light bucket' majority of older generation mass produced scopes, special purpose telescopes (astrographs).
  8. It's difficult to determine when scope is in focus at all ( 1/2 - 1 wf): 'If you're happy with it ...' unfortunately, not that difficult to find !
  9. Usable only at very low magnification ( ~1 wf): 'I don't want to have anything to do with this one'
  10. Absolutely useless: expletive' unlike 1, I've seen these :-)
99% of all scopes I've seen fall into '4-10' bracket.

Social contract: Harold Suiter, in a post to the atm list archived says, "Somewhat arbitrarily, we draw a line at some level of improvement and say 'Beyond this, the optical system is diffraction-limited.'" "However, the way we draw this line must be agreed upon because it has no immediately obvious placement. It is a social contract rather than one imposed by physics."

Magnification roll-off: Another qualitative method, though it does demonstrate the MTF (modulation transfer function). A low contrast object such as Jupiter is selected, and a series of magnifications are run through. At some point, the image will roll-off and begin to lose it sharpness. Dividing this magnification by the aperture gives a rating such as, this mirror is good to 50x per inch of aperture.

Here is my scale:

For smaller slower mirrors, push these numbers higher, for large fast mirrors, push the numbers lower.

Encircled energy: A measurement often used by professional opticians specifies the circle diameter that contains a certain percentage of the light rays. For instance, 90% of the reflected light rays shall pass through a 1 arcsecond circle.

What does diffraction limited mean?

The term "diffraction limited" has so many meanings that it is meaningless. Here are some meanings:

Optics limited by laws of diffraction. Good and bad optics are equally limited by the laws of diffraction, so this definition applies equally well to all optics.

The optic's resolution no worse than airy disk or put another way, the output image is limited in its quality only by the aperture of the instrument and the effects of the central obstruction (a 1/3 obstruction is equivalent to a 1/4 wavefront P-V degradation).

The most common definition is 1/4 wavefront P-V. Light rays of optics are ray traced with the goal that the optics bring all the light rays to a point, or more precisely, a circular area called the circle of energy or focus spot. As the light rays are brought ever closer together, improvement is seen, but only up to a point. No matter how tightly the light rays are traced, a dot and not a point remains, thanks to the diffractive nature of light. When the improvement ceases to be significant, the optic is called 'diffraction limited'. This is a somewhat arbitrary judgment. Professional opticians most relate this to 1/4 wavefront P-V.

Schroeder's ASTRONOMICAL OPTICS says that the optics are diffraction limited if the Strehl ratio is greater than 0.80, which matches the 1/4 wavefront Rayleigh tolerance for spherical aberration.

Assuming the somewhat rare case of a smooth conic mirror surface, this equates to the Marechal Limit of 1/14 wavefront RMS.

Yet another definition is considering that the wave nature of light fundamentally limits the resolving power of the optics, then diffraction limited means that the errors in the optics don't materially degrade the resolving power.

Danjon-Couder condition #1 is met.

McGraw-Hill Dictionary Of Scientific and Technical Terms, 5th ed. gives the following definition: "Capable of producing images whose separations are as small as the theoretical limit imposed by diffraction effects."

If a mirror gets 80% of the theoretical amount into the Airy disk, it's considered diffraction limited.

Mirror tests

Caustic: The returned light rays from the mirror actually do not come to a focus on axis, instead they form caustic horns to each side. The Caustic test measures not only in the 'Y' axis as the Foucault test does, but also measures in the 'X' axis. This can result in extremely precise results. Unfortunately, the X-Y testing stage is intimidating to build.

Poor-man Caustic: This uses the 2nd derivative. See Jim Burrow's reduction program. Can be easy to use. See my 'parabolizing' webpage for information on how to execute the test.

Double Pass AutoCollimation Test (DBAC). The mirror is tested at focus using a large flat to return the collimated beam of light to focus. Errors are doubled and easier to see. The return beam is most commonly inspected with a Ronchi grating. The Delmarva Mirror Making Workshop uses this test with great effectiveness.

Foucault: This is the time honored amateur test using a knife-edge cutting into the returning light. A series of zones across the mirror face are measured for longitudinal aberration. The zones are created by using a Couder Mask. Zonal differences down to 1/100 wave can be measured, see Burrows, Texereau, Hall. Mirror areas between the zones are interpolated. Phil Oltmann recently completed a 32 inch [0.81m] F/2.4 using the Foucault test.

Unmasked Foucault: this test, first mentioned by Michael Peck, and developed by Mark Cowan using a longitudinally travelling knife-edge to create a series of digital images that are analyzed by software to determined each's radius of curvature. For more see this thread on the Cloudy Nights forum.

Hartmann test uses a grid of holes placed over the mirror. The resulting grid of light is measured for the desired paraboloidal distortion.

Holomask test developed by Mauritz Andersson. His page is here and my online calculator is here.

Interferometric (Bath). This is a test that measures the mirror's surface using Optical Path Differences; the components are inexpensive and Dale Eason's software is free - just an incredible service to the community.

The Lyot phase contrast test reveals very small scale surface roughness. Wolfgang Rohr on worked on phase contrast tests. A motivated amateur can build a tester to reveal very small scale defects by using a candle to deposit soot on a piece of glass, then using a razor blade to create a smoked slit. Test as if a Foucault and as the beam of light moves across the mirror, there will be a magical point where all the light in phase is absorbed leaving very small scale defects shining through. For more see Herbert Highstone's articles. Finally, if the Foucault knife-edge is positioned just so deep into the light, small scale defects can begin to be seen in the shadows.

Ronchi: A ruled grating of about 100 lines per inch is placed in front of the light source and the eye. The mirror will distort the returning lines into wavy bands. By carefully comparing to computer generated ideal bands at specific distances from the mirror, the state of the mirror may be deduced. I use this test - it's incredibly quick and can be quite accurate. Go here to see how I used the Ronchi Matching Test for my very fast mirrors including a 25 inch [0.64m] F/2.6. A Ronchi grating used in place of the eyepiece at focus is not particularly discerning.

Ross null test: this test uses a single large lens properly positioned to reverse the spherical aberration resulting in a null pattern. Steve Swayze uses this with good results as does the Delmarva Mirror Making Workshop.

SCOTS (Software Configurable Optical Test System) test. For more see this paper.

Star test: using a high power eyepiece, the inside of focus and outside of focus star disks are inspected for differences. See my star test page. The indoor version of the star test uses a scope of slighter greater aperture than the mirror under test with an illuminated pinhole placed at the focal plane to generate a collimated beam of light that is fed into a tube assembly with the test mirror. For more, see my star test page.

Waineo null test: popularized by Tom Waineo, this test uses a sphere placed at the conjucate focus.

Wire test. This can work nicely on large fast optics.

Check out Malacara's Optical Shop Testing for a comprehensive resource to mirror testing. Malacara describes dozens of tests; most appear workable by amateurs and capable of quarter wavefront accuracy and better.

More tests can easily be dreamed up, for example, a light source at focus projected through a mesh whose return parallel beam from the mirror can be projected onto a screen and measured for deformation.

Thoughts from leading opticians who serve the amateur community

Carl Zambuto
John Lightholder
Mike Lockwood
R.F. Royce
Terry Ostahowski