The Method of Least Squares

The method of least squares was introduced by Legendre in is work Nouvelles Méthodes pour la Det é rmination des Orbites des Com é tes published in 1805. Certainly, the method was extremely easy to apply except for the computations involved. What it lacked was a theoretical foundation. Here follows a summary of the various arguments put forward to justify the use of least squares.

Robert Ellis has given a detailed comparison of the so-called proofs of the method of least squares in "On the Method of Least Squares" which appeared in the Transactions of the Cambridge Philosophical Society in 1844, pp. 204-219. He treats the proofs of Gauss, Laplace and Ivory.

Cleveland Abbe discovered the proof by Adrain and reported this in "A Historical Note on the Method of Least Squares" which appeared in the American Journal of Science and Arts 1: 411-415 (1871).

J.W.L. Glaisher has contributed "On the Law of Facility of Errors of Observation, and on the Method of Least Squares", Memoirs of the Royal Astronomical Society. Vol. XXXIX (1872) pp. 75-124. In this he first examines the proofs of Adrain. Glaisher also groups proofs according to the following scheme:

Gauss' proof based on the principle of the Arithmetic Mean with which he includes work by Encke, De Morgan and Ellis.
Laplace's proof including Poisson's simplification and criticism by Ivory.
Gauss' second proof and its relation to that of Laplace
Ivory's proofs and criticism of Ellis
Herschel's proof and the criticisms of Ellis and Boole.
Tait's proof
Donkin's proof

This was followed by a study by Mansfield Merriman who has given a chronology of proofs of the method of least squares in The Analyst for March 1877, Volume IV, No. 2, pp. 33-36. This journal and his paper is available through JSTOR. In addition, from the Transactions of the Connecticut Academy, Vol. IV, 1877, we have "A List of Writings relating to the Method of Least Squares, with historical and critical notes." This document lists 408 memoirs, books and parts of books related to the Theory of Errors.

The primary sources

Comments are derived from the aforementioned papers.

Robert Adrain, "Research concerning the probabilities of the errors which happen in making observations." The Analyst (1808) No. IV, pp. 93-109. Two proofs are contained therein. The first is on pages 93-95 and has been reprinted by Cleveland Abbe (see below) and the second lies on pages 96-97. Merriman himself reprints the gist of it in his 1877 paper. Herschel's proof is similar. See also Ellis 1850 below.
Carl Gauss, Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium, 1809. pp. 205-224. Charles Henry Davis made an English translation published as Theory of the Motion of the Heavenly Bodies Moving about the Sun in Conic Sections in 1857. See pages 253-273 there or paragraphs 175-189 of Book II Section 3. Dale F. Trotter also made a translation which is based on that of Bertrand.

Gauss assumes that the arithmetic mean is the most probable result for a sequence of observations of a quantity. The only law of error consistent with this assumption is the Gaussian distribution. From which, the method of least squares will follow.

Ellis states that details of Gauss's reasoning may be found in the paper by Bessel "Bestimmung der Axen des elliptischen Rotationssphaeroids, welches den vorhandenen Messungen von Meridianbégen der Erde am melsten entspricht" originally published in Astronomische Nachrichten 14, Nr. 333. Its translation as "Determination of the Axes of the Elliptic Spheroid of Revolution which most nearly corresponds with the existing Measurements of Arcs of the Meridian." is in Taylor's Scientific Memoirs Vol. 2 on pages 387-400.

We note that Encke also followed the same demonstration in the paper "Uber die Methode der Kleinsten Quadrate", Berliner Astronomisches Jahrbuch for 1834 (1832) pages 249-312 (including tables) of which a translation appears in the same Scientific Memoirs as "On the Method of Least Squares" on pages 317- 369. This article is continued in Berliner Astronomisches Jahrbuch for 1835 (1833), pp. 253-320 and Berliner Astronomisches Jarhbuch for 1836 (1834), pp. 253-308. Finally, further notes "Hr. Encke las einen Beitrag zur Begréndung der Methode der kleinsten Quadrate." Bericht über die zur Bekanntmachung geeigneten Verhandlungen, (1850), pp. 211- 213.

August De Morgan discusses the theory of the arithmetic mean in "On the Theory of Errors of Observation" Cambridge Philosophical Transactions, Vol. X, (1864) pp. 409-42.

Schiaparelli provides a justification for the use of the arithmetic mean in "Sur le principe de la moyenne arithmètique" in Astronomische Nachrichten Vol. LXXXVII (1875), Nr. 2068, columns 55-58. A copy may be downloaded from the SAO/NASA Astrophysics Data System.
Pierre Laplace, "Mémoire sur les approximations des formules qui sont fonctions de trés-grands nombres, et sur leur application aux probabilités" and supplement, "Mémoire sur les approximations des formules qui sont fonctions de trés-grands nombres, et sur leur application aux probabiliés (suite)" . Mém. l'Institut France 1809 (1810), 353-415, 559-565. For the proof see pages 383-389 and 559-565.
Pierre Laplace reproduces the proof in the Théorie Analytique des Probabilités, Chapter IV (pp. 309-354 or paragraphs 18-24). His second demonstration may be found on pp. 318-319.

Laplace shows that the method of least squares follows if all observations follow the same law of error and the number of observations increases without limit. He limits himself to two unknowns. Of course, the proof fails if the law of error is the Cauchy distribution. For this, see Poisson "Sur la probabilité des résultats moyens des observations." Connaissance des Temps, 1827, pp. 273-302.

Ellis gave the extension to any number of unknowns in 1844. Glaisher simplifies the argument of Laplace in "Remarks on certain portions of Laplace's Proof of the Method of Least Squares", Philosophical Magazine Vol. 43, 4th Series, 1864 and again in 1872.

Todhunter also extends the method of Laplace in "On the Method of Least Squares", Transactions of the Cambridge Philosophical Society, Vol. 11 (1871). pp. 219-238.
Carl Gauss, "Theoria combinationis observationum erroribus minimis obnoxiae," Comm. Soc. Gottingen, Vol. V (1823), pp. 33-90. This is the translation based on that of Bertrand by Dale F. Trotter.

Here Gauss assumes the importance of the error varies as the square of its magnitude. He has been accused of petitio principii or begging the question The mean value of the sum of the squares is taken as a measure of precision. Merriman believes this argument to be followed only by Helmert in 1872.
James Ivory, "On the method of Least Squares," Tilloch's Philosophical Magazine, Vol. LXV, (1825) pp. 3-10, 81-88,161-168 and Tilloch's Philosophical Magazine, Vol. LXVIII (1826) pp.161-165.

Ellis claims that Ivory gave three arguments. Glaisher finds four. The first argument (page 5) rests on an analogy to the condition of equilibrium which leads to the method of least squares. The second (pages 6-7) is based on minimizing the mean square error (the measure of precision) when several sets of observations are made. The third is a variant of this in that one minimizes the measure of precision among a set of observations. The last argument is based upon a symmetric law of error and independent errors in observations. In this case he claims the method of least squares follows from the equations of condition.
Hagen, Grundzége der Wahrscheinlichkeits-Rechnung, 1837, second edition 1867. An error is the sum of indefinitely many equal elementary errors and equally positive or negative. That is, the elementary errors are distributed binomially with parameter p = 0.5 so that consequently the distribution of an error is asymptotically normal. An exposition by Charles Kummel in English may be found in The Analyst Vol III for 1876 on pages 133-140, 165-178.
Friedrich Bessel "Untersuchungen éber die Wahrscheinlichkeit der Beobachtungsfehler." Astronomische Nachrichten, 1838, Vol. XV, col. 369-404. A copy may be downloaded from the SAO/NASA Astrophysics Data System. Bessel demonstrates the the normal distribution of errors cannot be assumed a priori by providing concrete examples. However, the error law is closely approximated if an error is the result of many causes and no cause dominates.

Donkin, An essay on the Theory of the Combination of Observations (1844) Ashmolean Society and in translation as "Sur la Théorie de la Combinaison des Observations" in Liouville's Jour. Math. Vol. XV, (1855) pp. 297-322. Merriman asserts that the reasoning is neither clear nor rigorous.
John Herschel, "Quetlelet on Probabilities." Edinburgh Review (1850) Vol. XCII, pp. 1-57. The proof lies on pages 19 and 20. It is the same as that given by Adrain. For a review of his review, see Robert Ellis in the Philosophical Magazine (1850), Vol. 37, pp. 321-328. Ellis puts the argument into mathematical language. See also Glaisher 1872. George Boole defends Herschel with "On the Application of the Theory of Probabilities to the Question of the Combination of Testimonies or Judgments," Transactions of the Royal Society of Edinburgh, Vol. XXI (1857).
Peter Guthrie Tait, "On the Law of Frequency of Error", Transactions of the Royal Society of Edinburgh, Vol. XXIV (1865). Reprinted in Scientific Papers Vol. 1, 1898.
Donkin, "On an analogy relating to the theory of probability and on the principle of least squares," Quarterly Jour. Math., 1857, Vol. I, pp. 152-162. An analysis of the reasoning is given in Glaisher 1872.
Crofton, "On the Proof of the Law of Errors of Observations." Phil. Trans. 1870, pp. 175-188. Here an error is the result of a large number of small errors positive and negative but not equally probable.

Several papers on the history of least squares have been published subsequently. These are

R.L. Plackett, "A historical note on the method of least squares," Biometrika 36: 458-460 (1949)
R.L. Plackett, "The discovery of the method of least squares," Biometrika 59: 239-251 (1972)
O.B. Sheynin, "On the history of the principle of least squares," Archive for the History of the Exact Sciences 46: 39-54 (1993)

The most comprehensive enumeration to date of research in this area has been done by W. Leon Harter. This has been published in a series of articles in the International Statistical Review during 1974 through 1976.

Part I: Introduction, Vol. 42, No. 2, p 147, 1974.
Part I: Pre-least Squares Era (1632-1804), Vol. 42, No. 2, pp. 148-152, 1974.
Part I: Eighty Years of Least Squares (1805-1884), Vol. 42, No. 2, pp. 152-168, 1974.
References and Glossary of Code Letters, Vol. 42, No. 2, pp. 168-174, 1974.
Part II: The Awakening (1885-1945), Vol. 42, No. 3, pp. 235-264 & 282, 1974.
Part III: The Modern Era I (1946-1964), Vol. 43, No. 1, pp. 1-44, 1975.
Part IV: The Modern Era II (1965-1974), Vol. 43, No. 2, pp. 125-190, 1975.
Part V: Conclusions and Recommendations, Vol. 43, No. 3, pp. 269-278, 1975.
Addendum and Additional References, Vol. 43, No. 3, pp. 273-278, 1975.
Subject and Author Index, Vol. 44, No. 1, pp. 113-159, 1976.

Mention must also be made of the book Fitting Linear Relationships: A History of the Calculus of Observations 1750-1900 by Richard W. Farebrother, Springer, 1999.