The method of least squares was introduced by Legendre in is work Nouvelles Méthodes pour la Detérmination des Orbites des Cométes published in 1805. Certainly, the method was extremely easy to apply except for the computations involved. What it lacked was a theoretical foundation. Here follows a summary of the various arguments put forward to justify the use of least squares.
Robert Ellis has given a detailed comparison of the so-called proofs of the method of least squares in "On the Method of Least Squares" which appeared in the Transactions of the Cambridge Philosophical Society in 1844, pp. 204-219. He treats the proofs of Gauss, Laplace and Ivory.
Cleveland Abbe discovered the proof by Adrain and reported this in "A Historical Note on the Method of Least Squares" which appeared in the American Journal of Science and Arts 1: 411-415 (1871).
J.W.L. Glaisher has contributed "On the Law of Facility of Errors of Observation, and on the Method of Least Squares", Memoirs of the Royal Astronomical Society. Vol. XXXIX (1872) pp. 75-124. In this he first examines the proofs of Adrain. Glaisher also groups proofs according to the following scheme:
This was followed by a study by Mansfield Merriman who has given a chronology of proofs of the method of least squares in The Analyst for March 1877, Volume IV, No. 2, pp. 33-36. This journal and his paper is available through JSTOR. In addition, from the Transactions of the Connecticut Academy, Vol. IV, 1877, we have "A List of Writings relating to the Method of Least Squares, with historical and critical notes." This document lists 408 memoirs, books and parts of books related to the Theory of Errors.
Comments are derived from the aforementioned papers.
Robert Adrain, "Research concerning the probabilities of the errors which happen in making observations." The Analyst (1808) No. IV, pp. 93-109. Two proofs are contained therein. The first is on pages 93-95 and has been reprinted by Cleveland Abbe (see below) and the second lies on pages 96-97. Merriman himself reprints the gist of it in his 1877 paper. Herschel's proof is similar. See also Ellis 1850 below.
Carl Gauss,
Theoria
Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium,
1809. pp. 205-224. Charles Henry Davis made an English
translation published as Theory
of the Motion of the Heavenly Bodies Moving about the Sun in Conic
Sections in 1857. See pages 253-273 there or paragraphs
175-189 of Book II
Section 3. Dale F. Trotter also made a translation which is based
on
that of Bertrand.
Gauss assumes that the arithmetic mean is the most probable result for a sequence of
observations of a quantity. The only law of error consistent with this
assumption is the Gaussian distribution. From which, the method of
least squares will follow.
Ellis states that details of Gauss's reasoning may be
found in the paper by Bessel "Bestimmung der Axen des elliptischen
Rotationssphaeroids, welches den vorhandenen Messungen von
Meridianbégen der Erde am melsten entspricht" originally published
in Astronomische Nachrichten 14, Nr.
333. Its translation as "Determination of the Axes of the Elliptic
Spheroid of Revolution which most nearly corresponds with the existing
Measurements of Arcs of the Meridian." is in Taylor's Scientific
Memoirs Vol. 2 on pages 387-400.
We note that Encke also followed the
same demonstration in the paper "Uber die Methode der Kleinsten
Quadrate",
Berliner Astronomisches
Jahrbuch for 1834 (1832) pages 249-312 (including tables) of which a translation
appears in
the same Scientific Memoirs
as "On the Method of Least Squares" on pages 317- 369. This article is continued
in
Berliner Astronomisches Jahrbuch for 1835 (1833), pp. 253-320 and
Berliner Astronomisches Jarhbuch for 1836 (1834), pp. 253-308. Finally,
further notes "Hr. Encke las einen Beitrag zur Begréndung der Methode der
kleinsten Quadrate."
Bericht über die zur Bekanntmachung geeigneten Verhandlungen, (1850),
pp. 211- 213.
August De Morgan discusses the theory of the arithmetic mean in "On the
Theory of Errors of Observation"
Cambridge
Philosophical Transactions, Vol. X, (1864) pp. 409-42.
Schiaparelli provides a justification for the use of the arithmetic
mean in "Sur le principe de la moyenne
arithmètique" in Astronomische
Nachrichten Vol. LXXXVII (1875), Nr. 2068, columns 55-58. A copy
may be downloaded
from the SAO/NASA Astrophysics
Data System.
Pierre Laplace, "Mémoire sur les approximations des formules qui sont fonctions de trés-grands nombres, et sur leur application aux probabilités" and supplement, "Mémoire sur les approximations des formules qui sont fonctions de trés-grands nombres, et sur leur application aux probabiliés (suite)" . Mém. l'Institut France 1809 (1810), 353-415, 559-565. For the proof see pages 383-389 and 559-565.
Pierre Laplace
reproduces the proof in the Théorie
Analytique des Probabilités, Chapter IV (pp. 309-354 or
paragraphs 18-24). His second demonstration may be found on pp.
318-319.
Laplace shows that the method of least squares follows if all
observations follow the same law of error and the number of
observations increases without limit. He limits himself to two
unknowns. Of course, the proof fails if the
law of error is the Cauchy distribution. For this, see Poisson "Sur la
probabilité des résultats moyens des observations."
Connaissance des Temps, 1827, pp. 273-302.
Ellis gave the extension to any number of unknowns in 1844.
Glaisher simplifies the argument of Laplace in "Remarks on
certain portions of Laplace's Proof of the Method of Least Squares",
Philosophical
Magazine Vol. 43, 4th Series, 1864 and again in 1872.
Todhunter also extends the method of Laplace in "On the Method of Least
Squares",
Transactions
of the Cambridge Philosophical Society, Vol. 11 (1871). pp.
219-238.
Carl Gauss, "Theoria
combinationis
observationum erroribus minimis obnoxiae," Comm. Soc. Gottingen,
Vol. V (1823), pp. 33-90. This is the translation based on that of
Bertrand by Dale F. Trotter.
Here Gauss assumes the importance of the error varies as the square of
its magnitude. He has been accused of petitio
principii or begging the question The mean value of the sum of
the squares is taken as a measure of precision. Merriman believes this
argument to be followed only by Helmert in 1872.
James Ivory, "On the method of Least Squares," Tilloch's
Philosophical Magazine, Vol. LXV, (1825) pp. 3-10,
81-88,161-168 and Tilloch's
Philosophical Magazine, Vol. LXVIII (1826) pp.161-165.
Ellis claims that Ivory gave three arguments. Glaisher finds four. The
first argument (page 5) rests on an analogy to the condition of
equilibrium
which leads to the method of least squares. The second (pages 6-7) is
based on minimizing the mean square error (the measure of precision) when several
sets of observations are made. The third is a variant of this in that
one minimizes the measure of
precision among a set of observations. The last argument is
based upon a symmetric law of error and
independent errors in observations. In this case he claims the method of
least squares follows from the equations of condition.
Hagen, Grundzége der Wahrscheinlichkeits-Rechnung, 1837, second edition 1867. An error is the sum of indefinitely many equal elementary errors and equally positive or negative. That is, the elementary errors are distributed binomially with parameter p = 0.5 so that consequently the distribution of an error is asymptotically normal. An exposition by Charles Kummel in English may be found in The Analyst Vol III for 1876 on pages 133-140, 165-178.
Friedrich Bessel "Untersuchungen éber die Wahrscheinlichkeit der Beobachtungsfehler." Astronomische Nachrichten, 1838, Vol. XV, col. 369-404. A copy may be downloaded from the SAO/NASA Astrophysics Data System. Bessel demonstrates the the normal distribution of errors cannot be assumed a priori by providing concrete examples. However, the error law is closely approximated if an error is the result of many causes and no cause dominates.
Donkin, An essay on the Theory of the Combination of Observations (1844) Ashmolean Society and in translation as "Sur la Théorie de la Combinaison des Observations" in Liouville's Jour. Math. Vol. XV, (1855) pp. 297-322. Merriman asserts that the reasoning is neither clear nor rigorous.
John Herschel, "Quetlelet on Probabilities." Edinburgh Review (1850) Vol. XCII, pp. 1-57. The proof lies on pages 19 and 20. It is the same as that given by Adrain. For a review of his review, see Robert Ellis in the Philosophical Magazine (1850), Vol. 37, pp. 321-328. Ellis puts the argument into mathematical language. See also Glaisher 1872. George Boole defends Herschel with "On the Application of the Theory of Probabilities to the Question of the Combination of Testimonies or Judgments," Transactions of the Royal Society of Edinburgh, Vol. XXI (1857).
Peter Guthrie Tait, "On the Law of Frequency of
Error",
Transactions of the Royal
Society of Edinburgh, Vol. XXIV (1865).
Reprinted in
Scientific
Papers Vol. 1, 1898.
Donkin, "On an analogy relating to the theory of probability and on the principle of least squares," Quarterly Jour. Math., 1857, Vol. I, pp. 152-162. An analysis of the reasoning is given in Glaisher 1872.
Crofton, "On the Proof of the Law of Errors of Observations." Phil. Trans. 1870, pp. 175-188. Here an error is the result of a large number of small errors positive and negative but not equally probable.
Several papers on the history of least squares have been published subsequently. These are
The most comprehensive enumeration to date of research in this area has been done by W. Leon Harter. This has been published in a series of articles in the International Statistical Review during 1974 through 1976.