3874 entries. Last updated May 21, 2013.

Statistics / Demography Timeline

Theme

1,000 BCE – 300 BCE

The Roman Census Circa 500 BCE

Servius Tullius. the sixth legendary king of ancient Rome, and the second king of the Etruscan dynasty, introduced the Roman census to determine taxes. Conducted every five years, it provided a register of citizens and their property.

View Map + Bookmark Entry

300 BCE – 30 CE

The First Census of Which Records are Preserved 2 CE

A map of Eastern China, the territories of the Han Dynasty highlighted in dark brown.

The first census of which records are preserved was taken in China during the Han Dynasty. At that time there were 57.5 million people living in Han China— the world’s largest population.

View Map + Bookmark Entry

500 CE – 600

The Plague of Justinian 541 – 542

The Plague of Justinian, afflicts the Eastern Roman Empire (Byzantine Empire), including its capital Constantinople.  

"The most commonly accepted cause of the pandemic is bubonic plague, which later became infamous for either causing or for contributing to the Black Death of the 14th century. The plagues' social and cultural impact during this period is comparable to that of the Black Death. In the views of 6th century Western historians, it was nearly worldwide in scope, striking central and south Asia, North Africa and Arabia, and Europe as far north as Denmark and as far west as Ireland.

"Until about 750, the plague would return with each generation throughout the Mediterranean basin. The wave of disease would also have a major impact on the future course of European history. Modern historians named this plague incident after the Eastern Roman Emperor Justinian I, who was in power at the time. He contracted the disease, but was one of a limited number of survivors" (Wikipedia article on Plague of Justinian, accessed 11-01-2010). 

In Giovanna Morelli et al "Yersinia pestis genome sequencing identifies patterns of global phylogenetic diversity, " Nature Genetics, 31 October 2010 | doi:10.1038/ng.705, the authors suggest a common origin for the Plague of Justinian and later pandemics of plague in the bacterial agent Yersinia pestis originating in China. 

View Map + Bookmark Entry

1000 – 1100

More than One Million Charters Survive from the Period of Norman Rule in England 1066 – 1307

More than one million charters survive, either as originals or early copies, from the period of Norman rule in Britain, from 1066 to 1307. Many of these documents are records of property and land transactions written in Latin and recorded by religious or royal institutions. They are fundamental source material for historical research in medieval politics, economics and society.

Through these charters historians can study the rise and fall of military and religious organizations, among many other topics. For example, charters show how the Knights Hospitallers, or the Order of Saint John, a religious organization founded around 1023 to provide care for poor, sick or injured pilgrims to the Holy Land, became a religious and military organization after the Western Christian conquest of Jerusalem in 1099 during the First Crusade, when it was charged with the care and defense of the Holy Land.

In the late seventeeth and early eighteenth centuries dating medieval charters was one of the problems which motivated Mabillon and Montfaucon to pioneer the science of palaeography. However, at least one million of the Norman charters remain undated, largely due to adminstrative changes introduced by William the Conqueror in 1066. To solve problem of dating the huge number of undated charters Gelila Tilahun and colleagues at the University of Toronto are applying computer-automated statistical techniques with the goal of reducing the time and effort to date them manually, and to improve the accuracy of assigned dates.

"Their approach is to use a subset of some 10,000 charters that are dated and to look for changes in language over time that could be used to date other documents. For example, Tilahun and co say that the phrase “amicorum meorum vivorum et mortuorum”, which means 'of my friends living and dead', was popular between the years 1150 and 1240 but not at other times. And the phrase 'Francis et Anglicis', which is a form of address meaning 'to French and English', was phased out when England lost Normandy to the French in 1204. However, the statistical approach is much more rigorous than simply looking for common phrases. Tilahun and co’s computer search looks for patterns in the distribution of words occurring once, twice, three times and so on. 'Our goal is to develop algorithms to help automate the process of estimating the dates of undated charters through purely computational means,' they say.  

"This approach reveals various patterns which they then test by attempting to date individual documents in this set. They say the best approach is one known as the maximum prevalence technique. This is a statistical technique that gives a most probable date by comparing the set of words in the document with the distribution in the training set.  

"Tilahun and co say their approach also has other applications. For example, the same technique could be used to work out authorship and to weed out forgeries, of which there are known to be a substantial number.  

"So how well does it work in practice? These guys finish their paper with a fascinating anecdote about a medieval English charter that was discovered in a drawer at the library of Brock University near Niagara Falls.  T

"The charter lacked a data so various historians attempted to work out when it was written. The first estimates pointed to the 14th century but these were later revised to the 13th century. Eventually, by comparing the charter to other records, one academic pinned it down to a date between 1235 and 1245.  

"Inspired by the media interest in this charter, Tilahun and co ran the document through their automated maximum prevalence procedure. 'The date estimate we obtained was 1246,' they say, with just a little hint of pride. Not bad!" (MIT Technology Review, 01-16-2013, accessed 01-16-2013).

Gelila Tilahun, Andrey Feuerverger, and Michael Gervers, "Dating medieval English charters," Annals of Applied Statistics VI (2012) 1615-1640.

 

View Map + Bookmark Entry

The Domesday Book, Recording the First English Census December 1085 – August 1086

The Domesday Book. (View Larger) /></p></a>  <p>William I of England, better known as <a href=William the Conqueror, less well known as William the Bastard, commissioned the Domesday Book, which recorded the first English census.

The first draft of the Domesday Book was completed in August 1086 and contained records for 13,418 settlements in the English counties south of the rivers Ribble and Tees (the border with Scotland at the time). William commissioned the book to assess the extent of the land owned in England at the time, and the extent of the taxes he could raise. The information collected was recorded in two huge books in around one year's time. William died in 1087 before the Domeday Book was completed. It is preserved in The National Archives of Britain in Richmond, Greater London.

A page of the Domesday Book on Warwickshire. (View Larger)

The work was called the Domesday Book because:

"It was written by an observer of the survey that 'there was no single hide nor a yard of land, nor indeed one ox nor one cow nor one pig which was left out.' The grand and comprehensive scale on which the Domesday survey took place, and the irreversible nature of the information collected led people to compare it to the Last Judgement, or 'Doomsday', described in the Bible, when the deeds of Christians written in the Book of Life were to be placed before God for judgement. This name was not adopted until the late 12th Century."

View Map + Bookmark Entry

1300 – 1400

The Black Death 1347 – 1353

The spread of the Bubonic plague in Europe. (View Larger)

The Black Death, one of the deadliest pandemics in human history, killed thirty to sixty percent of Europe's population.  

For centuries the epidemic continued to strike every 10 years or so, its last major outbreak being the Great Plague of London from 1665 to 1666. Though the vectors were not understood at the time, the disease was spread by rats and transmitted to people by fleas or, in some cases, directly by breathing.

"The pandemic is thought to have begun in Central Asia, and spread to Europe during the 1340s. The total number of deaths worldwide is estimated at 75 million people, approximately 25–50 million of which occurred in Europe. . . . It may have reduced the world's population from an estimated 450 million to between 350 and 375 million in 1400.

"The 14th century eruption of the Black Death had a drastic effect on Europe's population, irrevocably changing the social structure. It was a serious blow to the Roman Catholic Church, and resulted in widespread persecution of minorities such as Jews, foreigners, beggars, and lepers. The uncertainty of daily survival created a general mood of morbidity, influencing people to 'live for the moment', as illustrated by Giovanni Boccaccio in The Decameron (1353)" (Wikipedia article on Black Death, accessed 01-03-2009).

"The three plague waves [Plague of Justinian, Black Death, and that beginning in China's Yunnan province in 1894] have now been tied together in common family tree by a team of medical geneticists led by Mark Achtman of University College Cork in Ireland. By looking at genetic variations in living strains of Yersinia pestis, Dr. Achtman’s team has reconstructed a family tree of the bacterium. By counting the number of genetic changes, which clock up at a generally steady rate, they have dated the branch points of the tree, which enables the major branches to be correlated with historical events.  

"In the issue of Nature Genetics published online Sunday [October 31, 2010], they conclude that all three of the great waves of plague originated from China, where the root of their tree is situated. Plague would have reached Europe across the Silk Road, they say. An epidemic of plague that reached East Africa was probably spread by the voyages of the Chinese admiral Zheng He who led a fleet of 300 ships to Africa in 1409 (http://www.nytimes.com/2010/11/01/health/01plague.html, accessed 11-01-2010).

View Map + Bookmark Entry

1550 – 1600

The Beginning of the Collection of Medical Statistics 1592 – 1593

The collection, recording, and publishing of medical statistics in the form of Bills of Mortality began in England as a result of the epidemic of plague in 1592-93.

"The epidemic of plague, which reached its height in the year 1593, began to be felt in London in the autumn of 1592, and is said to have caused 2000 deaths before the end of the year. On the 7th September, soldiers from the north on their way to Southampton to embark for foreign parts had to pass round London 'to avoid the infection which is much spread abroad' in the city. On the 16th September, the spoil of a great Spanish carrack at Dartmouth could be brough no farther than Greenwich, on account of the contagion in London; no one to go from London to Dartmouth to buy the goods. It was an ominous sign that the infection lasted through the winter; even in mid winter people were leaving London: 'the plague is so sore that none of worth stay about these places.' On the 6th April 1593, one William Cecil who had been kept in the Fleet prison by the queen's command, writes that 'the place where he lies is a congregation of the unwholesome smells of the town, and season contagious, so many have died of the plague.' From a memorial of 1595, it appears that the neighbourhood of Fleet Ditch had been the most infected part of the whole city and liberties in 1593; 'in the last great plague more died about there than in three parishes besides.' The epidemic does not appear to have reached its height until summer. . . .

"Of that London epidemic a weekly record was kept by the Company of Parish Clerks, and published by them beginning with the weekly bill of 21st December, 1592. The clerk of the Company of Parish Clerks, writing in 1665, had the annual bill for 1593 before him, with the plague-deaths and other deaths in each of 109 parishes in alphabetical order, and the christenings as well. For the next two years, 1594 and 1595, he appears to have had before him not only the annual bills but also a complete set of the weekly bills of burials and christenings according to parishes. The same documents were used by Graunt in 1662, and had doubtless been used by John Stow at the time when they were published. The originals are all lost, and only a few totals extracted from them remain on record. . . .

"The London plague of 1592-93 called forth two known publications, an anonymous 'Good Councell against the Plague, showing sundry preservatives. . . to avoyde the infection lately begun in some places of this Cittie' (London, 1592), and the Defensative' of Simon Kellwaye (April, 1593). The dates of these two books show that the alarm had really begun in the end of 1592 and the early months of 1593" (Creighton, A History of Epidemics in Britain [1891] 352-53).


The earliest surviving copy of the Bills of Mortality is:

True bill of the vvhole number that hath died At London : printed by I.R[oberts]. for Iohn Trundle, and are to be sold at his shop in Barbican, neere Long lane end, [1603]

1 sheet ([1] p.) ;c1⁰. STC (2nd ed.), 16743 1-3.

 

View Map + Bookmark Entry

1600 – 1650

The Earliest Known Graph of Statistical Data 1644

In 1644 Dutch astronomer and cartographer Michael Florent van Langren (Langrenus, Miguel Florencio, Michale Florent) published La Verdadera Longitud por Mar y Tierra in Antwerp as a pamphlet. To show the magnitude of the problem of determining longitude, van Langren created the first known graph of statistical data, showing the wide range of estimates of the distance in longitude between Toledo and Rome.

Friendly, Valero-Mora, and Ibáñez Ulargui, "The First (Known Statistical Graph: Michael Florent van Langren and the 'Secret' of Longitude," 2010. http://www.datavis.ca/papers/langren-TAS09154.pdf, accessed 01-108-2013.

View Map + Bookmark Entry

1650 – 1700

The Longest Series of Monthly Temperature Observations 1659

The Central England Temperature (CET) record, a meteorological dataset originally published by English climatologist Gordon Manley in 1953, and subsequently extended and updated in 1974 following many decades of painstaking work, documents the monthly mean surface air temperatures, for the Midlands region of England in degrees Celsius from the year 1659 to the present. This record represents the longest series of monthly temperature observations in existence. It is monthly from 1659, and a daily version has been produced from 1772.

"The monthly means from November 1722 onwards are given to a precision of 0.1°C. The earliest years of the series, from 1659 to October 1722 inclusive, for the most part only have monthly means given to the nearest degree or half a degree, though there is a small 'window' of 0.1 degree precision from 1699 to 1706 inclusive. This reflects the number, accuracy, reliability and geographical spread of the temperature records that were available for the years in question" (Wikipedia article on Central England temperature, accessed 03-09-2013).

Manley, G., "The mean temperature of central England, 1698–1952," Quarterly Journal of the Royal Meteorological Society, vol. 79 (1953) 242-261.

View Map + Bookmark Entry

Demography & Vital Statistics 1662

In 1662 John Graunt, a draper in London, published Natural and Political Observations Mentioned in a Following Index, and Made upon the Bills of Mortality. Basing his work primarily on London's weekly Bills of Mortality, which had been published since 1593, Graunt noted the regularity of certain vital phenomena, such as higher death rates for children under six years of age, constructed the first life expectancy tables, and attempted to use his data to describe various characteristics of populations.

Graunt was well aware of the limitations of his data, however, citing such defects as lack of thoroughness, inadequate disease vocabulary, and dishonest reporting of deaths from certain causes such as syphilis.  His work first established the uniformity and predictability of many important biological phenomena when taken in large numbers, such as the greater number of female babies, the longer lifespans of females, the high mortality among infants.

It has long been debated how much Graunt's friend, the economist William Petty, contributed to the Observations; recent opinion has it that most of the work is Graunt's, although Petty may have made a few contributions. 

Carter & Muir, Printing and the Mind of Man (1967) No. 144.   Hook & Norman, The Haskell F. Norman Library of Science and Medicine (1991) No. 933.

View Map + Bookmark Entry

The Great Plague of London April 1665 – September 1666

A scanning electron micrograph depicting a mass of Yersinia pestis bacteria, which is the cause of the Bubonic Plague. (View Larger)

Between April 1665 and September 1666 plague killed 75,000 to 100,000 people, up to a fifth of London's population. "The disease was historically identified as bubonic plague, an infection by the bacterium Yersinia pestis, transmitted through a flea vector. The 1665-1666 epidemic was on a far smaller scale than the earlier "Black Death" pandemic, a virulent outbreak of disease in Europe between 1347 and 1353. The Bubonic Plague was only remembered afterwards as the "great" plague because it was one of the last widespread outbreaks in England.

"At the time, the outbreak was blamed upon the French. In early April 1665, two infected French sailors were said to have collapsed and died at the junction of Drury Lane and Long Acre in London. These cases were said to have brought about all subsequent infections. This theory has been largely dismissed as anti-French propaganda. The British outbreak is actually thought to have originated from the Netherlands, where the bubonic plague had occurred intermittently since 1599, with the initial contagion arriving with Dutch trading ships carrying bales of cotton from Amsterdam. The dock areas outside of London, including the parish of St. Giles-in-the Fields where poor workers crowded into ill-kept structures, were the first areas struck by the plague. Personal and public hygiene was very minimal during this period, contributing to the spread of disease. During the winter of 1664-1665, there were reports of several deaths. However, the very cold winter seemingly controlled the contagion. But spring and summer months were unusually warm and sunny, and the plague spread rapidly. As records were not kept on the deaths of the very poor, the first recorded case was a Rebecca Andrews, on April 12, 1665" (Wikipedia article on Great Plague of London, accessed 01-03-2009).

View Map + Bookmark Entry

The First Census in North America 1666

Jean Talon, the first Intendant of New France, conducted the first census of New France (Canada). Talon conducted the census largely by himself, travelling door-to-door among the settlements of New France. He did not include Native American inhabitants of the colony, or the religious orders. This was the first census conducted in North America.

"According to Talon's census there were 3215 people in New France, and 538 separate families. There were 2034 men and 1181 women. Children and unmarried people were grouped together; there were 2154 of these, while only 1019 people were married (42 were widowed). 547 people lived in Quebec, 455 in Trois-Rivières, and 625 in Montreal. The largest single age group, 21-30 year olds, numbered 842. 763 people were professionals of some kind, and 401 of these were servants, while 16 were listed as 'gentlemen of means.' "

View Map + Bookmark Entry

The Great Fire of London September 2 – September 5, 1666

The Great Fire of London swept through the central parts of the city.

"The fire gutted the medieval City of London inside the old Roman City Wall. It threatened, but did not reach, the aristocratic district of Westminster (the modern West End), Charles II's Palace of Whitehall, and most of the suburban slums. It consumed 13,200 houses, 87 parish churches, St. Paul's Cathedral, and most of the buildings of the City authorities. It is estimated that it destroyed the homes of 70,000 of the City's ca. 80,000 inhabitants. The death toll from the fire is unknown and is traditionally thought to have been small, as only six verified deaths were recorded. This reasoning has recently been challenged on the grounds that the deaths of poor and middle-class people were not recorded anywhere, and that the heat of the fire may have cremated many victims, leaving no recognizable remains."

"The social and economic problems created by the disaster were overwhelming; significant scapegoating occurred for some time after the fire. Evacuation from London and resettlement elsewhere were strongly encouraged by Charles II, who feared a London rebellion amongst the dispossessed refugees. Despite numerous radical proposals, London was reconstructed on essentially the same street plan used before the fire" (Wikipedia article on Great Fire of London, accessed 06-11-2009).

View Map + Bookmark Entry

Political Arithmetick 1690

English Economist Sir William Petty published in London Political Arithmetick, a major comparative study of the wealth and economic policies of England and her rivals France and Holland.  This was the first of Petty's works to contain in its title the phrase he had coined to describe the application of statistics to economic theory and policy.  Petty was the first to employ numerical evaluation in economics, and his work provided the decisive impulse toward econometrics and the general application of statistics.

View Map + Bookmark Entry

The Breslau Tables 1693

English astronomer, mathematician, geophysicist, meterologist and physicist Edmond Halley published "An Estimate of the Degrees of Mortality of Mankind, Drawn from Curious Tables of the Births and Funerals at the City of Breslaw, with an Attempt to Ascertain the Price of Annuities Upon Lives" in the Philosophical Transactions of the Royal Society of London. He compiled the "Breslau Tables" to show the proportion of men able to bear arms. . . to estimate mortality rates, to ascertain the price of annuities upon lives.

J. Norman (ed), Morton's Medical Bibliography 5th ed. (1991) no. 1687.

View Map + Bookmark Entry

1700 – 1750

One of the Earliest Applications of Statistics to a Socio-Medical Problem 1723

English physician and scientist James Jurin published A Letter . . . Containing, a Comparison Between the Mortality of the Natural Small Pox and that Given by Inoculation.

In this work, which is one of the earliest applications of statistics to a particular socio-medical problem, Jurin proved statistically that the fatality of inocculated smallpox is very much less than the fatality of natural smallpox.

J. Norman (ed.) Morton's Medical Bibliography 5th ed (1991) no. 1689.

View Map + Bookmark Entry

Theory of Annuities 1725

French Hugenot mathematician and demographer exiled in England, Abraham de Moivre published Annuities upon Lives: Or, the Valuation of annuities upon any Number of lives; as also, of Reversions.

Using the mortality statistics gathered by Edmond Halley in the 1690s, Moivre formulated the theory of annuities, deriving his formulas from a postulated uniform rate of mortality and constant rates of interest on money.  "Here one finds the treatment of joint annuities on several lives, the inheritance of annuities, problems about the fair division of the costs of a tontine, and other contracts in which both age and interest on capital are relevant.  This mathematics became a standard part of all subsequent commercial applications in England" (Dictionary of Scientific Biography).

Hook & Norman, The Haskell F. Norman Library of Science and Medicine (1991) no. 1530.

View Map + Bookmark Entry

Proving the Need for a Healthy and Industrious Population 1742

German army chaplain, statistician and demographer Johann Peter Süssmilch issued from Berlin Die göttliche Ordnung in den Veränderungen des menschlichen Geschlechts. In this work he showed the necessity of a healthy and industrious population for the survival of a nation.

J. Norman (ed.) Morton's Medical Bibliography 5th ed. (1991) No. 1691.

View Map + Bookmark Entry

The First Correct Life Tables 1746 – 1760

French mathematician and statistician Antoine Deparcieux issued in Paris Essai sur les probabilités de la durée de la vie humaine.  He published a supplement to this work entitled Addition à l'Essai sur les probabilités de la durée de la vie humaine in 1760. These works on annuities and mortality were the first correct "life tables."

J. Norman (ed) Morton's Medical Bibliography, 5th ed. (1991) no. 1691.1

View Map + Bookmark Entry

1750 – 1800

The Earliest Formal Treatment of "Data-Processing" 1755

In 1755 English mathematician Thomas Simpson published "On the Advantage of Taking the Mean of a Number of Observations, in Practical Astronomy" in the Philosophical Transactions of the Royal Society 49, part 1, 82-93.  Simpson's paper was "a milestone in statistical inference, as well as the earliest formal treatment of any data-processing practice" (Hook & Norman, Origins of Cyberspace [2002] No. 16).

View Map + Bookmark Entry

Bayes's Theorem 1763

Two years after his death "An Essay Towards Solving a Problem in the Doctrine of Chances" by English clergyman and mathematician Thomas Bayes was published in the Philosophical Transactions of the Royal Society 53 (1763) 370-418.

Bayes's paper enunciated Bayes's Theorem for calculating "inverse probabilities”—the basis for methods of extracting patterns from data in decision analysis, data mining, statistical learning machines, Bayesian networks, Bayesian inference.

Hook & Norman, Origins of Cyberspace (2002) no. 1.

View Map + Bookmark Entry

Early Graphic Representation of Statistics 1782

In 1782 French mathematician and director of fortifications Charles Louis de Fourcroy published Essai d'une table poléométrique, ou amusement d'un amateur de plans sur les grandeurs de quelques villes in Paris at the press of Dupain-Triel père. Fourcroy's Tableau poléométrique published in this work is

"one of the oldest proportional representations of human phenomena."

"Each city is reperesented by a square whose area is proportional to the geographic area occupied by the city (and for the smallest cities, by a half square only, divided by the diagonal line.

"When superimposed, the squares are classed automatically. This results in visual groupings, which lead the author to propose an 'urban classification' "(Bertin, Semiology of Graphics [2011] 202-03, with reproduction).

Fourcroy's 45-page work pioneered the use of graphs in cross-sectional and mathematical analysis.  In January 2013 a color reproduction of his graph was available at this link.

View Map + Bookmark Entry

Foundation of Statistical Graphics: the Line Chart and Bar Chart 1785 – 1786

In 1785 Scottish engineer and political economist William Playfair issued in London a privately circulated preliminary edition of his The Commercial and Political Atlas; Representing, by Means of Stained Copper-Plate Charts, the Exports, Imports, and General Trade of England, at a Single View. 

The next year Playfair formally published the work in London with an even longer title as The Commercial and Political Atlas; Representing, by Means of Stained Copper-Plate Charts, the Exports, Imports, and General Trade of England, at a Single View. To which are Added, Charts of the Revenue and Debts of Ireland, Done in the Same Manner by James Correy.  For this work Playfair invented the line chart or line graph or times series plots, present in the book in 43 variants, and the bar chart or bar graph, represented by a single example. The first 10 plates were engraved by Scottish engraver and cartographer John Ainslie in 1785 for the preliminary edition; the remainder were engraved by Samuel John Neele. It is thought that Playfair, often short of funds, may have hand-colored the charts himself—the coloring process that he curiously designated as "staining" in the titles.

As one inspiration for his information graphics concerning economics and finance, Playfair cited Priestley's timelines as published in his New Chart of History.

"Over the course of the next half century, Plafair's line graph, which counterposed two quantitative axes, (one for time, the other for economic measures such as exports, importants and debts) became on of the most recognizable chronographic forms" (Rosenberg & Grafton, Cartographies of Time [2010] 136).

"Playfair had a variety of careers. He was in turn a millwright, engineer, draftsman, accountant, inventor, silversmith, merchant, investment broker, economist, statistician, pamphleteer, translator, publicist, land speculator, convict, banker, ardent royalist, editor, blackmailer and journalist. On leaving Watt's company in 1782, he set up a silversmithing business and shop in London, which failed. In 1787 he moved to Paris, taking part in the storming of the Bastille two years later. He returned to London in 1793, where he opened a "security bank", which also failed. From 1775 he worked as a writer and pamphleteer and did some engineering work" (Wikipedia article on William Playfair, accessed 03-16-2010).

In 2005 the third edition (1801) of Playfair's atlas with the first edition (1801) of the breviary were reproduced in color as Playfair, The Commercial and Political Atlas and Statistical Breviary, Edited and Introduced by Howard Wainer and Ian Spence. 

View Map + Bookmark Entry

The First U.S. Census August 2, 1790

The first Census of the United States was conducted. The results were used to allocate Congressional seats (congressional apportionment), electoral votes, and funding for government programs.

The federal census records for the first census are missing for five states: Delaware, Georgia, Kentucky, New Jersey and Virginia. They were destroyed some time between the time of the census-taking and 1830. The census estimated the population of the United States at 3,929,214, ". . . of which 697,681 were slaves, and . . . the largest cities were New York City with 33,000 inhabitants, Philadelphia, with 28,000, Boston, with 18,000, Charleston, South Carolina, with 16,000, and Baltimore, with 13,000."

In 1791 approximately 200 copies of the census were printed by Childs and Swaine of Philadelphia as:

Return of the Whole Number of Persons with the Several Districts of the United States, According to 'An Act Providing for the Enumeration of the Inhabitants of the United States:,' Passed March the First, One Thousand Seven Hundred and Nintety-One.

♦ A copy of the original edition with the autograph signature of Thomas Jefferson sold for $122,500 in the James S. Copley sale at Sotheby's, New York, on April 14, 2010.

View Map + Bookmark Entry

Discovery of the Method of Least Squares 1795

Though Adrien-Marie Legendre was the first to publish the method of least squares in 1805, Carl Friedrich Gauss is credited with developing the fundamentals of the basis for least-squares analysis in 1795 at the age of eighteen.

"An early demonstration of the strength of Gauss's method came when it was used to predict the future location of the newly discovered asteroid Ceres. On January 1, 1801, the Italian astronomer Giuseppe Piazzi discovered Ceres and was able to track its path for 40 days before it was lost in the glare of the sun. Based on this data, it was desired to determine the location of Ceres after it emerged from behind the sun without solving the complicated Kepler's nonlinear equations of planetary motion. The only predictions that successfully allowed Hungarian astronomer Franz Xaver von Zach to relocate Ceres were those performed by the 24-year-old Gauss using least-squares analysis.

"Gauss did not publish the method until 1809, when it appeared [in Hamburg] in volume two of his work on celestial mechanics, Theoria Motus Corporum Coelestium in sectionibus conicis solem ambientium" (Wikipedia article on Least squares, accessed 08-24-2009).

View Map + Bookmark Entry

Malthus on Population 1798

Economist and demographer Thomas Malthus published in London An essay on the Principle of Population, as it Affects the Future Improvement of Society.

In this rebuttal of the utopian views of William Godwin, Malthus reasoned that populations inscrease by geometrical proportion but food supply only increases arithmetically. He argued that if both food and "the passion between the sexes" are necessary to man's existence, but populations have a much greater tendency to increase than does the food supply, then a "strong and constantly operating check"—such as famine, disease, or sexual deprivation—must be imposed to keep the population level consistent with the level of subsistence. 

Malthus's suppositions, though reasonable, were largely intuitive. Though the Essay contained no supporting numerical data, it was extremely influential on passage of the Census Act or Population Act of 1800, which led in 1801 to the first Census of England, Scotland and Wales. Using some of the information gathered in the first census, Malthus supplied factual documentation to support his theories in the greatly expanded second edition of his Essay published in 1803.

Hook & Norman, The Haskell F. Norman Library of Science and Medicine (1991) no. 1431.

View Map + Bookmark Entry

1800 – 1850

The First Census of England, Scotland and Wales 1801

Following the passage of the Census Act or Population Act of 1800, which he was largely responsible for drafting, John Rickman supervised the first Census of England, Scotland and Wales— the first detailed census ever undertaken of any country.

"The 1801 census was in two parts: the first was concerned with the number of people, their occupations, and numbers of families and houses. The second was a collection of the numbers of baptisms, marriages and burials, thus giving an indication of the rate at which the population was increasing or decreasing. Information was collected by census enumerators who were usually the local Overseers of the Poor or (in Scotland) schoolmasters. They visited individual households and gathered the required information, before submitting statistical summaries. The details of households and individuals were important only in creating these local summaries and were destroyed in all but a few cases."

John Rickman first proposed the census in 1796 in an article in the Commercial, Agricultural, and Manufacturer's Magazine, which he edited. The Secretary to the Treasury, George Rose, noticed the article and in 1800 the Census Act, drafted by Rickman, was presented to parliament. Rickman then directed the census and was responsible for digesting and annotating the data.

The study of population was one of the major concerns of political economy at this time and the first census came at a crucial point in the debate. When Malthus published his Essay on population in 1798, demographic knowledge was necessarily limited. After the results of the first census were known, Malthus extensively revised and expanded the Essay, incorporating insights gained from the census and other sources, and published it virtually as new work in 1803.

The census was published on December 21, 1801 as Abstract of the answers and returns made pursuant to an act, passed in the forty-first year of his majesty King George III. Intituled An act for taking an account of the population of Great Britain, and the increase or diminution thereof. A second volume was published on June 9, 1802.

View Map + Bookmark Entry

Invention of the Pie Chart 1801

Scottish engineer and political economist William Playfair published in London The Statistical Breviary; Shewing, on a Principle Entirely New, the Resources of Every State and Kingdom in Europe; Illustrated with Stained Copper-Plate Charts, Representing the Physical Powers of Each Distinct Nation with Ease and Perspicuity. To which is added, a Similar Exhibition of the Ruling Powers of Hindoostan.

In this work Playfair invented the pie chart.  It has also been suggested that Playfair, often short of funds, may have colored the charts himself—the process he characterized as "staining" in the title.

Playfair, The Commercial and Political Atlas and Statistical 
Breviary
, Edited and Introduced by Howard Wainer and Ian Spence (2005). This edition reproduces in color the third edition of the atlas (1801) and the first edition of the breviary (1801).

View Map + Bookmark Entry

First Publication of the Method of Least Squares 1805

Adrien-Marie Legendre published Nouvelles méthodes pour la détermination des orbites des comètes. His appendix to this work, “Sur la Méthode des moindres quarrés,” represented the first publication of the method of least squares, the earliest form of regression analysis.

View Map + Bookmark Entry

Foundation of the Birth Control Movement 1822

English tailor, economist and political radical Francis Place published in London Illustrations and Proofs of the Principle of Population: Including an Examination of the Proposed Remedies of Mr. Malthus, and a Reply to the Objections of Mr. Godwin and Others. 

Place's book was the foundation work of the birth-control movement. 

“Though many preceded Francis Place in discussing the technique of contraception, he seems to have been the first to venture, at first alone and unaided, upon an organized attempt to educate the masses. Place, holds, therefore, the same position in social education on contraception that Malthus holds in the history of general population theory . . . it was Place who first gave birth control a body of social theory” (Himes, Medical History of Contraception [1930], 212-13). 

Place, the son of an alcoholic London bailiff, overcame enormous economic hardship to become a successful master tailor. In his free time he taught himself mathematics, the law, history and economics; he also became involved in British radical politics, associating with such influential figures as Joseph Hume, Thomas Wakely, Sir Francis Burdett, Jeremy Bentham and John Stuart Mill.  David Ricardo had sent Place a copy of Malthus's work and Place sent Ricardo the manuscript of his book for comments in September 1821 to which Ricardo replied in a lengthy letter to Place dated September 9, 1821.

Place’s Illustrations and Proofs arose from the long-standing controversy between Thomas Malthus and the utopian socialist William Godwin over the nature of human society. Godwin held that there was no limit on human perfectibility, and that society, if freed from the evils of government and other man-made institutions, would advance to an ideal state, free of poverty and governed entirely by reason. Malthus countered Godwin’s utopian claims with his famous Essay on the Principle of Population (1798 and subsequent editions), in which he argued that humanity’s improvement was necessarily limited by the constant struggle between a population’s natural tendency to increase (which was not susceptible to control by reason) and the restraints on population growth, such as famine and disease, imposed by scarce resources. In the second edition of the Essay (1803) Malthus proposed that poverty and other miseries caused by these opposing pressures on populations could be mitigated by voluntary growth-limiting measures such as “moral restraint”; i.e. delayed marriage and sexual continence prior to marriage. Malthus explicitly condemned artificial methods of contraception, however, claiming they were unnatural and would lead to immorality.

Although a supporter of Malthus’s views on population, Place emphatically disagreed with Malthus’s condemnation of birth control. His own life experience had given him first-hand knowledge of both grinding poverty and licentious behavior, and he knew how hopeless a task it was to persuade England’s poor to refrain from sex until they were economically prepared to support a family. His own early marriage, at the age of 19, had rescued him from a life of debauchery; however, “experience . . . emphatically warned him that early marriage meant many children” (quoted in Hime, Introduction, p. 10)—a situation that kept poor families in poverty and led to such social evils as prostitution and child labor. “Thus it was that Place came to be dominated by the compelling persuasion, an opinion that amounted to an idée fixe, that Malthus’s remedy was impracticable, that it was as utopian in its own way . . . as Godwin’s notions of perfectibility. And thus it was that Place, feeling that he had a distinctive contribution to make to the discussion of population problems . . . came out unequivocally [in Illustrations and Proofs] for contraception as the best ‘means of preventing the numbers of mankind from increasing faster than food is provided’” (Himes, Introduction, p. 11). “It was a daring innovation in the history of economic thought . . . when, in 1822, Place published his Illustrations and Proofs of the Principle of Population, the first treatise on population in English to propose contraceptive measures as a substitute for Malthus’s ‘moral restraint’” (Himes, Medical History of Contraception, p. 213).

Place’s Illustrations sold poorly, which prompted him to use more direct methods of communicating his message. In 1823 he began distributing handbills advocating contraception, addressed to “The Married of Both Sexes,” “The Married of Both Sexes in Genteel Life,” and “The Married of Both Sexes of the Working People.” These “received considerable circulation not only in London, but in the industrial districts of the North; while the discussions which ensued caused them to be reprinted in several radical journals of the period . . . the handbills were in advance of modern medical opinion in maintaining that economic indications held a coordinate place with medical indications for contraception” (Himes, Medical History of Contraception, 213, 218).

Himes, “Editor’s introduction,” in Place, Illustrations and Proofs of the Principles of Population, ed. Himes (1930; repr. 1967), 7-63; Medical History of Contraception (1936), 212-20. J. Norman (ed) Morton's Medical Bibliography no. 1696.1.

View Map + Bookmark Entry

The First Opinion Poll 1824

According to the Wikipedia, the first known example of an opinion poll is a local straw vote conducted by The Harrisburg Pennsylvanian in 1824.

The straw vote showed Andrew Jackson leading John Quincy Adams by 335 votes to 169 in the contest for the Presidency of the United States.

View Map + Bookmark Entry

The "Average Man" 1835

Belgian astronomer, mathematician, statistician and sociologist Lambert Adolphe Jacques Quetelet published in Paris Sur l'homme et le développement des facultés, ou essai de physique sociale. In this statistical study of the development of human physical and intellectual qualities Quetelet introduced the concept of the "average man." 

"Quetelet's use of the average man was founded upon the belief that if there is no change in any underlying causal relationship-- if there is a `persistence of causes'— then there will be a tendency for the average of large aggregates of even unhomogeneous data to be stable. . . . Quetelet italicized this as a fundamental principle: `The greater the number of individuals observed, the more do individual peculiarities, whether physical or moral, become effaced, and allow the general facts to predominate, by which society exists and is preserved' " (Stigler,  171-172). 

View Map + Bookmark Entry

Mathematical Model of a Continuously Growing Population 1838

Belgian mathematician Pierre François Verhulst published from Brussels "Notice sur la loi que la population suit dans son accrossement" in Correspondance mathématique et physique X, 113–121.

In this paper Verhulst constructed the simplest mathematical model of a continuously growing population with an upper limit to its size. "The concept of r/K selection theory derives its name from the competing dynamics of exponential growth and environmental limitation introduced here" (Wikipedia article on Pierre François Verhulst, accessed 01-13-2009).

View Map + Bookmark Entry

The First of the Industrial Insurance Companies that Processed Immense Amounts of Data May 30, 1848

The Prudential Mutual Assurance, Investment and Loan Association was founded in Hatton Garden, London on May 30, 1848. The Prudential was the first of the great industrial life insurance companies that handled the insurance policies of millions of people, and processed an immense amount of data, initially by hand.

View Map + Bookmark Entry

1850 – 1875

Florence Nightingale's Rose Diagram 1858 – January 1859

In 1858 nurse, statistician, and reformer Florence Nightingale published Notes on Matters Affecting the Health, Efficiency, and Hospital Administration of the British Army. Founded Chiefly on the Experience of the Late War. Presented by Request to the Secretary of State for War. This privately printed work contained a color statistical graphic entitled "Diagram of the Causes of Mortality in the Army of the East" which showed that epidemic disease, which was responsible for more British deaths in the course of the Crimean War than battlefield wounds, could be controlled by a variety of factors including nutrition, ventilation, and shelter. The graphic, which Nightingale used as a way to explain complex statistics simply, clearly, and persuasively, has become known as Nightingale's "Rose Diagram." 

In January 1859 Nightingale more offically published and distributed  A Contribution to the the Sanitary History of the British Army During the Late War with Russia. This also contained a copy of the Rose Diagram.

View Map + Bookmark Entry

Having Refused to Support Babbage, the British Government Pays for a Difference Engine Produced in Sweden 1859

Long after refusing to fund the completion of Babbage’s Difference Engine No. 1, or funding construction of his Analytical Engine, the British government paid for the construction of the Scheutzes' third difference engine.  

Medical statistician William Farr first used the Engine in 1859 to print a table for his paper, published in Philosophical Transactions, “On the Construction of Life-Tables, Illustrated by a New Life-Table of the Healthy Districts of England.”

View Map + Bookmark Entry

The First Compilation of Baseball Statistics 1860

In 1860 Anglo-American sports journalist Henry Chadwick issued Beadle's Dime Base-Ball Player: a Compendium of the Game Comprising Elementary Instructions of this American Game of Ball; together with the Revised Rules and Regulations for 1860. This handbook, written by Chadwick and published in New York by Irwin P. Beadle, was the first baseball guide published for sale to the general public. In this work Chadwick  

"listed totals of games played, outs, runs, home runs, and strikeouts for hitters on prominent clubs, the first database of its kind. His goal was to provide numerical evidence to prove what players helped or hurt a team to win. . . . He is credited with devising the baseball box score (which he adapted from the cricket scorecard) for reporting game events. The first box score was a grid with nine rows for players and nine columns for innings. The original box scores also created the often puzzling abbreviation for strikeout as 'K' - 'K' being the last letter of 'struck' in 'struck out.' The basic format and structure of the box score has changed little since the earliest of ones designed by Chadwick. He is also credited with devising such statistical measures as batting average and earned run average. Ironically, ERA originated not in the goal of measuring a pitcher's worth but to differentiate between runs caused by batting skill (hits) and lack of fielding skill (errors). He is also noted as believing fielding range to be a superior skill to avoiding errors" (Wikipedia article on Henry Chadwick, accessed 10-06-2012).

View Map + Bookmark Entry

The First Instance of a Printing Calculator Used Extensively to do Original Work 1864

In 1864 English statistician and epidemiologist William Farr published English life table. Tables of lifetimes, annuities, and premiums. . . . Published by authority of the Registrar-General of births, deaths and marriages in England. The colophon leaf of this book indicated that 500 copies were printed. Farr's English Life Table contained, what was for its time, a tremendous amount of data— 6.5 million deaths sorted by age. Included in English Life Table no. 3 were the first lengthy working tables produced by the Scheutz printing calculator— the first instance of such a machine being used extensively to do original work. However, none of the hoped-for benefits of mechanizing the calculation of the tables were realized, since the Scheutz machine failed to include any of Babbage's security mechanisms to guard against mechanical error, and it required constant maintenance.

The machine did accomplish some of the typesetting which it stamped into sterotype plates; however, the process was so problematic that there was little cost savings from automation. Of the 600 pages of printed tables in the book, only 28 pages were composed entirely by the machine; a further 216 pages were partially composed by the machine, and the rest were typeset by hand. Nor was there the hoped-for savings from using the machine to prepare stereotype plates. Her Majesty's Stationery Office, printer of the volume, stated that having the machine set the entire book automatically would have saved only 10 percent over the cost of conventional typesetting (Swade, The Cogwheel Brain [2000] 203-8).

Pages cxxxix-cxliv contained Farr's appendix entitled "Scheutz's calculating machine and its use in the construction of the English life table no. 3," in which he emphasized the usefulness of the new machine, but also the delicacy and skill necessary for its operation:

The Machine required incessant attention. The differences had to be inserted at the proper terms of the various series, checking was required, and when the mechanism got out of order it had to be set right. Of the first watch nothing is known, but the first steam-engine was indisputably imperfect; and here we had to do with the second Calculating Machine as it came from the designs of its constructors and from the workshop of the engineer. The idea had been as beautifully embodied in metal by Mr. Bryan Donkin as it had been conceived by the genius of its inventors; but it was untried. So its work had to be watched with anxiety, and its arithmetical music had to be elicited by frequent tuning and skilful handling, in the quiet most congenial to such productions.

This volume is the result; and thus—if I may use the expression—the soul of the Machine is exhibited in a series of Tables which are submitted to the criticism of the consummate judges of this kind of work in England and in the world (p. cxl)

Farr also noted Babbage's contribution to the venture—it was Babbage who "explained the principles [of the Scheutz calculator] and first demonstrated the practicability of performing certain calculations, and printing the results by machinery" (p. xiii).

Having invested so much time and money in the project while realizing only token gains, the British government showed little patience with the Scheutz calculating machine. The General Register Office soon reverted to manual calculations by human computers employing logarithms, which they used until the GRO's conversion to mechanical calculation methods in 1911.  Hook & Norman, Origins of Cyberspace (2002) No. 85.

View Map + Bookmark Entry

Possibly the Best Statistical Graphic Ever Drawn November 20, 1869

On November 20, 1869 French civil engineer Charles Joseph Minard published in Paris Carte figurative des pertes successives en hommes de l'Armée Française dans la campagne de Russie 1812-1813This was a a flow map on Napoleon's disastrous Russian campaign of 1812 in which he marched his army of 500,000 men from the Neman River to Moscow.

"The graph displays several variables in a single two-dimensional image:

"• the army's location and direction, showing where units split off and rejoined

"• the declining size of the army (note e.g. the crossing of the Berezina river on the retreat)

"• the low temperatures during the retreat.

"Étienne-Jules Marey first called notice to this dramatic depiction of the terrible fate of Napoleon's army in the Russian campaign, saying it "defies the pen of the historian in its brutal eloquence". Edward Tufte says it 'may well be the best statistical graphic ever drawn' and uses it as a prime example in The Visual Display of Quantitative Information" (Wikipedia article on Charles Joseph Minard, accessed 01-16-2011).

The chart is a lithograph 62 x 30 cm.  

♦ An essay on Minard's historical sources for the chart, and a different reproduction pf the chart, was available at http://www.edwardtufte.com/tufte/minard in January 2011.

♦ The bibliographer of Minard's statistical graphics, Michael Friendly, posted several very interesting graphic variations on Minard's chart as Re-Visions of Minard. "I use 're-vision' in the sense of both 'to revise' and 'to see again', possibly from a new perspective." This was also available in January 2011.

View Map + Bookmark Entry

Mathematical Study of Anthropological Data 1871

Belgian astronomer, mathematician, statistician and sociologist Lambert Adolphe Jacques Quetelet published in Brussels Anthropométrie ou mesure des différentes facultés de l'homme. In Anthropmétrie and in Physique sociale ou essai sur le developpement des facultés de l'homme (1869), Quetelet established the basis for mathematical study of anthropological data. "Quetelet showed that if a series of anthropological measurements of either physical or intellectual qualities were plotted on squared paper, allowing x to be the measurements and y to be their frequency, they formed a curve like that representing the expansion of the binomial, or like that formed by plotting the errors of a great number of observers [i.e., the Gaussian curve]" (Penniman, 105). By applying the mathematics of the Gaussian curve to anthropological data, it became possible to plot the average or "standard" deviation from the statistical average, and thus to interpret anthropological data with greater exactness.

View Map + Bookmark Entry

The First National Thematic Atlas 1874

In 1874 American economist, statistician, journalist, educator, academic administrator, and military officer Francis Amasa Walker published in Washington, D.C. at the Government Printing Office Statistical Atlas of the United States Based on the Results of the Ninth Census 1870 with Contributions from Many Eminent Men of Science and Several Departments of the Government

This oversized compendium of maps, graphs, statistical tables, and essays by scientists, economists, and federal officials was the first comprehensive thematic atlas produced by any nation.  It was hailed both at home and abroad for its innovative use of graphic elements to distill and display complex data. When he conceived and supervised production and publication of this work Walker was Chief of the U. S. Bureau of Statistics and superintendent of the 1870 census. The 60 large maps, most of which were printed in color, were chromolithographed in New York by Julius Bien, who produced the plates for the first American full-size reissue of portions of Audubon's Birds of America (1858-60).

Kinnahan, "Charting Progress: Francis Amasa Walker's Statistical Atlas of the United States and Narratives of Western Expansion," American Quarterly 60 (2008) 399-423.

View Map + Bookmark Entry

1875 – 1900

300 Clerks Reviewing 2,500,000 Insurance Policies with 24 Calculators 1877

It took three hundred clerks working at The Prudential headquartered in London six months to review its 2,500,000 insurance policies with the assistance of twenty-four Thomas de Colmar arithmometers.

View Map + Bookmark Entry

A Physician-Librarian Suggests the Idea for Electric Punched Card Tabulating 1882

At the U.S. Census Bureau physician John Shaw Billings, founder and librarian of the Surgeons General's Library (now the National Library of Medicine), suggested to Herman Hollerith that there ought to be a machine for speeding up the process of tabulating population and similar statistics. 

Hollerith credited Billings for inspiring him to develop electric punched card tabulating for the census of 1890.

View Map + Bookmark Entry

Electromechanical Punched Card Tabulating 1889

In 1889 American statistician Herman Hollerith of Georgetown, Washington, D. C. was a awarded three patents (U.S. Patent 395,781, U.S. Patent 395,782, U.S. Patent 395,783) for an electromechanical machine for tabulating information stored on punched cards.  

"These patents described both paper tape and rectangular cards as possible recording media. The card shown in U.S. Patent 395,781 of June 8 was preprinted with a template and had holes arranged close to the edges so they could be reached by a railroad conductor's ticket punch, with the center reserved for written descriptions. Hollerith was originally inspired by railroad tickets that let the conductor encode a rough description of the passenger:  

"I was traveling in the West and I had a ticket with what I think was called a punch photograph...the conductor...punched out a description of the individual, as light hair, dark eyes, large nose, etc. So you see, I only made a punch photograph of each person." 

"Use of the ticket punch proved tiring and error prone, so Hollerith invented a pantograph 'keyboard punch' that allowed the entire card area to be used. It also eliminated the need for a printed template on each card, instead a master template was used at the punch; a printed reading board could be placed under a card that was to be read manually. Hollerith envisioned a number of card sizes. In an article he wrote describing his proposed system for tabulating the 1890 U.S. Census, Hollerith suggested a card 3 inches by 5½ inches of Manila stock "would be sufficient to answer all ordinary purposes."  

"The cards used in the 1890 census had round holes, 12 rows and 24 columns. A reading board for these cards can be seen at the Columbia University Computing History site. At some point, 31⁄4 by 73⁄8 inches (82.550 by 187.325 mm) became the standard card size, a bit larger than the United States one-dollar bill of the time (the dollar was changed to its current size in 1929). The Columbia site says Hollerith took advantage of available boxes designed to transport paper currency. Hollerith's original system used an ad-hoc coding system for each application, with groups of holes assigned specific meanings, e.g. sex or marital status. Later designs standardized the coding, with twelve rows, where the lower ten rows coded digits 0 through 9. This allowed groups of holes to represent numbers that could be added, instead of simply counting units " Wikipedia article on Punched Cards, accessed 12-21-2011).

Hollerith's electric punched card tabulator was used in the 1890 United States census — the first major data-processing project to use electrical machinery. It reduced data-processing time by 80 percent over manual methods. 

View Map + Bookmark Entry

Finger Prints as a Means of Identification 1892

Victorian polymath: geographer, meteorologist, explorer, statistician, psychometrician, and proto-geneticist Francis Galton published a detailed statistical model of fingerprint analysis and identification, and encouraged their use in forensic science in his book, Finger Prints published in London.

View Map + Bookmark Entry

1900 – 1910

The Automatic Punched Card Feed 1900

To improve data processing of the 1900 census, American statistician and inventor Herman Hollerith added an automatic card feed to his electric punched card tabulating machine. 

View Map + Bookmark Entry

1910 – 1920

Hollerith Sells the Tabulating Machine Company to Flint 1911

In 1911 Herman Hollerith sold his Tabulating Machine Company to Charles R. Flint.

View Map + Bookmark Entry

The First National Opinion Poll? 1916

The Literary Digest, an influential general-interest weekly magazine published by Funk & Wagnalls, conducted a national survey of voter preference, mailing out millions of postcards and counting the returns, partly as a circulation-raising exercise. Using these results the Digest correctly predicted the election of Woodrow Wilson as president of the United States. This may be the first national opinion poll.

View Map + Bookmark Entry

1930 – 1940

The First Publications on Statistical Quality Control in Manufacturing April 1930 – 1939

In 1930 American physicist, engineer and statistician Walter Andrew Shewhart of Bell Labs published "Economic Quality Control of Manufactured Product, "Bell System Technical Journal IX, No. 2 (April, 1930) 364-89. This paper, and its expansion in book form entitled Economic Control of Quality of Manufactured Product that Shewhart issued in 1931, represent the first publications on statistical quality control in manufacturing. 

"Shewhart framed the problem in terms of assignable-cause and chance-cause variation and introduced the control chart as a tool for distinguishing between the two. Shewhart stressed that bringing a production process into a state of statistical control, where there is only chance-cause variation, and keeping it in control, is necessary to predict future output and to manage a process economically. Dr. Shewhart created the basis for the control chart and the concept of a state of statistical control by carefully designed experiments. While Dr. Shewhart drew from pure mathematical statistical theories, he understood data from physical processes never produce a 'normal distribution curve' (a Gaussian distribution, also commonly referred to as a 'bell curve'). He discovered that observed variation in manufacturing data did not always behave the same way as data in nature (Brownian motion of particles). Dr. Shewhart concluded that while every process displays variation, some processes display controlled variation that is natural to the process, while others display uncontrolled variation that is not present in the process causal system at all times" (Wikipedia article on Walter A. Shewhart, accessed 01-08-2013). 

In 1939 Shewhart issued Statistical Method from the Viewpoint of Quality Control . . . With the editorial assistance of W. Edwards Deming. Strangely the book was published in Washington, D.C. by The Graduate School of the Department of Agriculture.  Shewhart and Deming's book was the first work to extend the principles of statistical quality control in industry to the wider realms of science and statistical inference. Shewhart “extended the applications of statistical process control to the measurement processes of science, and stressed the importance of operational definitions of basic quantities in science, industry and commerce . . . [Statistical Method] has profoundly influenced statistical methods of research in the behavioral, biological, and physical sciences, and in engineering” (Dictionary of Scientific Biography).

Shewhart’s long and fruitful collaboration with the physicist, statistician and consultant W. Edwards Deming began in 1938. It involved work on productivity during World War II and Deming’s championship of Shewhart’s ideas in Japan from 1950 onwards, which was “the catalyst that gave birth to Japan’s industrial efficiency and emphasis on highest attainable quality of manufactured products” (Dictionary of Scientific Biography). Only after Japan successfully adopted Deming's ideas, and set higher standards for manufacturing, did competition motivate American manufacturers to aggressively implement statistical quality control in the United States.

View Map + Bookmark Entry

Bradford's Law January 26, 1934

In a paper entitled "Sources of Information on Specific Subjects," (Engineering 137 [1934], 85-6), British mathematician, librarian and documentalist at the Science Museum in London Samuel C. Bradford published Bradford's Law, also known as  "Bradford's law of scattering" and as the "Bradford distribution," showing the "exponentially diminishing returns of extending a library search."

"In many disciplines this pattern [described by Bradford's Law] is called a Pareto distribution. As a practical example, suppose that a researcher has five core scientific journals for his or her subject. Suppose that in a month there are 12 articles of interest in those journals. Suppose further that in order to find another dozen articles of interest, the researcher would have to go to an additional 10 journals. Then that researcher's Bradford multiplier bm is 2 (i.e. 10/5). For each new dozen articles, that researcher will need to look in bm times as many journals. After looking in 5, 10, 20, 40, etc. journals, most researchers quickly realize that there is little point in looking further.

"Different researchers have different numbers of core journals, and different Bradford multipliers. But the pattern holds quite well across many subjects, and may well be a general pattern for human interactions in social systems. Like Zipf's law, to which it is related, we do not have a good explanation for why it works. But knowing that it does is very useful for librarians. What it means is that for each specialty it is sufficient to identify the core publications' for that field and only stock those. Very rarely will researchers need to go outside that set" (Wikipedia article on Bradford's Law, accessed 02-21-2012).

View Map + Bookmark Entry

The Social Security Program Creates a Giant Data-Processing Challenge 1935 – 1936

The Social Security Act of 1935 required the U. S. government to keep continuous records on the employment of 26 million individuals.

The first  Social Security Numbers (SSNs) were issued by the Social Security Administration in November 1936 as part of the New Deal Social Security program.

"Within three months, 25 million numbers were issued.

"Before 1986, people often did not have a Social Security number until the age of about 14, since they were used for income tracking purposes, and those under that age seldom had substantial income. In 1986, American taxation law was altered so that individuals over 5 years old without Social Security numbers could not be successfully claimed as dependents on tax returns; by 1990 the threshold was lowered to 1 year old, and was later abolished altogether." (Wikipedia article on Social Security Number, accessed 01-17-2010).

View Map + Bookmark Entry

1940 – 1950

Communication Theory as a Statistical Problem 1942

Having collaborated with engineer Julian Bigelow, mathematician Norbert Wiener published, as a classified document from MIT, The Extrapolation, Interpretation and Smoothing of Stationery Time Series.

According to Claude Shannon, this work contained “the first clear-cut formulation of communication theory as a statistical problem, the study of operations on time series.”

View Map + Bookmark Entry

Contract for Production of the UNIVAC 1948

A contract was drawn up between Eckert-Mauchly Computer Corporation and the United States Census Bureau for the production of the UNIVAC.

View Map + Bookmark Entry

Among the Earliest Extant Programs for a Stored-Program Computer March 15 – March 21, 1949

The United States Census Bureau wrote test programs for the BINAC. These manuscript programs, dated March 15 and March 21, were possibly among the earliest extant programs for a stored-program computer built in the United States.

View Map + Bookmark Entry

1950 – 1960

The First Electronic Computer Commercially Manufactured in the United States March 31 – June 14, 1951

UNIVAC I, serial 1, was signed over to the United States Census Bureau on March 31, 1951.

The official dedication of the machine at the government offices occurred on June 14, 1951. Excluding the unique BINAC, the UNIVAC I was the first electronic computer to be commercially manufactured in the United States. Its development preceded the British Ferranti Mark 1; however, the British machine was actually delivered to its first customer one month earlier than the UNIVAC I.

Though the United States Census Bureau owned UNIVAC I, serial 1, the Eckert -Mauchly division of Remington Rand retained it in Philadelphia for sales demonstration purposes, and did not actually install it at government offices until twenty-one months later.

View Map + Bookmark Entry

The First "Large Scale" Application of Humanities Computing in the U. S. 1959

The first "large scale" use of machine methods in humanities computing in the United States was Merle Curti's study of Trempealeau County, WisconsinThe making of an American Community: A Case Study of Democracy in a Frontier County (1959).

"Confronted with census material for the years 1850 through 1880–actually several censuses covering population, agriculture, and manufacturing–together with a population of over 17,000 persons by the latter date, Curti turned to punched cards and unit record equipment for the collection and analysis of his data. By this means a total of 38 separate items of information on each individual were recorded for subsequent manifpulation. Quite obviously, the comprehensive nature of this study was due in part to the employment of data processing techniques" (Bowles [ed.] Computers in Humanistic Research (1967) 57-58).

View Map + Bookmark Entry

1960 – 1970

"Computational Analysis of Present-Day American English" 1967

Henry Kucera (born Jindřich Kučera) of Brown University and Nelson Francis published Computational Analysis of Present-Day American English.

A founding work on corpus linguistics, this book "provided basic statistics on what is known today simply as the Brown Corpus. The Brown Corpus was a carefully compiled selection of current American English, totaling about a million words drawn from a wide variety of sources. Kucera and Francis subjected it to a variety of computational analyses, from which they compiled a rich and variegated opus, combining elements of linguistics, psychology, statistics, and sociology" (Wikipedia article on Brown Corpus, accessed 06-07-2010)./

View Map + Bookmark Entry

1980 – 1990

The Digital Domesday Project--Doomed to Early Digital Obsolescence 1984 – 1986

Acorn Computers Ltd, Philips, Logica and the BBC (with some funding from the European Commission's ESPRIT programme) marked the 900th anniversary of the original Domesday Book, an 11th century census of England, with the multimedia  BBC Domesday Project.  This publication is frequently cited as an example of digital obsolescence.

The Project  "included a new 'survey' of the United Kingdom, in which people, mostly school children, wrote about geography, history or social issues in their local area or just about their daily lives. This was linked with maps, and many colour photos, statistical data, video and 'virtual walks'. Over 1 million people participated in the project. The project also incorporated professionally-prepared video footage, virtual reality tours of major landmarks and other prepared datasets such as the 1981 census.

"The project was stored on adapted laserdiscs in the LaserVision Read Only Memory (LV-ROM) format, which contained not only analog video and still pictures, but also digital data, with 300 MB of storage space on each side of the disc. The discs were mastered, produced, and tested by Philips at their Eindhoven headquarters factory. Viewing the discs required an Acorn BBC Master expanded with an SCSI controller and an additional coprocessor controlled a Philips VP415 "Domesday Player", a specially-produced laserdisc player. The user interface consisted of the BBC Master's keyboard and a trackball (known at the time as a trackerball). The software for the project was written in BCPL (a precursor to C), to make cross platform porting easier, although BCPL never attained the popularity that its early promise suggested it might.

In 2002, there were great fears that the discs would become unreadable as computers capable of reading the format had become rare (and drives capable of accessing the discs even more rare). Aside from the difficulty of emulating the original code, a major issue was that the still images had been stored on the laserdisc as single-frame analogue video, which were overlaid by the computer system's graphical interface. The project had begun years before JPEG image compression and before truecolour computer video cards had become widely available.

"However, the BBC later announced that the CAMiLEON project (a partnership between the University of Leeds and University of Michigan) had developed a system capable of accessing the discs using emulation techniques. CAMiLEON copied the video footage from one of the extant Domesday laserdiscs. Another team, working for the UK National Archives (who hold the original Domesday Book) tracked down the original 1-inch videotape masters of the project. These were digitised and archived to Digital Betacam.

"A version of one of the discs was created that runs on a Windows PC. This version was reverse-engineered from an original Domesday Community disc and incorporates images from the videotape masters. It was initially available only via a terminal at the National Archives headquarters in Kew, Surrey but has been available since July 2004 on the web.

"The head of the Domesday Project, Mike Tibbets, has criticized the bodies to which the archive material was originally entrusted" (Wikipedia article on BBC Domesday Project, accessed 12-21-2008).

View Map + Bookmark Entry

1990 – 2000

"Death by Government" Statistics 1900-1987 1994

In Death By Government (1994), revised 2005) political scientist Rudolph J. Rummel of the University of Hawaii estimated that "deaths at the hands of one's own government in the period 1900-87 amounted to 212 million persons, while deaths from warfare numbered 34 million. In other words, victims of their own government (what he calls democide) were in fact over six times greater than those killed in the century's wars. The largest number of fatalities was 78 million killed by the Chinese Communists, then 62 million by the Soviet Communists, 21 million by the Nazis, 10 million by the Chinese nationalists, and 6 million by the Japanese militarists. Even this listing is incomplete; as Rummel puts it, 'post-1987 democides by Iraq, Iran, Burundi, Serbia and Bosnian Serbs, Bosnia, Croatia, Sudan, Somalia, the Khmer Rouge guerrillas, Armenia, Azerbaijan, and others have not been included' (http://www.danielpipes.org/blog/2012/01/anarchy-the-new-threat, accessed 01-31-2012).

View Map + Bookmark Entry

The Bureau of Labor Statistics Begins Publishing on its Website January 1995

The U.S. Department of Labor Bureau of Labor Statistics, which began publication of statistics in print in 1886, began publishing statistics on its website.

View Map + Bookmark Entry

2005 – 2010

The First Intelligible Word from an Extinct South American Civilization? August 12, 2005

Anthropologists Gary Urton and Carrie Brezine published "Khipu Accounting in Ancient Peru," Science 309 (2005) 1065 - 1067.

"Khipu [quipu] are knotted-string devices that were used for bureaucratic recording and communication in the Inka [Inca] Empire. We recently undertook a computer analysis of 21 khipu from the Inka administrative center of Puruchuco, on the central coast of Peru. Results indicate that this khipu archive exemplifies the way in which census and tribute data were synthesized, manipulated, and transferred between different accounting levels in the Inka administrative system" (Science).

"Researchers in the US believe they have come closer to solving a centuries-old mystery - by deciphering knotted string used by the ancient Incas.

"Experts say one bunch of knots appears to identify a city, marking the first intelligible word from the extinct South American civilisation.

"The coloured, knotted pieces of string,known as khipu, are believed to have been used for accounting information.

"The researchers say the finding could unlock the meaning of other khipu.

"Harvard University researchers Gary Urton and Carrie Brezine used computers to analyse 21 khipu.

"They found a three-knot pattern in some of the strings which they believe identifies the bunch as coming from the city of Puruchuco, the site of an Inca palace.

" 'We hypothesize that the arrangement of three figure-eight knots at the start of these khipu represented the place identifier, or toponym, Puruchuco,' they wrote in their report, published in the journal Science.

" 'We suggest that any khipu moving within the state administrative system bearing an initial arrangement of three figure-eight knots would have been immediately recognisable to Inca administrators as an account pertaining to the palace of Puruchuco.' (http://news.bbc.co.uk/2/hi/americas/4143968.stm, accessed 04-28-2009).

View Map + Bookmark Entry

Using Currency Movements to Predict the Spread of Infectious Disease January 26, 2006

Dirk Brockmann, a theoretical physicist and computational epidemiologist at Northwestern University in Evanston, Illinois, L. Hufnagel, and T. Geisel published "The scaling laws of human travel," Nature 439 (2006) 46265. 

Using statistical data from the American currency tracking website, Where's George?, the paper described statistical laws of human travel in the United States, and developed a mathematical model of the spread of infectious disease.

[By January 31, 2009, Where's George? tracked over 149 million bills totaling more than $810 million. (Wikipedia).]

View Map + Bookmark Entry

Statistical Analysis Correctly Forecasts the Election of Obama March 3, 2008

Statistical analyst and "sabermetrician" Nate Silver of Brooklyn, New York, founded fivethirtyeight.com.

Silver correctly predicted on March 7, 2008, roughly eight months before the election, that Barack Obama would be elected President of the United States.

View Map + Bookmark Entry

China Becomes the Top User of the Internet January 14, 2009

"BEIJING, China (CNN) -- China surpassed the United States in 2008 as the world's top user of the Internet, according to a government-backed research group.

"The number of Web surfers in the country grew by nearly 42 percent to 298 million, according to the China Internet Network Information Center's January report. And there's plenty of room for growth, as only about 1 in every 4 Chinese has Internet access.  

"The rapid growth in China's Internet use can be tied to its swift economic gains and the government's push for the construction of telephone and broadband lines in the country's vast rural areas, the report says.  

"The Chinese government wants phone and broadband access in each village by 2010.

"Nearly 91 percent of China's Internet users are surfing the Web with a broadband connection -- an increase of 100 million from 2007. Mobile phone Internet users totaled 118 million by the end of 2008" (http://www.cnn.com/2009/TECH/01/14/china.internet/index.html, accessed 01-13-2010).

View Map + Bookmark Entry

1.7 Billion Internet Users September 30, 2009

According to Internetworldstats.com there were about 1,733,993,000 Internet users on September 30, 2009. This compared with about 360,985,000 on December 31, 2000.

View Map + Bookmark Entry

2010 – 2011

Culturomics Introduced by the Cultural Observatory December 16, 2010

A highly interdisciplinary group of scientists, primarily from Harvard University: Jean-Baptiste Michel,Yuan Kui Shen, Aviva P. Aiden, Adrian Veres, Matthew K. Gray, The Google Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak and Erez Lieberman Aiden published "Quantitative Analysis of Culture Using Millions of Digitized Books," Science, Published Online December 16 2010 Science 14 January 2011: Vol. 331 no. 6014 pp. 176-182 DOI: 10.1126/science.1199644

The authors were associated with the following organizations: Program for Evolutionary Dynamics, Institute for Quantitative Social Sciences Department of Psychology, Department of Systems Biology Computer Science and Artificial Intelligence Laboratory, Harvard Medical School, Harvard College Google, Inc. Houghton Mifflin Harcourt Encyclopaedia Britannica, Inc. Department of Organismic and Evolutionary Biology Department of Mathematics, Broad Institute of Harvard and MITCambridge School of Engineering and Applied Sciences Harvard Society of Fellows, Laboratory-at-Large.

This paper from the Cultural Observatory at Harvard and collaborators represented the first major publication resulting from The Google Labs N-gram (Ngram) Viewer,

"the first tool of its kind, capable of precisely and rapidly quantifying cultural trends based on massive quantities of data. It is a gateway to culturomics! The browser is designed to enable you to examine the frequency of words (banana) or phrases ('United States of America') in books over time. You'll be searching through over 5.2 million books: ~4% of all books ever published" (http://www.culturomics.org/Resources/A-users-guide-to-culturomics, accessed 12-19-2010).

"We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. "Culturomics" extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities" (http://www.sciencemag.org/content/early/2010/12/15/science.1199644, accessed 12-19-2010).  

"The Cultural Observatory at Harvard is working to enable the quantitative study of human culture across societies and across centuries. We do this in three ways: Creating massive datasets relevant to human culture Using these datasets to power wholly new types of analysis Developing tools that enable researchers and the general public to query the data" (http://www.culturomics.org/cultural-observatory-at-harvard, accessed 12-19-2010).

 

View Map + Bookmark Entry

2011 – 2013

What Makes Spoken Lines in Movies Memorable April 30, 2012

Sentences that endure in the public mind are evolutionary success stories, comparing “the fitness of language and the fitness of organisms.” On April 30, 2012 Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, and Lillian Lee of the Department of Computer Science at Cornell University published "You had me at hello: How phrasing affects memorability," arXiv: 1203.6360v2 [cs.CL] 30 Apr 2012, (accessed 01-27-2013). Using the "memorable quotes" selected from the Internet Movie Database or IMDb, and the number of times that a particular movie line appeared on the Internet, they compared the memorable lines to the complete scripts of the movies in which they appeared—about 1,000 movies

"To train their statistical algorithms on common sentence structure, word order and most widely used words, they fed their computers a huge archive of articles from news wires. The memorable lines consisted of surprising words embedded in sentences of ordinary structure. 'We can think of memorable quotes as consisting of unusual word choices built on a scaffolding of common part-of-speech patterns,' their study said.  

Consider the line 'You had me at hello,' from the movie 'Jerry McGuire.' It is, Mr. Kleinberg notes, basically the same sequence of parts of speech as the quotidian 'I met him in Boston.' Or consider this line from 'Apocalypse Now': 'I love the smell of napalm in the morning.'Only one word separates that utterance from this: 'I love the smell of coffee in the morning.'

"This kind of analysis can be used for all kinds of communications, including advertising. Indeed, Mr. Kleinberg’s group also looked at ad slogans. Statistically, the ones most similar to memorable movie quotes included 'Quality never goes out of style,' for Levi’s jeans, and 'Come to Marlboro Country,' for Marlboro cigarettes.  

"But the algorithmic methods aren’t a foolproof guide to real-world success. One ad slogan that didn’t fit well within the statistical parameters for memorable lines was the Energizer batteries catchphrase, 'It keeps going and going and going.'

"Quantitative tools in the humanities and the social sciences, as in other fields, are most powerful when they are controlled by an intelligent human. Experts with deep knowledge of a subject are needed to ask the right questions and to recognize the shortcomings of statistical models.  

“ 'You’ll always need both,' says Mr. [Matthew] Jockers, the literary quant. 'But we’re at a moment now when there is much greater acceptance of these methods than in the past. There will come a time when this kind of analysis is just part of the tool kit in the humanities, as in every other discipline' " (http://www.nytimes.com/2013/01/27/technology/literary-history-seen-through-big-datas-lens.html?pagewanted=2&_r=0&nl=todaysheadlines&emc=edit_th_20130127, accessed 01-27-2013).

View Map + Bookmark Entry