3,200,000 Books Are In Print in the U.S. 2000
In 2000 there were 3,200,000 new printed book titles listed for sale in the United States. The number of book titles in print in the world may have been about 8,000,000 at that time.
In 2000 there were 3,200,000 new printed book titles listed for sale in the United States. The number of book titles in print in the world may have been about 8,000,000 at that time.
At some point in 2000 there were 72,398,092 Internet hosts and 9,950,491 websites.
Web size estimates by Inktomi at this time surpassed 1 billion pages that could be indexed.
How much information?, a project done at the University of California at Berkeley by Peter Lyman and Hal R. Varian in 2000, attempted to measure the amount of information produced in the world each year.
"Heavy information overload: the world’s total yearly production of print, film, optical, and magnetic content would require roughly 1.5 billion gigabytes of storage. This is the equivalent of 250 megabytes per person for each man, woman, and child on earth.”
“Printed documents of all kinds comprise only .003% of the total. Magnetic storage is by far the largest medium for storing information and is the most rapidly growing, with shipped hard drive capacity doubling every year. Magnetic storage is rapidly becoming the universal medium for information storage.”
“Approximately 240 terabytes (compressed) of unique data are recorded on printed media worldwide each year.” The website provides a chart breaking down the printed media into categories.
In 2000 the National Digital Library Program sponsored by the Library of Congress digitized and made available online over 5,000,000 items.
In 2000 the Library of Congress initiated a prototype system called Minerva (Mapping the Internet the Electronic Resources Virtual Archive) to collect and preserve open-access Web resources.
In 2000 Andrew Hoyem founded The Grabhorn Institute in San Francisco “for the purpose of preserving and continuing the use of one of the last integrated typefoundry, letterpress printing, and bookbinding facilities, and operating it as a living museum and educational and cultural center.”
In 2000 Lawrence Lessig of Stanford Law School published Code and Other Laws of Cyberspace, in which he argued:
"that cyberspace changes not only the technology of copying but also the power of law to protect against illegal copying (125-127). He explores the notion that computer code may regulate conduct in much the same way that legal codes do. He goes so far as to argue that code displaces the balance in copyright law and doctrines such as fair use (135). If it becomes possible to license every aspect of use (by means of trusted systems created by code), then no aspect of use would have the protection of fair use(136). The importance of this side of the story is generally underestimated and, as the examples will show, very often, code is even (only) considered as an extra tool to fight against 'unlimited copying'."
"Although the history of science and ideas is not my field, I could not imagine adopting Alfred North Whitehead's opinion that every science, in order to avoid stagnation, must forget its founders. To the contrary, it seems to me that the ignorance displayed by most scientists with regard to the history of their discipline, far from being a source of dynamism, acts as a brake on their creativity. To assign the history of science a role separate from that of research itself therefore seems to me mistaken. Science, like philosophy, needs to look back over its past from time to time, to inquire into its origins and to take a fresh look at models, ideas, and paths of investigation that had previously been explored but then for one reason or another were abandoned, great though the promise was. Many examples could be cited that confirm the usefulness of consulting history and, conversely, the wasted opportunities to which a neglect of history often leads. Thus we have witnessed in recent years, in the form of the theory of deterministic chaos, the rediscovery of Poincaré's dazzling intuitions and early results concerning nonlinear dynamics; the retum to macroscopic physics, and the study of fluid dynamics and disordered systems, when previously only the infinitely small and the infinitely large had seemed worthy of the attention of physicists; the revival of interest in embryology, ethology, and ecology, casting off the leaden cloak that molecular biology had placed over the study of living things; the renewed appreciation of Keynes's profound insights into the role of individual and collective expectations in market regulation, buried for almost fifty years by the tide of vidgar Keynesianism; and, last but not least, since it is one of the main themes of this book, the rediscovery by cognitive science of the cybernetic model devised by McCulloch and Pitts, known now by the name of 'neoconnectionism' or 'neural networks,' after several decades of domination by the cognitivist model' " (Dupuy, The Mechanization of the Mind: On the Origins of Cognitive Science, trans. M. B. DeBevoise , p. x.)
The inaugural issue of the journal
"defined Interactive Advertising as the 'paid and unpaid presentation and promotion of products, services and ideas by an identified sponsor through mediated means involving mutual action between consumers and producers.' This is most commonly performed through the Internet as a medium" (Wikipedia article on Interactive advertising, accessed 04-22-2009).
By about the year 2000 prepress became, for all printing processes except traditional letterpress, an entirely digital process. Prepress entails the processes and procedures that occur between the procurement of a manuscript and original artwork, and the manufacture of a printing plate, image carrier, or, in letterpress, forme, ready for mounting on a printing press.
When a photopolymer printing plate replaces the forme in letterpress that prepress may also be considered a digital process.
In 2000 American inventor, scientist, engineer, entrepreneur, and author William Daniel "Danny" Hillis wrote a paper entitled Aristotle (The Knowledge Web). In 2007, at the time of founding Metaweb Technologies to develop aspects of ideas expressed in his Aristotle paper, Hillis wrote:
"In retrospect the key idea in the "Aristotle" essay was this: if humans could contribute their knowledge to a database that could be read by computers, then the computers could present that knowledge to humans in the time, place and format that would be most useful to them. The missing link to make the idea work was a universal database containing all human knowledge, represented in a form that could be accessed, filtered and interpreted by computers.
"One might reasonably ask: Why isn't that database the Wikipedia or even the World Wide Web? The answer is that these depositories of knowledge are designed to be read directly by humans, not interpreted by computers. They confound the presentation of information with the information itself. The crucial difference of the knowledge web is that the information is represented in the database, while the presentation is generated dynamically. Like Neal Stephenson's storybook, the information is filtered, selected and presented according to the specific needs of the viewer. ["In his book Diamond Age, the science fiction writer Neil Stephenson describes an automatic tutor called The Primer that grows up with a child. Stephenson's Primer does everything described above and more. It becomes a friend and playmate to the heroine of the novel, and guides not only her intellectual but also her emotional development" (from Hillis's Aristotle, 2000).
"John, Robert and I started a project, then a company, to build that computer-readable database. How successful we will be is yet to be determined, but we are really trying to build it: a universal database for representing any knowledge that anyone is willing to share. We call the company Metaweb, and the free database, Freebase.com. Of course it has none of the artificial intelligence described in the essay, but it is a database in which each topic is connected to other topics by links that describe their relationship. It is built so that computers can navigate and present it to humans. Still very primitive, a far cry from Neal Stephenson's magical storybook, it is a step, I hope, in the right direction" (http://edge.org/conversation/addendum-to-aristotle-the-knowledge-web, accessed 02-02-2014).
After his problematic experience in earlier in 2000 with the electronic distribution of his e-book novella, Riding the Bullet, Stephen King decided to release serial installments of his epistolary novel, The Plant, directly from his website in Bangor, Maine, and unencrypted.
"People could pay a one-dollar fee for each installment using the honor system. He threatened, however, to drop the project if the percentage of paying readers fell below 75 percent. He viewed the release as an experiment in alternate forms of distribution, writing on his website at the time, 'My friends, we have the chance to become Big Publishing's worst nightmare.' More than 200,000 customers downloaded free copies of the story in a 24-hour promotion through the Barnes and Noble book-selling site.
"The book received more than the desired 75 percent for its first installment, but it fell to 70 percent after installment two. With the third installment, the numbers surged back to 75 percent. All told, after six installments, King revealed that he'd made nearly half a million dollars from the release of The Plant in what has been called his e-book experiment. King decided to double the cost of the fourth part of the novel to $2, while at the same time doubling the number of pages to 54. He also promised to cap the cost of the entire book at a total of $13. Paying readers dropped to 46 percent of downloads. The number of downloads decreased overall as well.
"The last installment was published on December 18, 2000. The book is yet to be completed. The original installments are now available for free on Stephen King's official website" (Wikipedia article on The Plant, accessed 10-19-2013).
In 2000 a research team from the Institute of Neuroinformatics ETHZ/UNI Zurich; Bell Laboratories, Murray Hill, NJ; and the Department of Brain and Cognitive Sciences & Department of Electrical Engineering and Computer Science at MIT created an electrical circuit of 16 "neurons" that could select and amplify input signals much like the cortex of the mammalian brain.
"Digital circuits such as the flip-flop use feedback to achieve multi-stability and nonlinearity tor estore signals to logical levels, for example 0 and 1. Analogue feedback circuits are generally designed to operate linearly, so that signals are over a range, and the response is unique. By contrast, the response of cortical circuits to sensory stimulation can be both multistable and graded. We propose that the neocortex combines digital selection of an active set of neurons with analogue response by dynamically varying the postive feedback inherent in its recurrent connections. Strong postive feedback causes differential instabilities that drive the selection of a set of active neurons under the constraints embedded in the synaptic weights. Once selected, the active neurons generate weaker, stable feedback that provides analogue amplication of the input. Here we present our model of cortical processing as an electronic circuit that emulates this hybrid operation, and so is able to perform computations that are similar to stimulus selection, gain modulation and spatiotemporal pattern generation in the neocortex" (Abstract)
R. Hahnloser, R. Sarpeshkar, M. Mahowald, R.J. Douglas and S. Seung: "Digital selection and analog amplification co-exist in an electronic circuit inspired by neocortex", Nature 405 (2000) 947-951.
An alternative to the WEIS coding system developed by Charles A McClelland, Conflict and Mediation Event Observations (CAMEO) was developed by Philip A. Schrodt and colleagues at Pennsylvania State University beginning in 2000 as a framework for coding event data, especially to overcome difficulties in automating the WEIS coding process. It was typically used to study events that merit news coverage, and was generally applied to the study of political news and violence.
Schrodt, CAMEO. Conflict and Mediation Event Observations. Event and Actor Codebook (March 2012).
In January 2000 Will Glaser, Jon Kraft, and Tim Westergren founded Pandora Radio, an automated music recommendation service, and "custodian" of the Music Genome Project—a mathematical algorithm to organize music—in Oakland, California.
During the week of February 7, 2000 massive denial-of-service attacks (DoS attacks, or distributed denial of service DDoS attacks) were launched against major websites, including Yahoo!, Amazon and eBay. These attacks used computers at multiple locations to overwhelm the vendors’ computers and shut down their websites. They are the first documented massive DDoS attacks.
Using money from the dot.com Bomis, on March 9, 2000 American entrepreneur Jimmy Wales founded the web-encyclopedia, Nupedia in San Diego, California, hiring philosopher Larry Sanger as editor-in-chief.
"Unlike Wikipedia, Nupedia was not a wiki; it was instead characterized by an extensive peer-review process, designed to make its articles of a quality comparable to that of professional encyclopedias. Nupedia wanted scholars to volunteer content for free. Before it ceased operating, Nupedia produced 24 articles that completed its review process (three articles also existed in two versions of different lengths), and 74 more articles were in progress.
"In June 2008, CNET hailed Nupedia as one of the greatest defunct websites in history" (Wikipedia article on Nupedia, accessed 05-23-2009).
After this date the dot-com bubble began to burst.
The Oxford English Dictionary Online (OED Online) became available to subscribers.
On March 24, 2000 American author of contemporary horror, suspense, science fiction and fantasy Stephen King first published a novella entitled Riding the Bullet as an electronic book through Simon & Schuster, using technology by Softlock. Available for $2.50, it was the first mass-market e-book. However, there were technical problems with downloading, and hackers eventually cracked the encryption.
"A movie adaptation of the story, starring Jonathan Jackson and David Arquette, was released in 2004" (Wikipedia article on Riding the Bullet, accessed 10-19-2013).
♦In 2010 Stephen King was interviewed on CNN about the present state and future of books and ebooks:
On June 5, 2000 Steven Pendergast, and Mindwise Media LLC owned by Scott Pendergast founded Fictionwise.com. The company became one of the largest distributors of ebooks in North America, and was acquired by Barnes & Noble in March 2009.
On June 26, 2000 "Celera Genomics [Rockville, Maryland] announced the first complete assembly of the human genome. Using whole genome shotgun sequencing, Celera began sequencing in September 1999 and finished in December. Assembly of the 3.12 billion base pairs of DNA, over the next six months, required some 500 million trillion sequence comparisons, and represented the most extensive computation ever undertaken in biology.
“The Human Genome Project reported it had finished a “working draft” of the genome, stating that the project had fully sequenced 85 percent of the genome. Five major institutions in the United States and Great Britain performed the bulk of sequencing, together with contributions from institutes in China, France, and Germany” (Genome News Network, Genetics and Genomics Timeline 2000, accessed 05-24-2009).
The ASCI White supercomputer at the Lawrence Livermore National Laboratory in California became operational on June 29, 2000. An IBM system, it covered a space the size of two basketball courts and weighed 106 tons. It contained six trillion bytes (TB) of memory— almost 50,000 times greater than the average personal computer at the time—and had more than 160 TB of Serial Disk System storage capacity—enough to hold six times the information stored in the 29 million books in the Library of Congress.
♦ In December 2013 I decided that the ASCI White would be the last supercomputer documented in From Cave Paintings to the Internet. The merits of supercomputers are mainly appreciated for their abilities to perform the most complex of calculations, and without the time and space and the ability to explain such calculations, descriptions of the ever-advancing magnitudes of supercomputers seemed beyond the scope of this project. Readers can follow the development of supercomputers through the Wikipedia article on supercomputer and through other websites, such as the TOP500 twice-annual ranking of the world's supercomputers. To review progress to 2000 and a bit afterward, I quote the section on Applications of Supercomputers from the Wikipedia article as it read in December 2013:
"Applications of supercomputers
"The stages of supercomputer application may be summarized in the following table:
|Decade||Uses and computer involved|
|1970s||Weather forecasting, aerodynamic research (Cray-1).
|1980s||Probabilistic analysis, radiation shielding modeling (CDC Cyber).|
|1990s||Brute force code breaking (EFF DES cracker),
|2000s||3D nuclear test simulations as a substitute for legal conduct Nuclear Non-Proliferation Treaty (ASCI Q).
|2010s||Molecular Dynamics Simulation (Tianhe-1A)
"The IBM Blue Gene/P computer has been used to simulate a number of artificial neurons equivalent to approximately one percent of a human cerebral cortex, containing 1.6 billion neurons with approximately 9 trillion connections. The same research group also succeeded in using a supercomputer to simulate a number of artificial neurons equivalent to the entirety of a rat's brain.
"Modern-day weather forecasting also relies on supercomputers. The National Oceanic and Atmospheric Administration uses supercomputers to crunch hundreds of millions of observations to help make weather forecasts more accurate."
Reflective of rapid advancements in computational biology and genomics, in August 2000 IBM formed a Life Sciences Solutions division, incorporating its Computational Biology Center.
In September 2000 here were 20,000,000 websites on the Internet; the number had doubled since February of 2000.
"Google Launches Self-Service Advertising Program
"Google's AdWords Program Offers Every Business a Fully Automated, Comprehensive and Quick Way to Start an Online Advertising Campaign /
"MOUNTAIN VIEW, Calif. - October 23, 2000 - Google Inc., developer of the award-winning Google search engine, today announced the immediate availability of AdWords(TM), a new program that enables any advertiser to purchase individualized and affordable keyword advertising that appears instantly on the google.com search results page. The AdWords program is an extension of Google's premium sponsorship program announced in August. The expanded service is available on Google's homepage or at the AdWords link at http://adwords.google.com, where users will find all the necessary design and reporting tools to get an online advertising campaign started" (http://www.google.com/press/pressrel/pressrelease39.html, accessed 06-09-2009).
The Senate and House versions of this bill were introduced and rushed through congress on the last day before the Christmas holiday. The 11,000-page bill was never debated in the House or the Senate. Less than a week after it was passed by congress, President Clinton signed it into Public Law (106-554) on December 21, 2000. (adapted from the Wikipedia article on Credit Default Swap).
On December 21, 2000 the U.S. Congress appropriated $99,800,000 for the planning and implementation of the National Digital Information Infrastructure and Preservation Program (NDIIPP). It was a collaborative initiative of the Library of Congress.
In 2000 Lawrence Lessig, then a professor at Stanford Law School, published The Future of Ideas: The fate of commons in a connected world, in which he argued that while
". . . copyright helps artists get rewarded for their work, . . . a copyright regime that is too strict and grants copyright for too long a period of time (i.e. the current US legal climate) can destroy innovation, as the future movements by corporate interests to promote longer and tighter protection ofintellectual property in three layers: the code layer, the content layer, and the physical layer. . . . In the end, he stresses the importance of existing works entering the public domain in a reasonably short period of time, as the founding fathers intended."
This book documented:
"how IBM's New York headquarters and CEO Thomas J. Watson acted through its overseas subsidiaries to provide the Third Reich with punch card machines that could help the Nazis to track down the European Jewry (especially in newly conquered territory). The book quotes extensively from numerous IBM and government memos and letters that describe how IBM in New York, IBM's Geneva office and Dehomag, its German subsidiary, were intimately involved in supporting Nazi oppression. The book also includes IBM's internal reports that admit that these machines made the Nazis much more efficient in their efforts. Several documentaries, including the 2003 film The Corporation Screened, C-SPAN broadcast and The Times, the Village Voice, the JTA and numerous other publications published close-ups of several documents demonstrating IBM's involvement in the Holocaust. These included IBM code sheets for concentration camps taken from the files of the National Archives. For example, IBM's Prisoner Code listed 8 for a Jew and Code 11 for a Gypsy. Camp Code 001 was Auschwitz, Code 002 was Buchenwald. Status Code 5 was executed by order, code 6 was gas chamber. One extensively quoted IBM report written by the company's European manager during WWII declared “in Germany a campaign started for, what has been termed … ‘organization of the second front.’ ” The memo added, “In military literature and in newspapers, the importance and necessity of having in all phases of life, behind the front, an organization which would remain intact and would function with ‘Blitzkrieg’ efficiency … was brought out. What we had been preaching in vain for years all at once began to be realized.”
"The book documents IBM's CEO Thomas J. Watson as being an active Nazi supporter. Watson made numerous statements in numerous venues that the international community ought to give Nazi Germany a break from the economic sanctions. As head of the International Chamber of Commerce, Watson engineered an annual meeting to be held in Berlin, where he was witnessed to publicly give a Nazi salute to Hitler in the infamous "Seig, Heil" fashion. Watson traveled to Germany numerous times after the Nazis took power in 1933, but it was on the Commerce trip that he received an honor medal from Hitler himself. Watson also dined privately with Hitler, and attended lavish dinners with many infamous Nazi officials at the same time that Jews were being officially robbed and driven from their homes.
"There was an IBM customer site, the Hollerith Abteilung, in almost every concentration camp, that either ran machines, sorted cards or prepared documents for IBM processing. The Auschwitz tattoo began as an IBM number.
"Although IBM actively worked with the Hitler regime from its inception in 1933 to its demise in 1945, IBM has asserted that since their German subsidiary came under temporary receivership by the Nazi authorities from 1941 to 1945, the main company was not responsible for its role in the latter years of the holocaust. Shortly after the war, the company worked aggressively to recover the profits made from the many Hollerith departments in the concentration camps, the printing of millions of punchcards used to keep track of the prisoners, the custom-built punchcard systems, and its servicing of the Extermination through labour program. The company also paid its employees special bonuses based on high sales volume to the Nazis and collaborator regimes. As in many corporate cases, when the US entered the war, the Third Reich left in place the original IBM managers who continued their contacts via Geneva, thus company activities continued without interruption" (Wikipedia article on IBM and the Holocaust, accessed 05-23-2009).
In 2001 American director, screen writer and film producer Steven Spielberg directed, co-authored and produced, through DreamWorks and Amblin Entertainment, the science fiction film A.I. Artificial Intelligence, telling the story of David, an android robot child programmed with the ability to love and to dream. The film explored the hopes and fears involved with efforts to simulate human thought processes, and the social consequences of creating robots that may be better than people at specialized tasks.
The film was a 1970s project of Stanley Kubrick, who eventually turned it over to Spielberg. The project languished in development hell for nearly three decades before technology advanced sufficiently for a successful production. The film required enormously complex puppetry, computer graphics, and make-up prosthetics, which are well-described and explained in the supplementary material in the two-disc special edition of the film issued on DVD in 2002.
Filed under: Artificial Intelligence / Machine Learning / Deep Learning, Cinematography / Motion Pictures / Video, Computers & Society, Computers & the Human Brain / Neuromorphic Computing, Fiction / Science Fiction / Drama / Poetry, Graphics / Visualization / Computer Generated Imagery, Human-Computer Interaction, Robotics / Automata
The prehistory of Google Earth began in 2001 when a software development firm called Keyhole, Inc., was founded in Mountain View, California, which happened also to be Google's base of operations. Keyhole specialized in geospatial data visualization applications. The name "Keyhole" paid homage to the original KH reconnaissance satellites, also known as Corona satellites, which were operated by the U.S. between 1959 and 1972. Google acquired Keyhole in 2004, and Keyhole's Earth Viewer reached a wide public as Google Earth in 2005. Other aspects of Keyhole technology were incorporated into Google Maps.
In 2001 Norsam Technologies, Santa Fe, New Mexico, developed High Density Rosetta (HD-Rosetta) archival preservation technology, which "uses unique microscopic processes to provide analog and/or digital data, information or pictures on nickel plates." Density could be 20 times that of microfilm/microfiche.
196,000 pages of text could be etched with an electron microscope on a two square-inch plate.
"Benefits of the HD-ROSETTA Nickel Tablet System:
"Few environmental controls required
"Immune to technology obsolescence
"High temperature tolerance
"Immune to water damage
"Unaffected by electromagnetic radiation
"Highly durable over long periods of time."
Since 2001 U.S. Postal Service computers have been photographing the exterior of every piece of paper mail processed in the United States under the formerly secret mass surveillance program known as Mail Isolation Control and Tracking (MICT). Created in the aftermath of the 2001 anthrax attacks that killed five people, including two postal workers, MICT enables the Postal Service to track mail correspondence retroactively at the request of law enforcement, under the "Mail cover" program.
"The Federal Bureau of Investigations revealed MICT on June 7, 2013 when discussing the Bureau's investigation of ricin-laced letters sent to U.S. President Barack Obama and New York City mayor Michael Bloomberg. The FBI stated in a criminal complaint that the program was used to narrow its investigation to Shannon Richardson.Computer security and information privacy expert Bruce Schneier compared MICT to National Security Agency programs leaked in June 2013 by Edward Snowden and said,
" 'Basically they are doing the same thing as the other programs, collecting the information on the outside of your mail, the metadata, if you will, of names, addresses, return addresses and postmark locations, which gives the government a pretty good map of your contacts, even if they aren’t reading the contents.'
"James J. Wedick, a former FBI agent, said of MICT, 'It’s a treasure trove of information. Looking at just the outside of letters and other mail, I can see who you bank with, who you communicate with — all kinds of useful information that gives investigators leads that they can then follow up on with a subpoena.' He also said the program 'can be easily abused because it’s so easy to use and you don’t have to go through a judge to get the information. You just fill out a form.' (Wikipedia article on Mail isolation Control and Tracking, accessed 07-08-2013).
When I wrote this entry in November 2013 it remained the only major department in a top-ranked international university to offer multi-disciplinary social science degree programs focussing on the Internet, including a one-year MSc in Social Science of the Internet and a DPhil in Information, Communication and the Social Sciences.
In January 2001 the Digital Preservation Coalition was established in Heslington, York, United Kingdom "to foster join action to address the urgent challenges of securing the preservation of digital resources in the UK and to work with others internationally to secure our global digital memory and knowledge base."
In January 2001 The Consultative Committee for Space Data Systems (CCSDS), Washington, D.C., issued Reference Model for an Open Archival Information System (OAIS).
"An OAIS is an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community. It meets a set of such responsibilities as defined in this document and this allows an OAIS archive to be distinguished from other uses of the term ‘archive’. The model provides a framework for the understanding and increased awareness of archival concepts needed for long-term digital information preservation and access, and for describing and comparing architectures and operations of existing and future archives. It also guides the identification and production of OAIS related standards." ISO Number : 1472
"In its first year, Wikipedia generated 20,000 articles, and had acquired 200 regular volunteers working to add more (this compares with the 55,000 articles in the Columbia [Encyclopedia], all subject to rigorous standards of editing and fact-checking, though this in itself was a small-scale enterprise compared to the behemoths of the industry like the Encyclopaedia Britannica, whose 1989 edition covered 400,000 different topics). By the end of 2002, the number of entries on Wikipedia had more than doubled. But it was only in 2003, once it became apparent that there was nothing to stop it continuing to double in size (which is what it did), that Wikipedia started to attract attention outside the small tech-community that had noticed its launch. In early 2004, there were 188,000 articles; by 2006, 895,000. In 2007 there were signs that the pace of growth might start to level off, and only in 2008 did it begin to look like the numbers might be stabilising. The English-language version of Wikipedia currently has more than 2,870,000 entries, a number that has increased by 500,000 over the last 12 months. However, the English-language version is only one of more than 250 different versions in other languages. German, French, Italian, Polish, Dutch and Japanese Wikipedia all have more than half a million entries each, with plenty of room to add. Xhosa Wikipedia currently has 110. Meanwhile, the Encyclopaedia Britannica had managed to increase the number of its entries from 400,000 in 1989 to 700,000 by 2007" (Runciman, "Like Boiling a Frog," Review of "The Wikipedia Revolution" by Andrew Lih, London Review of Books, 28 May 2009, accessed 05-23-2009).
"Seven months after the ceremony at the White House marking the completion of the human genome sequence, highlights from two draft sequences and analyses of the data were published in Science and Nature. Scientists at Celera Genomics and the publicly funded Human Genome Project independently found that humans have approximately 30,000 genes that carry within them the instructions for making the body's diverse collection of proteins.
"The findings cast new doubt on the old paradigm that one gene makes one protein. Rather, it appears that one gene can direct the synthesis of many proteins through mechanisms that include 'alternative splicing.' "It seems to be a matter of five or six proteins, on average, from one gene," said Victor A. McKusick of the Johns Hopkins University School of Medicine, who was a co-author of the Science paper.
"The finding that one gene makes many proteins suggests that biomedical research in the future will rely heavily on an integration of genomics and proteomics, the word coined to describe the study of proteins and their biological interactions. Proteins are markers of the early onset of disease, and are vital to prognosis and treatment; most drugs and other therapeutic agents target proteins. A detailed understanding of proteins and the genes from which they come is the next frontier.
"One of the questions raised by the sequencing of the human genome is this: Whose genome is it anyway? The answer turns out to be that it doesn't really matter. As scientists have long suspected, human beings are all very much alike when it comes to our genes. The paper in Science reported that the DNA of human beings is 99.9 percent alike—a powerful statement about the relatedness of all humankind" (Genome News Network, Genetics and Genomics Timeline 2001, accessed 05-24-2009)
Venter, J.C. et al. "The sequence of the human genome," Science 291, 1304-1351 (February 16, 2001).
Lander, E.S. et al. The Genome International Sequencing Consortium. "Initial sequencing and analysis of the human genome," Nature 409, 860-921 (February 15, 2001).
"An initial rough draft of the human genome was available in June 2000 and by February 2001 a working draft had been completed and published followed by the final sequencing mapping of the human genome on April 14, 2003. Although this was reported to be 99% of the human genome with 99.99% accuracy a major quality assessment of the human genome sequence was published in May 27, 2004 indicating over 92% of sampling exceeded 99.99% accuracy which is within the intended goal. Further analyses and papers on the HGP continue to occur. An initial rough draft of the human genome was available in June 2000 and by February 2001 a working draft had been completed and published followed by the final sequencing mapping of the human genome on April 14, 2003. Although this was reported to be 99% of the human genome with 99.99% accuracy a major quality assessment of the human genome sequence was published in May 27, 2004 indicating over 92% of sampling exceeded 99.99% accuracy which is within the intended goal. Further analyses and papers on the HGP continue to occur" (Wikipedia article on Human Genome Project, accessed 01-09-2013).
On February 21, 2001 Google acquired Deja.com's (formerly Deja News Research Service), Austin, Texas, Usenet archive dating back to 1995, including 500,000,000 messages.
In its press release announcing the acquisition Google stated that it was performing 70,000,000 Internet searches per day.
Filed under: Indexing & Searching Information
In April 2001 American writer Nicholson Baker, of South Berwick, Maine, published Double Fold: Libraries and the Assault on Paper. Prior to the book an excerpt appeared in the July 24, 2000 issue of The New Yorker, under the title "Deadline: The Author's Desperate Bid to Save America's Past."
Baker's exhaustively researched polemic detailed his quest to expose the fate of thousands of books and newspapers that were replaced and often destroyed during the microfilming boom of the 1980s and '90s.
"The term 'double fold' refers to the test used by many librarians and preservation administrators to determine the brittleness and 'usability' of paper. The test consists of folding down the corner of a page of a book or newspaper, then folding it back in the opposite direction—one double fold. The action is then repeated until the paper breaks or is about to break. The more folds the page can withstand, the more durable it is. (In the late 1960s, preservation founding father William Barrow was fond of using a machine-run fold tester to back up his claims about the number of endangered books.) This experiment was used by library officials to identify their institution's brittle books, and, in some case, to justify withdrawing items from the shelves or replacing them with another format (most often microfilm). Baker's take on the double-fold test? '...utter horseshit and craziness. A leaf of a book is a semi-pliant mechanism. It was made for non-acute curves, not for origami.' (p. 157)"
"In 1999, Baker took matters into his own hands and founded the American Newspaper Repository in order to save some of the collections being auctioned off by the British Library. A year later he became the owner of thousands of volumes of old newspapers, including various runs of the New York Times, the Chicago Tribune, the New York Herald Tribune, and the New York World. In May 2004 the entire collection was moved to Duke University, where it is stored on climate-controlled shelves and looked after by the Rare Books and Special Collections division. As part of the gift agreement between the American Newspaper Repository and Duke, the collection will kept together in perpetuity, and no disbinding or experimental deacidification will be allowed.
"Baker makes four recommendations in Double Fold's epilogue: that libraries should be required to publish lists of discarded holdings on their websites, that the Library of Congress should fund a building that will serve as a storage repository for publications and documents not housed on-site, that some U.S. libraries should be designated with saving newspapers in bound form, and that both the U.S. Newspaper and the Brittle Books Programs should be abolished, unless they can promise that all conservation procedures will be non-destructive and that originals will be saved" (Wikipedia article on Double Fold, accessed 07-28-2009).
At the meeting of the San Francisco chapter of the Women's National Book Association on May 3, 2001 David Spiselman predicted that ebooks would be a 3.1 billion dollar business by 2004. He also predicted that by 2004 "screen quality will be superior to paper."
On July 11, 2001 Final Fantasy: The Spirits Within, a computer animated (CGI) science fiction film by Japanese game designer, game director and game producer Hironobu Sakaguchi, the creator of the Final Fantasy series of role-playing games, was released in the United States by Columbia Pictures. This film, produced by Square Pictures, Honolulu, Hawaii, was the first attempt to make a photorealistic rendered 3D feature film.
"Square Pictures rendered the film using some of the most advanced processing capabilities available for film animating at the time. A render farm consisting of 960 workstations was tasked with rendering each of the film's 141,964 frames. It took a staff of 200 and some four years to complete the film. Square intended to make the character of Aki Ross into the world's first photorealistic computer-animated actress, with plans for appearances in multiple films in different roles.
"The Spirits Within debuted to mixed critical reception, but was widely praised for the realism of the computer-animated characters. Due to rising costs, the film greatly exceeded its original budget towards the end of production, reaching a final cost of US$137 million, of which it made back only $85 million at the box office. The film has been called a box office bomb, and is blamed for the demise of Square Pictures" (Wikipedia article on Final Fantasy: The Spirits Within, accessed 03-23-2012).
"Roger Ebert was a strong advocate of the film; he gave the film 3 1/2 stars out of 4, praising it as a "technical milestone" while conceding that its 'nuts and bolts' story lacked 'the intelligence and daring of, say, Steven Spielberg's A.I.'. He also expressed a desire for the film to succeed in hopes of seeing more films made in its image, though he was skeptical of its ability to be accepted" (Wikipedia article on Final Fantasy: The Spirits Within, accessed 05-05-2009).
In August 2001 Michael K. Bergman, founder of BrightPlanet in Sioux Falls, South Dakota, published "The Deep Web: Surfacing Hidden Value," Journal of Electronic Publishing VII (2001) no. 1. For publishing this paper Bergman was credited with coining the expression, "the deep web."
"Searching on the Internet today can be compared to dragging a net across the surface of the ocean. While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, missed. The reason is simple: Most of the Web's information is buried far down on dynamically generated sites, and standard search engines never find it.
"Traditional search engines create their indices by spidering or crawling surface Web pages. To be discovered, the page must be static and linked to other pages. Traditional search engines can not "see" or retrieve content in the deep Web — those pages do not exist until they are created dynamically as the result of a specific search. Because traditional search engine crawlers can not probe beneath the surface, the deep Web has heretofore been hidden.
"The deep Web is qualitatively different from the surface Web. Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request. But a direct query is a "one at a time" laborious way to search. BrightPlanet's search technology automates the process of making dozens of direct queries simultaneously using multiple-thread technology and thus is the only search technology, so far, that is capable of identifying, retrieving, qualifying, classifying, and organizing both "deep" and "surface" content.
If the most coveted commodity of the Information Age is indeed information, then the value of deep Web content is immeasurable. With this in mind, BrightPlanet has quantified the size and relevancy of the deep Web in a study based on data collected between March 13 and 30, 2000.
Our key findings include:
♦ Public information on the deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web.
♦ The deep Web contains 7,500 terabytes of information compared to nineteen terabytes of information in the surface Web.
♦ The deep Web contains nearly 550 billion individual documents compared to the one billion of the surface Web.
♦ More than 200,000 deep Web sites presently exist.
♦ Sixty of the largest deep-Web sites collectively contain about 750 terabytes of information — sufficient by themselves to exceed the size of the surface Web forty times.
♦ On average, deep Web sites receive fifty per cent greater monthly traffic than surface sites and are more highly linked to than surface sites; however, the typical (median) deep Web site is not well known to the Internet-searching public.
♦ The deep Web is the largest growing category of new information on the Internet.
♦ Deep Web sites tend to be narrower, with deeper content, than conventional surface sites.
♦ Total quality content of the deep Web is 1,000 to 2,000 times greater than that of the surface Web.
♦ Deep Web content is highly relevant to every information need, market, and domain.
♦ More than half of the deep Web content resides in topic-specific databases.
♦ A full ninety-five per cent of the deep Web is publicly accessible information — not subject to fees or subscriptions.
"To put these findings in perspective, a study at the NEC Research Institute , published in Nature estimated that the search engines with the largest number of Web pages indexed (such as Google or Northern Light) each index no more than sixteen per cent of the surface Web. Since they are missing the deep Web when they use such search engines, Internet searchers are therefore searching only 0.03% — or one in 3,000 — of the pages available to them today. Clearly, simultaneous searching of multiple surface and deep Web sources is necessary when comprehensive information retrieval is needed.
After launching its two broadcast satellites "Rock" and "Roll", on September 25, 2001 XM Radio of Washington, D.C. initiated the first U.S. digital satellite radio service in Dallas/Ft. Worth and San Diego. (The original launch date of September 12 was pushed back after the 9/11 attacks.) Within two months service extended across the U.S.
"3G networks enable network operators to offer users a wider range of more advanced services while achieving greater network capacity through improved spectral efficiency. Services include wide-area wireless voice telephony, video calls, and broadband wireless data, all in a mobile environment. Additional features also include HSPA data transmission capabilities able to deliver speeds up to 14.4 Mbit/s on the downlink and 5.8 Mbit/s on the uplink" (Wikipedia article on 3G, accessed 04-11-2009).
On October 23, 2001 Apple launched the iPod line of portable media players.
On October 24, 2001 The Internet Archive first made its retrospective data available through the Wayback Machine. The name Wayback Machine is a droll reference to a plot device in the animated cartoon series, The Rocky and Bullwinkle Show, in which Mr. Peabody and Sherman routinely used a time machine called the "WABAC machine" (pronounced "Wayback") to witness, participate in, and, more often than not, alter famous events in history.
"In 1996 Brewster Kahle, with Bruce Gilliat, developed software to crawl and download all publicly accessible World Wide Web pages, the Gopher hierarchy, the Netnews bulletin board system, and downloadable software. The information collected by these "crawlers" does not include all the information available on the Internet, since much of the data is restricted by the publisher or stored in databases that are not accessible. These "crawlers" also respect the robots exclusion standard for websites whose owners opt for them not to appear in search results or be cached. To overcome inconsistencies in partially cached websites, Archive-It.org was developed in 2005 by the Internet Archive as a means of allowing institutions and content creators to voluntarily harvest and preserve collections of digital content, and create digital archives.
"Information was kept on digital tape for five years, with Kahle occasionally allowing researchers and scientists to tap into the clunky database.When the archive reached its five-year anniversary, it was unveiled and opened to the public in a ceremony at the University of California" (Wikipedia article on Wayback Machine, accessed 12-06-2013).
In November 2001 the Council on Library and Information Resources, Washington, D.C., issued Evidence in Hand: Report of the Task Force on the Artifact in Library Collections by Stephen G. Nichols and Abby Smith, exploring the tension between library preservation and maintenance of physical and digital artifacts, including books.
Filed under: Libraries
On November 15, 2001 Microsoft launched the Xbox game console, its first entry into the gaming console market.
"According to the book Smartbomb, by Heather Chaplin and Aaron Ruby, the remarkable success of the upstart Sony PlayStation worried Microsoft in late 1990s. The growing video game market seemed to threaten the PC market which Microsoft had dominated and relied upon for most of its revenues. Additionally, a venture into the gaming console market would diversify Microsoft's product line, which up to that time had been heavily concentrated on software."
This invention enabled printing a food-grade color photograph on the surface of a birthday cake, or other iced baked goods, using a dedicated inkjet printer and edible inks.
The online music store subscription service, Rhapsody, was launched in Seattle, Washington in December 2001.
"Downloaded files come with restrictions on their use, enforced by Helix, Rhapsody's version of digital rights management enforced on AAC+ or WMA files. The service also sells individual MP3s without digital rights management restrictions" (Wikipedia article on Rhapsody, accessed 03-18-2012).
In 2002 Charles Babbage’s Difference Engine No. 2, designed between 1847 and 1849, but never previously built, was completed and fully operational at the Science Museum, London. Babbage's purpose in designing the machine was to produce mathematical tables more accurate than any available in his day. To this end he designed a machine that could not only compute the tables but could also print them out and prepare stereotype printing plates so that the tables could be printed without the insertion of errors by human typesetters.
Built from Babbage’s engineering drawings roughly 150 years after it was originally designed, the calculating section of the machine weighs 2.6 tons and consists of 4000 machined parts. The automatic printing and stereotyping apparatus weighs an equal amount, with about the same number of parts. The machine is operated by turning hand-cranks.
The calculating section of the machine was completed in November 1991. After the Science Museum successfully built the computing section Nathan Myhrvold funded the construction of the output section, which performs both printing and stereotyping of calculated results. He also commissioned the construction of a second complete Difference Engine #2 for himself, which has been on display at the Computer History Museum in Mountain View, California, since May 10, 2008.
In 2002 there were 147,344,723 Internet hosts and 36,689,008 websites (Cisco). The estimated number of Internet users worldwide was about 600,000,000.
In 2002 Diana Hook and the author/editor of this database, Jeremy Norman, issued as a limited edition an annotated, descriptive bibliography entitled Origins of Cyberspace: A Library on the History of Computing, Networking, and Telecommunications. This was the first annotated descriptive bibliography on the history of these subjects. The brief timeline on the history of those subjects published in Origins of Cyberspace was the basis on which historyofinformation.com was later constructed.
In spite of the immense loss of information over the centuries, in 2002 there were about 45,000 Egyptian papyri, including fragments, in six institutional libraries and museums in the United States. (Athena Review, 2, no. 2). The main U.S. holders of papyri were Duke University, University of California at Berkeley, University of Michigan, Columbia, Yale, and Princeton. It was estimated that there are about 500,000 unpublished papyri preserved elsewhere. Other major institutional collections of papyri were the University of Heidelberg, University of Oxford, University of Lecce, and the University of Copenhagen.
Cathy N. Davidson, former Vice Provost for Interdisciplinary Studies and co-founder of the John Hope Franklin Humanities Institute at Duke University, and David Theo Goldberg, Director of the University of California's state-wide Humanities Research Institute (UCHRI) based at UC Irvine, founded HASTAC (Humanities, Arts, Science and Technology Advanced Collaboratory, pronounced "haystack"), a virtual organization of individuals and institutions inspired by the possibilities that new technologies offer for shaping how society learns, teaches, communicates, creates, and organizes at the local and global levels. In 2012 the organization had over 7000 members.
Filed under: Education / Scholarship / Reading / Literacy
"It is set primarily in Washington, D.C. and Northern Virginia in the year 2054, where "Precrime", a specialized police department, apprehends criminals based on foreknowledge provided by three psychics called 'precogs'. The cast includes Tom Cruise as Precrime officer John Anderton, Colin Farrell as Department of Justice agent Danny Witwer, Samantha Morton as the senior precog Agatha, and Max von Sydow as Anderton's superior Lamar Burgess. The film has a distinctive look, featuring desaturated colors that make it almost resemble a black-and-white film, yet the blacks and shadows have a high contrast, resembling film noir."
"Some of the technologies depicted in the film were later developed in the real world – for example, multi-touch interfaces are similar to the glove-controlled interface used by Anderton. Conversely, while arguing against the lack of physical contact in touch screen phones, PC Magazine's Sascha Segan argued in February 2009, 'This is one of the reasons why we don't yet have the famous Minority Report information interface. In that movie, Tom Cruise donned special gloves to interact with an awesome PC interface where you literally grab windows and toss them around the screen. But that interface is impractical without the proper feedback—without actually being able to feel where the edges of the windows are' " (Wikipedia article on Minority Report [film] accessed 05-25-2009).
The two-disc special edition of the film issued on DVD in 2002 contained excellent supplementary material on the special digital effects.
In 2002 Paul Marino founded the Academy of Machinima Arts and Sciences in New York.
"So, what is Machinima?
"Machinima (muh-sheen-eh-mah) is filmmaking within a real-time, 3D virtual environment, often using 3D video-game technologies.
"In an expanded definition, it is the convergence of filmmaking, animation and game development. Machinima is real-world filmmaking techniques applied within an interactive virtual space where characters and events can be either controlled by humans, scripts or artificial intelligence. By combining the techniques of filmmaking, animation production and the technology of real-time 3D game engines, Machinima makes for a very cost- and time-efficient way to produce films, with a large amount of creative control"
The cover art for Leave Me Alone: A Novel of Chengdu
"Mr. Murong owes his commercial success to the fact that he has found ways to practice his art and build a fan base on the Internet, outside the more heavily policed print industry.
"He addresses political issues on both a blog and a microblog account that resembles Twitter, which has nearly 1.1 million followers. He posts his novels chapter by chapter or in sections online under different pseudonyms as he writes. This Dickens-style serialization generates buzz, and the writing evolves with reader feedback. Once the book is finished or nearly so, Mr. Murong signs with a publisher. The censored print editions make money, but the Internet versions are more complete" (http://www.nytimes.com/2011/11/07/world/asia/murong-xuecun-pushes-censorship-limits-in-china.html?hp, accessed 11-10-2011).
The uncensored version of Murong's novel was translated into English by Harvey Thomlinson, and published in 2010 as Leave Me Alone: A Novel of Chengdu.
"Thirty-six year old Murong - Chinese literary superstar and reclusive celebrity - twenty eight and working as a sales manager in the car industry when he started posting his first novel Chengdu Please Forget Me Tonight on the internet. In 2002 it became a cult hit amongst young middle class Chinese looking for writing that pushed the boundaries of what was acceptable literature. Chengdu Please Forget Me Tonight was eventually posted on almost all of China's online bulletin boards, and attracted around 5 million online readers. Thousands of web commentaries and impassioned debates about the book appeared, while 'formal' commentaries and critiques amounted to more than 50,000 words. The novel won Murong the New Periodical 'Person of the Year', Xinliang website's 'Most popular novel', and the China Literary Journal's 2003 literature prize" (Amazon.com, accessed 11-10-2011).
From the beginning of Google at the Stanford Digital Libraries Initiative in 1996 Larry Page and Sergey Brin wanted to build a searchable digital library of the world's books, but put the project on hold in order to launch Google's web search. In 2002 Larry Page returned to the project of digitizing the world's books and making them searchable. To do that he asked Marissa Mayer, Google employee no. 20 and Goggle's first female engineer, to help test the idea by turning pages to the beat of a metronome as he snapped digital photos. It took him 40 minutes to photograph all the pages of a 300-page book.
To research the project Page and a small team visited other book digitizing projects, including one at his alma mater, the University of Michigan. There the staff estimated that it would take 1000 years to scan all seven million books in the university library; Page estimated that he could do in it six years.
Page then hired a robotics company to build an automatic scanner that could handle books with fragile pages, and programmers at Google created a page-recognition program that could recognize the widest range of typefaces of various sizes in 430 different languages.
After discussions with several university libaries, Oxford University became the first institution to allow Google to scan their one million nineteenth century books over a three year period.
Brandt, The Google Guys (2011) Chapter 9, "The Ruthless Librarians."
According to Martin Hilbert and Priscilla López in their paper "The World's Technological Capacity to Store, Communicate, and Compute Information," Science 332 (April 1, 2011) 60-64, the year 2002 could be considered the beginning of the "digital age"— the first year worldwide digital storage capacity overtook total analog capacity.
In January 2003 the Defense Advanced Research Projects Agency (DARPA) of the United States established the Information Awareness Office (IAO) to bring together several DARPA projects focused on applying surveillance and information technology to track and monitor terrorists, and other asymmetric threats to U.S. national security, by achieving Total Information Awareness (TIA). Asymmetric threats include threats from belligerants such as terrorists whose relative military power or strategy and tactics differ significantly from the U. S. government or military.
The IAO created
"enormous computer databases to gather and store the personal information of everyone in the United States, including personal e-mails, social networks, credit card records, phone calls, medical records, and numerous other sources, without any requirement for a search warrant. This information was then analyzed to look for suspicious activities, connections between individuals, and "threats". Additionally, the program included funding for biometric surveillance technologies that could identify and track individuals using surveillance cameras, and other methods.
"Following public criticism that the development and deployment of this technology could potentially lead to a mass surveillance system, the IAO was defunded by Congress in 2003. However, several IAO projects continued to be funded and merely run under different names, as revealed by Edward Snowden during the course of the 2013 mass surveillance disclosures" (Wikipedia article on Information Awareness Office, accessed 11-26-2013).
In May 2002 RLG in Mountain View, California, and OCLC in Dublin, Ohio issued the report, Trusted Digital Repositories: Attributes and Responsibilities.
Krishna Bharat, a research scientist at Google, created Google News in 2002 in the aftermath of the September 11, 2001 attacks in order to keep himself abreast of new developments. Google News watches more than 4500 worldwide news sources, aggregating content from more than 25,000 publishers. For the English language it covers more than 4500 sites; for other languages, fewer sites are covered.
According to the Wikipedia, different versions of the aggregator were available for more than 60 regions in 28 languages, as of March 15, 2012, with continuing development ongoing. As of January 2013, service in the following languages was offered: Arabic, Cantonese, Chinese, Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Hungarian, Italian,Japanese, Korean, Malayalam, Norwegian, Polish, Portuguese, Russian, Spanish, Swedish, Tamil, Telugu,Thai, Turkish, Ukrainian, and Vietnamese.
"As a news aggregator site, Google uses its own software to determine which stories to show from the online news sources it watches. Human editorial input does come into the system, however, in choosing exactly which sources Google News will pick from. This is where some of the controversy over Google News originates, when some news sources are included when visitors feel they don't deserve it, and when other news sources are excluded when visitors feel they ought to be included. . . .
"The actual list of sources is not known outside of Google. The stated information from Google is that it watches more than 4,500 English-language news sites. In the absence of a list, many independent sites have come up with their own ways of determining Google's news sources . . . ." (Wikipedia article on Google News, accessed 10-24-2014).
The Bibliotheca Alexandrina, or Maktabat al-Iskandarīyah (English: Library of Alexandria; Arabic: مكتبة الإسكندرية), a major library and cultural center located near the site of the original Royal Library of Alexandria, was opened to the public on October 16, 2002.
"The dimensions of the project are vast: the library has shelf space for eight million books, with the main reading room covering 70,000 m² on eleven cascading levels. The complex also houses a conference center; specialized libraries for maps, multimedia, the blind and visually impaired, young people, and for children; four museums; four art galleries for temporary exhibitions; 15 permanent exhibitions; a planetarium; and a manuscript restoration laboratory. The library's architecture is equally striking. The main reading room stands beneath a 32-meter-high glass-panelled roof, tilted out toward the sea like a sundial, and measuring some 160 m in diameter. The walls are of gray Aswan granite, carved with characters from 120 different human scripts.
"The collections at the Bibliotheca Alexandrina were donated from all over the world. The Spanish donated documents that detailed their period of Moorish rule. The French also donated, giving the library documents dealing with the building of the Suez Canal.
"Bibliotheca Alexandrina maintains the only copy and external backup of the Internet Archive" (Wikipedia article on Bibliotheca Alexandrina, accessed 03-18-2012).
In November 2002 physicist and software developer Blaise Agüera y Arcas and Paul Needham, Librarian of the Scheide Library at Princeton University, working on original editions in the Scheide Library, used high resolution scans of individual characters printed by Gutenberg, and image processing algorithms, to locate and compare variants of the same characters printed by Gutenberg. As a result of this research it appears that the method of producing movable type attributed to Gutenberg developed in phases rather than as a complete system.
"The irregularities in Gutenberg's type, particularly in simple characters such as the hyphen, made it clear that the variations could not have come from either ink smear or from wear and damage on the pieces of metal on the types themselves. While some identical types are clearly used on other pages, other variations, subjected to detailed image analysis, made for only one conclusion: that they could not have been produced from the same matrix. Transmitted light pictures of the page also revealed substructures in the type that could not arise from punchcutting techniques. They [Agüera y Arcas and Needham] hypothesized that the method involved impressing simple shapes to create alphabets in "cuneiform" style in a mould like sand. Casting the type would destroy the mould, and the alphabet would need to be recreated to make additional type. This would explain the non-identical type, as well as the substructures observed in the printed type. Thus, they feel that "the decisive factor for the birth of typography", the use of reusable moulds for casting type, might have been a more progressive process than was previously thought. . . . " (Summary from the Wikipedia article on Johannes Gutenberg, accessed 02-08-2209).
Blaise Agüera y Arcas and Paul Needham, "Computational analytical bibliography," Proceedings Bibliopolis Conference The future history of the book', The Hague: Koninklijke Bibliotheek, (November 2002).
Agüera y Arcas, "Temporary Matrices and Elemental Punches in Gutenberg's DK type", in: Jensen (ed) Incunabula and Their Readers. Printing , Selling, and Using Books in the Fifteenth Century (2003) 1-12.
Creative Commons, founded in 2001, released as its first project, a set of copyright licenses free for public use in December 2002:
"Taking inspiration in part from the Free Software Foundation’s GNU General Public License (GNU GPL), Creative Commons has developed a Web application that helps people dedicate their creative works to the public domain — or retain their copyright while licensing them as free for certain uses, on certain conditions. Unlike the GNU GPL, Creative Commons licenses are not designed for software, but rather for other kinds of creative works: websites, scholarship, music, film, photography, literature, courseware, etc. We hope to build upon and complement the work of others who have created public licenses for a variety of creative works. Our aim is not only to increase the sum of raw source material online, but also to make access to that material cheaper and easier. To this end, we have also developed metadata that can be used to associate creative works with their public domain or license status in a machine-readable way. We hope this will enable people to use our search application and other online applications to find, for example, photographs that are free to use provided that the original photographer is credited, or songs that may be copied, distributed, or sampled with no restrictions whatsoever. We hope that the ease of use fostered by machine- readable licenses will further reduce barriers to creativity."
On December 1, 2002 the ECHO initiative was announced in Berlin. Funded by the European Commission, it was founded by the Max Planck Institute for the History of Art in Rome, by the Max Planck Institute for Psycholinguistics in Nijmegen, and by the Max Planck Institute for the History of Science in Berlin, together with their international partners.
"The new European Commission-funded project ECHO (European Cultural Heritage Online) to create an IT-based infrastructure for the humanities is taking shape today with its kick-off-meeting held in Berlin. With a budget of approximately 1.6 million Euros 16 partners from 9 European countries including candidate countries together with their subcontractors, the initiative aims at achieving four major goals, scientific, technological, cultural and political, until May 2004:
"By 1) improving the situation for the humanities concerning the new information technologies through
"2) the fostering of a new IT-based infrastructure, adequate to future information technologies,
"3) cultural heritage in Europe will be brought online and
"4) be made freely accessible without any commercial constraints.
"The project, coordinated by the Max Planck Institute for the History of Science in Berlin, is highly welcomed by the EU commission as a chance to strengthen the competiveness of European research by promoting an urgently needed concept for good practice in scholarly research in the humanities. In order to exploit the innovative potential of the new information technologies, the project will contribute to overcome the present fragmentation of approaches to transfer cultural heritage to the Internet.
"At present Europe lags behind in developing a large-scale infrastructure for the humanities adequate to the Internet age and competitive with similar ventures in the US. As a Europe-wide effort, ECHO aims at developing high-quality research in line with the ambition of the European Research Area and competitive with US and Japanese ventures. Only by overcoming the limitations of national perspectives can the critical mass be brought together that ensures the self-organisation of culture in the new medium.
"If the new media comprises an adequate representation of human cultural diversity they can offer also new opportunities reflecting on possible links and similarities e.g. between European and non-European cultures. A culturally informed Web may thus even constitute a public think-tank in which cultural diversity drives rather than conflicts with communication.
"The ECHO project is constituted by its main partners as well as by subcontractors. Even now, however, the informal network of actors willing to contribute extends far beyond the group of applicants. Some 25 academic, governmental, and private institutions from 15 European and 3 non-European countries (China, Mexico, and the USA) have declared their adherence to the project; they will be contacted during its first phase.
"The single most important added European value offered by the project to the citizens of Europe is a contribution to the preservation of, and an improved and extended access to, their own European cultural heritage. Its enhanced availability on the Internet will also create new opportunities for shaping a polyvalent European identity, including a realisation of the non-European origins of essential presuppositions of European culture as well as an awareness of its historical pitfalls. Border-crossing technologies such as language tools adapted to cultural sources contribute to European integration by making these treasures accessible to all Europeans (e-Europe). ECHO will provide web-accessible multimedia content together with navigation facilities, hence making it attractive for researchers, teachers, students, journalists, and also for the general public.
"In addition, the ECHO project will be directly concerned with copyright laws and open source policies. It will provide an opportunity for reflecting on the ongoing developments from a practical point of view and may lead to the definition of new policies encouraging the transfer of cultural heritage to the existing and new media.
"The project is defined in three major steps.
"• An assessment of the present situation in relation to bringing European cultural heritage online. In view of the fragmentation of endeavours presently undertaken, it is necessary to assess the implementation of Information Technology for preserving, sharing, and studying this heritage in different disciplines and nations.
• The exploration of a novel IT-based cooperative research infrastructure. The project will create, within its limited scope, a model implementation of a new cooperative research infrastructure, that aims at mobilising and bringing together all relevant actors (universities, museums, libraries, archives, (national) research councils, digital heritage organisa-tions, and companies) in the broad field of the humanities and cultural heritage in Europe.
"• A paradigmatic proof of the new potentials for research offered by this infrastructure. By taking up four paradigmatic content areas in the humanities, from the history of art, the history of science, language studies, and social and cultural anthropology, respectively, the project aims at demonstrating the innovative potential for research offered by this infrastructure.
"The highly ambitious ECHO project aims at the creation of a progressively growing agora, defining the management structure, data formats, tools and workflows. This in turn is intended to serve as a model for a larger-scale network within the 6th Framework Program of the EU. The subsequent project, possibly labelled ECHO 2, shall bring a major contribution to the preservation of Europe's cultural heritage as well as improved and extended access to this heritage for both scholars and the general public alike. This transformation of the Internet into a semantic web allowing the exchange and processing of information in the language of human culture within an emerging Open Library will serve as a framework for cooperative work on the sources and for the presentation of its results. It will also show socio-economic effects such as becoming a central resource of technology for storing and distributing information for institutions who lack such means; or for creating a basis for virtual tourism into the digitised realm of our rich cultural heritage in Europe."
How much information 2003: The research project from the University of California at Berkeley, first published on the web in 2000, updated its findings in 2003. Strikingly it estimated that each person in the U.S. generated 800 MB of recorded information. This was more than three times the data per capita that the same research project calculated was being produced in 2000. The remaining data in this entry of the database is quoted from the 2003 website:
"How much new information is created each year? Newly created information is stored in four physical media -- print, film, magnetic and optical --and seen or heard in four information flows through electronic channels -- telephone, radio and TV, and the Internet. This study of information storage and flows analyzes the year 2002 in order to estimate the annual size of the stock of new information recorded in storage media, and heard or seen each year in information flows. Where reliable data was available we have compared the 2002 findings to those of our 2000 study (which used 1999 data) in order to describe a few trends in the growth rate of information.
"The original operational date was set at Nov 1st 2006. Programming and development began in 2003 but actual processing on the system was delayed until 2005. The system initially processed only 1040EZ tax returns, the simplest type of electronic tax returns. In 2006 the capacity was increased for the system to begin processing a limited number of more complex 1040 forms and other support forms. In 2007 the system began to process Schedule C forms and other more complex tax forms.
"Because the system is still unable to handle the full load of IRS tax returns, a hybrid approach is used by the IRS with the overwhelming majority of tax returns still being processed with the old system. Current processing loads and returns done by CADE are used for testing purposes to determine the systems functionality.
"The system, although beset by regular set backs due to funding, is expected to be fully operational by 2012" (Wikipedia article on Customer Account Data Engine, accessed 12-27-2008).
Under the pen name "Yoshi," in 2003 a Tokyo man published the first cell phone novel, Deep Love— the story of a teenage prostitute in Tokyo. Deep Love
"became so popular that it was published as an actual book, with 2.6 million copies sold in Japan, then spun off into a television series, a manga, and a movie. The cell phone novel became a hit mainly through word of mouth and gradually started to gain traction in China and South Korea among young adults. In Japan, several sites offer large prizes to authors (up to $100,000 US) and purchase the publishing rights to the novel."
"Cell phone or mobile phone novels called keitai shousetsu in Japanese, are the first literary genre to emerge from the cellular age via text messaging. Phone novels started out primarily read and authored by young Japanese women, on the subject of romantic fiction such as relationships, lovers, rape, love triangles, and pregnancy. However, mobile phone novels are trickling their way to a worldwide popularity on all subjects. Japanese ethos of the Internet regarding mobile phone novels are dominated by false names and forged identities. Therefore, identities of the Japanese authors of mobile phone novels are rarely disclosed. 'Net transvestites' are of the most extreme play actors of the sort. Differing from regular novels, mobile phone novels may be structured according to the author's preference. If a couple is fighting in the story, the author may choose to have the lines closely spaced and crowded. On the contrary, if the author writes a calm or soothing poem the line spacing may be further apart than normal. Overall, the line spacing of phone novels contains enough blank space for an easy read. Phone novels are meant to be read in 1,000 to 2,000-word (in China) or 70-word (in Japan) chapters via text message on mobile phones. They are downloaded in short installments and run on handsets as Java-based applications on a mobile phone. Cell phone novels often appear in three different formats: WMLD, JAVA and TXT. Maho i-Land is the largest cell phone novel site that carries more than a million titles, mainly novice writers, all which are available for free. Maho iLand provides templates for blogs and homepages. It is visited 3.5 billion times each month. In 2007 98 cell phone novels were published into books. "Love Sky" is a popular phone novel with approximately 12 million views on-line, written by "Mika", that was not only published but turned into a movie. www.textnovel.com is another popular mobile phone novel site, however, in English."
"Five out of the ten best selling novels in Japan in 2007 were originally cell phone novels" (Wikipedia article on Cell phone novel, accessed 08-23-2009).
According to Bowker, as cited by Robert Darnton in Publisher's Weekly, 859,000 new book titles were published worldwide in 2003. This represented a significant increase over the 700,000 titles published in 1998.
"I have been invited to so many conferences on “The Death of the Book” during the past decade that I think books must be very much alive. The death notices remind me of one of my favorite graffiti, inscribed in the men's room of the Firestone Library at Princeton University:
"God is dead.
Then, added in another hand:
"Nietzsche is dead.
"The book is not dead. In fact, the world is producing more books than ever before. According to Bowker, 700,000 new titles were published worldwide in 1998; 859,000 in 2003; and 976,000 in 2007. Despite the Great Recession of 2009 that has hit the publishing industry so hard, one million new books will soon be produced each year" (http://www.publishersweekly.com/pw/print/20090914/451-on-the-ropes-robert-darnton-s-case-for-books.html).
In 2003 computer scientist Noah Wardrip-Fruin of the University of California, Santa Cruz, and Nick Montfort, professor of digital media at MIT issued The New Media Reader, with introductions by Janet H. Murray of Georgia Institute of Technology and Lev Manovich, then professor at the University of California at San Diego.
This anthology represented the first attempt to present in a single volume significant representative documents covering the wide range of digital media. As Janet Murray wrote in her introduction, "This a landmark volume, marking the first comprehensive effort at establishing the geneology of the computer as an expressive medium." The 823-page physical volume, designed by Michael Crumpton, and published by MIT Press, was innovative several ways: most notably through the use of special symbols in the text and the margins that directed the reader to cross-references throughout the book— a kind of physical hypertext. The anthology also contained a CD-ROM containing programs, videos, games, interactive fiction and games.
A Federal regulatory clearinghouse in January 2003 Regulations.gov, was launched as the first milestone of the Federal "E-Government eRulemaking" Initiative.
"This U.S. Government Web site encourages public participation in the federal decision-making by allowing you to view and submit comments and documents concerning federal regulations, adjudication, and other actions. Regulations.gov provides one-stop, online access to every rule published and open for comment, from more than 160 different Federal agencies.
"Regulations.gov has created universal access to the Federal regulatory process by removing barriers that previously made it difficult for the public to navigate the expanse of Federal regulatory activities. Regulations.gov is the first one-stop Internet site for the public to submit comments on all Federal rulemakings. It is also the first site that allows the public to submit comments via the Internet to virtually all Federal Agencies.
"The new generation of Regulations.gov, the eRulemaking Initiative's Federal Docket Management System (FDMS), launched in the fall of 2005, enabled the public to access entire rulemaking dockets from participating Federal Departments and Agencies. FDMS is a full-featured electronic docket management system that builds upon the capabilities of the original Regulations.gov and gives Federal rule writers and docket managers the ability to better manage their rulemaking and non-rulemaking activities. With this system, Federal Departments and Agencies can post Federal Register documents, supporting materials, and public comments on the Internet. The public can search, view, and download these documents on FDMS' public side, Regulations.gov."
Blei, Ng, & Jordan. "Latent Dirichlet Allocation," The Journal of Machine Learning Research, Volume 3 (2003) 993-1022.
In 2013 Blei published an informative illustrated survey of algorithms for managing large document archives: "Probablilistic Topic Models," Communications of the ACM Vol. 55, No. 4 (2012) 77-84, accessed 10-20-2013.
The Battle of Umm Qasr was the first military confrontation in the Iraq War. At the start of the war, one of the first objectives was the port of Umm Qasr. On March 21, 2003, as allied forces advanced across Southern Iraq, an amphibious landing force captured the new port area of Umm Qasr. The assault was spearheaded by Royal Marines of the British 3 Commando Brigade, augmented by Marines of the US 15th Marine Expeditionary Unit and Polish GROM troops. Iraqi forces in the old town of Umm Qasr put up unexpectedly strong resistance, requiring several days' fighting before the area was cleared of defenders. The port was finally declared safe and reopened on March 25, 2003.
CNN's coverage of this battle, produced by Senior Executive Producer David Bohrman, was probably the first live coverage of an actual battle on the ground seen on television.
"Those watching CNN in the wee hours Sunday morning found themselves witnessing something we haven't seen much of—an actual battle, closeup, in real time. Until then, this war, at least the televised portion of it, had remained a fairly abstract affair: a mix of spectacular explosions in Baghdad (some idiot reporter at a Pentagon press conference referred to Friday's bombardment as "last night's show") and massive tank columns dashing unimpeded through the desert. Occasionally, an embedded reporter would talk of brief skirmishes that had just ended, but we at home never saw the action.
"The whole picture changed when a British TV pool in the southern Iraqi port of Umm Qasr started broadcasting live footage of the real, gritty thing. Early Saturday, U.S. Marines had supposedly secured this port, which will soon be receiving shiploads of military reinforcements and humanitarian aid. Then they started coming under fire. Around 1 a.m. EST, we saw troops of the 15th Marine Expeditionary Unit lying on their stomachs, eyes glued to their rifle scopes, aiming their barrels at a concrete building where the snipers seemed to be perched. A couple of Abrams M1-A1 tanks fired shells on the building, then rolled slowly toward the target to survey the damage. The Iraqi soldiers—who turned out to be a patrol from the Republican Guard—fired back. Two more Abrams tanks moved forward. We could smell the tension, feel the adrenalin. CNN's Aaron Brown, watching along in his studio, warned viewers that this was an unpredictable scene, that dreadful things might happen without notice. The fighting continued for another four hours, until a British Harrier jet was called in to take out the Iraqis from the air" (Fred Kaplan, "Reality War. The Battle of Umm Quasr - Live and in Color," Slate.com, October 23, 2003).
Between April 6 and April 12, 2003 The National Museum of Iraq in Baghdad lost an estimated 15,000 artifacts, including priceless relics of Mesopotamian civilization. The relics were stolen by looters in the days after Baghdad fell to U.S. forces in the Iraq War. Of the objects looted, about 5,000 were still missing in 2003, 4,000 were returned and 6,000 were recovered, according to Lawrence Rothfield, author of Antiquities Under Siege: Cultural Heritage Protection After the Iraq War (2008).
The BookScan 1200 was the first automatic, page-turning scanner for the conversion of bound volumes to digital files. The manufacturers claimed that it could scan volumes at up to 1200 pages per hour. The motto of the company was "Moving knowledge from Books to Bytes."
On April 14, 2003 the Privacy Rule of the Health Insurance Portability and Accountability Act (HIPAA) went into effect.
"The Health Insurance Portability and Accountability Act (HIPAA) was enacted by the U.S. Congress in 1996. According to the Centers for Medicare and Medicaid Services (CMS) website, Title I of HIPAA protects health insurance coverage for workers and their families when they change or lose their jobs. Title II of HIPAA, known as the Administrative Simplification (AS) provisions, requires the establishment of national standards for electronic health care transactions and national identifiers for providers, health insurance plans, and employers. It helps people keep their information private.
"The Administration Simplification provisions also address the security and privacy of health data. The standards are meant to improve the efficiency and effectiveness of the nation's health care system by encouraging the widespread use of electronic data interchange in the U.S. health care system."
"The HIPAA Privacy Rule regulates the use and disclosure of certain information held by 'covered entities' (generally, health care clearinghouses, employer sponsored health plans, health insurers, and medical service providers that engage in certain transactions.) It establishes regulations for the use and disclosure of Protected Health Information (PHI). PHI is any information held by a covered entity which concerns health status, provision of health care, or payment for health care that can be linked to an individual. This is interpreted rather broadly and includes any part of an individual's medical record or payment history.
"Covered entities must disclose PHI to the individual within 30 days upon request. They also must disclose PHI when required to do so by law, such as reporting suspected child abuse to state child welfare agencies.
"A covered entity may disclose PHI to facilitate treatment, payment, or health care operations, or if the covered entity has obtained authorization from the individual. However, when a covered entity discloses any PHI, it must make a reasonable effort to disclose only the minimum necessary information required to achieve its purpose.
"The Privacy Rule gives individuals the right to request that a covered entity correct any inaccurate PHI. It also requires covered entities to take reasonable steps to ensure the confidentiality of communications with individuals. . . .
"The Privacy Rule requires covered entities to notify individuals of uses of their PHI. Covered entities must also keep track of disclosures of PHI and document privacy policies and procedures. They must appoint a Privacy Official and a contact person responsible for receiving complaints and train all members of their workforce in procedures regarding PHI.
"An individual who believes that the Privacy Rule is not being upheld can file a complaint with the Department of Health and Human Services Office for Civil Rights (OCR). However, according to the Wall Street Journal, the OCR has a long backlog and ignores most complaints. 'Complaints of privacy violations have been piling up at the Department of Health and Human Services. Between April 2003 and Nov. 30, the agency fielded 23,896 complaints related to medical-privacy rules, but it has not yet taken any enforcement actions against hospitals, doctors, insurers or anyone else for rule violations. A spokesman for the agency says it has closed three-quarters of the complaints, typically because it found no violation or after it provided informal guidance to the parties involved' " (Wikipedia article on Health Insurance Portability and Accountability Act, accessed 08-05-2009).
"about computer mediated and computer generated works of many forms: interactive fiction, net.art, electronic poetry, interactive drama, hypertext fiction, computer games of all sorts, shared virtual environments, and more."
In May 2009 GTxA became "an aggregator for a distributed group of blogs in which we participate. The authors of these blogs work as both theorists and developers, and are interested in authorship, design, and technology, as well as issues of interaction and reception."
In July 2003 the International Internet Preservation Consortium (IIPC,) netpreserve.org, was founded.
"In July 2003 the national libraries of Australia, Canada, Denmark, Finland, France, Iceland, Italy, Norway, Sweden, The British Library (UK), The Library of Congress (USA) and the Internet Archive (USA) acknowledged the importance of international collaboration for preserving Internet content for future generations. This group of 12 institutions chartered the IIPC to fund and participate in projects and working groups to accomplish the Consortium’s goals. The initial agreement was in effect for three years, during which time the membership was limited to the charter institutions. Since then, membership has expanded to include additional libraries, archives, museums and cultural heritage institutions involved in Web archiving.
"The goals of the consortium are:
" * To enable the collection, preservation and long-term access of a rich body of Internet content from around the world.
" * To foster the development and use of common tools, techniques and standards for the creation of international archives.
" * To be a strong international advocate for initiatives and legislation that encourage the collection, preservation and access to Internet content.
" * To encourage and support libraries, archives, museums and cultural heritage institutions everywhere to address Internet content collecting and preservation."
In August 2003 Swedish entrepreneurs Niklas Zennström, Janus Friis, and the Estonians Ahti Heinla, Priit Kasesalu launched the peer-to-peer voice over Internet Protocol (VOIP) telephony service, Skype. The name of the company evolved from "Sky peer-to-peer" or "Skyper." However some of the domain names associated with "Skyper" were already taken, so the final "r" was dropped leaving "Skype," for which domain names were available. Skype was sold to eBay, based in San Jose, California, in September 2005. On 10 May 2011 Microsoft purchased Skype from eBay for a supposed $8.5 billion. According to the Wikipedia Skype had 663 million registered users in September 2011.
On October 23, 2003 Amazon.com made it possible to “search inside” the full text of 120,000 books from more than 190 publishers. This allowed Amazon users to search not only the full texts of individual titles but all 120,000 collectively.
On October 23, 2003 joujrnalist Gary Wolf published an article about the cultural history of digital libraries, and more specifically Amazon's "Search Inside," in Wired magazine, entitled "The Great Library of Amazonia," from which I quote a portion:
"The more specific the search, the more rewarding the experience. For instance, I've recently become interested in Boss Tweed, New York's most famous pillager of public money. Manber types "Boss Tweed" into his search engine. Out pop a few books with Boss Tweed in the title. But the more intriguing results come from deep within books I never would have thought to check: A Confederacy of Dunces, by John Kennedy Toole; American Psycho, by Bret Easton Ellis; Forever: A Novel, by Pete Hamill. I immediately recognize the power of the archive to make connections hitherto unseen. As the number of searchable books increases, it will become possible to trace the appearance of people and events in published literature and to follow the most digressive pathways of our collective intellectual life.
"From the Hamill reference, I link to a page in the afterward on which he cites books that influenced his portrait of Tweed. There, on the screen, is the cream of the research performed by a great metropolitan writer and editor. Some of the books Hamill recommends are out of print, but all are available either new or used on Amazon.
"With persistence, serendipity and plenty of time in a library, I may have found these titles myself. The Amazon archive is dizzying not because it unearths books that would necessarily have languished in obscurity, but because it renders their contents instantly visible in response to a search. It allows quick query revisions, backtracking, and exploration. It provides a new form of map.
"Getting to this point represents a significant technological feat. Most of the material in the archive comes from scanned pages of actual books. This may be surprising, given that most books today are written on PCs, e-mailed to publishers, typeset on computers, and printed on digital presses. But many publishers still do not have push-button access to the digital files of the books they put out. Insofar as the files exist, they are often scattered around the desktops of editors, designers, and contract printers. For books more than a few years old, complete digital files may be lost. John Wiley & Sons contributed 5,000 titles to the Amazon project -- all of them in physical form.
"Fortunately, mass scanning has grown increasingly feasible, with the cost dropping to as low as $1 each. Amazon sent some of the books to scanning centers in low-wage countries like India and the Philippines; others were run in the United States using specialty machines to ensure accurate color and to handle oversize volumes. Some books can be chopped out of their bindings and fed into scanners, others have to be babied by a human, who turns pages one by one. Remarkably, Amazon was already doing so much data processing in its regular business that the huge task of reading the images of the books and converting them into a plain-text database was handled by idle computers at one of the company's backup centers."
In November 2003 Hiroshi Ishiguro (石黒浩 Ishiguro Hiroshi), director of the Intelligent Robotics Laboratory, part of the Department of Adaptive Machine Systems(知能・機能創成工学専攻) at Osaka University, Japan, developed the actroid, a humanoid robot and android with a lifelike appearance and visible behavior such as facial movements.
"In robot development, Professor Ishiguro concentrates on the idea of making a robot that is as similar as possible to a live human being; at the unveiling in July 2005 of the "female" android named Repliee Q1Expo, he was quoted as saying 'I have developed many robots before, but I soon realised the importance of its appearance. A human-like appearance gives a robot a strong feeling of presence. ... Repliee Q1Expo can interact with people. It can respond to people touching it. It's very satisfying, although we obviously have a long way to go yet.' In his opinion, it may be possible to build an android that is indistinguishable from a human, at least during a brief encounter" (Wikipedia article on Hiroshi Ishiguro, accessed 03-05-2011).
I quote from the beginning of the lecture:
"WE HAVE THREE TYPES OF MEMORY. The first one is organic, which is the memory made of flesh and blood and the one administrated by our brain. The second is mineral, and in this sense mankind has known two kinds of mineral memory: millennia ago, this was the memory represented by clay tablets and obelisks, pretty well known in this country, on which people carved their texts.
"However, this second type is also the electronic memory of today's computers, based upon silicon. We have also known another kind of memory, the vegetal one, the one represented by the first papyruses, again well known in this country, and then on books, made of paper. Let me disregard the fact that at a certain moment the vellum of the first codices were of an organic origin, and the fact that the first paper was made with rags and not with wood. Let me speak for the sake of simplicity of vegetal memory in order to designate books.
"This place has been in the past and will be in the future devoted to the conservation of books; thus, it is and will be a temple of vegetal memory. Libraries, over the centuries, have been the most important way of keeping our collective wisdom. They were and still are a sort of universal brain where we can retrieve what we have forgotten and what we still do not know.
"If you will allow me to use such a metaphor, a library is the best possible imitation, by human beings, of a divine mind, where the whole universe is viewed and understood at the same time. A person able to store in his or her mind the information provided by a great library would emulate in some way the mind of God. In other words, we have invented libraries because we know that we do not have divine powers, but we try to do our best to imitate them. To build, or better to rebuild, today one of the greatest libraries of the world might sound like a challenge, or a provocation. It happens frequently that in newspaper articles or academic papers some authors, facing the new computer and internet era, speak of the possible "death of books". However, if books are to disappear, as did the obelisks or the clay tablets of ancient civilisations, this would not be a good reason to abolish libraries. On the contrary, they should survive as museums conserving the finds of the past, in the same way as we conserve the Rosetta Stone in a museum because we are no longer accustomed to carving our documents on mineral surfaces.
"Yet, my praise for libraries will be a little more optimistic. I belong to the people who still believe that printed books have a future and that all fears à propos of their disappearance are only the last example of other fears, or of milleniaristic terrors about the end of something, the world included...."
Michael Hawley reviews his book, Bhutan: a Visual Odyssey Across the Kingdom
In December 2003Michael Hawley, a scientist at MIT, issued the world's largest book—Bhutan: a Visual Odyssey Across the Kingdom. The work, which was also one of the most beautiful books ever published, was undertaken as a philanthrophic endeavor. It had 112 pages and weighed 133 pounds on an included custom-built aluminum stand. It's page openings were 7 x 5 feet. The work was initially offered in exchange for a $10,000 contribution. However, in November 2008 Amazon.com was offering copies for sale for $30,000 each.
A more practical and affordable way to appreciate this spectacular volume would be the trade edition published in 2004, a copy of which I acquired. In February 2009 this was offered for sale by Amazon.com for $100.00. In my opinion this is one of the finest and most spectacular trade books designed, printed and bound in America, though my aging eyes are not entirely comfortable reading white text against a black background. The clothbound volume, with an unusual dust jacket printed on both sides, measures 15¼ x 12¼ inches (39 x 31 cm).
The World Summit on the Information Society (WSIS) convened its first meeting in Geneva, Switzerland from December 10-12, 2003.
On December 16, 2003 The CAN-SPAM Act of 2003 was signed into law by President George W. Bush, establishing the United States' first national standards for the sending of commercial e-mail and requiring the Federal Trade Commission (FTC) to enforce its provisions.
"The acronym CAN-SPAM derives from the bill's full name: Controlling the Assault of Non-Solicited Pornography And Marketing Act of 2003. This is also a play on the usual term for unsolicited email of this type, spam. The bill was sponsored in Congress by Senators Conrad Burns and Ron Wyden.
"The CAN-SPAM Act is commonly referred to as the "You-Can-Spam" Act because the bill explicitly legalizes most e-mail spam. In particular, it does not require e-mailers to get permission before they send marketing messages. It also prevents states from enacting stronger anti-spam protections, and prohibits individuals who receive spam from suing spammers. The Act has been largely unenforced, despite a letter to the FTC from Senator Burns, who noted that "Enforcement is key regarding the CAN-SPAM legislation." In 2004 less than 1% of spam complied with the CAN-SPAM Act of 2003.
"The law required the FTC to report back to Congress within 24 months of the effectiveness of the act. No changes were recommended. It also requires the FTC to promulgate rules to shield consumers from unwanted mobile phone spam. On December 20, 2005 the FTC reported that the volume of spam has begun to level off, and due to enhanced anti-spam technologies, less was reaching consumer inboxes. A significant decrease in sexually-explicit e-mail was also reported.
"Later modifications changed the original CAN-SPAM Act of 2003 by (1) Adding a definition of the term "person"; (2) Modifying the term "sender"; (3) Clarifying that a sender may comply with the act by including a post office box or private mailbox and (4) Clarifying that to submit a valid opt-out request, a recipient cannot be required to pay a fee, provide information other than his or her email address and opt-out preferences, or take any other steps other than sending a reply email message or visiting a single page on an Internet website" (Wikipedia article on CAN-SPAM Act of 2003, accessed 01-19-2010).
In 2004 OCLC (Online Computer Library Center), Dublin, Ohio, served more than 50,540 libraries of all types in the U.S. and 84 countries and territories around the world. OCLC WorldCat contained 56 million catalogue records, representing 894 million holdings.
In 2004 800,000,000 people in the world were using the Internet.
In 2004 the Library of Congress contained 130,000,000 physical items on approximately 530 miles of bookshelves. Its collections included more than 29 million books and other printed materials, 2.7 million recordings, 12 million photographs, 4.8 million maps, and 58 million manuscripts
In 2004 1,200,000 unique book titles were sold. According to an article in the New York Times, only two percent sold more than 5000 copies.
According to R.R. Bowker, publisher of Books in Print, 375,000 new unique books were published in English during 2004.
According to Sloan-C, A Consortium of Institutions and Organizations Committed to Quality Online Education, 2.35 million students were enrolled in online learning in the United States during 2004.
In 2004 Bob Stein, pioneering commercial multi-media publisher and co-founder of the Voyager Company and The Criterion Collection, co-founded The Institute for the Future of the Book, "a small think-and-do tank investigating the evolution of intellectual discourse as it shifts from printed pages to networked screens."
Inspired by the success of the Wikipedia (which began in 2001), in 2004 Steve Coast, then a computer science student at the University of London, created OpenStreetMap.org as a collaborative project to create a free editable map of the world. "Cartography-obsessed," Coast liked to bicycle around town with a GPS taped to his handlebars and a laptop recording its data in his backpack. Bolstered by the availability of map information and cheap GPS devices, OpenStreetMap (O.S.M) has since grown into a collaboration among some 300,000 map enthusiasts around the world. Anyone can contribute to it and use it free of charge.
made up the fastest-growing industry, tripling in employment in 10 years.", and publishers of and have lost a combined 400,000 jobs since the recession began. — including web-search firms — offset only a fraction of the losses, adding 76,000 jobs.
"GarageBand is a streamlined digital audio workstation (DAW) and music sequencer that can record and play back multiple tracks of audio. Built-in audio filters that utilize the AU (audio unit) standard allow the user to enhance the audio track with various effects, including reverb, echo, and distortion amongst others. GarageBand also offers the ability to record at both 16-bit and 24-bit Audio Resolution. An included tuning system helps with pitch correction and can effectively imitate the Auto-Tune effect when tuned to the maximum level.
"Virtual software instruments
"GarageBand includes a large selection of realistic, sampled instruments and software modeled synthesizers. These can be used to create original compositions or play music live through the use of a USB MIDI keyboard connected to the computer, an on-screen virtual keyboard, or using a standard QWERTY keyboard with the "musical typing" feature. The synthesizers are broken into 2 groups: [virtual] analog and digital. Each synthesizer has a wide variety of adjustable parameters, including richness, glide, cut off, standard attack, decay, sustain, and release; these allow for a wide array of sounds to be created.
In addition to the standard tracks, Garageband allows for guitar-specific tracks that can utilize a variety of simulated amplifiers, stomp boxes, and effects processors. These imitate popular hardware from companies including Marshall Amplification, Orange Music Electronic Company, and Fender Musical Instruments Corporation. Up to five simulated effects can be layered on top of the virtual amplifiers, which feature adjustable parameters including tone, reverb, and volume. Guitars can be connected to Macs using the built-in input (requires hardware that can produce a standard stereo signal using a 3.5mm output) or a USB interface.
"GarageBand can import MIDI files and offers piano roll or notation-style editing and playback. By complying with the MIDI Standard, a user can edit many different aspects of a recorded note, including pitch, velocity, and duration. Pitch can be set to 1/128 of a semi-tone, on a scale of 0-127 (sometimes described on a scale of 1-128 for clarification). Velocity, which determines amplitude (volume), can be set and adjusted on a scale of 0-127. Note duration can be adjusted manually via the piano roll or in the score view. Note rhythms can be played via the software instruments, or created in the piano roll environment; rhythm correction is also included to lock notes to any time signature subdivision. GarageBand also offers global editing capabilities to MIDI information with Enhanced Timing, also known as Quantizing. Whilst offering comprehensive control over MIDI files, GarageBand does not include several features of professional-level DAWs, such as a sequencer for drum tracks separate from the normal piano roll. However, many of these shortcomings have been addressed with each successive release of GarageBand.
"A new feature included with GarageBand '09 and later is the ability to download pre-recorded music lessons from GarageBand's Lesson Store for guitar and piano. There are two types of lesson available in the Lesson Store: Basic Lessons, which are a free download, and Artist Lessons, which must be purchased. The first Basic Lessons for both guitar and piano are included with GarageBand. In both types of lesson, a music teacher presents the lesson, which is in a special format offering high quality video and audio instructions. The lessons include a virtual guitar or piano, which demonstrates finger position and a musical notational area to show the correct musical notations. The music examples used in these lessons features popular music. In an Artist Lesson the music teacher is the actual musician/songwriter who composed the song being taught in the lesson. As of November 2009 the artists featured are: Sting (Roxanne, Message in a Bottle, Fragile), Sarah McLachlan (Angel), Patrick Stump of Fall Out Boy (I Don't Care, Sugar, We're Goin' Down), Norah Jones (Thinking About You), Colbie Caillat (Bubbly), Sara Bareilles (Love Song), John Fogerty (Proud Mary, Fortunate Son, Centerfield), Ryan Tedder of OneRepublic (Apologize), Ben Folds (Brick, Zak and Sara), John Legend (Ordinary People), and Alex Lifeson of Rush (Tom Sawyer, Limelight, Working Man, The Spirit of Radio). No new Artist Lessons have been released in 2010, and Apple has not announced plans to release any more" (Wikipedia article on GarageBand, accessed 08-12-2013).
In February 2004 Flickr, the photo and video sharing and photo and video social networking site, was launched by Ludicorp, a Vancouver, Canada, based company founded by Stewart Butterfield and Caterina Fake. It emerged out of tools originally created for Ludicorp's Game Neverending, a web-based massively multiplayer online game. Its organizational tools allowed photos to be tagged and browsed by folksonomic means.
Ludicorp and Flickr were purchased by Yahoo in March 2005.
"Yahoo reported in June 2011 that Flickr had a total of 51 million registered members and 80 million unique visitors. In August 2011 the site reported that it was hosting more than 6 billion images and this number continues to grow steadily according to reporting sources." (Wikipedia article on Flickr, accessed 03-23-2012).
On February 4, 2004, while a student at Harvard, Mark Zuckerberg founded Thefacebook.com.
The name of the site was later simplified to Facebook. Membership was initially limited to Harvard students. but then expanded to other colleges in the Ivy League. Facebook expanded further to include any university student, then high school students, and, finally, to anyone aged 13 and over.
♦ In August 2013, after Facebook had over one billion users, a timeline entitled The Evolution of Facebook was available from The New York Times.
In March 2004 the National Endowment for the Humanities and the Library of Congress founded the National Digital Newspaper Program (NDNP).
"Ultimately over a period of approximately 20 years, NDNP will create a national, digital resource of historically significant newspapers from all the states and U.S. territories published between 1836 and 1922. This searchable database will be permanently maintained at the Library of Congress (LC) and be freely accessible via the Internet. An accompanying national newspaper directory of bibliographic and holdings information on the website will direct users to newspaper titles available in all types of formats."
In May 2004 there were 50,000,000 websites on the Internet.
On May 1, 2004 The Index-Catalogue of the Surgeon-General's Office, a 61 volume printed bibliographical resource for the history of medicine and science, published from 1880 to 1961, was made available online by the United States National Library of Medicine. In its online form the utility of the work was greatly enhanced since it became a single searchable database rather than a series of physical volumes and different indices published over decades.
This was the culmination of a data conversion project which began in 1996.
On May 12, 2004 archaeologists announced finding what they believed to be the remains of the building site of the ancient Library of Alexandria.
The 13 lecture halls at the building site could have housed as many as 5000 students, raising the possibility that the Library of Alexandria might have been the world's first university.
On July 6, 2004 The Journal of Cell Biology began screening digital images submitted with electronic manuscripts to determine whether these images were manipulated in ways that misrepresented experimental results. The image-screening system that checked for image manipulation took 30 minutes per paper.
At the Frankfurt Book Fair in October 2004 Google announced the Google Print project to scan and make searchable on the Internet the texts of more than ten million books from the collections of the New York Public Library, and the libraries of Michigan, Stanford, Harvard and Oxford Universities.
The project was renamed Google Books in December 2005.
Chris Anderson published "The Long Tail" in the October 2004 issue of Wired magazine. In this article he described "the niche strategy of businesses, such as Amazon.com or Netflix, that sell a large number of unique items, each in relatively small quantities. Anderson elaborated the Long Tail concept in his book The Long Tail: Why the Future of Business Is Selling Less of More.
"A frequency distribution with a long tail—the concept at the root of Anderson's coinage—has been studied by statisticians since at least 1946. The distribution and inventory costs of these businesses allow them to realize significant profit out of selling small volumes of hard-to-find items to many customers, instead of only selling large volumes of a reduced number of popular items. The group that purchases a large number of "non-hit" items is the demographic called the Long Tail.
"Given a large enough availability of choice, a large population of customers, and negligible stocking and distribution costs, the selection and buying pattern of the population results in a power law distribution curve, or Pareto distribution. This suggests that a market with a high freedom of choice will create a certain degree of inequality by favoring the upper 20% of the items ("hits" or "head") against the other 80% ("non-hits" or "long tail"). This is known as the Pareto principle or 80–20 rule.
"The Long Tail concept has found a broad ground for application, research and experimentation. It is a common term in online business and the mass media, but also of importance in micro-finance (Grameen Bank, for example), user-driven innovation (Eric von Hippel), social network mechanisms (e.g., crowdsourcing, crowdcasting, Peer-to-peer), economic models, and marketing (viral marketing)" (Wikipedia article on The Long Tail, accessed 04-19-2009).
Google Maps originated in a C++ program designed by Lars and Jens Rasmussen of Where 2 Technologies in Sydney, Australia. They designed the program to be separately downloaded by users, but pitched it to Google as a Web-based product. Google acquired the company in October 2004. At Google it was transformed into the web application Google Maps. In the same month, Google acquired Keyhole, a geospatial data visualization company, and aspects of Keyhole technology were also incorporated into Google Maps.
Google Maps expanded its coverage and features at an extraordinary rate even for web developments, and its open A.P.I was incorporated into a remarkable number of websites, including this one. On December 12, 2013 The New York Times published an article entitled "Google's Road Map to Global Domination," from which I quote:
"Today, Google’s map includes the streets of every nation on earth, and Street View has so far collected imagery in a quarter of those countries. The total number of regular users: A billion people, or about half of the Internet-connected population worldwide. Google Maps underlies a million different websites, making its map A.P.I. among the most-used such interfaces on the Internet. At this point Google Maps is essentially what Tim O’Reilly predicted the map would become: part of the information infrastructure, a resource more complete and in many respects more accurate than what governments have...."
"Web 2.0 is a term describing changing trends in the use of World Wide Web technology and web design that aims to enhance creativity, secure information sharing, collaboration and functionality of the web. Web 2.0 concepts have led to the development and evolution of web-based communities and its hosted services, such as social-networking sites, video sharing sites, wikis, blogs, and folksonomies. The term became notable after the first O'Reilly Media Web 2.0 conference in 2004. Although the term suggests a new version of the World Wide Web, it does not refer to an update to any technical specifications, but to changes in the ways software developers and end-users utilize the Web. . . .
Some technology experts, notably Tim Berners-Lee, have questioned whether one can use the term in any meaningful way, since many of the technology components of "Web 2.0" have existed since the early days of the Web."
"Current thinking about long-term memory in the cortex is focused on changes in the strengths of connections between neurons. But ongoing structural plasticity in the adult brain, including synapse formation/elimination and remodelling of axons and dendrites, suggests that memory could also depend on learning-induced changes in the cortical ‘wiring diagram’. Given that the cortex is sparsely connected, wiring plasticity could provide a substantial boost in storage capacity, although at a cost of more elaborate biological machinery and slower learning."
"The human brain consists of 10 to the 11th power neurons connected by 10 to the 15 power synapses. This awesome network has a remarkable capacity to translate experiences into vast numbers of memories, some of which can last an entire lifetime. These long-term memories survive surgical anaesthesia and epileptic episodes, and thus must involve modifications of neural circuits, most likely at synapses" (Chklovskii, Mel & K. Svoboda, "Cortical Rewiring and Information Storage," Nature, Vol. 431, 782-88).
According to the Pew Internet and American Life Project, by November 2004 8,000,000 American adults said they had created blogs.