Google Books Ngram Viewer - Misplaced Pages

(Redirected from Ngram Viewer) Online search engine

The Google Books Ngram Viewer is an online search engine that charts the frequencies of any set of search strings using a yearly count of n-grams found in printed sources published between 1500 and 2022 in Google's text corpora in English, Chinese (simplified), French, German, Hebrew, Italian, Russian, or Spanish. There are also some specialized English corpora, such as American English, British English, and English Fiction.

The program can search for a word or a phrase, including misspellings or gibberish. The n-grams are matched with the text within the selected corpus, and if found in 40 or more books, are then displayed as a graph. The Google Books Ngram Viewer supports searches for parts of speech and wildcards. It is routinely used in research.

History

In the development processes, Google teamed up with two Harvard researchers, Jean-Baptiste Michel and Erez Lieberman Aiden, and quietly released the program on December 16, 2010. Before the release, it was difficult to quantify the rate of linguistic change because of the absence of a database that was designed for this purpose, said Steven Pinker, a well-known linguist who was one of the co-authors of the Science paper published on the same day. The Google Books Ngram Viewer was developed in the hope of opening a new window to quantitative research in the humanities field, and the database contained 500 billion words from 5.2 million books publicly available from the very beginning.

The intended audience was scholarly, but the Google Books Ngram Viewer made it possible for anyone with a computer to see a graph that represents the diachronic change of the use of words and phrases with ease. Lieberman said in response to the New York Times that the developers aimed to provide even children with the ability to browse cultural trends throughout history. In the Science paper, Lieberman and his collaborators called the method of high-volume data analysis in digitalized texts "culturomics".

Usage

Commas delimit user-entered search terms, where each comma-separated term is searched in the database as an n-gram (for example, "nursery school" is a 2-gram or bigram). The Ngram Viewer then returns a plotted line chart. Note that due to limitations on the size of the Ngram database, only matches found in at least 40 books are indexed.

Limitations

The data sets of the Ngram Viewer have been criticized for their reliance upon inaccurate optical character recognition (OCR) and for including large numbers of incorrectly dated and categorized texts. Because of these errors, and because they are uncontrolled for bias (such as the increasing amount of scientific literature, which causes other terms to appear to decline in popularity), care must be taken in using the corpora to study language or test theories. Furthermore, the data sets may not reflect general linguistic or cultural change and can only hint at such an effect because they do not involve any metadata like date published, author, length, or genre, to avoid any potential copyright infringements.

Systemic errors like the confusion of s and f in pre-19th century texts (due to the use of ſ, the long s, which is similar in appearance to f) can cause systemic bias. Although the Google Books team claims that the results are reliable from 1800 onwards, poor OCR and insufficient data mean that frequencies given for languages such as Chinese may only be accurate from 1970 onward, with earlier parts of the corpus showing no results at all for common terms, and data for some years containing more than 50% noise.

Guidelines for doing research with data from Google Ngram have been proposed that try to address some of the issues discussed above.

References

^ Michael, Jean-Baptiste; Shen, Yuan K.; Aiden, Aviva P.; Veres, Adrian; Gray, Matthew K.; The Google Books Team; Pickett, Joseph P.; Hoiberg, Dale; Clancy, Dan; Norvig, Peter; Orwant, Jon; Pinker, Steven; Nowak, Martin A.; Aiden, Erez L. (2010). "Quantitative Analysis of Culture Using Millions of Digitized Books". Science. 331 (6014): 176–182. doi:10.1126/science.1199644. PMC 3279742. PMID 21163965.
^ Bosker, Bianca (2010-12-17). "Google Ngram Database Tracks Popularity Of 500 Billion Words". The Huffington Post. Retrieved 2012-05-31.
^ Lance Whitney (2010-12-17). "Google's Ngram Viewer: A time machine for wordplay". Cnet.com. Archived from the original on 2014-01-23. Retrieved 2012-05-31.
@searchliaison (July 13, 2020). "The Google Books Ngram Viewer has now been updated with fresh data through 2019" (Tweet). Retrieved 2020-08-11 – via Twitter.
^ "Google Books Ngram Viewer - University at Buffalo Libraries". Lib.Buffalo.edu. 2011-08-22. Archived from the original on 2013-07-02. Retrieved 2012-05-31.
^ "Google Books Ngram Viewer - Information". Retrieved 2024-06-01.
Greenfield, Patricia M. (2013). "The Changing Psychology of Culture From 1800 Through 2000". Psychological Science. 24 (9): 1722–1731. doi:10.1177/0956797613479387. ISSN 0956-7976. PMID 23925305. S2CID 6123553.
Younes, Nadja; Reips, Ulf-Dietrich (2018). "The changing psychology of culture in German-speaking countries: A Google Ngram study". International Journal of Psychology. 53: 53–62. doi:10.1002/ijop.12428. PMID 28474338. S2CID 7440938.
^ "In 500 Billion Words, New Window on Culture". The New York Times. 2010-12-16. Retrieved 2024-06-01.
"Steven Pinker – The Stuff of Thought: Language as a window into human nature". Royal Society of Arts. 2010-02-04. Retrieved 2024-06-02 – via YouTube.
Nunberg, Geoff (2010-12-16). "Humanities research with the Google Books corpus". Archived from the original on 2016-03-10. Retrieved 2015-04-19.
Pechenick, Eitan Adam; Danforth, Christopher M.; Dodds, Peter Sheridan; Barrat, Alain (2015-10-07). "Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-Cultural and Linguistic Evolution". PLOS One. 10 (10): e0137041. arXiv:1501.00960. Bibcode:2015PLoSO..1037041P. doi:10.1371/journal.pone.0137041. PMC 4596490. PMID 26445406.
^ Zhang, Sarah. "The Pitfalls of Using Google Ngram to Study Language". WIRED. Retrieved 2017-05-24.
Koplenig, Alexander (2015-09-02). "The impact of lacking metadata for the measurement of cultural and linguistic change using the Google Ngram data sets—Reconstructing the composition of the German corpus in times of WWII". Digital Scholarship in the Humanities. 32 (1). Oxford Academic (published 2017-04-01): 169–188. doi:10.1093/llc/fqv037. ISSN 2055-7671.
"Google n-grams and pre-modern Chinese". digitalsinology.org. Retrieved 2015-04-19.
"When n-grams go bad". digitalsinology.org. Retrieved 2015-04-19.
Younes, Nadja; Reips, Ulf-Dietrich (2019-03-22). "Guideline for improving the reliability of Google Ngram studies: Evidence from religious terms". PLOS One. 14 (3): e0213554. Bibcode:2019PLoSO..1413554Y. doi:10.1371/journal.pone.0213554. ISSN 1932-6203. PMC 6430395. PMID 30901329.

Bibliography

Lin, Yuri; et al. (July 2012). "Syntactic Annotations for the Google Books Ngram Corpus" (PDF). Proceedings of the 50th Annual Meeting. Demo Papers. 2. Jeju, Republic of Korea: Association for Computational Linguistics: 169–174. 2390499. Whitepaper presenting the 2012 edition of the Google Books Ngram Corpus

External links

Official website

Google

a subsidiary of Alphabet

Company

Divisions

Subsidiaries

Active

Defunct

Programs

Events

Infrastructure

111 Eighth Avenue
Android lawn statues
Androidland
Barges
Binoculars Building
Central Saint Giles
Chelsea Market
Chrome Zone
Data centers
GeoEye-1
Googleplex
Ivanpah Solar Power Facility
James R. Thompson Center
King's Cross
Mayfield Mall
Pier 57
Sidewalk Toronto
St. John's Terminal
Submarine cables
- Dunant
- Grace Hopper
- Unity
WiFi
YouTube Space
YouTube Theater

People

Current	Krishna Bharat Vint Cerf Jeff Dean John Doerr Sanjay Ghemawat Al Gore John L. Hennessy Urs Hölzle Salar Kamangar Ray Kurzweil Ann Mather Alan Mulally Rick Osterloh Sundar Pichai (CEO) Ruth Porat (CFO) Rajen Sheth Hal Varian Neal Mohan
Former	Andy Bechtolsheim Sergey Brin (co-founder) David Cheriton Matt Cutts David Drummond Alan Eustace Timnit Gebru Omid Kordestani Paul Otellini Larry Page (co-founder) Patrick Pichette Eric Schmidt Ram Shriram Amit Singhal Shirley M. Tilghman Rachel Whetstone Susan Wojcicki

Criticism

General	Censorship DeGoogle FairSearch "Google's Ideological Echo Chamber" No Tech for Apartheid Privacy concerns Street View YouTube Worker organization Alphabet Workers Union YouTube copyright issues
Incidents	Backdoor advertisement controversy Blocking of YouTube videos in Germany Data breach Elsagate Fantastic Adventures scandal Kohistan video case Reactions to Innocence of Muslims San Francisco tech bus protests Services outages Slovenian government incident Walkouts YouTube headquarters shooting

Other

Development

Software

A–C	Accelerated Linear Algebra AMP Actions on Google ALTS American Fuzzy Lop Android Cloud to Device Messaging Android Debug Bridge Android NDK Android Runtime Android SDK Android Studio Angular AngularJS Apache Beam APIs App Engine App Inventor App Maker App Runtime for Chrome AppJet Apps Script AppSheet ARCore Base Bazel Bigtable BigQuery Bionic Blockly Borg Caja Cameyo Chart API Charts Chrome Enterprise Premium Chrome Frame Chromium Blink Closure Tools Cloud Connect Cloud Dataflow Cloud Datastore Cloud Messaging Cloud Shell Cloud Storage Code Search Compute Engine Cpplint
D–N	Dalvik Data Protocol Dialogflow Exposure Notification Fast Pair Fastboot Federated Learning of Cohorts File System Firebase Firebase Cloud Messaging FlatBuffers Flutter Freebase Gadgets Ganeti Gears Gerrit GLOP gRPC Gson Guava Guetzli Guice gVisor GYP JAX Jetpack Compose Keyhole Markup Language Kubernetes Kythe LevelDB Lighthouse Looker Studio lmctfy MapReduce Mashup Editor Matter Mobile Services Namebench Native Client Neatx Neural Machine Translation Nomulus
O–Z	Open Location Code OpenRefine OpenSocial Optimize OR-Tools Pack PageSpeed Piper Plugin for Eclipse Polymer Programmable Search Engine Project IDX Project Shield Public DNS reCAPTCHA RenderScript SafetyNet SageTV Schema.org Search Console Shell Sitemaps Skia Graphics Engine Spanner Sputnik Stackdriver Swiffy Tango TensorFlow Tesseract Test Translator Toolkit Urchin UTM parameters V8 VirusTotal VisBug Wave Federation Protocol Weave Web Accelerator Web Designer Web Server Web Toolkit Webdriver Torso WebRTC

Operating systems

Android
- Cupcake
- Donut
- Eclair
- Froyo
- Gingerbread
- Honeycomb
- Ice Cream Sandwich
- Jelly Bean
- KitKat
- Lollipop
- Marshmallow
- Nougat
- Oreo
- Pie
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- version history
- smartphones
Android Automotive
Android Go
- devices
Android Things
Android TV
- devices
Android XR
ChromeOS
ChromiumOS
Fuchsia
Glass OS
gLinux
Goobuntu
TV
Wear OS

Language models

Neural networks

Computer programs

Formats and codecs

Programming languages

Search algorithms

Domain names

Typefaces

Products

A	Aardvark Account Dashboard Takeout Ad Manager AdMob Ads AdSense Affiliate Network Alerts Allo Analytics Android Auto Android Beam Answers Apture Arts & Culture Assistant Attribution Authenticator
B	BebaPay BeatThatQuote.com Blog Search Blogger Body Bookmarks Books Ngram Viewer Browser Sync Building Maker Bump BumpTop Buzz
C	Calendar Cast Catalogs Chat Checkout Chrome Chrome Apps Chrome Experiments Chrome Remote Desktop Chrome Web Store Classroom Cloud Print Cloud Search Contacts Contributor Crowdsource Currents (social app) Currents (news app)
D	Data Commons Dataset Search Desktop Dictionary Digital Wellbeing Dinosaur Game Directory Docs Docs Editors Domains Drawings Drive Duo
E	Earth Etherpad Expeditions Express
F	Family Link Fast Flip FeedBurner fflick Fi Wireless Finance Files Find My Device Fit Flights Flu Trends Fonts Forms Friend Connect Fusion Tables
G	Gboard Gemini Gesture Search Gizmo5 Google+ Gmail Goggles GOOG-411 Grasshopper Groups
H	Hangouts Helpouts
I	iGoogle Images Image Labeler Image Swirl Inbox by Gmail Input Tools Japanese Input Pinyin Insights for Search
J	Jaiku Jamboard
K	Kaggle Keep Knol
L	Labs Latitude Lens Like.com Live Transcribe Lively
M	Map Maker Maps Maps Navigation Marketing Platform Meet Messages Moderator My Tracks
N	Nearby Share News News & Weather News Archive Notebook NotebookLM Now
O	Offers One One Pass Opinion Rewards Orkut Oyster
P	Panoramio PaperofRecord.com Patents Page Creator Pay (mobile app) Pay (payment method) Pay Send People Cards Person Finder Personalized Search Photomath Photos Picasa Picasa Web Albums Picnik Pixel Camera Play Play Books Play Games Play Music Play Newsstand Play Pass Play Services Podcasts Poly Postini PostRank Primer Project Starline Public Alerts Public Data Explorer
Q	Question Hub Questions and Answers Quick, Draw! Quick Search Box Quick Share Quickoffice
R	Read Along Reader Reply
S	Safe Browsing SageTV Santa Tracker Schemer Scholar Search Knowledge Graph SafeSearch Searchwiki Sheets Shoploop Shopping Sidewiki Sites Slides Snapseed Socratic Softcard Songza Sound Amplifier Spaces Sparrow (chatbot) Sparrow (email client) Speech Recognition & Synthesis Squared Stadia Station Store Street View Surveys Sync
T	Tables Talk TalkBack Tasks Tenor Tez Tilt Brush Toolbar Toontastic 3D Translate Travel Trendalyzer Trends TV
U	URL Shortener
V	Video Vids Voice Voice Access Voice Search
W	Wallet Wave Waze WDYL Web Light Where Is My Train Widevine Word Lens Workspace Workspace Marketplace
Y	YouTube YouTube Kids YouTube Music YouTube Premium YouTube Shorts YouTube Studio YouTube TV YouTube VR

Hardware

Pixel

Smartphones	Pixel (2016) Pixel 2 (2017) Pixel 3 (2018) Pixel 3a (2019) Pixel 4 (2019) Pixel 4a (2020) Pixel 5 (2020) Pixel 5a (2021) Pixel 6 (2021) Pixel 6a (2022) Pixel 7 (2022) Pixel 7a (2023) Pixel Fold (2023) Pixel 8 (2023) Pixel 8a (2024) Pixel 9 (2024) Pixel 9 Pro Fold (2024)
Smartwatches	Pixel Watch (2022) Pixel Watch 2 (2023) Pixel Watch 3 (2024)
Tablets	Pixel C (2015) Pixel Slate (2018) Pixel Tablet (2023)
Laptops	Chromebook Pixel (2013–2015) Pixelbook (2017) Pixelbook Go (2019)
Other	Pixel Buds (2017–present)

Nexus

Smartphones	Nexus One (2010) Nexus S (2010) Galaxy Nexus (2011) Nexus 4 (2012) Nexus 5 (2013) Nexus 6 (2014) Nexus 5X (2015) Nexus 6P (2015)
Tablets	Nexus 7 (2012) Nexus 10 (2012) Nexus 7 (2013) Nexus 9 (2014)
Other	Nexus Q (2012) Nexus Player (2014)

Other

Android Dev Phone
Android One
Cardboard
Chromebit
Chromebook
Chromebox
Chromecast
Clips
Daydream
Fitbit
Glass
Liftware
Liquid Galaxy
Nest
- smart speakers
- Thermostat
- Wifi
Play Edition
Project Ara
OnHub
Pixel Visual Core
Project Iris
Search Appliance
Sycamore processor
Tensor
Tensor Processing Unit
Titan Security Key

v t e Litigation
Advertising	Feldman v. Google, Inc. (2007) Rescuecom Corp. v. Google Inc. (2009) Goddard v. Google, Inc. (2009) Rosetta Stone Ltd. v. Google, Inc. (2012) Google, Inc. v. American Blind & Wallpaper Factory, Inc. (2017) Jedi Blue
Antitrust	European Union (2010–present) United States v. Adobe Systems, Inc., Apple Inc., Google Inc., Intel Corporation, Intuit, Inc., and Pixar (2011) Umar Javeed, Sukarma Thapar, Aaqib Javeed vs. Google LLC and Ors. (2019) United States v. Google LLC (2020) United States v. Google LLC (2023)
Intellectual property	Perfect 10, Inc. v. Amazon.com, Inc. (2007) Viacom International Inc. v. YouTube, Inc. (2010) Lenz v. Universal Music Corp.(2015) Authors Guild, Inc. v. Google, Inc. (2015) Field v. Google, Inc. (2016) Google LLC v. Oracle America, Inc. (2021) Smartphone patent wars
Privacy	Rocky Mountain Bank v. Google, Inc. (2009) Hibnick v. Google, Inc. (2010) United States v. Google Inc. (2012) Judgement of the German Federal Court of Justice on Google's autocomplete function (2013) Joffe v. Google, Inc. (2013) Mosley v SARL Google (2013) Google Spain v AEPD and Mario Costeja González (2014) Frank v. Gaos (2019)
Other	Garcia v. Google, Inc. (2015) Google LLC v Defteros (2020) Epic Games v. Google (2021) Gonzalez v. Google LLC (2022)

Concepts

Products

Android	Booting process Custom distributions Features Recovery mode Software development
Street View coverage	Africa Antarctica Asia Israel Europe North America Canada United States Oceania South America Argentina Chile Colombia
YouTube	Copyright strike Education Features Moderation Most-disliked videos Most-liked videos Most-subscribed channels Most-viewed channels Most-viewed videos Arabic music videos French music videos Indian videos Pakistani videos Official channel Social impact Suspensions YouTube Premium original programming
Other	Gmail interface Maps pin Most downloaded Google Play applications Stadia games

Documentaries

Books

Google Hacks
The Google Story
Google Volume One
Googled: The End of the World as We Know It
How Google Works
I'm Feeling Lucky
In the Plex
The Google Book
The MANIAC

Popular culture

Google Feud
Google Me (film)
"Google Me" (Kim Zolciak song)
"Google Me" (Teyana Taylor song)
Is Google Making Us Stupid?
Proceratium google
Matt Nathanson: Live at Google
The Billion Dollar Code
The Internship
Where on Google Earth is Carmen Sandiego?

Other

Italics denote discontinued products.

Categories:

History

Usage

Limitations

See also

References

Bibliography

External links