August 24th, 2005

Data is or data are / pronouncing colonel / more about the vowel W

by Barbara Wallraff

J.R. Schelhass, of Livonia, Mich., writes: “‘The data show that ...’ ‘The data are not ...’ I guess these are correct, but they sound clumsy, contrived and snobbish. To me, ‘data’ is a collection of information. So a collection ‘is.’ I would appreciate your comments.”

Dear J.R.: “Data,” like “bacteria,” “criteria” and “phenomena,” is a Latin plural. But we tend to overlook this grammatical fact for a few reasons. For one, our English-tuned ears expect plurals to end in “s”: “Data” doesn’t sound plural. For another, hardly anybody ever uses the singular: When is the last time you heard someone talking about a “datum”? Nowadays people say “data point” instead. What’s more, as you point out, the word often means “a collection of information.” So a sentence like “The data is incomplete” seems natural.

I suspect it’s inevitable that “data” will become mainly a singular noun in English. When it does, it will be following in the footsteps of “agenda” -- which used to be plural, with an “agendum” being an individual item of business to be considered. For now, though, we have choices to make about how to use “data.” May I suggest that to show a decent respect for its history, you treat it as plural when it means, essentially, “facts”? Thus “The data show that ...” means “The facts show ...” But sometimes “data” is something abstract and uncountable, and the word means more nearly “information,” as in “The data is in electronic form.” (Admittedly, the two meanings often overlap.) When you could substitute “information” but not “facts” for “data,” then using it as a singular noun is fine. Until further notice, though, “bacteria,” “criteria” and “phenomena” remain plural only.

Patricia Mills, of Albany, N.Y., writes: “I’m an English teacher and a stickler. One of the great mysteries in my life is this: How did the word ‘colonel’ come to be pronounced ‘kernel’?”

Dear Patricia: Don’t you find, alas, that the solutions to great mysteries are often disappointing? The answer to your question is a shaggy-dog story. In brief: We got the word in the 16th century from the French. They wrote “coronel” (or “coronnel”), and many English-speakers did too until about 1650, when it began to occur to everybody that the word is actually related to the Latin root for “column” (because the colonel led a column of soldiers), not “crown.” Around that time, too, English-speakers stopped pronouncing the word with all three syllables that we see when the word is written and gave it just two: “col-nel.” But “col-nel” is hard to say. This fact, apparently together with the memory of the earlier spelling, encouraged people to say “cor-nel,” or “kernel.”

P.S.: In last week’s column I said that “W serves as a vowel only in combination with another vowel (in particular, A, E or O).” Theron Downes, of Okemos, Mich., responded: “My dictionary and others I’ve checked include ‘cwm’ variously defined as a mountain lake, valley or cirque.” Theron: Right you are, and thanks for letting Word Court know about this unusual word, in which W is a stand-alone vowel. But what’s a cirque? My favorite dictionary, the American Heritage, which gives only that meaning for “cwm,” defines “cirque” as “a steep bowl-shaped hollow occurring at the upper end of a mountain valley, especially one forming the head of a glacier or stream.”

