Is the finding of James Watson’s black ancestry overblown?

Could it really be true, as reported recently by the Icelandic firm deCODE genetics, that the genetic ancestry of the very aquiline-looking and oddball WASP-ish personality type James Watson, the co-discoverer of the structure of DNA, is 16 percent black and nine percent Asian?

Trying to figure out where the black genetic component in Watson’s DNA comes from, Steve Sailer does some interesting detective work, going back several generations in Watson’s family tree, and finds that there simply is no sign of black ancestry that could account for the 16 percent figure. He concludes:

So, what likely happened is that Watson had a few nonwhite ancestors fairly well back in the past, and their versions of the genes used as genetic markers in deCODE’s analysis, via the luck of the draw in the sexual reproduction shuffle, kept turning up in Watson’s ancestors, greatly exaggerating his overall nonwhite ancestry. But the great majority of his functional genes were inherited from his white ancestors.

Hmm, not exactly a crystal clear explication, is it? All the reader can glean from this is that the sub-Saharan genes found by deCODE aren’t really representative of Watson’s DNA, because of something to do with genetic markers. But what exactly it has to do with genetic markers, and how this mistake could have occurred, and what this says about identifying people’s racial background through their DNA, Sailer barely bothers to tell us. Since he devoted a thousand words to his detective search of Watson’s family which failed to uncover black ancestors (preceded by another thousand words on the one-drop rule and “passing”), couldn’t he at least have devoted a few more words to explaining his alternative theory regarding misleading genetic markers? But such is the art of writing in the Age of the Internet. Everything is just tossed off.

Birch B. writes:

From your post:

“But what exactly it has to do with genetic markers, and how this mistake could have occurred, and what this says about identifying people’s racial background through their DNA, Sailer barely bothers to tell us.”

The big problem with the current ancestry-by-DNA tests is that they test too few DNA variations. A DNA test that could very accurately distinguish a “pure” sub-Saharan African from a “pure” European is not nearly as accurate for telling whether someone is say, 5, 10, or 15 percent of a given ancestry.

For example, statistical geneticist Neil Risch was able to to classify 3500 unrelated black, white, and East Asian subjects with 99.8 percent accuracy, using 350 genetic markers. To achieve this kind of accuracy distinguishing someone of about one-sixth or 16 percent of a different ancestry, such as European vs. African, from someone with no such ancestry requires far more markers (roughly 36 times as many, or about 12,600 markers). Even with 12,600 markers as ancestrally informative (in layman’s terms, as effective at determining race) as those used in the Risch study, there would still be a roughly one in 500 chance of a “pure” European coming up as 16 percent African.

The point in this statistical gobbledygook (which I tried to keep to a bare minimum) is that the problem of false ancestry tests is technological. As genome sequencing becomes cheaper, it will be easier to test more markers, and eventually test all human DNA variations, which number in the millions. If a test using several million DNA markers were used, it would be virtually impossible for someone with no Sub-Saharan African ancestry to even test as 1 percent of Sub-Saharan African descent.

Although a bit off-topic, I’d also like to bring up the issue of “Hispanics.” Since so many of our immigrants are Hispanic, and Hispanics tend to be poor and unsuccessful, and additionally because the “Hispanic” category is a popular target of the “race does not exist” crowd, this is an extremely important issue. If Hispanics don’t represent a group genetically distinct from white Americans, it is easy to argue that their poor performance is due to racism, or at the minimum, that it has nothing to do with genetics and will clear up Real Soon Now. It is true that there are European (Spanish), Asian (Filipino), and African (Dominican) Hispanics. However, most U.S. Hispanics, and Mexicans and Central Americans in particular (who comprise the great majority of Hispanic immigrants, and really American immigrants as a whole) are of mostly American Indian descent. So while “Hispanic” does not necessarily indicate any particular geographic or racial ancestry, it is a very good bet that any given American Hispanic is mostly American Indian.

LA replies:

Thanks for the explanation.

One way of getting around the “Hispanic” problem is by speaking instead of Mestizo immigrants.

EG writes:

The problem is you are assuming that Steve Sailer knows what he’s talking about with respect to the Watson/ancestry testing question, and that we need to therefore “understand” Sailer’s convoluted and misleading writings.

Not so.

“Birch B” is somewhat more on target, but there are some things that need to be added. With respect to Decode’s Watson ancestry “data”: they did not provide any information with which to provide context to the “finding.” For example, when DNAPrint gives data to customers they at least give “confidence intervals,” so one can see the range in which a particular alleged “admixture” may fall, and on their website they give the levels of each type of “admixture” required to be “real” at the 95 percent confidence level. That, is, of course, still a one in 20 chance of error, so BB’s comments hold in that regard. However, at the current level of technology, this is the sort of information that must be given along with raw numbers. Decode with Watson, we know of none of this, just a “16 percent African” claim. What’s the range? For Decode, what would be the 95 percent confidence level for African ancestry in Europeans (for DNAPrint, it is about four percent)? Did Decode analyze Watson with all one million markers they use, or just a fraction? Is Watson’s publicized genome sequence really correct? And, what exactly are Decode’s methods?

This is besides the issues about ethics. Yes, Watson made his genome public, so he doesn’t have a legal basis for complaint, but was it necessary for Decode to reveal information about Watson’s ancestry and disease risk without his permission? It may have been legal, but was it moral and ethical? What does it say about that company that they would do that to a public figure? Or to anyone? John Hawks argues that Decode did damage to the field of personalized genomics in their eagerness to “expose” Watson, and I agree. I suspect that someone like Jared Taylor—or even Sailer or even yourself- would need to carefully consider the issue of getting “watsoned” if using personalized genomic services.

Dan M. writes:

