The Measurement of a “World in Turmoil” : Background to a Terrible Idea.

Flashing Lights: a screenshot of impending doom or liberation? ("Foreign Policy"/John Beieler/Google Maps/CartoDB)

Flashing Lights: a screenshot of impending doom or liberation? (“Foreign Policy”/John Beieler/Google Maps/CartoDB)

Foreign Policy magazine this week declared that we are living “in a world of turmoil”: that the last decade has seen a dramatic increase of worldwide protest and unrest. In Washington D.C., it surely must feel that way.  Luckily, they have statistics to prove it: a Penn State Doctoral Candidate has mapped “Every Protest on the Planet Since 1979” and they show “what data from a world in turmoil looks like.” Flashing lights begin slowly in 1979 and as they near our present day, a riot of seizure inducing flashes have spread across the globe from Europe, North America, and the Middle East to cover every dark corner but the mid oceans.  It shows we are in an unprecedented age of revolution: welcomed by revolutionaries but the worst fear of the powerful.

Picked up by a series of websites, the map is a “Mind-Blowing” “riveting visual of our world’s protests, riots and demonstrations” which “offers historians and social scientists a new …way of understanding how protest behavior and civic unrest changes over time.”

The data behind this flashy map shows nothing of the sort, and the researcher himself makes caveats that largely contradict any such sweeping statements, saying these are tied to press reporting, and identifying patterns is “tricky”. Yet the best sober takeaway the maker can offer is that “the areas that are ‘bright’ [lit up with flashes that indicate protests] are those that would generally be expected to be so.”  So the conclusion is that our conclusions were correct all along. Given that everything is blinking like a slot machine, it seems we’re not really so big on caveats.  And that is certainly the inescapable conclusion anyone looking at this will come away with.

It would be nice to be able to explain things like social movements with an animated dataset.  Given the rise of machines in every aspect of our lives it must be tempting to put to work their huge potential power — and their cultural caché as agents of productivity and objectivity.  The popular reach of this one video has already far outstripped most polisci papers the map maker’s academic advisors have ever published.

The idea that human history can be reduced to calculations or mapping of aggregate ranked events or other such statistics is generally greeted with skepticism in academic circles (you know, the ones not mentioned on Reddit). Yet a growing minority of academics — less sober even than those mentioned above — are positively ecstatic about such projects.  A cadre of professors from the former Soviet Union and American free-market economists, led rhetorically by one Peter Turchin, have championed a new field of study for themselves which — by using a handful of basic statistical formula — will not only explain human society but will predict future events. ” “The only way to do science is to make predictions and then test them with data,” says Turchin.  And his is the Science of History.  “Cliodynamics” — as they call their discipline — provides such astounding insights as this late 2011 paper which predicts, as the Arab Spring was in full boil, that we might expect more disorder in North Africa and West Asia in the future!  Who could have guessed?

The mechanism for this statement of the obvious is no more impressive: a team or writers — it does take a village, after all — conclude that the “Malthusian Trap” causes “disorder” just as nations escape it.  This is despite the fact that that Malthus was wrong in his predictions, that the Malthusian trap has long been agreed to be bunk, and that its application has frequently been shown as a way in which subtly racist Westerners worry about loss of control over the ‘irrationally breeding’ ‘darker races’.  But we have statistics!

“Cliodynamics”– like the ghost of the worst of the 19th century’s Positivists — has rightly been rubbished, but in terms far too soft for the damage dozens of Doctorates of Cliodynamics being created are likely to do over the coming generation.  Yet the same set of tech blogs and news-sites pushing this map of world disorder have previously fawned over Cliodynamics as a “scientific theory… that broad trends of history occur in predictable patterns based on common factors” and  “a fascinating new area, one that, along with computational social science, could represent the future of both history and social studies in general.”  And it makes perfect sense: you marry modern technology to numeric facts, and history is explained.   Or perhaps you simply collect MORE data (“long data”).  With no guide to how society works except a few popular histories, received wisdom, the experience of American society,  and the things you’ve read off the back of sugar packets, we’ll use data — what “data points” are consistent an available from across human history we’re sure will be obvious — to not only explain all human society, but to enable us to see that what is important to a modern American is both relevant and important to her distant relatives or his offspring a thousand years in the future.  History is just modern Americans in period costume, after all.

This all returns to the original sin of “statistical” social science : that human society does not appear with numbers attached to it.  Someone must apply numbers to people, things, or selected “events”, often across history and cultures. Someone must say that this moment is “an event” while another is “normal”.  More that any math they inflict on those numbers in the course of their research, it is how the researcher comes by their initial numbers more often than not determines their results.  So apart from birth and death, there are few aspects of society that come pre-numbered.  Then might raw demography be the safest place to begin to enumerate social change?

Yet even these run into a thicket of problems of interpretation, making cross-cultural comparisons and meaningful time series almost impossible.  In Sahelian West Africa, censuses for a hundred years attempted to measure residents by governing subdivision, and often failed in ways not acknowledged in aggregate statistics.  Simple head counts — so useful in taxing the hell out of subject peoples — were confounded by seasonal migration common among nomadic communities. So the French colonial rulers invented for nomads a parallel system “tribes” and “fractions” which were roughly equal to communes and villages.  Even when this succeeded in gaining an accurate count of nomads, it created self-fulfilling roles for these communities: as being somehow inherently organized by family ties and not shared place. It separated them from non-nomads who might even be members of the same family or ethnic group.  It helped deepen divisions between these two rather arbitrary divisions in a community. Nomads might spend a season in a particular village every year and become integrated into the web or ethnic/profession groups there, to the point that fields depended upon the nomad’s livestock to provide the manure which enabled crops to grow. The very lives of each family might depend upon the resources of their seasonal neighbors at times of drought.  Meanwhile, a fisherman or metal worker ethnic/professional group in the same village might remain legally part of that fixed commune in a way the nomads were suddenly not. Just because they are not in the area when censuses are done, nomads now have a different local governance system, placed at odds with that shared by their neighbors.  That the marker of importance is now “tribe” further helps create an ethnic cleavage in this — but only this — part of a multi-ethnic society.

As if this were not enough, our little commune might have a third of its population travel abroad during the dry season every year to do wage work. How shall we even count the population if the census falls in this period?

If we can’t get even an accurate population number without being willing to understand  differences in societies that stymie cross-cultural enumeration — and if even that process might involve creating an “observer effect” worthy of any particle physicist — one can guarantee academics who try to quantify something as complex as global movements for social change are not only bound to fail, but to create conclusions which often become little more than a cipher for all the biases of the observer.  The math professor pushing “Long Data” suggests we might start out with something easy, like analyzing   “…data on the number of countries every half-century since the fall of the Roman Empire…”  But if we can’t decide what a polity, a community, or a locality is consistently over one century in one place, how could we possibly count “numbers of countries” over twenty centuries? We and our ancestors don’t agree on what “a country” is, even assuming we have undisputed history for every political entity in the last two thousand years. How many “countries” were there been in Central America between 500 A.D. and 1500 A.D.?  We can only answer such poorly formed questions if we impose our ways of organizing the world on cultures and times to which they have no relevance, and pretend we know things we do not.  The conclusions we reap from such science quickly becomes nonsense.  But the column of numbers produced by this mess hides all these flaws.

The British imperial economist Josiah Stamp‘s dictum is the usual warning given. In discounting Imperial statistics in India, Stamp noted that the Hindi village watchman, the “chowky dar”, has much more to do with the outcome of statistical study than his British military rulers.

“The government are very keen on amassing statistics. They collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But you must never forget that every one of these figures comes in the first instance from the chowky dar, who just puts down what he damn pleases.”

That village watchman would have been a fool to go along anyway: no British statistical study was going to bring his family anything but harm. He must have known that there is nothing neutral in statistics: as in the age of imperialism, they are generally part of a project.

Yet today’s most outlandish and uncritical projects’ real strength, as a tool for reinforcing ideology, is precisely their weakness as an actual “science”.  Like skull-measuring, crude statistical social science is most useful in reinforcing prejudice because there is nothing in it to divert its conclusions in unexpected directions. There is no debate, no “on this side” or “on that site”.  There is a chart and a column of figures and a firm conclusion, even if constructed on nonsense.  We select the facts that “make sense” and the conclusion will “make sense”.  It is an empty box, into which we place our unspoken agreements and unchallenged assumptions about what is in fact a very complex world.

There are of course a legendary host of brilliant, careful social scientists who have made statistical science the basis of  their research: the great Charles Tilly comes to mind instantly as a writer who revolutionized how we look at 19th century Europe through his careful and clever application of huge data.  And like dozens of writers, he warned against equating datasets and regression analysis with knowledge.  This he called

“the abuse of the Great Blender, in which we take numerical observations on a hundred-odd national states, made comparable by the magic of appearing in parallel columns of a statistical handbook, and run multiple regressions or factor analyses in order to determine dimensions of development, of modernity, of political instability, or of some other equally ill-defined global concept.”

Of all the lessons of his work, you’d think social scientists would at least have heeded that one.

But these are today not the failings of a few marginal cranks.  A few years ago I saw a team of Harvard/NYU/Berkeley development economists produce a paper setting out to prove that Walter Rodney was wrong: Europe had not underdeveloped Africa.  They did this in one short article and a 68k data file, by rating the income and “technological development” of dozens of nation states around the world at 1000 B.C., 0 A.D., and 1000 A.D.  From this they discovered that Guinea had lower income and technological development than Honduras or Latvia at each stage.  Bingo: it wasn’t colonialism’s fault.

If you think I’m exaggerating for effect, you can read this utterly ridiculous paper for yourself.  I remember one of the authors telling me at the time “we used the best data available” in response to a rather pointed four-letter critique.  Leaving aside the fact that these nation-states didn’t exist at any of these dates and that we have no real knowledge or measure of aggregate income for any of these “places” for any of these dates, how if God’s name do you rank “Technological development”?

The answer is, you cannot.  This, like piles of academic work, are fronts for confirming prejudices, nothing more.  And that these sorts of things are accepted by the top publications in their fields suggests that we should probably just fence off those fields to avoid injury.

We don’t, because such conclusions are useful to power. They enable World Bank directors and Western politicians to get on a stage and tell people that what the rich want to do is approved by science.  There is a powerful lobby — and the psychology of wanting to believe your greed is actually not a choice — at play here.

To expect this sort of “predictive” social science is not tied to control is naive.  From the birth of statistical sciences, this is how data has been used.  The writer Ian Hacking famously charted how the rise of statistics in 19th century Europe was entwined with government control of land and people.

“We obtain data about a governed class whose deportment is offensive, and then attempt to alter what we guess are relevant conditions of that class in order to change the laws of statistics that the class obeys. This is the essence of the style of government that in the United States is called ‘liberal. …the intentions are benevolent. The ‘we’ who know best change the statistical laws that affect ‘them’.”

And this emphasizes why, despite the warning of academics like Charles Tilly, such nonsense is still extracted from statistics.

Statistical study of ill-defined and anxiety producing topics both produce the outcome which the researches begin with and give that starting point the respectability of science. There will always be funding for such research because there will always be publicity and use for such research, in ways a closely argued text that makes no pretense of finality or objectivity is not useful.

In defense of the Penn State mapper, the associated actual research is filled with caveats, and spends pages analyzing the frequency of wire service articles. But the conclusions don’t seem to extend beyond technical fixes of the strengths or weakness of various codesets or batches of wire service data. That this might not be a question best answered with statistics never seems to come up.  And the caveats don’t seem to translate to his maps at all: these problems don’t discredit either mapping as a strategy or the data as a source, because mapping data is what we have.

The tools are new and shiny and not particularly complex: a MySQL database and a Google Maps API is all you need. This is then fed hunks of data extracted from the Texas based “Global Database of Events, Language, and Tone (GDELT)” which has daily pulled “events” out of newspapers and other sources around the world, giving them categories and geocodes, but apparently not much context.  This is the academic equivalent of giving a toddler a handgun and telling her to be careful. You just know its not going to end well, and it’s hard to fault the toddler.

In fact, the context of this academic work has an interesting sidelight on the history — even the purpose — of the GDELT:

“These efforts built a substantial foundation for event data. By the mid-2000s, virtually all refereed articles in political science journals used machine-coded, rather than human-coded, event data. However, the overall investment in machine coding technology remained relatively small. This situation changed dramatically with the DARPA-funded Integrated Conflict Early Warning System project, which invested substantial resources in event data development using automated methods. ” (Leetaru, Schrodt : 2013, p. 1)

DARPA is of course the United States military’s Defense Advanced Research Projects Agency, those who brought you the Internet, and who currently fund such pure science as the “High Energy Liquid Laser Area Defense System”, “Persistent Close Air Support”. the Boeing X-37 Orbital Test Vehicle, the “ArcLight” ship-based missile system, and the “ACTUV unmanned Anti-submarine warfare vessel”.  There are no prizes for guessing why they would want to predict protests and other unrest in the world: to stop them.

Despite the talk and the undoubted money involved, in looking at my area of knowledge (West Africa) GDELT’s data fails quite remarkably.  It not only misses West African history, it wildly overstates recent events, totally skewing any conclusions you can make, like those Foreign Policy appears to have made with talk of “a world of upheaval.”

On this map protest “blips” start appearing regularly in West Africa in the late 90s, and nothing appears (meaning less than 10 events a year in a national subdivision) between 1979 and 1991. So for example, the mass student strikes and protests in Bamako in 1979 and 1980 don’t rate a blip. In the 1991 revolution (yes, “revolution”, so a fair few protests before and after across the country) Bamako gets 1 blip, while the equally dramatic events in Niamey and the rest of neighbor Niger get nothing.  Years of protests and strikes roiled these two nations in the early to mid 1990s.  And yet every demo in support of the 1999 Nigerien coup (somewhat astroturffed) seems to appear and the impressive — but not 1991 level — 2005 cost-of-living protests in Niamey look like freaking fireworks. From here on a solid array of flashing lights appears in these nations.

So what happened in the late 90s? Easy. Internet reporting. West African newspapers start to go online, slowly and irregularly at first, but between 2004 2009 or so in huge numbers.  And suddenly, across West Africa, things light up like a Christmas tree. This is a major and incredibly obvious flaw. It suggests among other things that the 00’s were a time of increased protest in these nations, when in fact the early-mid 90s saw much more popular protest.  Any Nigerien will tell you this, but no one asked them: they “ran the numbers.”

I fear this is replicated elsewhere, having already seen Latvian comentators make the same complaint. If so, it renders the conclusion that the world is getting more activist entirely unsupported by this evidence.  Something both the “Global Database of Events, Language, and Tone (GDELT)” and the person’s advisor should have cottoned onto long before this.  That they didn’t — and people in the press are making grand self-supporting conclusions from this obviously flawed dataset — is my real worry.

While many folks I know are sharing this report as happy evidence that people around the globe are shaking off oppression, those influential Foreign Policy reading apparatchiks who see this mean something very different when they say the world is in turmoil. Peter Turchin, like the Penn State map maker, and the DARPA scientists, are primarily interested in “Disorder” : the word comes up again and again. I find it hard to ignore that predicting disorder in the world is a decade long preoccupation with the United States military, and a potential big business for anyone who can provide them such predictions.  The most recent public foray into this area was a disaster : in 2011 is was revealed that DARPA and Lockheed Martin had “spent more than $125 million on computer models that are supposed to forecast political unrest” but had failed to predict any of the Arab Spring protests. A lineage decades long, in fact, of US projects, run through the National Defense University, intelligence agencies, and others, had only predicted the blindingly obvious, and yet more were always funded.  What DARPA and Lockheed and similar computer science driven “solutions” to explaining history can offer is the glare of  very expensive “technology.”

Not only is this a captive ready market, as the United States “Homeland Security” market balloons from $7 Billion annually in 2006 to over $70 Billion in 2011, providing predictable scientific explanations for human society is immensely comforting to a people who feel themselves more threatened by the outside world and less sure of their dominance than any time in the last hundred years.

Social science trying to meet those demands is not unlike the British “bomb detector” maker, who sold 12,000 empty boxes with radio antennas attached to them for $20,000 a pop to governments and militaries, with the promotion of the British defense establishment.  In the midst of the disastrous Iraq War, the British Military and its allies were eager for anything that might stop bombs before they could do harm, and were willing to believe a literal “black box” could do this where no other known strategy could. There is a rush on private military contractors, greater policing and intelligence budgeting, and a stronger hand against disorder.  Those institutions see all these flashing lights not as flickers of resistance but warning lights.  To find a science that can predict them is both a fevered hope and a supremely comforting dream.  But it is not a very accurate view of history.

 

Cited:

  • Beieler, John. Animated Protest Mapping. Jul 31st, 2013
    http://johnbeieler.org/blog/2013/07/31/animated-protest-mapping/
  • Blaut, J.M.. The Colonizer’s Model of the World: Geographical Diffusionism and Eurocentric History. Guilford Press, 2012
    pp.65-67
  • Comin, Diego, William Easterly, and Erick Gong. 2010. “Was the Wealth of Nations Determined in 1000 BC?” American Economic Journal: Macroeconomics, 2(3): 65-97.
  • Hacking, Ian. The Taming of Chance. Cambridge University Press, 1990.
    p.119.
  • Korotayev, Andrey; Zinkina, Julia; Kobzeva, Svetlana; Bozhevolnov, Justislav; Khaltourina, Daria; Malkov, Artemy;  et al.(2011). A Trap At The Escape From The Trap? Demographic-Structural Factors of Political Instability in Modern Africa and West Asia. Cliodynamics: The Journal of Theoretical and Mathematical History, 2(2). irows_cliodynamics_217. Retrieved from: http://escholarship.org/uc/item/79t737gt
  • Leetaru, Kalev and Schrodt, Philip. (2013). GDELT: Global Data on Events, Language, and Tone, 1979-2012. International Studies Association Annual Conference, April 2013. San Diego, CA. – See more at: http://gdelt.utdallas.edu/about.html#sthash.NjVGBkOg.dpuf
  • Leibler, Anat. Statisticians’ Ambition: Governmentality, Modernity and National Legibility. Israel Studies. Vol. 9, No. 2, Science, Technology and Israeli Society (Summer, 2004), pp. 121-149 http://www.academia.edu/2464395/Statisticians_Ambition_Governmentality_Modernity_and_National_Legibility
  • Tilly, Charles. Big Structures, Large Processes, Huge Comparisons. Russell Sage Foundation, 1984
    pp.117-118
  • Winckler, Onn. Arab Political Demography: Population growth and natalist policies. Sussex Academic Press, 2005
    p.147

1 comment for “The Measurement of a “World in Turmoil” : Background to a Terrible Idea.

Leave a Reply