Are ‘predatory’ publishers an American export?

Research shows that 65% of the so-called predatory publishers named on Beall’s list may be based in the United States and may not, as much of the ‘predatory’ rhetoric would indicate, be a developing world, or offshore phenomenon.

There is not doubt, of course, that true predatory publishers should be named and shamed. However, in my previous post on Open Access vs. Neocolonialism alluded to a potentially unpalatable subtext of the debate on who or what constitutes a ‘predatory’ publisher: that of western chauvinism or ‘neocolonialism’. Beall’s list of predatory Open Access publishers is an often cited collection of  so-called predatory publishers. There is little specific in Beall’s criteria that directly links a publishers’ undefined (and therefore assumed to be non-first world)  location with ‘predatory’ status but much that hints at it. Take, for example, this comment from a May, 2012 post:

“The publisher claims to be based in Cambridge, Massachusetts, but the poor and non-standard English throughout the site belies its true location: I think this is really an Indian operation”

This post was heavily, and rightly, criticised by Karen Coyle on Library Journal as using language that seems “to be bordering on racism”.  This, to me, seems too emotional a rebuttal. Instead I feel that Beall’s view expresses not racism but rather an ingrained cultural chauvinism. Indian English (as opposed to ‘hinglish‘) is an established sub-continental language with its own specific set of vocabularies and many tens of millions of speakers. It may be considered as ‘poor’ by a speaker of US English but, as we all know, the same is considered by some of US English in relation to Standard ‘English’ English. Beall’s comment exposes his normative word view; to judge quality on a misreading of language is unfair and his conversion of ‘Indian’ into a pejorative is worrying.

To some, this may seem to be a case of cultural hairsplitting. But some more specific criticisms that move beyond language have been levelled at Beall’s criteria by Coyle:

Some of the criteria seem to make first world assumptions that aren’t valid worldwide, such as “The publishers’ officers use email addresses that end in, or some other free email supplier.” It isn’t difficult to find third-world university web sites with entire faculty lists that use Yahoo! email, undoubtedly because of minimal technical support at their institution.

Having lived and worked in India for two years I know firsthand that there is still reliance on generic email addresses in the professions (as there was in the developed world until relatively recently). Making any kind of value judgement based exclusively on western assumptions is lazy and, most importantly, fails to spot the iceberg of cultural complexity. This complexity, conveniently elided by a normative view of the world, enables local commentators to say of India, with much good reason, and in a specifically Indian idiom, that ‘we are like that only‘.

As such, and considering the rapid growth in scholarly output from the developing world the onus is surely on the developed world to ensure that lazy cultural chauvinisms do not undermine what should be a serious debate with regard to the ‘predatory’ (or otherwise) practices of the rapidly growing sector of scholarly publishers whose business models are predicated on ‘gold’ open access.

To this end, it’s worth examining one of Beall’s criteria in more detail. This states that a ‘predatory’ publisher ‘may’ list “insufficient contact information, including contact information that does not clearly state the headquarter’s location …”

A quick review of the sites listed by Beall shows a marked lack of clear contact information on the majority of them. Again, if we make a judgement by western standards – i.e. benchmark what we do (or do not) find against our well-developed digital economies with the associated expectations of transparency and service level – this lack can undoubtedly be read as suspicious. Whether these specific suspicions are valid or otherwise is not my point.

What I’m interested in is how the geographic ambiguity created by this lack of concrete contact information fuses with Beall’s cultural misapprehensions to give credence to the assumption that predatory publishers are most likely offshore.

I decided to test this assumption. Using a methodology that we use to help publishers with rights management we took Beall’s lists of 192 predatory journals and 321 predatory publishers and geo-located their IP addresses with Maxmind’s GeoIP database. Of course, this is not an exact science. Just because a business hosts its content in a specific locale does not mean that the business itself is based in this locale. Nevertheless, the results are surprising.

‘Predatory’ journals – geo-located by IP address

map of predatory journals geo-located from IP address

pie chart location of predatory journals

‘Predatory’ publishers – geo-located by IP address

locations of predatory publishers geo-located from IP address

pie chart location of predatory publishers

What does this mean?

Of course, a business may not live in the same location as its digital content. That said, I think that there is enough of a correlation between business location and hosting location to warrant further investigation and discussion of these data.

Here are two interpretations:

1. Taking the data at face value (i.e. assuming that hosting country and business entity are synonymous in terms of location), these results indicate that the ‘predatory’ publishers and their journals are a phenomenon of the developed world in general, and of the US in particular. This contradicts the neocolonial inferences of the ‘predatory’ rhetoric.

2. These data are unreliable as it may simply be that businesses from the developing world are hosting their content in the developed world in order to give the impression of being from the developed world: a kind of black hat marketing strategy.

So, where do the majority of ‘predatory’ publishers live? Onshore or offshore? I do not offer these findings as definitive: more as food for thought.

What’s next?

In my next post on this subject I’ll add in another data point – the location of the registrant of the domain name. This will enable us to re-evaluate the data by cross-referencing the geo-located IP address with the location of the registrant: in other words, to explore if developing world businesses are indeed hosting their content in the developed world as mentioned in point (2), above.

A note on methodology

We sent an automated browser (selenium-webdriver + chrome) to Beall’s lists of predatory journals and predatory publishers. We scraped all the links off each of these two pages and threw away any that were not in the main block of content, and then cut them down to just the domain part (e.g. became We then resolved these hostnames to their IP addresses and ran the IP addresses through the GeoIP City and GeoIP ISP database to locate their geographical coordinates. We wrote that data out to a CSV file and imported it into Google Docs to map it.

