TROLLing: new open data archive

Linguists at the University of Tromsø have released a new repository for language and linguistic data, which is fully open access.

From the archive’s About page:

The Tromsø Repository of Language and Linguistics (TROLLing) is designed as an archive of linguistic data and statistical code. The archive is open access, which means that all information is available to to everyone. All postings are accompanied by searchable metadata that identify the researchers, the languages and linguistic phenomena involved, the statistical methods applied, and scholarly publications based on the data (where relevant).

Linguists worldwide are invited to post datasets and statistical models used in linguistic research. The TROLLing Steering Committee is responsible for the scientific content of the archive, whereas the University Library provides quality and relevance control, in addition to user management. The University Library also oversees the technical and legal structure of TROLLing.

You can visit the archive here. There’s also an amusing promotional video:

Advertisement

What happened to eLanguage? 18 months on

eLanguage was a beautiful idea. Founded by Dieter Stein and Stephen Anderson, it was an online platform for linguistics e-journals. The cost of hosting and technical support was borne by the LSA, while other things – such as typesetting, proofing, copy-editing and marketing – had to be paid for by the editors or their institutions. From 2011-2013 I was one of those editors, and, since I didn’t have a budget of my own to play with, everything was done on the cheap: I did the copy-editing and marketing (insofar as there was any) myself, and my estimable colleague Moreno Mitrović looked after the typesetting.

For whatever reasons, the LSA decided to discontinue eLanguage – presumably as part of their negotiation to bring Language itself closer to full no-fees OA status. I wasn’t happy with the decision, and neither were many others, but by the time I’d heard about it the decision had already been made. Though you can still find the archives on their website, none of the journals have been accepting new content in this form since the end of 2013. I thought I’d take a quick look at what happened to the “co-journals”, as they were called. These fall into three main categories: some ceased to exist, some were absorbed into Language, and others have struck out on their own.

Ceased to exist

In this category we have:

The former doesn’t really count, since it is the precursor of Pragmatics, and ceased to exist as such long before eLanguage came into being. Its purpose on the site was only ever as an archive. Mesoamerican Languages and Linguistics, on the other hand, was never very active: it only published two articles and one review, between 2008 and 2010. Its demise is thus perhaps unsurprising.

Became part of Language

This category covers:

My own Journal of Historical Syntax published three full papers and three reviews before it merged into Language, not bad considering that it was only set up in mid-2011. After some negotiation, it became an online-only section of Language. The deal is rather similar to eLanguage, with a few crucial differences:

  • We’re paginated as part of Language, and count as such for the purposes of impact factors, etc.
  • We now have the administrative support of the Language team, who deal with the papers once they have been accepted. (This saves me and Moreno a LOT of work.)
  • The copyright agreement is no longer in the CC-BY family, though authors still retain copyright of their work, unlike with most journals.
  • It no longer carries reviews.
  • The Journal of part had to be excised from its name, due to its new non-independence.
  • Most importantly, the materials are now not instantaneously open access. Authors can pay $400 (a lot of money, but chicken feed compared to what the for-profit publishers charge) for instant Gold; otherwise, all content is made available through the LSA website after one year. Since the half-life of a linguistics article is presumably much longer than this, it’s hardly a problem to wait that long. In the meantime, it’s behind a paywall at Project Muse.

To date, three full papers have been published in the Historical Syntax section of Language, with several more on the way.

Teaching Linguistics has followed a similar path; they were established even later and hadn’t published anything as part of eLanguage. Now they have four papers out, two in 2013 and two in 2014.

Historical Syntax and Teaching Linguistics are joined by Language and Public Policy, Phonological Analysis, and Perspectives. These have generated three, one, and two papers respectively (not counting the responses and “Short Shots” that Perspectives has also generated). So all the new sections are healthy, though none incredibly prolific.

Struck out on their own

The journals which took their own path are:

The only one of the batch to move to a commercial publisher is Pragmatics, which is now at John Benjamins and has a similar setup in terms of OA to Language and its sections, including a one-year embargo. Unlike many of the other eLanguage co-journals, Pragmatics was big and established before moving to eLanguage; you can view its extensive archives here.

It’s not immediately clear what’s happened to Studies in African Linguistics. Like Pragmatics, it’s a long-established journal that much predates eLanguage. The archive page proclaims in CAPITAL LETTERS that it is NO LONGER PUBLISHING NEW CONTENT (you can view the old stuff here). However, a quick search reveals that it appears to be alive and well. It’s still fully no-fees gold OA online, which seems to be funded through print subscriptions.

Semantics and Pragmatics is the golden boy of eLanguage, its biggest success story. It has simply continued in its original form: no fees to publish or to read. The LSA still support it, though they also receive funding from MIT and the University of Texas at Austin.

Since they have technology on their side, LiLT have done well for themselves, and like Semantics and Pragmatics have hung on to gold no-fees OA. They’re now simply hosted on a server at Stanford and supported by CSLI. Their back catalogue is here.

Dialogue and Discourse ticks along quietly. They don’t publish much outside their special issues (typically only one article a year), and are hosted at Bielefeld, with no-fees gold OA. They’ve kept their identity and migrated all their back issues to the new site, which is nice.

Constructions appear to have a similar arrangement, hosted using blog software at Osnabrück. Their back issues are here. Sometimes they have a lot of content; at other times, not so much. After producing no articles in 2013, they came back in 2014 with a special issue.

(It’s important to emphasize that by highlighting the sporadic nature of these e-journals I’m not making a criticism. The principle of uniformly-sized little issues was really rather specific to print, an artefact of the Gutenberg parenthesis. It’s neither necessarily healthy nor particularly important for a journal to have a consistent volume of content. If there are ten good 100-page articles one year and only a squib the next, then so be it. Quality over quantity.)

So there’s a diversity of outcomes. Of those journals that survived, the LSA has hung onto relatively few; most are still immediate gold OA, and all are gold after no more than a year. Though the LSA’s decision to drop eLanguage was regrettable, the programme as a whole has made a significant and lasting impact on the linguistics publishing landscape.

Open Access Linguistics: You’re Doing It Wrong

Note: this post first appeared on my personal blog in 2014.

Update 16/12/2020: in the last 48 hours I’ve had 15 comments on this, a six-year-old blog post which otherwise hasn’t seen much action since 2014. All the comments are broadly positive about the journal, and at least one is from an email address @scirp.org (the publisher of OJML). I’m loth to shut down debate, but since this has the flavour of an organized attempt to comment-bomb the post, I’m turning off comments at this point.


If you’re a linguist – any kind of linguist – then you, like me, will probably have received an email from the Open Journal of Modern Linguistics, inviting you to submit your work.

I’m extremely committed to open access in linguistics, and in academia more broadly; here’s why. But OJML is doing it wrong, and the rest of this post aims to explain why. The tl;dr list version of this post is as follows:

  • Don’t ever submit your work to OJML.
  • Tell your friends never to submit to OJML.
  • If you know someone who’s on the editorial board, gently ask them not to be.

So, what’s so very wrong with OJML? The short answer is that it is run by the wrong people and threatens to bring the entire, very promising, open access movement into disrepute by charging stupidly high APCs and skimping on quality both in terms of typesetting and intellectually.

The “costs” of progress: predatory publishers

Let’s take a look at OJML’s guidelines on Article Processing Charges (APCs). It’s $600 per article, but only if that article is within ten printed pages: in linguistics, that’s barely out of squib status. For each additional page above ten, an extra $50 is whacked on.

This may not seem like much, given that Elsevier charge up to $5000. But for a 20-page article, which is still short by linguistics standards, we’re talking $1100. Moreover, this kind of incremental model penalizes thorough argumentation and, in particular, proper referencing. It might even not be so bad if what you paid for was worth it – but I’ll argue below that it isn’t even close.

The open access community has a name for this kind of publishing practice: “predatory”. Jeffrey Beall maintains a list of predatory publishers on his website, along with criteria for inclusion. Surprise, surprise: “Scientific Research Publishing” (SCIRP), the publishers of OJML, are on the list at number 206.

What’s in it for them? Large amounts of money, made from academics’ naivety. Last year, journalist John Bohannon conducted a “sting” operation by submitting a series of 304 deliberately deeply flawed manuscripts by fictional authors to gold open access journals, many of them ostensibly peer-reviewed. More than half of them accepted the papers, including many that apparently sent the paper out for review, and 16 journals accepted the papers despite the reviewers spotting their damning flaws.

The journal Science, who hosted Bohannon’s piece, were keen to trumpet the failure of open access (unsurprisingly, as they represent the status quo that open access threatens). However, there are a lot of problems with Bohannon’s approach, which have been ably summarized elsewhere. In particular, since Bohannon didn’t include a “control group” of traditional subscription journals, there’s no evidence that open access peer review practices are any worse than those. And even if they were, the existence of exploitative behaviour within open access of course doesn’t entail that open access itself is a bad thing. But it’s clear from Bohannon’s experiences and those of others that, where there are new ways of making shady money, there will be crooks who leap to seize them, and that gold open access (and OJML) simply illustrates one instance of this general principle.

Bad production standards

One of the areas where any publisher can claim to add value is in ensuring the formal quality of their published submissions: typesetting, copy-editing, proofreading, redrawing complex diagrams or illustrations, etc. If a publisher does this well, they may merit at least some of the fees that they typically charge for open access. However, OJML’s performance in this area shows that they hardly even look at the papers they publish. Here are some examples from Muriungi, Mutegi & Karuri’s 2014 paper on the syntax of wh-questions in Gichuka (which, at 23 pages, must have cost them a pretty penny):

  • Glosses are not aligned (e.g. in (6) on p2).
  • The header refers to the authors, ridiculously, as “M. K. Peter et al”.
  • There are clauses which contain clear typographical errors, e.g. “the particle ni which in Bantu, which is referred to as the focus marker”, on p3.
  • In (17), the proper name “jakob” is not capitalized.
  • There are spelling errors: “Intermadiate”, in table 1, p8.
  • The tree on p14 has been brutally mangled.
  • Some of the references are incomprehensible garbage: “Norberto (2004). Wh-Movement. http.www.quiben.org/wp.content/uploads”

A quick glance through any OJML paper will reveal that these aren’t isolated occurrences, and little of this is likely to be the fault of the authors: at least, any linguistically-informed copy-editor or proofreader should have picked up on all of these points instantly, and any proofreader at all should have picked up on most of them.

Low quality papers

What about the academic quality of the papers accepted? I don’t want to pick on any particular paper: in fact, I’m sure that there are nuggets of gold in there (the Muriungi et al. paper mentioned above, for instance, is a valuable syntactic description of an aspect of an understudied language). But I invite you to skim some of the papers and draw your own conclusions.

In particular, the dates of acceptance and revision of the papers aren’t exactly indicative of a thorough review process. For instance, the paper by Muriungi et al. was “Received 7 June 2013; revised 9 July 2013; accepted 18 July 2013”. Again, this isn’t unusual for the papers in this journal. It’s certainly not impossible for quality peer review to take place at this speed – and it’s certainly desirable to move away from the unacceptable slowness of some of the big-name journals – but it is at least doubtful. And one thing that is extremely eerie is how many of the articles are dated as having been revised exactly one month after receipt, suggesting that the process may have been even shorter and that SCIRP is trying to cover itself, by means of outright lies, against exactly the kind of allegation I’m making.

The fields of linguistics given under their Aims & Scope don’t inspire confidence, either, with “Cosmic Linguistics” and “Paralinguistics” among them.

Why is this important?

OJML is symptomatic of exactly the wrong approach to open access. Open access, to me, is about disintermediation, about putting power back into the hands of academics. There are several good open access operators out there: Language Science Press is a prime example in the domain of books, the e-journal Semantics and Pragmatics has been performing a valuable no-fees open access service for years, and the Linguistic Society of America recently took a step in the right direction by making papers in its flagship journal Language openly accessible after a one-year embargo period. These initiatives are all run by researchers, for researchers.

In contrast, OJML is about opportunistic money-making. Here’s a quote from SCIRP’s About page, in relation to why their base of operations is in China while they’re registered as a corporation in Delaware: “What SCIRP does is to seize the current global trade possibilities to
ensure its legitimate freedom with regard to where to do what.” If this sort of creepy graspingness doesn’t put you off submitting to OJML, and the problems outlined in the previous sections don’t either, then I don’t know what will.

Unless we nip this problem in the bud, then it threatens to damage the reputation of the Open Access movement more generally. Time to boycott OJML, and to spread the word.

Journal of Historical Syntax: interim report

Note: this post dates from April 2013, before the Journal of Historical Syntax became the Historical Syntax section of Language. It was first posted on my personal blog.


My little Journal of Historical Syntax has been in existence for a year and a half now. The Executive Committee of the LSA has requested some facts and figures on the eLanguage journals, and I thought that readers might be interested to see these numbers as well. Enjoy!

Since its inception in summer 2011, the Journal of Historical Syntax has received 13 submissions: 1 in 2011, 9 in 2012, and 3 so far in 2013.

Of those 13 submissions:

3 were rejected.
4 were advised to revise and resubmit (of which 1 was subsequently accepted).
4 were accepted with changes (plus the 1 mentioned above).
2 are currently under review.

36 individuals have been involved in reviewing. The average time between receipt of the manuscript and date of the decision (not counting papers that were not sent out for review) is 97 days. 2 peer-reviewed papers have so far been published (1 in 2012, 1 in 2013). For these two, the times between receipt of the manuscript and publication were 275 and 187 days respectively. The articles have received 158 and 138 views respectively, and their abstracts have received 454 and 257 views respectively.

2 book reviews have also been published (1 in 2012, 1 in 2013), and a third is in the works. The two reviews have received 200 and 106 views respectively, and their abstracts have received 420 and 184 views respectively.

Many thanks to all our reviewers, authors and readers!

The case for Open Access

Note: this is a modified version of an original which appeared here in 2013. A few things have changed since then, but the rationale behind this post hasn’t.


A large proportion of academic research in the UK is taxpayer-funded. The money comes either via grants from the Research Councils, on which the government spends approximately £3 billion each year, or directly to universities from the Higher Education Funding Council for England (HEFCE), which in 2011-12 distributed £1.6 billion.

The transformative potential of world-class research is pretty clear. In the last few years alone, UK researchers have developed the wonder material graphene and discovered the body of Richard III, among other things. Yet, in a curious and inequitable twist of fate, the results of this research have for the most part never been made available to the taxpayers who funded it.

Instead, research findings are published in peer-reviewed scholarly journals run by private publishing companies. In the modern era, these largely take the form of PDFs behind pay-walls, tantalisingly close and yet inaccessible to those who aren’t willing to fork out $40 per view. Universities and libraries, meanwhile, can buy back-breakingly expensive subscriptions to this content. The net result of all this is that research findings are available only to the wealthy and to research institutions themselves, and even then only at great cost.

It’s hardly surprising, then, that in the last few years people have begun to comment on how deeply perverse and unfair this system is. The culmination of this trend in the UK is a document called the Finch Report, produced by a group of academics, funders and publishers. Published in summer 2012, the report delivers a number of recommendations to all the bodies involved. Its key conclusion is that the UK should abandon the traditional subscription-based model of publication and embrace Open Access (OA).

OA itself is hardly a new idea; in many ways it co-evolved with the digital age. It’s been around since the early 1990s in its current form, and the roots of the movement can be traced back even further. The Budapest Open Access Initiative crystallized its main methods and objectives: “to make research free and available to anyone with a computer and an internet connection”. However, only in the last few years has it entered the mainstream academic consciousness in the UK. The shift has been sudden and dramatic, and the effects of the Finch Report are still making themselves felt. HEFCE and the UK Research Councils have published responses to the report, enshrining OA as a requirement for future taxpayer-funded research outputs.

It might seem as if OA would be welcomed by all involved, but the reality is that reactions among the academic community have been mixed. The reasons aren’t hard to understand: the Finch Report proposes to shift the cost of research publication from the consumer to the producer, via a mechanism of Article Processing Charges (APCs). Under this new business model, research findings will indeed be free for the reader, and accessible to the taxpayer. It’s the researcher who must shell out for the “privilege” of making their findings known to the world. Unsurprisingly, this “pay-to-say” model has been criticised. A comparison with the creative industries may help to indicate why: an industry in which musicians, or authors, must themselves pay through the nose in order to make their work accessible to the world doesn’t exactly inspire artistic confidence. Making APCs the primary route to funding of research findings amounts to encouraging vanity publication, and has the potential to crush independent researchers and smaller research institutions.

The Research Councils have promised to make substantial funding available to universities to enable them to foot the APC bill, but problems remain. For one thing, it’s not clear whether this new funding will in fact cover the costs, which can in some instances be astronomical: I was recently offered the option of paying $3000 to Elsevier in order to make an article that had been accepted to one of their journals freely available. If decisions must be made about which articles get their APCs paid, who will make those decisions, and on what basis?

The key to resolving these issues lies in a more radical rethink of academic publishing than envisaged by the Finch Report. The report rightly identifies the need for a transfer of costs, but implicitly assumes that the costs themselves must remain at the same level as they are at present. Given that the panel responsible for the report included representatives from publishing companies such as Wiley-Blackwell and Springer, this is to be expected. The report mentions the possibility of “disintermediation”, defined as a reduction of the role of intermediaries such as publishers, but the possibility is cursorily skipped over. There is reason to believe, however, that this sort of disintermediation is exactly what academic publishing needs. As George Monbiot has argued, it is questionable whether academic publishers really add value at all – and yet for-profit publishers such as Elsevier operate with seriously substantial profit margins. Meanwhile, in what is perhaps the best-kept non-secret of the business, the real drivers of the process – editors and reviewers – are for the most part paid nothing at all, but assume their roles out of the goodness of their hearts, confident that they are helping to ensure rigour in their chosen field.

What are the real costs of running a journal, then? In the digital age, print editions of journals are at best a quaint reminder of the past and at worst a waste of space; every journal worth its salt is available online. Online publishing is quick and cheap: the only costs are hosting (minimal), typesetting, and marketing (which can largely be carried out on the basis of word-of-mouth networks that already exist within academic disciplines). A perfect case study is provided by the eLanguage programme, a digital publishing platform for academic journals in my own field, linguistics. Hosting here is funded by a learned society, the Linguistic Society of America, leaving journals to meet the costs of typesetting, which in many cases can be carried out on a voluntary basis (just like the more arduous task of peer review).

Faced with this type of business model, the arguments against OA evaporate. The funds provided by the Research Councils for the purposes of paying APCs can, and should, be re-purposed to directly fund the operation of a new generation of free-to-view, free-to-publish academic journals. All that is standing in the way of this position is an attachment on the part of policy-makers to the role of traditional publishing companies – an attachment which can, and should, be questioned.

Why oaling?

This is a blog by George Walkden, of the University of Manchester, about open access in linguistics. Needless to say, all content here reflects my own views and not those of my employers or indeed anyone else.

Since I’ve written about OA in various places, including for the Pirate Party, and on my deeply incoherent personal blog, I thought I’d bring things together here.

Everything I write here is released under a CC-BY license.