I just finished reading the ‘Net Lang‘ UNESCO report on ‘the multilingual cyberspace’. I didn’t find the report overall earth shattering, but did note a few points of interest, which I’d like to share here.
The first thing I’d like to note is that nobody really seems to know what’s happening on the web, or be able to measure it. I posted charts from wikipedia in a previous post, and commented on the discrepancy between English language contents vs English speakers online. All authors agree that English content is disproportionately high. But the exact measure is unclear. In spite of various projects, there seems to be no clear figures about language on the web. Given the abundance of contents, including huge quantities of dynamic contents (blogs, facebook pages, twitter…), it is, at present, impossible to give any clear calculations: lack of proportion does not help generate percentage points. This initial remark serves as a caveat for what follows.
The second point is something I already noted when reflecting on language policy within Australia: there is remarkably little talk of ‘second-tier’ languages, and how to strategically engage with them. In rough terms, the situation is as follows: there are about 6000 to 7000 languages spoken on the planet, but a small number of them dominate the offline world – and even more so the online world. Roughly speaking, on the internet, 1 language (English) accounts for about 50% of all contents, and about 60 to 70 account for over 90% of all contents. Many author defend ‘minority languages’ – those within the under-represented 10%. But no-one seems to really focus on the ‘second-tier’ dominant languages – Chinese, Japanese, Russian, Arabic, Spanish, etc – which alternatively fall in with the rest of under-represented languages, or – more frequently – are bundled together with English among ‘privileged’ languages which do benefit from an established set of standards, and are recognised by multilingual browsers and translators.
These ‘second-tier languages’ are precisely those I am most interested in. They represent about one third of all contents, and two thirds of all users. What will happen to that proportion in the close future? Are they going to challenge the dominance of English? Chinese, particularly, but also Spanish, Portuguese, Arabic… Is the web gearing towards an equal proportion of English and Chinese? Or is English going to remain the dominant form, a necessary koine for web communication?
While I prepare further reflections about machine-assisted language learning vs automatic translation, and scenarios for the future of digital multilingualism, I’d love to hear your thoughts on this – and link to any material on that question!