Publishers and information providers are in danger of cultivating a blind spot to one of the key issues currently inhibiting the growth of online information services: identity management.
The web as it exists today suffers from the lack of a consistent way of managing identity. There are challenges in both identifying myself to the sites I visit and in identifying those sites myself. Without any standard mechanism to deal with this, web developers have devised an array of different and incompatible schemes to manage identity. This presents serious challenges, since authenticity and trust are critically important concerns for publishers and information providers.
In the beginning, the internet was a place of implicit trust. Networks were small, and users were trusted with not abusing services. During the rapid growth of the network, many security and identity problems with the underlying network protocols and subsystems were discovered and solved. However the web itself has followed a different path of evolution.
Hypertext and identity
Early hypertext systems were firmly based on the idea that all users would need to be able to read and write documents. The models built into the first systems, such as Xanadu, allowed users to compose their own documents and to annotate documents authored by others. This functionality naturally required users to authenticate and identify themselves.
However, when Tim Berners-Lee invented the web in the late 1980s the distributed authentication services to support the read/write web did not yet exist. He simplified the original hypertext vision in order to make it easier to implement by removing the composition and annotation elements. This in turn removed the need for identity management systems which significantly reduced the complexity of the software needed. This simplification allowed the web to grow quickly at the cost of initially making it into an essentially read only environment for most users.
Visitors and residents
Recent JISC research into how people use online services has started to focus on the distinction between ‘visitors’ and ‘residents’ (proposed as a more useful replacement for the previous talk of ‘natives’ and ‘immigrants’). Those who use the web as a tool for specific tasks can be seen as visitors whereas those who spend a large amount of time online can be categorised as residents. Residents clearly have a strong need to maintain their online identity, since their presence on the web is an essential part of their overall social interactions. Visitors still need to manage identity, though their needs are more in the context of being able to authenticate themselves and the sites they visit.
This distinction between visitors and residents is in reality more of a spectrum of different behaviours, but it’s critical that publishers understand the common needs for identity and authentication services. As the participative read/write web returns with the move towards Web 2.0 services the traditional roles within publishing of author and reader are disrupted, and again this is exactly the point where online identity, and trust in that identity, is centrally important.
Publisher platforms
Current publisher platforms provide a good example of the patchwork of different services and approaches to authentication and identity management developed over the past decade. Most feature a combination of the following approaches:
- IP authentication to identify a user’s home institution. This works well for on-campus users as it allows invisible authentication and thus provides the lowest possible barrier to usage. However this scheme is useless for remote users.
- Federated authentication to identify an individual user within an institution. This covers both the historical Athens protocol used within the UK and the Shibboleth protocol now gaining significant traction in the UK and America. This works equally well for on and off campus users at the cost of a sometimes difficult user experience.
- Other username/password schemes. Individual identity outside of the institutional context is invariably handled by requiring users to create their own account within a publisher’s system.
- Other remote access schemes. Schemes such as EZproxy or referrer authentication allow remote users to identify their institution to a publisher platform by first logging into their institutional home page. Again, usability is compromised by having to go through the institutional portal to gain access to the resource.
- Personalisation log-in. In order to uniquely identify individual users within an organisation it has often been necessary to have a second level of log-in within a publisher platform. This requires users to create and maintain yet another identity within the publisher platform.
These approaches cannot usually be combined or aggregated, which acts as a further barrier to usability and usage. For instance, to access individual purchases it may be necessary to log-out of the institutional account and log-in again as an individual user.
Publishers are increasingly looking for business models which allow them to combine different levels of institutional access rights (e.g. school, department and college) with individual access rights. The existing infrastructure is in many cases not able to support these new models and so new solutions are needed.
Contributor identity
As the distinction between reader and author becomes increasingly fluid both parties need to be able to manage their online identity to interact with the read/write web. The CrossRef ContributorID project is focused on providing an identity framework designed to address the issues around knowledge discovery and authentication for the scholarly publishing community.
Conclusion
I’ve attempted to provide a survey of the existing landscape here, but where Online Identity is going next is a much bigger question – not to mention our ideas for where it should end up! Next month I want to discuss the influence of trends associated with Web 2.0 and how they are affecting the field, as we move towards Online Identity 2.0.

I notice you didn’t mention OpenId, was the deliberate? Most large journal publishers seem to offer Athens, Shibboleth, IP recognition and user/pass authentication but not OpenId.
I didn’t mention OpenID as I was saving that for the second part of this post, which I’ve just published here: The Challenge of Online Identity: Part 2
I think its key for STM and other publishers to fully understand individual users and this includes the context of their links to (possibly) multiple institutions.
No mention of OpenID, do you see that playing a part?
Within STM, I certainly see a future where we need to link individuals with institutions more effectively whether through Shibboleth or ANO process.
Pingback: The Challenge of Online Identity: Part 2 | The discovery blog - Semantico