IP address authentication considered harmful

The practice of using IP addresses to identify institutional users is commonplace in most sectors of the online publishing marketplace, as it allows users to seamlessly access resources without needing to go through an explicit log-in process. But this practice has serious pitfalls which lead to significant customer service costs for publishers, and frustration for institutional customers.

The increasing realisation by publishers that their users need community and social networking features in publishing platforms, but at the same time have little or no tolerance for additional sign-on procedures, leads to terminal pressure on this once tried and trusted method of authentication for institutional users.

Use and usability

At its best this method of authenticating users allows seamless access to resources without the need to enter and maintain individual usernames and passwords. This is clearly a big win in terms of usability, as the need for authentication is satisfied without any barrier to use for the end user. If this is properly combined with strategies to drive usage through search engine optimisation and discoverability through integration with library catalogue systems then it should be an ideal technique. However the increasing need to uniquely identify users to provide integration with community and social networking platforms can be directly at odds with this approach.

Where IP authentication fails

A significant number of customer access problems are caused by data quality issues. Publishers expect librarians to provide and maintain accurate lists of IP addresses for their institutions. In practice these lists are rarely accurate and often include addresses of private networks and duplicated ranges including addresses which overlap with other institutions. Duplicated ranges are a major issue because often different departments or schools within an institution are separate customers, yet their IP ranges overlap or even share the same proxy server address. In these cases it is impossible for the publisher to distinguish between customers. This leads to difficulty accessing online platforms, and makes it impossible to track usage correctly in order to generate COUNTER usage statistics.

Inevitably this becomes an issue which no-one wants to deal with. Publishers cannot effectively debug the problems caused when IP address ranges conflict between institutions because it would require a detailed knowledge of the technical network infrastructure of the institutions. Similarly libraries or administrators of institutional purchases have to rely on the data provided by their own IS departments. Problems arise as network infrastructures rarely stay fixed; proxies can be changed, address ranges get re-allocated, firewalls get upgraded and improved. In an ideal world these changes would propagate automatically into publishers access management systems, but in reality there is no infrastructure in place to support this. In fact, when these kinds of changes get made, it is often impossible to know how many places need to be notified of IP address changes, and responsibility for this falls outside the IS department.

Effectively representing hierarchies within institutions is another need that cannot be met by IP authentication. Often hierarchies exist where individual departments want separate access to their own resources, but at the same time also need access to institution-wide resources. IP address based methods can’t help here as the low level networking infrastructure is frequently shared in a way which prevents different departments from being distinguished from one another by IP address.

Size matters

Even given the problems above, IP address authentication remains a dominant form of authentication for publishers with large institutional customers. But there are huge B2B and B2C markets which simply cannot be accommodated using this technique. Schools would love to give their students seamless access to resources in this way, but they are hampered by having little or no control over their IP addresses and network infrastructure. Even a huge organisation like the NHS cannot use this technique effectively, as their network infrastructure is delivered by the BT N3 system, leading to dynamic IP addresses which cannot be used to identify individual trusts. And the huge market of B2C broadband connected customers have exactly the same issues; the network infrastructure is fundamentally not designed to cope with this problem.

The technical background

The IP address system is part of the basic low-level design of the internet. Having a system of unique addresses was a fundamental requirement which allowed computer networks to be built in the first place. In the early days of the internet it was commonplace to use the IP address of a machine as a trusted way of uniquely identifying that machine. But this method of identifying machines and users was quickly recognised to have several security loopholes, as it was possible for anyone to forge IP addresses with ease. These loopholes exist because the IP address mechanism was never designed to be used a mechanism for identity and authentication; it is an unintended side effect that it works at all.

What we can do about it

Initiatives and standards such as Shibboleth, OpenID and OAuth all have a part to play in meeting the need to uniquely identify individual users. Increasingly Shibboleth is becoming a mandatory requirement for institutional sales in some markets. But these systems still place a burden on the individual to log-in, which in turn is a barrier to use. And furthermore these systems only work for interactive web page based access; authenticating access to even simple RSS feeds cannot be supported by these standards. In reality the web is still in need of a transparent, trusted and easy-to-use universal authentication mechanism.

6 Responses to IP address authentication considered harmful

  1. Don’t forget the awful workarounds that get put in to place because of the desire for IP Authentication: The X-Forwarded-For header.

    If you can’t distinguish based on the IP address of the institutional proxy, there’s a desire to use the IP address behind the proxy. But this method is wide-open to abuse. There’s no way to stop anyone supply their own X-Forwarded-For header with your IP address.

  2. Hi Dom,

    Very good point re: the X-Forwarded-For header. Since there are no standards for this header, we’ve seen no end of cruft end up in there from ill behaved proxy servers.

  3. Interested as to why you would want authenticated access to an RSS feed? Surely such feeds should be openly accessible, with authentication only being required at the last possible moment to access full content. I can see the argument for authenticated RSS feeds from, say, banks and credit card companies, but not sure of a publishing use case?

  4. Hi Nicole,

    There are a number of use cases for authenticated RSS from publishers. One common one for us is to deliver search results as an RSS feed – this allows users to store canned searches in their feed readers and check back regularly to see when new content is added to the system matching their search criteria.

    For a paid-to-access abstracts service, like CAB Direct for example, the search results must be protected, and so therefore any RSS derived from it must be also.

    The fact the RSS can be used for delivering rich services such as search as well as content syndication means that robust authentication is a key requirement for many publishers.

  5. Another note on X-Forwarded-For is that this setting can be changed at other points along the way to the webserver including at the Load Balancer and settings in IIS7 and up that will also potentially provide the wrong address for users behind proxy.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>