Debate on the legality of ‘screen scraping’ in Ireland is informed by preliminary findings from litigation involving budget airline Ryanair, but the legal position remains uncertain.
The rise of generative AI and the way data is sourced for use by such systems promises to reignite discussion on the topic.
Screen scraping is the use by an unrelated third-party meta search site of automated systems or software to extract data from a website for its commercial purposes. Screen scraping can be carried out by anyone, including cyber criminals or by legitimate businesses, such as news sites, financial companies, and online travel agents.
For example, a travel agency may operate a website where consumers can search through the flight data of airlines, compare prices and, on payment of a fee or commission, book a flight. There are various ways the travel agency can obtain the necessary data to respond to an individual’s queries, such as by automated means which includes a dataset linked to the relevant website, or from a third party provider.
There can be negative consequences for the website owner whose website is scraped, including system overload, loss of advertising revenue, loss of control of the information content, and devaluation of content – particularly if that content was from a premium service. Accordingly, many website owners prohibit data scraping in their website terms and conditions.
Before Irish companies embark on any scraping activity, they should consider the following, otherwise they may find themselves embroiled in a legal dispute:
A website’s ‘terms and conditions’ or ‘terms of use’ typically contain provisions prohibiting screen scraping. In addition to an outright ban on using any data mining, robots or similar data gathering or extraction methods, the terms commonly include the following prohibitions:
Significantly, a provision dealing with jurisdiction will also be included in a website’s terms and conditions. As a result, by entering into a website contract, a user agrees to submit to that contract being governed by the laws in that specified jurisdiction.
Click wrapping contracts are a feature of many online distribution services’ websites. Visitors to such sites accept terms and conditions through a click-wrapping contract, which involves the explicit acceptance of such terms through various clicks at various pages on a website. Website owners can then rely on these terms and conditions in a claim that the practice of screen scraping is a breach of its terms.
Whilst an Irish court has analysed a claim for breach of a website terms and conditions as part of an injunction application, there has been no substantive hearing on whether screen scraping does in fact breach such terms. To date, the Irish courts have accepted that website terms and conditions and clickwrap contracts are capable of giving rise to binding contractual obligations; and that users will be bound by jurisdiction clauses in a website terms and conditions. Accordingly, it is possible that a claim for breach of contract might succeed.
Two potential types of protection for databases are provided for in the EU Database Directive. Article 3 provides for protection of a database under copyright. Copyright protection is available for databases which, by reason of the selection and arrangement of their contents, constitute the author’s own intellectual creation. Most websites will, however, struggle to reach this originality threshold.
Instead, website owners may be able to rely on the fact that the data provided on the website is the result of a substantial investment. A database right is provided for in Article 7 of the Database Directive. Article 7 provides that a database right should exist where a substantial investment has been made in either the obtaining, verification, or presentation of the contents.
A database right is infringed when all or a substantial part of a database is extracted or re-utilised without the owner’s consent. The repeated extraction or re-utilisation of insubstantial parts of a database which conflicts with the normal use of the database may also infringe database rights.
In the case of Ryanair Ltd v PR Aviation BV, the Court of Justice of the EU (CJEU) held that the Database Directive only applies to databases protected by copyright under Article 3(1) or by way of the sui generis database right under Article 7(1). The CJEU left it to the national courts to decide whether a particular database qualifies for protection under the Database Directive – for example, in Ryanair Ltd v PR Aviation BV, the Dutch court had previously held that Ryanair’s database did not qualify for protection under the Database Directive. The CJEU also held that the owner of a publicly accessible database is free to impose contractual conditions on the use of its database by third parties once it complies with the applicable national law.
As a result, PR Aviation could not rely on the Database Directive which rendered null and void any contractual provisions contrary to the Database Directive. In other words, because the Database Directive did not apply, PR Aviation could not rely on certain provisions to avoid a breach of contract claim. As a result, PR Aviation would be bound by Ryanair's website terms and conditions, which prohibited screen scraping.
To date, the Irish courts have not decided whether data contained in a database within a website is protected by way of database rights; or if a website’s terms and conditions prevents screen scraping where the EU Database Directive is not engaged. Accordingly, it is possible that, depending on circumstances, a claim for breach of database rights might succeed.
Getty Images, the renowned media company, launched proceedings in the US and the UK against Stability AI for infringement of its intellectual property. In the US complaint, Getty alleges that the copying is brazen and on a staggering scale and relates to more than 12 million photographs, captions and metadata scraped from Getty’s database. Getty’s key complaints are grounded on copyright infringement and trade mark infringement claims regarding use of its images and trade marks. Apart from the explicit prohibition set out in the website terms and conditions, Getty is relying on statutory protection afforded by copyright and trade mark legislation.
Equivalent statutory provisions are available to website owners under Irish copyright and trade mark law. In the case of copyright, whether such a claim has any merits will depend on the particular circumstances, because not all scraped data qualifies for copyright protection. This is because the originality threshold may be a challenge for websites where the author’s own intellectual creation will have to be demonstrated. Works which are copyright protected include original literary and artistic works such as computer programs, website graphics, and photographs.
It may be possible to argue a case on passing off or registered trade marks if the website owner’s unregistered or registered trade marks are used without their consent. Passing off prohibits a third party from selling goods or carrying on business under a name, mark, description, or in any other way that is likely to mislead, deceive or confuse the public into believing that the merchandise or business is that belonging to the brand owner.
Given the various protections open to website owners’, companies operating a screen scraping model should be cognisant of the following points:
As there is a still question mark in Ireland over whether screen scraping is legal, Irish companies planning to benefit commercially from web scraping should tread softly or not at all.
For the moment, the safest route for meta search sites intending to screen scape is to seek to negotiate a licence with the owners of the target websites.
A version of this article was first published by Lexology.