What happens when you can search 30 million words of oil contracts?

OpenOil today releases the fourth version of the contracts repository. As my colleague Anton Rühling writes, there are another 83 contracts, from 29 countries, and some interesting material around the Brazilian state oil company Petrobras, currently at the middle of major corruption scandals.

But along with the new contracts comes a feature we hope will be useful to researchers and extractives governance professionals around the planet – the ability to comb all the contracts word by word for phrases, combinations of phrases, date ranges and other functions.

So what can you do with it? A short tour…

Use case 1: royalties for gas

There are many reasons to support the norm of contract transparency. While a lot of attention has been on the ability to scrutinise the terms of particular deals, there is also huge value in having a large body of contracts in public domain to provide a broad basis for comparison. The repository is still small relative to the total universe of contracts – 800 out of… 50,000? 100,000? Nevertheless…

Type the word “royalties” in to the search box and you get 216 results. But you may have a more specific focus of enquiry. For example, you’re interested in the royalty rates applied on gas in contracts, you’ll get 49 contracts from six countries (including India and Israel, as well as the more usual “transparency” jurisdictions). Still a fair piece of work to decide how to approach this document base… but the starting point came quick.

Use case 2: when is a benchmark a benchmark?

Or, to go a little more oil geek: one of many contentious issues in contracts is how oil and gas are valued and how much they are sold for. Of the thousands of grades of crude oil in the world, only a few dozen have their own standing price. The rest are valued against them, the benchmarks. But which benchmarks, and at what premium? What’s reasonable, what isn’t? How do contracts deal with this?

One important factor is the technical quality of the crude governed in the contract compared to the benchmarks. Are they similar grades, or, in the parlance, do they have similar “gravity”? So, run a search for “API” and “gravity” across the repository and you find there are 32 published contracts which reference this. Many references to cut off points along the API scale – above 30 degrees, below 10, above 15 degrees. These are triggering specific sales and valuation conditions, using the API index as proxy for market value.

And among them… two contracts from Sierra Leone which show some interesting language… “the Basket shall differ less than four (4) degrees API”. It suggests an external, perhaps even vaguely objective, parameter to the limits of comparison. Benchmarks shouldn’t be more than four API degrees lighter or heavier than the crude in the contract – an indication of outliers. Insisting on less than one degree would be unnecessarily restrictive. Ten degrees? Exploitative.

But it’s only two contracts from one country, and Sierra Leone is not even a producer. Can we strengthen it? Well, zoom out from just the contracts repository to the whole of the corporate filings database, currently 1.3 million records from half a dozen financial regulators in the major jurisdictions for the industry around the world… and… we find two more descriptions of Kosmos contracts in Suriname, filed with the SEC in the United States, and another from Albania filed with the Canadian authorities. And these also mention four degrees API as the reasonable limit of comparison. Still not definitive… but a starting point to examine any specific set of terms defining benchmarks by technical quality, in less than ten minutes looking.

Use case 3: local training programs

How oil companies bring on local staff and their counterparts in state oil companies and even ministries is another issue of keen interest. But what’s the norm? Run a search for just the word “training” and you find 250 contracts which reference it. But maybe that’s too many… Maybe you’re in a country that is debating and wants to put a figure on it. But the company is shying away from specifying an amount. So how normal is it for a contract to specify an amount? Adapt the search to say: find the contracts which mention “dollars” within 25 words of “training” and you get 98 results. So, inconclusive. But informed inconclusive.

Use case 4: Finding that troublesome phrase

You might actually use the repository to search one contract for a specific phrase. Outsiders would be amazed at how much time lawyers and others spend combing contracts they didn’t write for where the standard provisions governing this or that must be.

Simple example: Ghana’s Deepwater Tano contract does not mention “decommissioning “in its index of articles. But it must be there. “Tano” and “decommissioning” gives you a host of references to it – in the Accounting Procedure, leading back to the clause in the main contract (headed “Taxation and Other Imposts”) which specifies a decommissioning mechanism.

Use case 5: confidentiality

Over 300 contracts (out of 800 remember) reference “confidentiality”. So yes it’s a big deal. But companies and governments often cite just the existence of language relating to confidentiality as the end to a discussion about whether contracts can be published. Does that stand up?

Read through some of the results and you find words like “unless” “without” and “however” – implying that generic confidentiality clauses have lists of exceptions. The transparency lobby has long claimed that this list of exceptions often entitles the government to publish contract under a wide range of circumstances found. How common? “Without” appears in the same paragraph in 134 contracts, “unless” in 42, and “except” or “exceptions” in 150. Those are pretty high proportions of the 309 total – so clearly the presence of a confidentiality clause is the beginning of a debate about contract transparency, as the advocates say, rather than the end of it.

Use case 6: Non-English contracts

If, for example, you want references to the environment directly in French. Or Spanish.


First, our corporate filings search engine Aleph has taken PDF versions of the contracts (along with hundreds of thousands of other filings) and crunched them through a reader which renders them into text, which is then indexed in the system. Which is what makes the search possible. But it’s far from perfect.

Second, a lot of the power in searching lies in what data geeks call “regular expressions” – give me word A in the same sentence as word B… in a document which does not mention word C. Give me this phrase in contacts signed between 2000 and 2005. Now between 2005 and 2010. This functionality is built into the repository – but we haven’t had the space to design a front end interface which allows it all to be laid out for the casual user intuitively. For now, either read up on regular expressions, or ask us.

Category: Blogs, OpenOil blogs · Tags:

Comments are closed.