Open Access, Open Data and Data Licenses

Posted: 16 June 2011 by Alistair Miles in Uncategorized
Tags: , , , , , ,

In the world of open source software, licenses like GPL, LGPL, MIT, etc., are generally viewed as a good thing, as they allow the authors of the software to place limited restrictions on the re-use of software according to their preference, whilst still being able to publish the source code. Similarly, for other creative works and open access publishing, the Creative Commons licenses are generally viewed as beneficial, because they allow authors to protect the integrity of their work if desired along with their right to attribution, but not otherwise limit access to or re-use of their work.

So what about scientific data? In MalariaGEN, we are developing policies for “community projects” where partners from independent research institutions around the world to submit samples for sequencing. Ultimately, we would like to make all of the data derived from sequencing those samples available to the scientific research community, but we would also like to protect our partners investment in collecting those samples by ensuring they are attributed when data are re-used. So, I thought, surely the best way to do this is to publish the data under a CC-like license, right?

It turns out this is not the current consensus. Science Commons have published a Protocol for Implementing Open Access Data, which (in section 5) has a good explanation of why using intellectual property rights (i.e., licenses) to enforce norms of attribution or share-alike is a bad idea. So the protocol states that:

[…] to facilitate data integration and open access data sharing, any implementation of this protocol MUST waive all rights necessary for data extraction and re-use […] and MUST NOT apply any obligations on the user of the data or database such as “copyleft” or “share alike”, or even the legal requirement to provide attribution.

This is consistent with policies adopted by major scientific data publishers like the European Nucleotide Archive (ENA), e.g.:

The INSD will not attach statements to records that restrict access to the data, limit the use of the information in these records, or prohibit certain types of publications based on these records. Specifically, no use restrictions or licensing requirements will be included in any sequence data records, and no restrictions or licensing fees will be placed on the redistribution or use of the database by any party.

However, the Science Commons protocol also says that:

Any implementation SHOULD define a non-legally binding set of citation norms in clear, lay-readable language.

Advertisements
Comments
  1. Here’s the license that OpenStreetMap uses. I think it’s interesting, because they are also licensing a database of facts, so it’s interesting that they’ve chosen to add the copy-left, attribution-required style license that is so clearly eschewed in this case.

    http://www.opendatacommons.org/licenses/odbl/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s