This morning, I went – on behalf of Creative Commons – to the “Open Science Works!” meeting in the European Parliament.
The meeting was organized under the auspices of MEP Marietje Schaake with the support of the TransAtlantic Consumer Dialogue (TACD)
Disclaimer: I am fairly new to this blogging/reporting. If you have any remarks, corrections (especially when you were present at the meeting, please share them in the remarks
The panel session, hosted by David Hammerstein (@DaHammerstein) featured Alma Swan, Director of SPARC Europe, Tim Hubbard, Wellcome Trust Sanger Institute (@timjph) , Cristoph Bruch, Helmholtz Gemeinschaft, and Celina Ramjoué, DG Connect.
Alma Swan,“The EU´s new Open Access Policy: making it happen.”
Alma Swan gave an overview of the differences of the expected Open Access policy in Horizon 2020. From a ‘best effort’ policy during FP7 – only 20% of all research, only in selected areas – the OA policy of the EC will be expanded to all research, ánd it will become mandatory.
To make it work, Swan stressed the necessity of alignment between Member States policies and the European policies:
- alignment makes international collaboration (a reality in most H2020 research projects) easier
- consistency with the idea of the unified European Research Area (ERA)
- consistent policies all over Europe are necessary to change authors’ attitudes vis-à-vis for example self-archiving policies
- allows for the creation of generic infrastructures
Swan mentioned the importance of Europe-wide infrastructure projects such as OpenAIRE, creating a single location where all European research output, together with project information and underlying data can be aggregated (mainly, but not exclusively, from repositories). She also mentioned two interesting, upcoming FP7 projects: FOSTER, OA advocacy based on the ‘train the trainer’ principle, and PASTEUR4OA, about OA policies.
The next speaker was Cristoph Bruch, responsible for the OA policies of the German Helmholtz Association: “Text and Data Mining (TDM): the way forward in the EU.”
‘Data is the new oil’: crucial, but retrieval processes still need to be refined. Bruch talked about how researchers are already constantly mine Text and Data, but do not realize it. Most also don’t know that they, to conduct TDM, are making copies of work (researchers don’t associate copying with TDM).
Bruch identifies three main hurdles for efficient TDM:
1. European Copyright Directive: does not include TDM. Content industry claims copies of work are necessary for TDM – so TDM violates their rights of reproduction
2. Existing Sui Generis database rights: even if a database is accessible under a liberal license such as CC-BY, these rights ‘protect’ against mass downloads
3. Contractual arrangements between researchers and their publishers might also prevent TDM
Bruch gives two examples about how it can be different: in the USA, TDM is covered under the ‘fair use’ rule, and in Japan, it’s even explicitly allowed by law.
What needs to be done?
- exceptions on copyright law for TDM should not be limited to educational purposes (is currently under the discussion in the UK, but this should not be limited to education & research)
- mandatory collective licensing
- repeal of the EU database protection laws
- H2020: if a publication fee is paid, the use of CC-BY (for publications) or CC0 (for data) should be mandatory!
Tim Hubbard, “The importance and potential of Open Science”.
Hubbard, known for his Human Genome Research – one of the earliest triumphs of open science – discussed some the remaining issues in Open Science. There are still problems with the (lack of) comprehensiveness of repositories, unique author identifiers (ORCID might be a solution) and restrictions on the reuse of articles (imposed by publishers).
He then talked about the Data Sharing Policy of the Sanger institute: all data is released, even pre-publication. This makes that in the end all raw data is available, together with the intermediate stages of analysis and the final data analysises with accompanying research output (publications).
On privacy issues, especially when working with health data a sensitive subject, Hubbard explained that anonymisation is not always sufficient to protect privacy. On the other hand, too stringent privacy laws (such as EU regulation) can prevent essential health research to be conducted. A proposed policy in the UK (where?) is to keep the data centrally stored in a ‘safe haven’ – and only have the processed data freely distributed. This makes it hard to do ‘the wrong thing’.
Célina Ramjoué closed the panel session by voicing the European Commission’s position:
- The EC is very committed to Open Science, prove in the new H2020 policy
- TDM: the Licenses4Europe debate proves that this is a tricky discussion, but at least it got the subject on the agenda
- Open Access :
- in H2020, correct metadata will get even more attention
- costs for Gold OA will be eligible even after project ends
- enriched publications
- alignment of policies
- Open Research Data: ‘data is indeed the new oil’
- H2020 will feature a pilot about data, what will be the scope?
- opt-out option (for security, privacy or IP reasons)
- importance of Research Data Management plans
- ‘Open Science is digital science’
- Stakeholder input is necessary
A discussion with the audience then followed, of which I will mainly remember a short discussion between Alma Swan and the representative of Reed Elsevier. As far as I could understand, Elsevier is planning to collect large numbers of data and make them available. Of course – it’s Elsevier!- only after an ‘agreement’. Swan’s reply (loosely cited) : “why should we need extra agreements? That’s not what
opendata is about”. It seems that Elsevier is yet again trying to – disguised as ‘opening up’ – lock stuff up and only make them available after individual agreements (cf. Elseviers policies about self-archiving when mandated).
To conclude, Tim Hubbard made a great remark about how practice shows that ‘there are no downsides to open data’. The fears of researchers about sharing their data are almost never justified: peers, public, journalists etc. all show that – when data is properly made available – it is generally used and cited correctly.