Dear Mark
I have not contributed before on the list. My name is Benjamin White, and I am Head of Intellectual Property at the British Library.
I am aware that Creative Commons are working on public statements on CC licences, the requirement for attribution and how that is possible in the context of large scale text and data mining. I don't think the copyright / moral right idea of attribution maps well to big data. For example the requirement to attribute is non sensical when mining 1000s of articles of indeed when TDM is deriving results from what has NOT been written about in the text.
For this reason my preference would to say CC 0 with a strong statement that attribution should be used as a matter of professional ethics, when at all possible and practical.
I would also like a very strong statement in here that in many jurisdictions (am thinking the EU) the creation of nearly all forms of data automatically attracts sui generis database rights and depending on the contents, copyright also. Researchers should therefore ALWAYS ALWAYS seek to ensure clarity through clear licensing of that data.
I really do not think this can be over stressed. I have seen / heard so many statements from different organisations, and been at EU hearings also, where sharing of data has been discussed ad nauseum and yet NO mention of the fact that databases / arrangements of data attract IPR and therefore can be a barrier to sharing.
I would favour a CC0 approach with soft norms like professional ethics etc being used to deal with issues like attribution.
Regards
Ben