Licensing your code for better impact

March 2022

Status of these slides

This is work in progress software engineering advice, and has not been checked by a legal expert. Feedback welcome!

Hi, I’m Bob!

Career… Software Consultant -> Researcher -> Research Software Engineer

RSE at Sheffield

RSE

13 RSEs, 35 projects / year worth ~£11m total

Operational Model

  • Underwritten by overheads
  • Funded from external sources

Operational Model: Results

  • Financial sustainability
  • “Fair” allocation of staff to projects
  • “Convenient” access to growing pool of expertise and experience
  • (Mainly) academic led
  • Open ended contracts for RSEs

Talk Structure

  • Definitions
  • Open and closed source
  • Practicalities of open sourcing

License

Gives right to use copyright material in specific ways, without changing ownership.

No license : no right to copy.

In a nutshell

Open Source Benefits

  • Reproducibility is easier
  • Faster impact
  • More eyes, less bugs
  • Access for everyone, regardless of ability to pay
  • Encouraged by UK Government

FAIR Principles

…the existing digital ecosystem surrounding scholarly data publication prevents us from extracting maximum benefit from our research investments…

The FAIR Guiding Principles for scientific data management and stewardship

Findability, Accessibility, Interoperability, and Reuse

FAIR4RS, Four freedoms (abridged)

National Policy

https://www.gov.uk/government/publications/uk-research-and-development-roadmap

OGL 3

Open Source Drawbacks

  • Getting “scooped”
  • Being exposed as a bad programmer
  • Misuse (building weapons, rigging elections)
  • Sustainability (Who updates the code? Who pays for this?)

Software is not “just” data. It needs to be updated to remain useful. Here’s the code for Space Invaders.

Open Source Case Study: GPy

Open Source Case Study: GateNLP

  • “Natural Language Processing” (AI text analysis)
  • Impact case study (2014)
  • Used in media (delivery; analysis; journalism); pharmaceuticals; patent search; voice-of-the-customer; brand, product, and reputation management; social media analytics; bioinformatics.
  • Commercial beneficiaries include BT, Elsevier, Yahoo, Atos, Dassault Aviation, MPS Bank, Creditreform, BBC, the Press Association, Euromoney.

Closed Source Benefits

  • May increase the value of closed source licences, or products derived from the software.
  • Resulting revenue might fund:
    • Sustainability and maintenance
    • Further research and development
    • Regulatory compliance
    • Support
    • Your yacht (😂 - but you may benefit financially)

The World Needs Closed Source

  • Open source tends to have more difficult user experiences (contrast MacOS and Linux on user devices).
  • This comes from understanding the market and the end user.
  • Which comes from market and user research, paid for by software licencing revenue.

Software Licensing Income

  • Over the last six years of available data the University of Sheffield made about £300,000 in software licensing income per year.
  • The University of Cambridge made the most in the UK at £1.5 million per year.
  • Total research income for these two institutions were around £175 million and £527 million per year, respectively.

Software licencing income summary

Closed Source Drawbacks

  • Reproducibility is harder, perhaps impossible
  • Transparency / openness is reduced
  • Trust in research outputs is reduced

Closed Source Case Study: Fluent

  • Computational Fluid Dynamics (CFD) models liquid and air flows and is now considered essential to many aspects of engineering
  • Code developed in 1970s
  • Licensing in early 1980s
  • Part of the basis for market leader ANSYS Fluent

Closed Source Case Study:

How to open source

  • Choose a licence, agreed by everyone on the project.
  • Put the licence text in a file alongside your code.
  • Publish your code somewhere such as GitHub or GitLab.
  • Release a package.
  • Reference specific versions using ORDA, Zenodo or equivalent.

Choose a license

“Copyleft” e.g. GPL3 - better for academic collaboration

More permissive e.g. MIT - better for private sector collaboration

choosealicence.com

Presenter follow links.

Creative Commons?

  • No recommended:
    • Lack software specific terms
    • License compatibility problems

https://creativecommons.org/faq/#can-i-apply-a-creative-commons-license-to-software

Apply the licence

  • Most licenses require a single file like LICENSE or LICENSE.txt.
  • Some licences advise that you should put a message at the top of every file.

Publish

Compendia?


├───Data
├───Code
└───Docs

License individually.

Encouraging Reuse

Make a package and publish to an appropriate place e.g.

  • PyPI for Python
  • CRAN for R

Citing Software

e.g. (ORDA)

Beton, Joseph; Pyne, Alice; Praveen Joseph, Agnel; Topf, Maya (2020): TopoStats - an automated tracing program for AFM images. The University of Sheffield. Software. https://doi.org/10.15131/shef.data.13103327.v3

Automation

Contact RSE Team for further technical advice.

|Webhook or action| B(Package Index - PyPI, CRAN, …) A –>|Webhook or action| C(Archive - ORDA, Zenodo, …) –>

University of Sheffield advice

Thank you