jag3773 / copenhagen-alliance

Copenhagen Alliance for Open Biblical Language Resources

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Copenhagen Alliance for Open Biblical Language Resources

The Copenhagen Alliance for Open Biblical Language Resources is a diverse coalition of organizations, institutions, and individuals with a common interest in making biblical language data free and openly accessible to anyone for research, language learning, translation, and other uses.

We formed this coalition at a workshop for researchers, language teachers, and translators in Copenhagen in March, 2018. As we worked together, we realized that freely licensed resources would allow us to accomplish much more together. Some groups are blocked because they cannot use resources that other groups have. Several groups are creating similar resources because they are unaware of the work of other groups or because licensing restrictions prevent their use. This duplicate effort is inefficient. Worse, many of these projects are not finished because of insufficient resources, or never achieve the level of quality that could be achieved if we pooled our effort. We decided to form a coalition to identify existing resources, to promote free licensing for existing resources that can be shared, and to create new resources.

Some resources are created only for commercial use or cannot be freely licensed for other reasons. We recognize that fact, and some of our members produce such resources. Our focus is on resources that can be freely licensed.

Why are free resources important?

Freely licensed resources can be used without contacting and negotiating with the copyright holder. That way, creators can focus on doing their work, using and building on the resources according to the terms of use spelled out in standard licenses.

For instance:

  • A language instructor can incorporate freely licensed text and images into lessons that can also be distributed freely.
  • A researcher can provide an analysis of an existing text together with the text. Other researchers can build on that, adding new analyses and data sets.
  • A researcher can analyze the work of another researcher in new ways, using the same datasets and texts.
  • A researcher can confirm the results of another researcher by running the same test on the same data, noting any discrepancies.
  • An existing text can be enhanced with morphology, syntax trees, annotations, and illustrations.
  • A translator can freely translate a work into new languages (this is not allowed for texts that are not freely licensed).
  • A translation association can freely translate a commentary or translator's notes into another language.

Even when copyright holders are willing to share their work, negotiating licenses and working with lawyers can consume significant time and effort. With freely licensed resources, creators can focus on doing their work, using standard licenses that spell out the terms so that a resource can be used without contacting and negotiating with the copyright holder.

What do we mean by free?

There is an open issue on this section - see Copenhagen-Alliance#5.

A resource is freely licensed if the license permits free use, reuse, modification, and sharing with others. A resource in the public domain is free by definition. If you create a work but do not license it, it is copyrighted by default, and nobody else can use it without your permisssion. The best way to make a resource free is to create a clear license that identifies you as the copyright holder.

Standard licenses are strongly preferred because the terms are well-understood by a large community and it is easy to get information on them. For content, you can generate a license at Creative Commons by making a few simple choices. For code, choosealicense.com provides an excellent overview of licensing options.

For content, we prefer free culture licenses, including these:

  • CC0 Public Domain Dedication - this is not actually a license, but a way of dedicating a work to the public domain, clearly identifying yourself as the copyright holder who has the right to do so.
  • Attribution 4.0 International (CC BY 4.0) - this license allows anyone to share or adapt your work as long as they attribute the copyright holder. You can specify how your work should be attributed, including a link to the definitive source.
  • Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) - this license allows anyone to share or adapt your work as long as they attribute the copyright holder and share their adaptations of your work on the same terms. You can specify how your work should be attributed, including a link to the definitive source.

Some Creative Commons licenses do not permit commercial use. These are not free culture licenses, but can be used when business models or other considerations prohibit commercial use. For this purpose, Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) or Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) can be used. A separate license can be created for commercial use.

Some Creative Commons licenses do not permit derivatives. These are not free culture licenses, but can be used when business models or other considerations prohibit derivatives. For this purpose, Attribution-NoDerivatives 4.0 International or Attribution-NonCommercial-NoDerivatives 4.0 International can be used. Works licensed with no-derivatives licenses should consider granting permissions for translation, reformatting, adding morphology or syntax trees, subsetting the data, and annotation - see the Free Translate 2.0 International Public License for an example of a license that does this. (Creative Commons licenses do not allow new restrictions to be added to a license, but permissions can be granted for a more restrictive license.)

Source Code Control

Resources should be curated over time, correcting errors, adding new features, and responding to the needs of the community.

We prefer resources that are available in publicly accessible source code control systems like GitHub. These systems make it much easier for communities to access data, suggest corrections by opening issues, compare current versions to previous versions to see what changed, and to work together to add new features in branches.

Standard Formats with Metadata and Extensibility

We prefer human-readable standard formats that are also easy for programs to parse. These formats should support metadata and extensibility. Some common formats that satisfy these requirements are:

  • XML and well-formed HTML (often best for documents)
  • JSON (often best for objects)
  • CSV or TSV with descriptive header rows (for tabular data)

What kinds of resources are we targeting?

We are targeting resources that are useful primarily for people working with texts in the original biblical languages. Here are some kinds of resources we would like to see freely available:

  • Software platforms for working with biblical texts (translation, annotation, queries, research, etc.)
  • Images of manuscripts
  • Transcriptions of manuscripts
  • Critical Editions of Biblical Texts
  • Morphologies
  • Syntactic analyses
  • Headings and passage boundaries
  • Word senses
  • Glosses
  • Discourse analyses
  • Semantic analyses
  • Alignment data for manuscripts and critical editions
  • Lexicons
  • Semantic domains and WordNets
  • Reference grammars
  • Commentaries
  • Translation handbooks and guides
  • Exegetical summaries
  • Catalogues of specific kinds of references - people, places, things, flora, fauna, etc.
  • Catalogues of specific figures of speech and rhetorical devices
  • Catalogues of specific syntactic structures, e.g. articular infinitives, articular participles, circumstantial participles, supplementary participles, etc.
  • Cross-reference tables for versification across manuscripts and critical editions
  • Maps and GIS coordinates
  • Image catalogues keyed to verses or biblical words

About

Copenhagen Alliance for Open Biblical Language Resources