Curatio et Innovatio

08/10/2021 ∙ by Roberto Rossi, et al. ∙ 0

The Middle Ages focused obsessively on the old; our era is totally absorbed with the new. In medio stat virtus. In this short note, I advocate a strategy that blends copyright and copyleft for disseminating research results in the sciences. I argue that such a blend may be beneficial in fields such as mathematics and computer science, that it may facilitate the evolution and emergence of improved problem descriptions, whilst at the same time preserving author's rights, and easing researchers' work.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

Code Repositories

copyleft

A LaTeX package for embedding copyleft material in a LaTeX document.


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In scientific enquiries, the researcher must acknowledge two essential elements. On the one hand, there is the Known; on the other, there is the Unknown.

The Known, is whatever has been investigated for years, decades, and perhaps centuries. People may have devoted their lives and efforts to structure a problem, demarcate it, dissect it, and come up with a polished description of the challenge, and of ways of dealing with it.

The Unknown, conversely, is the unexplored. It is the realm of discovery, the force which keeps research going.

In this work, I argue that, while conducting research in the sciences,

  • the Known and the Unknown should be approached in different ways, by leveraging two different tools: copyleft and copyright; and that

  • dissemination of research results in the sciences should leverage a blend of these two strategies that deal with intellectual property.

The rest of this short note is structured as follows. In Section 2, I survey different attitudes that existed in different ages towards the Known and the Unknown; in particular, the Middle Ages obsession with preservation of the Known, and the present time obsession with the discovery of the Unknown. In Sections 3, I illustrate a possible new approach to dissemination of research results in the sciences, which attempts to leverage a blend of intellectual property protection strategies to strike a better balance between preservation of the Known, and discovery of the Unknown. This approach is operationalised in Section 4. In Section 5, I draw some concluding remarks.

2 The Known and the Unknown

The “Known” is the realm of the explored. It is that branch of knowledge that deals with problems that people have considered before, discussed, and perhaps solved, or left open. For instance, a known problem in mathematics is the Knapsack Problem [6], whose origin dates back to the early works of Dantzig.

Medieval authorities tended to advocate a predominantly closed canon of knowledge which made dealing with previously unknown concepts particularly difficult.111Foreign Knowledge – Medieval Attitudes towards the Unknown. In: H-Soz-Kult, 14.02.2018, www.hsozkult.de/event/id/event-86250.

“Curiositas,” the process of seeking of new knowledge, was despised in the Middle Ages [4].

At the end of his discussion of a late medieval English guidebook to the Holy Land, Howard mentions the general medieval ambivalence toward curiosity. “All this,” he says, “is curiositas — the traveller’s interest in what he sees and the reader’s in what he hears. But it is exactly this ‘curiosity’ that led pilgrims astray and put the pilgrimage in bad repute.” [1, p. 30]

In a word, pilgrims ought not be concerned with discovering new lands and costumes; they had to focus on the spiritual aim of their journey. As a consequence of this established mindset, the Known ought to be preserved, commented upon, recombined (“varietas” was prized in line with Ciceronian tenets of ancient rhetoric [3]), but seldom modified or expanded. Preserving the Known for future generations became a key aim and duty of medieval scholars, particularly in monastic settings.

However, it is hard to preserve knowledge in its pristine state. When latin or greek manuscripts were transcribed by amanuenses, they were often annotated, and — out of necessity — modified. Changes were however minimal, unnoticeable. The action of the amanuenses was akin to an apprentice stonemason chiselling, akin to natural selection: unnoticeable in the grand scheme of things. And yet, that chiselling helped knowledge sail the waves of time, and evolve. Words were modified. Sentences were removed, inserted, or changed; and the original text therefore slowly morphed into a different one.

All this came to a dramatic change at the onset of the Renaissance. As discussed in [11], the change of mindset underpinning the Renaissance did not emerge overnight. The invention of the compass and of cartography — its cognate discipline — represent the prelude of the Age of Discovery. Sailors broke old established taboos and sailed through the “Unknown” to discover new land and riches. As a consequence of these discoveries, which directly contradicted established dogmas, people grew increasingly suspicious of the closed canon of knowledge that dominated the Middle Ages. These contradictions [8] catalysed a revolution in people’s mind. The time was ripe for a dramatic change.

It is surprising to observe how key developments in human thought seem to have emerged almost synchronously. Within a time span of fifty years, a New World had been discovered (1492) by Christopher Columbus, and a “Nova Scientia” had been discussed by Nicola Tartaglia (1537) [9]; and yet, Tartaglia’s work was still “conformist.” Tartaglia’s aim was to discuss a New Science, what we would now call dynamics, i.e. that branch of physics that deals with time-dependent physical matters. Akin to Euclid’s work, which begins with the definition of a point, Tartaglia’s work begins with the definition of an “instant,” that is a “point in time;” and then uses logic-deductive method to derive results. However, Tartaglia did not have the means to go as far as Galileo went. He stopped short of Galileo’s revolutionary claims, and framed his New Science within an Aristotelian framework. Still, Tartaglia was able to obtain new results, and to chart the Unknown. In particular, he focused on a practical military problem, and showed at what angle a cannon should be fired, in order to achieve the longest possible shot [10]. It is this charting of the Unknown that made his science “new;” hence the title “Nova Scientia,” the New Science. A close inspection of Tartaglia’s work reveals the influence he exerted on Galileo.

3 Curatio et innovatio

Our age is clearly aligned with Tartaglia’s and Galileo’s mindset. Academics strive to produce new knowledge and to generate so-called “impact.” While doing so, they “stand on the shoulders of giants.” They build upon existing results to produce knew knowledge. And yet, despite standing on the shoulders of giants, a boulder blocks their view: copyright.

When working on an existing problem, for example the Knapsack Problem, authors cannot reuse the original problem description verbatim — for instance, in the case of the Knapsack Problem, the one presented in section “The Knapsack Problem” of [6, p. 273]. In order to not infringe author’s copyright, they must create a new description of the same problem, by paraphrasing the original text. This leads to a number of problems: for complex problems, some authors may misunderstand the problem description, and generate wrong, incomplete, or misleading paraphrases; other authors may develop correct but poor descriptions. Science is then caught in an endless cycle in which a problem description is constantly perturbed, constantly re-created afresh in every new paper published, and thus never reaches a steady state. Entropy, rather than stability, is sovereign.

I argue that creation of new knowledge — for instance the discussion of a new algorithm to solve the Knapsack Problem — and preservation of existing knowledge — i.e. the Knapsack Problem definition — should be treated in different ways. In particular, copyright and copyleft should be used in concert while crafting research works.

More specifically, problem definitions should be disseminated under copyleft, e.g. Creative Commons Attribution (CC-BY), so that future authors and researchers may reuse them, and build upon them. I name this the “curatio” part of research, the preservation of knowledge, which is completely overlooked by existing research practices. The problem definition created by the original author should be initially published in a copyleft repository, rather than in a copyrighted work. This very same definition should then be reused in its original form by subsequent authors who aim to build upon the original research. Authors seeking to improve a problem definition, should themselves release their improved problem definition in copyleft form. Eventually, this strategy allows the “best” problem definition(s) to emerge spontaneously, by natural selection and popular vote: the best problem definitions being those that appear more frequently in published works, those definitions that are liked the most.

Finally, new research results — for example a new solution method for the Knapsack Problem — new analyses, discussions, or other original findings may be published as usual, subject to copyright, to protect authors’ and publishers rights. This is the “innovatio” part of research, which we are all familiar with.

Figure 1: A sample page in a copyrighted work embedding copyleft materials.

How would then research be affected by this proposal? In essence, research will turn into a mix of “curatio” and of “innovatio;” of copyleft and copyright.

Some researchers will focus on curating existing problems. They will focus on improving problem definitions, perhaps on public repositories,222e.g. https://www.github.com

and on releasing copyleft versions of these definitions that will be reused in future research works. This will resemble what already happens in open source software development. These authors will be rewarded by seeing their problem definition emerge as the definition of choice in the literature.

Other researchers, will focus on developing new approaches to tackle existing problems, or on developing new problems. While developing new approaches to tackle existing problems, they will publish as usual, in traditional journals, by leveraging and embedding the aforementioned copyleft materials (Fig. 1). While developing new problems, they will take care to first release the problem definition in copyleft open source form, in public repositories such as GitHub, so that other researchers will be able to reuse the original text, embed it in their works, build upon it, modify it, and ideally improve it.

There are further advantages to this approach to disseminating science. The art of memory [5] leverages patterns and repetitions. Parataxis (juxtaposition) of traditional elements has been used since the dawn of mankind as a memory device. Its use is apparent in oral epic poetry, which required aoidoi to memorize entire poems. For instance, a device extensively used in Homer’s works are “fixed epithets,” stereotyped descriptive phrases that can be leveraged as necessary to suit the demands of the metre: “fleet-footed” Achilles, “wily” Odysseus, or “rosy-fingered” Dawn. Furthermore, there are stereotyped formulas for going to bed and getting up, putting on and taking off armour, sacrificing and feasting, and launching and beaching ships [7].

Scholars [showed] how the formulas varied with great subtlety and effect in relation to the specific narrative contexts in which they appeared. Often these variations were among different traditional elements, not between a traditional formula and a unique expression, suggesting that oral aesthetics consisted of a skillful use of traditional elements rather than the invention of new material. [2]

I argue then that the aforementioned endless cycle in which a problem description is constantly reinvented in every new publication has a detrimental effect on memory, and on quality of published material, since scholars must familiarise with endless modelling conventions and choices. It also makes it more difficult to carry out a literature survey and follow a given research thread across multiple publications, which may leverage different — and in extreme cases even conflicting — problem definitions. The approach here proposed, by promoting standardisation of problem definitions, is likely to ease these problems.

4 Curatio in practice

After outlining the general research dissemination framework advocated in this manuscript, we now turn our attention to implementation, and discuss how this framework can be operationalised in practice. For the sake of convenience, we shall illustrate a possible implementation that blends LaTeX and GitHub; for this purpose, a new LaTeX package defining the environment copyleft has been developed; this is presented in Appendix A. The environment is structured as follows,

\begin{copyleft}{Author}{Title}{Source}{License}
    Copyleft material.
\end{copyleft}

Environment copyleft surrounds a chunk of copyleft material and, in line with copyleft attribution best practices,333https://creativecommons.org/use-remix/attribution/ accepts four input parameters: the author, the title of the work, the source of the work, and the licence under which the material is released.

Assume I have recently developed a new definition of the Knapsack Problem, and I have made it available in a GitHub repository under a Creative Commons Attribution 2.0 Generic (CC BY 2.0).444https://creativecommons.org/licenses/by/2.0/ The repository will contain a LaTeX file knapsack.tex, which contains the LaTeX code illustrated in Listing 1.

\begin{copyleft}
{Roberto Rossi}                                        % Author
{Knapsack Problem}                                     % Title
{\url{https://github.com/.../knapsack.tex}}            % Source
{Creative Commons Attribution 2.0 Generic (CC BY 2.0)} % License
Given a set of $n$ items numbered from 1 up to $n$,
each with a weight $w_i$ and a value $v_i$,
along with a maximum weight capacity $W$, the problem
is to maximize the knapsack value, subject to the
knapsacks capacity constraint, that is
\[
\begin{array}{ll@{}ll}
\mathrm{max}        & \displaystyle\sum\limits_{i=1}^{n} v_{i}&x_{i} &\\
\mathrm{subject~to} & \displaystyle\sum\limits_{i=1}^{n} w_{i}&x_{j} \leq W\\
                    &
                    &x_{i} \in \{0,1\}, &i=1 ,\dots, n.
\end{array}
\]
where $x_{i}$ represents the number of instances of item $i$
to include in the knapsack.
\end{copyleft}
Listing 1: knapsack.tex; observe how the Knapsack Problem definition is surrounded by the copyleft environment, and therefore annotated with information on author, title, source, and license, in line with copyleft attribution best practices.

The latex material in Listing 1 can be easily embedded into any other latex document as follows,

\input{knapsack.tex}

and this leads to the following result.

Roberto Rossi Knapsack Problem https://github.com/.../knapsack.tex Creative Commons Attribution 2.0 Generic (CC BY 2.0)
Given a set of items numbered from 1 up to , each with a weight and a value , along with a maximum weight capacity , the problem is to maximize the knapsack value, subject to the knapsack’s capacity constraint, that is

where represents the number of instances of item to include in the knapsack.

Assume now that an enhanced problem definition has been recently released by John Doe, and it has been made available as better_knapsack.tex in a new GitHub repository, again under a Creative Commons Attribution 2.0 Generic (CC BY 2.0) license. We can embed this enhanced problem definition in our manuscript via the command

\input{better_knapsack.tex}

and this leads to the following result.

John Doe A Better Knapsack Problem https://github.com/.../better_knapsack.tex Creative Commons Attribution 2.0 Generic (CC BY 2.0)
Given a set of items numbered from 1 up to , each with a weight and a value , along with a maximum weight capacity , the Knapsack Problem (KP) is to maximise the value of a knapsack (i.e. a selection of items), subject to the constraint that items picked must fit into its capacity. The problem can be formulated mathematically as follows,

where represents the number of instances of item to include in the knapsack.

The use of package copyleft ensures that the original LaTeX source is duly annotated in line with copyleft attribution best practices. Moreover, these annotations (author, title, source, and license information) are gathered and then compiled into a list of copyleft credits that can be printed after the usual list of references of an article via the command \printcopyleft.555The list of copyleft credits for the present work is printed at the end of this document. Note that improved versions of this package may compile and display these attributions in different forms, depending on the needs of the publication outlet.

5 Conclusions

To conclude, in this paper I advocate a departure from existing research publication practices, which mainly rely upon copyrighted work for dissemination; and a move towards a new, more balanced, blend of copyright and copyleft, for dissemination of research results. Arguably, such a blend may facilitate the evolution and emergence of improved problem descriptions, whilst at the same time preserving author’s rights, and easing researchers’ work. Finally, I have illustrated a possible strategy to operationalise this framework; this strategy leverages a newly developed LaTeX package (copyleft.sty), and open repositories such as GitHub to distribute copyleft material.

Appendix A The copyleft LaTeX package

The copyleft LaTeX package, presented in Listing 2, can be used into a latex manuscript via the command

\usepackage{copyleft}
\NeedsTeXFormat{LaTeX2e}[1994/06/01]
\ProvidesPackage{copyleft}[2021/08/07 Copyleft Package]
\RequirePackage{etoolbox}
\def\copyleftlist{}%
\listadd{\copyleftlist}{}% Initialize list
\newrobustcmd{\myexpandingcommand}[1]{%
\listgadd{\copyleftlist}{#1}% Add an element
}%
% Macro showing the current list element%
\newrobustcmd{\showlist}[1]{%
#1%
}%
\newcommand\copyleftsource{}
\newcommand\copyleftlicence{}
\newenvironment{copyleft}[4]{
% Four arguments: title, author, source, license
% https://creativecommons.org/use-remix/attribution/
\expandafter\myexpandingcommand
\expandafter{#1. #2. #3. #4.}
}{}
\newcommand{\printcopyleft}{%
  \section*{Credits}
  This manuscript embeds copyleft material
  from the following sources.
  \forlistloop{\showlist\\\\}{\copyleftlist}
}
\endinput
Listing 2: copyleft.sty

References

  • [1] Clarissa W. Atkinson. Mystic and pilgrim: the Book and the world of Margery Kempe. Cornell University Press, New York, 1983.
  • [2] Deborah Beck. [review of: Rainer Friedrich (2007) Formular economy in Homer: the poetics of the breaches]. Bryn Mawr Classical Review, October 2008. https://bmcr.brynmawr.edu/2008/2008.10.27/.
  • [3] M. Carruthers. The Experience of Beauty in the Middle Ages. Oxford-Warburg Studies. Oxford University Press, Oxford, 2013.
  • [4] Mary Carruthers. The Craft of Thought: meditation, rhetoric, and the making of images, 400-1200. Cambridge University Press, New York, 1998.
  • [5] Mary Carruthers. The Book of Memory: a study of memory in medieval culture. Cambridge Studies in Medieval Literature. Cambridge University Press, New York, 2008.
  • [6] George B. Dantzig. Discrete-variable extremum problems. Operations Research, 5(2):266–277, 1957.
  • [7] Paul Feyerabend. Against Method: Outline of an Anarchistic Theory of Knowledge. New Left Books, New York, 1975.
  • [8] Thomas S. Kuhn. The Structure of Scientific Revolutions. University of Chicago Press, Chicago, 1970.
  • [9] N. Tartaglia. La Nova Scientia. S. da Sabio, Vinegia, 1537.
  • [10] Matteo Valleriani. Metallurgy, Ballistics and Epistemic Instruments. Max Planck research library for the history and development of knowledge. epubli GmbH, April 2013.
  • [11] D. Wootton. The Invention of Science: A New History of the Scientific Revolution. Penguin Books Limited, Allen Lane, 2015.