Publications of research results are an important activity to disseminate new knowledge. “Standing on the shoulders of giants” is a vivid expression that points out that new discoveries and innovations are usually built on prior work by others [9, 12]. Researchers thrive on free exchange of information.
I-a Drawbacks and Limitations of Existing Publication Platforms of Publishers
To date, the most successful venues for academic paper publication are journals and magazines owned by large entrenched publishers, such as Nature Publishing Group, Institute of Electrical and Electronic Engineers (IEEE), Association for Computing Machinery (ACM) and Elsevier of RELX Group. These publishers publish a huge number of research papers every year and their journals and magazines are platforms on which researchers exchange their latest research results and where latest research breakthroughs are announced. Despite their success, these publication platforms have significant drawbacks and limitations from the standpoint of key players – authors, reviewers, and readers – that matter most.
I-A1 Pay Wall
The power to publish, store and share academic literature is concentrated in the hands of a few dominant publishers. These publishers are for-profit outfits. They charge authors for publishing papers on their venues and charge readers for accessing the papers. Conferences organized by some publishers often charge exorbitant registration fees for conference attendance, and outrageous sums of money for page charge for pages that do not incur much additional cost on their electronic platforms. They get away with these exploitations because they can. They have built up their brands over the years.
But who help them build and maintain their brands? Well, they leverage the free service of editors and reviewers to maintain the quality of the publications. In most businesses, workers who do work receive compensation rather than the other way round. Publication business is an exception – publishers charge both the workers (the authors) as well as the customers (the readers) and receive free services from both the workers and the customers (the authors and readers themselves often serve as the reviewers).
Their charges can be quite expensive to the extent that only large organizations, such as corporations, research institutions and universities, can afford the fees to access their publications. The pay wall put up by the publishers excludes small organizations and individuals from accessing the latest research publications. These publishers stand in the way of knowledge dissemination and the pay wall prevents a level playing field among researchers.
I-A2 Information Island
The authors are forced to transfer the copyrights of their papers to the publishers. The publishers typically do not mutually share their literature resources. This gives rise to information islands with unsynchronized contents. There are many intrinsic disadvantages associated with such isolated information islands. Readers and researchers lacking resources will have difficulty getting a complete set of past papers unless they subscribe to all these publishers. These islands are hurdles to knowledge dissemination.
I-A3 Disintegration of Peer Review Process
Peer reviews of papers should be performed by experts with the same level of competence as the authors of the papers in their particular field. The peer review process is crucial to maintaining paper quality. As a rule of thumb, journals and conferences with a rigorous peer review process and with a low paper acceptance rate are considered to be more prestigious by readers and authors.
With the growth of research participants, research papers are also growing exponentially. It is getting increasingly difficult to find quality reviewers to review the large number of papers. Competent reviewers are researchers themselves. As researchers, they need to balance their time between reviewing others’ papers and doing their own research. Unless these papers are directly related to their current research topics, they have little incentive to do the review, even if they have the technical expertise to do so. Paper review is a form of “technical auditing” as far as scientific papers are concerned.
When accountants perform financial auditing for corporations and organizations, they often charge a large sum of money for their service, and as such they are obligated to do a professional job that meets a certain minimum quality threshold. Otherwise, the accountants would not receive future jobs. When reviewers perform technical auditing, reviewers receive zero compensation, and the quality of review varies much from reviewer to reviewer. There are no incentives other than the conscience of the reviewers to meet certain minimum quality target. Arguably, serious technical auditing can be a lot more time-consuming than financial auditing. Why should technical auditing be free? Are scientists worth less than accountants?
Without proper incentives, there is little reason for reviewers to spend time on paper review. As a result, because of paper explosion, many reviews are quite shallow in nature, even for prestigious venues such as IEEE. Many senior researchers (e.g., professors) may relegate the responsibility of paper review to junior novice researchers (e.g., beginning graduate students of the professors) who at least have the incentive to review papers as part of their learning process –some of them probably have no choice because their superiors ask them to do the job. Where did the money – page charges, membership fees – go? Did any go to those responsible for quality assurance?
I-B Other Publication Platforms and Services
Literature search and citation index services, such as web of science and google scholar, can partially overcome the information island effect. Papers from multiple publishers can be listed and their citations can be indexed. Since these services do not really publish papers, they still cannot overcome the handicaps of pay wall and peer review disintegration.
To destroy the pay wall of publishers, Free Open Access aims to make academic literature a free public resource on a global scale. For example, the arXiv preprint system allows authors to upload their papers for free access by all researchers. By the year of 2014, more than 1 million articles have been uploaded on arXiv . The founder of arXiv, physicist Paul Kingsbagh, won the 2002 MacArthur award for his contribution to Free Open Access. Although Free Open Access platforms allow everyone to access research outputs freely and easily, they still suffer from peer review disintegration. In fact, arXiv does not even have a peer review process. Low-quality papers abound on Free Open Access platform. As of today, papers published on Free Open Access platforms do not earn the same prestige that they earn on the publication platforms of Publishers.
All publication platforms today are centralized – they are owned or managed by a single organization. As a consequence, they are prone to single points of failure – there is no guarantee that the organization will never close the access to the database. The power to publish, store and share academic literature is concentrated in the hands of publishers and owners of open-access platforms.
I-C How does PubChain Incentivize Participants
Publication Chain (PubChain) aims to overcome the limitations of the current publication platform. PubChain is a decentralized publication platform, where authors, readers and reviewers are incentivized to participate in a meaningful and substantive manner. In particular, these key players can earn credits and rewards through self-motivated interactions. The assets of PubChain are owned by these key players, not by a separate profit-focused publisher. PubChain does not own the copyrights of the papers; the authors retain their copyrights. PubChain is not a central authority. The authors do not even need to permission to PubChain for PubChain to publish their papers. Papers posted by authors are automatically distributed to the distributed IPFS file sharing system and are registered in a decentralized blockchain.
In the following, we review the status quo of existing publication platforms from the standpoints of the incentives for the authors, the reviewers, and the readers. For this purpose, IEEE is taken as a representative of publisher platforms, and arXiv is taken as a representative of Free Open Access platforms.
Incentives for Authors:
Visibility – The most important motivation for authors is that their papers are downloaded and read by many. This is successfully achieved by IEEE already. It is also achieved to some extent by arXiv given its open access nature.
Prestige and Recognition by Peers – Well written papers with good results are recognized by peers. This is successfully achieved by IEEE already. As of today, the quality of papers in arXiv varies widely because of the lack of a review process. Having an arXiv paper by itself does not command recognition by peers.
Time Stamping – Claiming the first to do something. This is achieved by IEEE to some extent; however the time stamps are not immediate. Time stamping is more immediate with arXiv.
Low Cost – This not achieved by IEEE. Publishing in IEEE venues is costly. This is achieved by arXiv.
Continuous Improvement of Publications – Authors can submit revised versions based on the feedback and reviews on the platform. If this can be achieved, research publications, like software, will have a life of its own in that it can be continuously improved. Papers published in IEEE go through a few reviewers only. And once accepted and published, the paper publications are permanent. arXiv allows authors to submit new versions of the same paper. However, there is a lack of feedback by reviewers that add quality to the new versions.
Financial Incentive – To most authors, making money from publications probably ranks low as an incentive. That said, as far as we know, IEEE (and other publishers) does not pay authors of significant papers that add much prestige to their journals and magazines. Occasionally, prize paper awards come with only a small token amount of money as a goodwill gesture. There are no financial incentive schemes on open-access platforms either.
Incentives for Reviewers:
Reward – Good reviews should be rewarded financially or rewarded by other means. Paper review is an “auditing” process. In other economic endeavors, for the audit of the financial health of a company, or the expenditure of an R&D project, the auditing company typically charges a large sum for the effort. Why should the efforts of paper reviewers be free especially if most reviewers do not gain recognition from the efforts? Technical people have been exploited to a large extent in that regard. IEEE certainly does not provide strong incentives for reviewers to do a good job. Reviewers are not participants on the arXiv platform.
Incentives for Readers:
Good and Relevant Papers – Readers, who are often researchers themselves, want to find good and relevant papers quickly. This is achieved by IEEE and arXiv.
Interactions with Authors and Reviewers – Readers can obtain answers from the authors directly on the platform. Each paper could also have an FAQ managed by the author, but with contributions from the other readers and reviewers. This are very little open interactions and debates between readers, authors, and reviewers, on IEEE and arXiv.
PubChain is designed to achieve the above incentives for all stakeholders through an incentivized ecosystem.
Ii Solution of PubChain
Ii-a Design Concept
A central design concept of PubChain is to use blockchain [13, 8] and off-chain peer-to-peer distributed file storage (i.e., InterPlanetary File System (IPFS) ) as building blocks to decentralize the publication platform. Such decentralization also means that there is no single central party that controls the running of the platform. If properly designed, the decentralized system can also be more robust than a centralized system given its replication of data across multiple parties.
PubChain uses the IPFS system as the database system for storing papers. IPFS is a distributed and decentralized storage system consisting of a network of peer-to-peer nodes. The techniques and features of IPFS can be found in . With IPFS, papers are content-addressed in the database. Authors can back up their papers to the network and freely download papers without the risk of single-point failure. The IPFS repository is physically owned by all users and not by a single entity.
PubChain exploits blockchain technology to confirm the registration of the paper ownership, to track index citations, and to incentivize participants. Blockchain is a distributed and decentralized append-only ledger for digital assets. Data in blockchain is replicated and shared among all the participants. Past records are made tamper-resistant through its append-only paradigm. There are many successful existing blockchain systems, e.g., Bitcoin , Ethereum . We can reuse and modify their software code to build the blockchain of PubChain.
The operation of PubChain blockchain is divided into two consecutive phases, with the first phase being a temporary phase before the final second phase takes over. In the first phase, PubChain operates as a consortium blockchain using the Proof-of-authority (PoA) consensus protocol . In the second phase, PubChain operates as a public blockchain using the Proof-of-work (PoW) consensus protocol .
Fig. 1 gives an overview of the PubChain platform. There are three entities in the platform: a group of publication players, a blockchain system sustained by miners, an IPFS system with distributed storage nodes. The blockchain system and the IPFS system are the infrastructure of PubChain. A network node can be a miner of blockchain or/and a storage node of IPFS. Blockchain miners run the distributed consensus protocol to maintain the data on blockchain. In the consortium blockchain phase, the miners are the super nodes that are selected to run the PoA protocol. In the public blockchain phase, the miners are the nodes that devote their computing powers to solving hash puzzles of the blockchain. IPFS storage nodes share their memory space for the distributed and persistent storage for PubChain. Through a PubChain interface, the publication players (i.e., authors, reviewers, and readers) interact with the blockchain and IPFS systems in the conduct of their activities on PubChain. We have developed a PubChain system that combines blockchain and IPFS. We will describe the system architecture of PubChain in Section III.
When an author uploads his/her paper to PubChain, the paper is time-stamped and registered on Pubchain as a permanent record. The citation index for every paper is also tracked on PubChain. Tokens are used to financially incentivize players to engage in publication activities on PubChain and to incentivize miners to sustain and maintain PubChain. We will elaborate our proposed incentive mechanism in Section IV.
The tokens issued by PubChain are called PubCoins. PubChain is a non-profit project and we will not sell the issued PubCoins through initial coin offering (ICO) and private placements to any other entity to make money. PubCoins will be distributed to all the participants as the rewards for their contributions to the platform, rather than to the project team or other organizations.
To endow PubCoin with real monetary value, we design PubChain as a side chain of another parent chain whose tokens are in wide circulation and are considered to have real monetary values, e.g., Bitcoin, Etherum, Bitcoin Cash. Using the two-way pegging technique of side chain , we can transfer the tokens on the parent chain to PubChain and vice versa. This concept is illustrated in Fig. 2. The technical details of two-way pegging and side chain can be found in . At the beginning stage, PubChain operates separately from the parent chain, and PubCoin has no real monetary values. Donation to PubChain can be injected into PubChain from a parent chain using two-way pegging, and PubChain will then operate as a side chain after that. We discuss the details about the financial model of PubChain in the Section II.B.
Ii-B Financial Model
With the crypto-currencies provided by blockchain systems, PubChain aims to establish the following financial model for the world of publications.
A certain amount of PubChain tokens (PubCoins) are issued to the participants on PubChain. Corresponding to the two phases of blockchain systems, the establishment of the value over PubCoins is also divided into the following two phases:
Phase I (Consortium blockchain phase): In this phase, as a bootstrap incentive scheme, a certain amount of new PubCoins is issued to each PubChain user when he/she first registers as a user. To endow PubCoins with real monetary values, we adopt the two-way pegging technique of side chain to transfer the values of other cryptocurrencies (that already have real prices on the market) to PubChian. On one parent chain, we lock a certain amount of the cryptocurrency tokens to a special address and we also send the simplified payment verification (SPV) proof of this token-locked transaction to PubChain. The cryptocurrency tokens owned by the special address cannot be transferred to other address by spending: these cryptocurrency tokens are simply a “reserve” to endow PubChain tokens with real monetary values. On PubChain, the miners will package the transaction sent from the parent chain into a block for broadcast to the whole Pubchain network. Then, a block on PubChain issues a number of PubCoins to the users of PubChain (with two-way pegging, these issued PubCoins can be sent to the parent chain to unlock the locked cryptocurrencies on the parent chain). In this manner, the issued PubCoins are linked to the locked cryptocurrencies on the parent chains; the value of the PubCoins are endorsed and determined by the total amount of the locked tokens.
Phase II (Public chain phase): In this phase, a certain amount of PubCoins are issued in each block. These issued tokens will be given as rewards to the miner as well as to the authors and reviewers that contribute to PubChain. How the rewards are distributed among the players will be explained later. The amount of the rewarded PubCoins in each block is constant and does not vary from block to block. This means that the total amount of tokens issued increases over time and is unlimited. No other cryptocurrency is transferred to PubChain anymore in Phase II. PubChain is operated as a decentralized central bank that constantly issues new tokens to adapt to the expansion of the whole economy on the platform.
Fig. 3 illustrates the above two-phase financial model of the PubChain system. A few remarks are as follows:
There is a important difference between the cryptocurrency endorsement mechanism in Phase I of PubCoin and the ICO activities of other projects. Using the two-way pegging technique, Pubchain cryptocurrency endorsement can guarantee the cryptocurrencies transferred to PubChain belong to the entire PubChain network rather than to a single entity (i.e., not controlled by one entity); nobody can embezzle these locked cryptocurrencies and spend them. For ICO, there is a high risk that the institution or individual controlling the project will abscond with the raised cryptocurrencies.
Phase I is similar to the Bretton Woods system (the system for monetary and exchange rate management established in 1944 ) where the US dollar is linked to gold (all involved countries confirmed the official price of 35 US dollars per ounce of gold in January 1944) and the currencies of other countries maintain fixed exchange rates with the US dollar. Under Bretton Woods system, the credit and the value of the US dollar is supported by gold.
The purpose of Phase I is to inject real monetary values into PubCoins to incentivize participants to conduct publication activities over PubChain. This phase is very important for cold-booting PubChain. As more active participants join PubChain and as the value of PubChain to publication players are demonstrated, more participants can be brought in, whether they are incentivized by money or by the value of PubChain as a publication venue. At some point, a vibrant ecosystem will be established, and there will be no need to inject the values of other crypto-currencies into PubChain. PubChain then operates normally with its financial value tied to its use value: PubChain then enters Phase II of its operating model.
In Phase II, the amount of the rewarded PubCoins in every block keeps constant all the time, which means the total amount of tokens issued increases over time and is unlimited. This is different from the token issuing mechanism of Bitcoin that limits its total amount of tokens. This implies the underlying financial models of PubCoin and Bitcoin are different. Bitcoin is more like gold, and its value comes from its scarcity; PubCoin is more like a currency, and the issuer continues to issue this currency to adapt to the gradual expansion the entire economy: if more work is done (in this case, more publications, more reviews, more readers participations), more currency will be issued to support the increased scale of the economy. Phase II is similar to the current US dollar system after the collapse of the Bretton Woods system the early 1970s.
Iii System Architecture of PubChain
Recently, blockchain-based decentralized data storing and sharing networks were investigated in [20, 3, 19, 14, 10]. PubChain has many technological similarities with these existing blockchain-based decentralized data systems. One major difference of PubChain, however, is that the data stored and shared on PuhChain are very specific, i.e., papers. This difference adds a set of special technical requirements to the design and implementation of PubChain.
PubChain has the following technical requirements: 1) Any node should be able to upload and share papers on the platform. 2) The platform should provide scalable data transmission and storage capabilities. 3) Nodes should be incentivized to upload papers to PubChain to derive benefits from the uploaded papers. 4) The platform should be able to identify and evaluate the quality of papers. 5) The platform does not keep ownership of papers.
In order to realize these technical requirements, PubChain is built on a completely decentralized system architecture. As shown in Fig. 4, the system architecture of PubChain consists of four layers: blockchain layer, virtual machine layer, routing layer and storage layer. Following the design principle proposed in , this architecture decouples the control plane (that consists of the blockchain and virtual machine layers) and the data plane (consisting of the Routing and storage layers). We describe the functions of the four layers in the following.
Iii-a Blockchain Layer
The bottom layer is the blockchain layer. The Bitcoin-like blockchain system is a tamper-proof distributed ledger that is suitable for recording small data but not suitable for processing big data. For the design of PubChain, the blockchain systems are exploited to record the metadata of papers; the blockchain systems also record the operation commands sent by the nodes and enable consensus on the execution order of operations. In a nutshell, the blockchain layer realizes a global state recorder for the PubChain platform.
Iii-B Virtual Machine Layer
Above the blockchain layer is the virtual machine layer. The API provided by the script languages on the Bitcoin-like blockchain systems is not Turing complete. The functionalities provided by them are very limited. For example, the Bitcoin blockchain can only perform simple operations such as issuing and recording transactions. For PubChain, the logic functionalities reside in the more versatile virtual machine layer. We can define new operations in the virtual machine layer without changing the underlying blockchain layer. The virtual machine layer reads the recorded metadata of papers and the operation commands from the blockchain layer and executes these operations accordingly.
Our current reference implementation of PubChain uses the Ethereum Virtual Machine (EVM) smart contract mechanism  in its virtual machine layer. Since the EVM smart contract is Turing complete, we can realize the functionalities required by the PubChain platform and easily incorporate other new functionalities as they arise. Another choice for the virtual machine layer is the virtual chain mechanism proposed in . We can also realize the logic of PubChain by predefining the required operations using the technology of virtual chain. Although virtual chain is not Turing complete, it is a lightweight design whose advantages include better reliability, security and performance.
Iii-C Routing Layers
Above the virtual machine layer is the routing layer. The main function of the routing layer is to allow the virtual machine layer to obtain the addresses of papers in the storage system. PubChain separates routing requests (i.e., how to locate papers) from the actual storage of papers. This avoids the need for PubChain to adopt a specific storage backend, allowing the coexistence of multiple forms of storage backends, including centralized database, commercial cloud service, and peer-to-peer distributed file sharing systems (e.g., IPFS).
In the implementation of PubChain, a subset of publication player nodes, blockchain miner nodes, IPFS storage nodes forms a peer-to-peer network based on distributed hash table (DHT)  for storing the routing files. When the virtual machine layer sends an address resolution request to the routing layer, the routing layer first looks for the corresponding routing file according to the target hash value of the paper sent by the virtual machine layer through the DHT mechanism. With the routing file, we can then get the URL of the specific storage locations (such as cloud service: https://, IPFS storage: ipfs://)111One paper can be stored in multiple storage systems to ensure its availability. The IPFS system is always used to store papers for ensuring free open access. If the URL of a cloud service is used, the access to the paper is via a central server. If the URL of IPFS is used, the access to the paper is via the DHT embedded in IPFS system.. A data request is then initiated to the corresponding storage system via the URL.
Iii-D Storage Layer
The top level of the PubChain is the storage layer that stores the actual data of papers. Each stored paper is signed by the owner’s private key to claim its ownership. Because papers are stored outside of the blockchain, PubChain can support the storage of papers with any size and can use a variety of storage backends. Nodes do not need to trust the storage systems because they can verify the integrity of downloaded paper in the virtual machine layer. Multiple forms of data storage can be mounted to the storage layer. For example, some large institutions (such as universities, companies) can set up their own centralized databases to back up the papers on PubChain for their own use. However, the decentralized peer-to-peer IPFS storage system is always included to ensure the accessibility of the papers.
Iv Incentive Mechanisms of PubChain
Papers on PubChain will attract large readership only if it is a reputable publication platform. Authors and reviewers contribute toward making PubChain a quality publication platform by submitting high-quality papers and reviews. In that light, effective incentive mechanisms to encourage substantive and meaningful participation by authors and reviewers are a core part of Pubchain. In this section, we describe the incentive mechanisms of PubChain222We focus on the incentive mechanism to publication players here. The incentives to blockchain miners are the mining rewards and transactions fees. Filecoins  are the incentive to IPFS storage nodes who share their storage spaces..
Iv-a Incentive Mechanism to Authors
In Pubchain, an author submits her/his paper via a transaction . The transaction includes meta-information associated with the submitted paper, such as the author’s address on the blockchain, the paper’s IPFS address, title, keywords, and the transaction hashes of papers cited by the paper.
The author pays PubCoins in the transaction of each submitted paper. A fraction of tokens, , are allocated to areviewer bonus pool. The tokens in the reviewer bonus pool are distributed to reviewers according to the mechanism presented in Section IV.B. Another fraction of tokens, , are given to the papers cited by the submitted paper 333Only authors of cited papers having a registered account with PubChain will be rewarded. Each Paper is identified by the hash of the transaction that registered the paper on the blockchain.. The remaining tokens are taken as the transaction fees given to the miner that records the transaction onto blockchain.
In the public-chain phase of PubChain, every mined block contains a coinbase transaction that mints new PubCoin tokens. Among these minted tokens, a fraction of tokens, , are released to the reviewer bonus pool, a fraction of tokens, , are released to authors as rewards according to the following reward distribution mechanism, and the remaining tokens are released to the miner of that block.
To incentivize authors to submit quality papers, rewards are distributed to authors according to the review scores of their papers. Specifically, when a new block is mined, the author of paper receives a reward of PubCoins from the new block, computed as follows:
where is the current review score of paper (computation of the current review score of paper will be presented in Section V.A), is a quality threshold for papers, the summation of is over all papers that were been published on PubChain during the previous block intervals. In other words, a paper will be rewarded within a reward window of blocks after it is published on PubChain.
The review score of paper is initialized to zero, , and thus initially. With the threshold , posting and overwhelming PubChain with low-quality papers whose review score does not pick up over time will not be rewarding. Furthermore, since PubCoins are charged for each posted paper, there is a disincentive for authors to submit low-quality papers.
Besides receiving rewards from good reviews, a paper can also receive rewards when it is cited by another paper. Specifically if paper cites other papers that are also posted on PubChain, tokens paid by paper will be given to the authors of the cited papers. Each cited paper will receive tokens from paper . In this way, if a paper has a long lasting influence on other papers, it may continue to receive rewards through the citation mechanism (i.e., long after the review reward window has transpired, citation reward may continue).
Iv-B Incentive Mechanism to Reviewers
In Pubchain, a reviewer submits the review of a paper by sending a transaction to the blockchain. The transaction includes the hash of the transaction that publishes the paper, the numerical score of the paper given by the reviewer, and the blockchain address of the reviewer. In addition to the numerical score, reviewers can also write comments on papers. Insightful comments are useful to the authors in terms of improving future versions of their papers; they also let readers identify high-quality papers. The comments on papers are stored on IPFS and their IPFS address are included in the transaction that is sent to PubChain for record keeping.
We denote the review of paper by reviewer by where is the numerical score and is the comments. PubChain treats the comments by reviewers as some sort of a “special paper” that are reviewed by readers – paper reviews are also reviewed, but with a numerical score only. The score of review depends on its review numerical scores given by readers. Readers will not give high scores to a paper review with only a numerical score without insightful comments. Review of paper receives a reward of PubCoins computed as follows:
where is the current average score of comment , is the total reward in the review bonus pool during the current block interval 444In the consortium-chain phase of PubChain, , where is the total reward in the review bonus pool during the last block interval, is the number of the published papers and is the total amount of the tokens paid by the published papers during this block interval. In the public-chain phase of PubChain, , where is the amount of tokens released to the review bonus pool by the current block interval., is a ratio () that governs how much bonus in the current pool are distributed to reviewers during this block interval, the summation of and is over the comments recorded onto PubChain during the previous blocks. The bonus not used in the current block, , will be kept in the pool for release in subsequent blocks.
To incentive miners to include review transactions into their blocks, a fraction of the reward obtained by a review (i.e. tokens) is released to the miner who included this review transaction into its block. Therefore, during each block interval, tokens from the review bonus pool are released to miners who included review transactions associated with all reviews in the past blocks.
Fig. 5 illustrates the token flows associated with our incentive mechanism. The incentive mechanism relies on the scores that can objectively reflect the qualities of papers and reviews. The next section will present a decentralized scoring system that can prevent malicious nodes from tampering with scores.
V Decentralized Scoring System of PubChain
The financial rewards of PubChain are issued to authors and reviewers according to the scores of their papers and their reviews. To earn more rewards, malicious nodes may deliberately give scores that deviate from the true quality of papers and reviews. Therefore, a decentralized scoring system that can ensure objective scores in the presence of malicious nodes is very important. In this section, we first propose a decentralized scoring system to compute the scores of papers and reviews. We then perform simulations to investigate the integrity of the proposed decentralized scoring system.
V-a Decentralized Scoring System
The effective score of review is computed by averaging readers’ scores on . If review has received scores from less than readers, its effective score is fixed to ; otherwise, the effective score of review is obtained by excluding the highest and lowest 10% readers’ scores and then averaging the remaining scores. To avoid conflict of interest, if a participant has submitted a review of a paper, she/he cannot score the other reviews of the same paper as a reader. In addition, to avoid score flooding, a reader can at most score review comments of the same paper 555For implementation, we need a way to identify participants on PubChain and associate each participant ID with a unique address on blockchain. To achieve this, we can use the affiliation emails or ORCID IDs of the participants as their IDs on PubChain.. The effective score of review is for two purposes. First, it is used in (??) to incentivize reviewers to perform high-quality reviews. Second, it is used to compute the effective score of paper .
We employ the review results of paper , encoded in the form of to compute the review score of paper . First, we normalize the scores of review given by the readers as:
for all . The normalized score takes value between 0 and 1. Then, we compute as a weighted sum of scores given by reviewers to paper using the normalized scores as their weights:
The computed is an evaluation on the quality of paper and is used to reward the author by the reward distribution mechanism. In essence, the effective score made by readers to review reflects the quality of that review and is an indication of the extent to which readers agree with the score by reviewer on paper .
V-B Simulation Investigations
We next present simulation results to validate that our proposed decentralized scoring system can ensure fair reviews of papers, even in the presence of adversary reviewers with a biased interest.
Consider one poor-quality paper, paper with a ground-truth score of . The author of this paper is an attacker who wants to gain more rewards by controlling a set of malicious nodes faking as reviewers and readers so that the paper can obtain a much higher score on the platform. We assume the scores for a paper
given by honest reviewers are Gaussian distributed with mean
and variance. The scores given by honest readers to a particular review of paper are Gaussian distributed with mean and variance , where is the mean score for a “perfect review” that assigns the same score to the ground-truth score (i.e., ).
We consider two strategies for the attacker. The first strategy is to have all malicious nodes serve as reviewers of the paper. All malicious nodes will give a high review score , for all , where is the set of the malicious nodes controlled by the attacker. In our simulations, we assume there are totally 1000 review scores given by reviewers to paper , among which scores are given by the malicious nodes. We assume each review is scored by honest readers. Then, the effective score of review is obtained by first excluding the highest and lowest 10% scores from readers’ scores and then averaging the remaining scores. Finally, we compute the final score of paper according to (3) and (4). The results are shown in Fig. 6 and Fig. 7, where the final scores are evaluated with respect to different numbers of malicious nodes . We treat the simple average of the review scores, i.e., , as our benchmark. In the simulations, we set , , and . Fig. 6 and Fig. 7 show the results for , and , respectively. As we can see, our scoring method is robust to the attacker’s fake reviews. When more and more malicious nodes are involved (large ), the attacker becomes more successful in biasing the score toward the fake score. Large readership on the PubChain platform means large , and large makes the system more robust against large .
The second strategy is to have a fraction of the malicious nodes be fake reviewers and the rest be fake readers; and half of the fake readers will support the fake reviews by giving high scores, and the other half of the fake readers will attack the honest reviews by giving them low scores. For example, suppose that there are malicious nodes and . Then, 10 of the malicious nodes are fake reviewers that give review score to paper and 90 malicious nodes are fake readers that can give a total of fake scores to all reviews of paper . Among the fake scores to reviews, scores of are given to the fake reviews put up by the attacker (each of the 10 fake reviews is assigned with scores of ), where is a very high score used to support the fake reviews ; scores of are given to the honest reviews (each honest review is assigned with scores of ), where is a very low score used to attack these honest reviews.
In the simulation, we set , , , , , , . Fig. 8 and Fig. 9 show the results for , and , respectively. From the results, we can observe that with large , our scoring method is still robust to this attack strategy.
Vi System Implementation
We have implemented a proof-of-concept prototype for the PubChain platform. The implementation of the blockchain reuses Ethereum, which means we can realize the virtual machine layer of PubChain using the EVM smart contract mechanism. The prototype uses IPFS for the storage layer. We have deployed the PubChain interface to a network node with address http://220.127.116.11:3000/. Users (i.e., publication players) can use the JSON-RPC protocol to remotely deploy and invoke smart contract via this network node to conduct their activities on PubChain.
With smart contracts, we have implemented the functions of paper posting, paper reviewing, review scoring. The script codes of smart contracts are stored on blockchain. The smart contracts are triggered by transactions sent sent to their address on blockchain. Algorithm 1 shows the script codes of the smart contract that implements the function of paper posting. To post a paper on PubChain, an author needs to carry out the following procedure: 1) upload the paper (possibly including some program codes, multimedia materials) with her/his signature to the IPFS system and obtain the IPFS address of this paper (i.e., the paper hash); 2) include the publication information about the paper, i.e., its ownership (the address of the author on blockchain), IPFS address, paper title, key words, etc.) into a paper metadata record; 3) pack the metadata of the paper to a transaction; 4) issue the transaction to the blockchain system. After the smart contract receives the transaction, it can be executed by some miner to write the metadata of the paper to the blockchain. Fig. 10 shows the window of Remix Ethereum IDE  after the paper posting smart contract is triggered by a transaction that posts our paper to the deployed Ethereum testnet. The procedures and smart contracts for other functions are designed and implemented in similar ways.
Currently, we have not implemented the proposed incentive mechanism that requires extensive modifications on the blockchain program codes. This is the most important part of our follow-up work.
To overcome the drawbacks and limitations of existing publication platforms for research papers, we exploit recent advances in decentralized technologies (i.e., blockchain, IPFS) to design a decentralized open-access publication platform named PubChain. Compared with the existing centralized publication platforms, PubChain has several advantages: (i) PubChain breaks the pay wall imposed by publishers so that everybody can enjoy free access to papers. (ii) PubChain eliminates undesired effects of information islands and has the potential to become a unified database for global sharing and recording of papers. (iii) PubChain, as a decentralized system, provides uninterrupted service without single points of failure. (iv) PubChain incentivizes participants to make positive contributions to the platform with an incentive scheme implemented over blockchain technology.
Importantly, unlike many other publication platforms, PubChain is not meant to be a profit-oriented platform. The donation of cryptocurrency injects initial financial values to Pubchain. We propose to use a two-way pegging technique to lock donated cryptocurrency to a special address of the parent chain that cannot be spent by any individual address. The project development team, as volunteers, will not receive any cryptocurrency
This project will be successful only if it can recruit the participation of authors, reviewers, and readers who believe in the tenet of free dissemination and free open access to timely research results. We invite more volunteers to join the project and work with us to improve the design of Pubchain, and to serve as advocates for the new way of knowledge dissemination for the benefit of humanity.
-  Note: https://remix.ethereum.org/ Cited by: §VI.
-  (2016) Blockstack: a global naming and storage system secured by blockchains. In Proc. 2016 USENIX Annual Technical Conference (USENIX ATC), pp. 181–194. Cited by: §III-B, §III.
-  (2016) Medrec: using blockchain for medical data access and permission management. In Proc. 2016 2nd International Conference on Open and Big Data (OBD), pp. 25–30. Cited by: §III.
-  (2014) Enabling blockchain innovations with pegged sidechains. Cited by: §II-A.
-  (2018) Filecoin: A decentralized storage network. Protoc. Labs. Cited by: footnote 2.
-  (2014) IPFS-content addressed, versioned, p2p file system. arXiv preprint arXiv:1407.3561. Cited by: §II-A, §II-A.
-  (2013) Ethereum: a next-generation smart contract and decentralized application platform. Cited by: §II-A, §III-B.
-  (2016) Blockchains and smart contracts for the internet of things. IEEE Access 4, pp. 2292–2303. Cited by: §II-A.
-  (2015) Strategic sourcing in the new economy: harnessing the potential of sourcing business models for modern procurement. Springer. Cited by: §I.
-  (2017) Blockchain based data integrity service framework for iot data. In Proc. 2017 IEEE International Conference on Web Services (ICWS), pp. 468–475. Cited by: §III.
-  (2008) Bitcoin: a peer-to-peer electronic cash system. Cited by: §II-A, §II-A.
-  (Retrieved 7 June 2018) Letter from Sir Isaac Newton to Robert Hooke. Historical Society of Pennsylvania. Cited by: §I.
-  (2018) Everything you wanted to know about the blockchain: Its promise, components, processes, and problems. IEEE Consumer Electronics Magazine 7 (4), pp. 6–14. Cited by: §II-A.
-  (2017) Towards blockchain-based auditable storage and sharing of iot data. In Proc. of the 2017 on Cloud Computing Security Workshop, pp. 45–50. Cited by: §III.
-  (2001) Chord: a scalable peer-to-peer lookup service for internet applications. ACM SIGCOMM Computer Communication Review 31 (4), pp. 149–160. Cited by: §III-C.
-  (retrieved 12 Dec. 2017) Parity: Fast, light, robust ethereum implementation. Cited by: §II-A.
-  (1978) Bretton woods: birth of a monetary system. Springer. Cited by: 2nd item.
-  (2014) One million preprints and counting: A conversation with arxiv founder Paul Ginsparg. The Scientists. Cited by: §I-B.
-  (2018) TSAR: a fully-distributed trustless data sharing platform. In Proc. 2018 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 350–355. Cited by: §III.
-  (2015) Decentralizing privacy: using blockchain to protect personal data. In Proc. 2015 IEEE Security and Privacy Workshops, pp. 180–184. Cited by: §III.