Michael Soprano

Michael Soprano

Ph.D XXXV

Supervisor: Stefano Mizzaro

+39 0432 558457

Stanza / Room: L2-06-ND (DMIF, NS3)

michael.soprano@uniud.it

Research Project

Readersourcing 2.0: A Practical Solution for Crowdsourcing Peer Review

The main mechanism to spread scientific knowledge is the scholarly publishing process, which is based on the peer review activity; a scientific article written by some authors is judged and rated by colleagues of the same degree of competence. Although peer review is a reasonable and well established a priori mechanism to ensure the quality of scientific publications, it is not free from problems, and indeed it is characterized by various issues related to the process itself and the malicious behaviour of some stakeholders.

Just to cite an example, in some cases reviewers cannot correctly evaluate a publication, e.g., when the paper reports data from an experiment which is long and complex and, therefore, not replicable by the reviewer itself; thus, an act of faith (that the author is honest) is sometimes required [3]. Moreover, it is reported that some stakeholder of the scholarly publishing process can have a malicious behaviour. In particular, there is a recent article [21] according to which there are more and more journals where peer review activity is not performed, in contrast with what stated by their publishers. This leads scholars to inflate the list of their publications with articles that, probably, would not pass peer review.

The world of scholarly publishing is discussing since many years about how to improve the peer review activity [22] and alternative approaches have been proposed. In my research project I will focus on an alternative approach to the peer review activity based on a particular form of crowdsourcing, which is defined as the outsourcing of a task traditionally performed by few experts to an undefined, generally large group of people in the form of an open call [7].

Background and Related Work

Among the various alternative approaches to the peer review activity proposed during the last years, there are some that aim to take advantage of information which are otherwise lost. When a scientific article is published its readers have their own opinion about it, but such opinions usually remain private spread informally between a few collaborators. Eventually, they remain “encoded” inside citations inserted into other articles. It would be useful to have an approach to the peer review activity which considers also such opinions.

In literature there are two proposals, from Mizzaro [10, 11] and De Alfaro and Faella [4], which aim to take advantage of readers’ opinions by outsourcing the peer review activity to their community. Mizzaro [11] calls his approach Readersourcing, as a portmanteau for “crowdsourcing” and “readers”. The shared idea consist in asking readers to rate quantitatively the articles they read; these ratings are used to measure the overall quality of such articles as well as the reputation of a reader as an assessor; moreover, they are used to derive the reputation of a scholar as an author.

In other terms, the main issue with which the two models of Mizzaro [10, 11] and De Alfaro and Faella [4] have to deal consists in how the ratings that the assessed entity (i.e., a publication) receives should be aggregated into indexes of quality and, from these indexes, how to compute indexes of reputation for the assessors (i.e., the readers) and, eventually, indexes of how much an author is “skilled” (i.e., a measure of his ability to publish papers which are positively rated by their readers). Such models are based on co-determination algorithms [8]. As supported by Mizzaro [11], the aforementioned proposals can be seen as a particular form of crowdsourcing, which has been used and evaluated [6] and it is seen as an effective approach [5].

Although this might seem a radical solution, it is important to remark that: (i) similar approaches, suggesting variants and changes to peer review including collaborative reviews and/or a more distributed peer review practice, have already been proposed in the past [2, 9, 1]; and (ii) the usage of crowdsourcing in scholarly publishing is being proposed and analyzed for even more radical ap- proaches, for example to outsource some steps of writing of scientific publications [20].

I have been able to provide a working implementation of the models proposed by Mizzaro [11, 10] and De Alfaro and Faella [4] by building the Readersourcing 2.0 ecosystem within a research project co-funded by SISSA Medialab and University of Udine. This ecosystem has been presented recently [12] and it is composed of four software applications. There is one (RS Server [17]) which acts as a server to gather all the ratings given by readers and one that acts as a client to allow the readers themselves to effectively rate publications, althought it is possible to carry out every operation also directly on a web interface provided by RS Server. There is also a component (RS PDF [14]) which has the task to annotate files in PDF format by taking advantage of an ad hoc software library; this component is exploited by the server-side application. Lastly, there is also an additional component (RS Py [15]) which provides a fully working implemen- tation of the Readersourcing models proposed by Mizzaro [10, 11] and De Alfaro and Faella [4] that can be used as a standalone software. The code of every software component and the overall documentation are available on Zenodo and GitHub [18, 17, 14, 16, 15].

Aims

The aim of my research project is multifold:

  1. Stage 1: Formal Study: A study of the properties of the state-of-the- art Readersourcing models proposed by Mizzaro [10, 11] and De Alfaro and Faella [4] must be performed. Formal properties such as a matricial form can be derived while properties such as model bias, fairness, robustness, etc. can be investigated. Network analysis techniques such HITS and PageRank algorithms can be applied to study such properties and predict its temporal evolution.
  2. Stage 2: Implementation Extension: the current implementation of Readersourcing 2.0 [13, 18, 17, 14, 16] must be extended and strengthened; to this extent, mobile applications and browser plugins can be developed to provide to the readers more interaction modalities, while a system of incentives should be used to encourage and retain such readers.
  3. Stage 3: Testing, Evaluation and Refinement: the Readersourcing models proposed by Mizzaro [10, 11] and De Alfaro and Faella [4] must be evaluated by means of an adequate simulation model defined using statistical techniques such as Bayesian modelling and probabilistic programming and the current implementation of Readersourcing 2.0 must be refined.
  4. Stage 4: Deployment and Usage: the final implementation of Readersourcing 2.0 should be deployed in a resilient environment. SISSA-Medialab could provide help and cooperate to boostrap and spread its usage.

Working Program

The working program of my research project is distributed over a total of 36 months (M1-M36):

  • Stage 1: Formal Study (M1-M12)
    • Definition of a matricial form of the Readersourcing models;
    • Effects of the temporal order in rating aggregation;
    • Rating cancellation in the model proposed proposed by Mizzaro [10, 11];
    • Efficient computation of the scores in the model proposed by De Alfaro and Faella [4];
    • Bias of Readersourcing models [19].
  • Stage 2: Implementation Extensions (M6-M12)
    • Definition of a system of incentives;
    • Mobile apps and browser plugins development.
  • Stage 3: Testing, Evaluation and Refinement (M12-M24)
    • Definition of a simulation model;
    • Experimental evaluation of the Readersourcing models;
    • Refinement of the current implementation of Readersourcing 2.0;
    • Wrap-up analysis.
  • Stage 4: Deployment and Usage (M24-M36)
    • Deployment of Readersourcing 2.0 in a resilient enviroment;
    • Usage of Readersourcing 2.0 to gather fresh data;
    • Spreading of Readersourcing 2.0 within one or more communities;
    • Cooperation with SISSA-Medialab.

References

[1] OpenReview (2016), https://openreview.net/
[2] Akst, J.: I Hate Your Paper: Many say the peer review system is broken. Here’s how some journals are trying to fix it. The Scientist 24, 36 (08 2010)
[3] Arms, W.Y.: What are the alternatives to peer review? quality control in scholarly publishing on the web. JEP 8(1) (2002)
[4] De Alfaro, L., Faella, M.: TrueReview: A Platform for Post-Publication Peer Review. CoRR (2016), http://arxiv.org/abs/1608.07878
[5] Dow, S., Kulkarni, A., Klemmer, S., Hartmann, B.: Shepherding the crowd yields better work. In: Proceedings of ACM 2012 CSCW. pp. 1013–1022. ACM (2012)
[6] Hirth, M., Hoßfeld, T., Tran-Gia, P.: Analyzing costs and accuracy of validation mechanisms for crowdsourcing platforms. Mathematical and Computer Modelling 57(11-12), 2918–2932 (2013)
[7] Howe, J.: Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business (1 2009)
[8] Medo, M., Rushton Wakeling, J.: The effect of discrete vs. continuous-valued ratings on reputation and ranking systems. EPL 91(4), 48004 (2010)
[9] Meyer, B.: Fixing the Process of Computer Science Refereeing (10 2010)
[10] Mizzaro, S.: Quality control in scholarly publishing: A new proposal. JASIST 54(11), 989–1005 (2003)
[11] Mizzaro, S.: Readersourcing – A Manifesto. JASIST 63(8), 1666–1672 (2012), https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.22668
[12] Soprano, M., Mizzaro, S.: Crowdsourcing peer review: As we may do. CCIS 988, 259–273 (2019)
[13] Soprano, M., Mizzaro, S.: Crowdsourcing Peer Review: As We May Do. In: Manghi, P., Candela, L., Silvello, G. (eds.) Digital Libraries: Support- ing Open Science. pp. 259–273. Springer International Publishing, Cham (2019), https://doi.org/10.1007%2F978-3-030-11226-4 21
[14] Soprano, M., Mizzaro, S.: Readersourcing 2.0: RS PDF (2019), https://do i.org/10.5281/zenodo.1442597
[15] Soprano, M., Mizzaro, S.: Readersourcing 2.0: RS Py (2019), https://doi. org/10.5281/zenodo.3245208
[16] Soprano, M., Mizzaro, S.: Readersourcing 2.0: RS Rate (2019), https://do i.org/10.5281/zenodo.1442599
[17] Soprano, M., Mizzaro, S.: Readersourcing 2.0: RS Server (2019), https:// doi.org/10.5281/zenodo.1442630
[18] Soprano, M., Mizzaro, S.: Readersourcing 2.0: Technical documentation (2019), https://doi.org/10.5281/zenodo.1443371
[19] Soprano, M., Roitero, K., Mizzaro, S.: HITS Hits Readersourcing: Validat- ing Peer Review Alternatives Using Network Analysis. BIRNDL 2019, ACM (Jul 2019)