Creating a sample of articles

As a second round to bring the analysis up-to-date, we elected to sample 100 articles from the same 21 journals for 2017-2018.

Aside from the different years, the journals remain the same and process as described previously.

Step 1

R setup

library("dplyr")
library("readr")
library("bib2df")
library("googlesheets4")
library("stringr")
library("Reproducibility.in.Plant.Pathology")

set.seed(23) # note that `set.seed()` is different for all four rounds

# For printing tibble in total
options(tibble.print_max = 21, tibble.print_min = 21)

Create list of journals

We hand-picked a list of 21 journals that we felt represented plant pathology research. In this step, we will create a tibble in R of these journals, assigning them a number so that we can randomise them.

journal_list <- tibble(
  seq(1:21),
  c("Australasian Plant Pathology",
    "Canadian Journal of Plant Pathology",
    "Crop Protection",
    "European Journal of Plant Pathology",
    "Forest Pathology",
    "Journal of General Plant Pathology",
    "Journal of Phytopathology",
    "Journal of Plant Pathology",
    "Virology Journal (Plant Viruses Section)",
    "Molecular Plant-Microbe Interactions",
    "Molecular Plant Pathology",
    "Nematology",
    "Physiological and Molecular Plant Pathology",
    "Phytoparasitica",
    "Phytopathologia Mediterranea",
    "Phytopathology",
    "Plant Disease",
    "Plant Health Progress",
    "Plant Pathology",
    "Revista Mexicana de Fitopatología",
    "Tropical Plant Pathology"))

names(journal_list) <- c("number", "journal")

Create list of evaluators for second round of reviews

These articles were added at a later date to look for any possible trends in the plant pathology literature being published. The same journal titles were sampled for 2017-2018 using the same protocols as the original set of 200 articles for 100 articles over two years.

In response to reviewers, three more years of articles were added. That work is detailed in this third document.

assignees <- rep(c("Emerson", "Kaique"), 50)

Create randomised list

In this round only two individuals evaluated the articles, so we will create a randomised list of journal articles to assign to these two authors.

Create a randomised list of the journals
journals <- tibble(sample(1:21, 100, replace = TRUE))

names(journals) <- "number"

journals <- left_join(journals, journal_list, "number")
Randomly select articles

Generate a random list of years between 2012 and 2016 and a random list of start pages between 1 and 150 since some journals start numbering at 1 with every issue. Then bind the columns of the randomised list of journals with the randomised years and page start numbers. This then assumes that there is no temporal effect, i.e., the time of year an article is published does not affect whether or not it is reproducible.

year <- sample(2017:2018, 100, replace = TRUE)

contains_page <- sample.int(150, 100, replace = TRUE)

journals <- cbind(journals[, -1], year, contains_page, assignees)

journals <- arrange(.data = journals, assignees, journal, year, contains_page)
Check the number of articles per journal
journals %>%
  group_by(journal) %>%
  tally(sort = TRUE)
## # A tibble: 21 × 2
##    journal                                         n
##    <chr>                                       <int>
##  1 Plant Pathology                                10
##  2 Journal of Phytopathology                       7
##  3 Physiological and Molecular Plant Pathology     7
##  4 Canadian Journal of Plant Pathology             6
##  5 Molecular Plant-Microbe Interactions            6
##  6 Phytoparasitica                                 6
##  7 Plant Disease                                   6
##  8 Crop Protection                                 5
##  9 Forest Pathology                                5
## 10 Journal of General Plant Pathology              5
## 11 Virology Journal (Plant Viruses Section)        5
## 12 Australasian Plant Pathology                    4
## 13 Journal of Plant Pathology                      4
## 14 Nematology                                      4
## 15 Plant Health Progress                           4
## 16 European Journal of Plant Pathology             3
## 17 Molecular Plant Pathology                       3
## 18 Phytopathology                                  3
## 19 Tropical Plant Pathology                        3
## 20 Phytopathologia Mediterranea                    2
## 21 Revista Mexicana de Fitopatología               2

Once this is done, the articles are manually examined for suitability. Reference articles or off-topic articles are not included. Notes are provided regarding these cases in assigned_article_notes. If the selected page number/article was not suitable, the next sequential article was selected manually.

Add empty columns for evaluations

A variety of information will be collected with each article to be used in the analysis later. It is easiest to enter this using a spreadsheet application, so we will add on the columns for what information we want to collect and save the table as a Google Sheet for manual editing.

to_record <- c(
  "doi",
  "IF_5year",
  "country",
  "open",
  "repro_inst",
  "iss_per_year",
  "art_class",
  "supl_mats",
  "comp_mthds_avail",
  "software_avail",
  "software_cite",
  "analysis_auto",
  "data_avail",
  "data_annot",
  "data_tidy",
  "reproducibility_score"
  )
  journals[to_record] <- ""
  
  template <- 
    journals %>%
    group_by(assignees) %>%
    as_tibble()

Write to Google Sheets

We decided to use Google Sheets so that we could concurrently edit the file more easily. Once we’re done filling in our evaluations, we will import the data back to R. Jenny Bryan has created a handy package, googlesheets4 that we use here. Note that this must be run interactively for gs4_create() to work. Knitting this .Rmd file will not generate any Google sheets, it must be done in an interactive R session.

Create a Google Sheets workbook to hold worksheets for this project. This first sheet will serve as a template and is thus named “template”.

# Give googlesheets permission to access spreadsheets and Google Drive
gs4_auth()

# create Google Sheet for concurrent edits, first sheet: article notes template
gs4_create(
  "article_notes_2017-2018",
  sheets = template
  )

Step 2

Add article DOI’s

This step is completed in Google’s online Sheets app, Adam looked up articles and added notes and DOIs to a sheet, “article_notes_2017-2018”, which will be imported to R in the next step.

The Google Sheets file can be found here: https://docs.google.com/spreadsheets/d/19gXobV4oPZeWZiQJAPNIrmqpfGQtpapXWcSxaXRw1-M/edit#gid=1699540381.

Colophon

sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.2.2 (2022-10-31)
##  os       macOS Ventura 13.1
##  system   aarch64, darwin20
##  ui       X11
##  language en
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Australia/Perth
##  date     2023-01-04
##  pandoc   2.19.2 @ /opt/homebrew/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package                            * version date (UTC) lib source
##  assertthat                           0.2.1   2019-03-21 [2] CRAN (R 4.2.2)
##  bib2df                             * 1.1.1   2019-05-22 [2] CRAN (R 4.2.0)
##  bslib                                0.4.2   2022-12-16 [2] CRAN (R 4.2.2)
##  cachem                               1.0.6   2021-08-19 [2] CRAN (R 4.2.2)
##  cellranger                           1.1.0   2016-07-27 [2] CRAN (R 4.2.2)
##  cli                                  3.5.0   2022-12-20 [2] CRAN (R 4.2.2)
##  curl                                 4.3.3   2022-10-06 [2] CRAN (R 4.2.2)
##  DBI                                  1.1.3   2022-06-18 [2] CRAN (R 4.2.2)
##  desc                                 1.4.2   2022-09-08 [2] CRAN (R 4.2.2)
##  digest                               0.6.31  2022-12-11 [2] CRAN (R 4.2.2)
##  dplyr                              * 1.0.10  2022-09-01 [2] CRAN (R 4.2.0)
##  ellipsis                             0.3.2   2021-04-29 [2] CRAN (R 4.2.2)
##  evaluate                             0.19    2022-12-13 [2] CRAN (R 4.2.2)
##  fansi                                1.0.3   2022-03-24 [2] CRAN (R 4.2.2)
##  fastmap                              1.1.0   2021-01-25 [2] CRAN (R 4.2.2)
##  fs                                   1.5.2   2021-12-08 [2] CRAN (R 4.2.2)
##  gargle                               1.2.1   2022-09-08 [2] CRAN (R 4.2.0)
##  generics                             0.1.3   2022-07-05 [2] CRAN (R 4.2.2)
##  glue                                 1.6.2   2022-02-24 [2] CRAN (R 4.2.2)
##  googledrive                          2.0.0   2021-07-08 [2] CRAN (R 4.2.0)
##  googlesheets4                      * 1.0.1   2022-08-13 [2] CRAN (R 4.2.0)
##  hms                                  1.1.2   2022-08-19 [2] CRAN (R 4.2.2)
##  htmltools                            0.5.4   2022-12-07 [2] CRAN (R 4.2.2)
##  httr                                 1.4.4   2022-08-17 [2] CRAN (R 4.2.0)
##  humaniformat                         0.6.0   2016-04-24 [2] CRAN (R 4.2.2)
##  jquerylib                            0.1.4   2021-04-26 [2] CRAN (R 4.2.2)
##  jsonlite                             1.8.4   2022-12-06 [2] CRAN (R 4.2.2)
##  knitr                                1.41    2022-11-18 [2] CRAN (R 4.2.0)
##  lifecycle                            1.0.3   2022-10-07 [2] CRAN (R 4.2.2)
##  magrittr                             2.0.3   2022-03-30 [2] CRAN (R 4.2.2)
##  memoise                              2.0.1   2021-11-26 [2] CRAN (R 4.2.2)
##  pillar                               1.8.1   2022-08-19 [2] CRAN (R 4.2.2)
##  pkgconfig                            2.0.3   2019-09-22 [2] CRAN (R 4.2.2)
##  pkgdown                              2.0.7   2022-12-14 [2] CRAN (R 4.2.2)
##  purrr                                1.0.0   2022-12-20 [2] CRAN (R 4.2.2)
##  R6                                   2.5.1   2021-08-19 [2] CRAN (R 4.2.2)
##  ragg                                 1.2.4   2022-10-24 [2] CRAN (R 4.2.2)
##  Rcpp                                 1.0.9   2022-07-08 [2] CRAN (R 4.2.2)
##  readr                              * 2.1.3   2022-10-01 [2] CRAN (R 4.2.2)
##  Reproducibility.in.Plant.Pathology * 1.0.0   2023-01-04 [1] local
##  rlang                                1.0.6   2022-09-24 [2] CRAN (R 4.2.2)
##  rmarkdown                            2.19    2022-12-15 [2] CRAN (R 4.2.2)
##  rprojroot                            2.0.3   2022-04-02 [2] CRAN (R 4.2.2)
##  rstudioapi                           0.14    2022-08-22 [2] CRAN (R 4.2.2)
##  sass                                 0.4.4   2022-11-24 [2] CRAN (R 4.2.0)
##  sessioninfo                          1.2.2   2021-12-06 [2] CRAN (R 4.2.2)
##  sodium                               1.2.1   2022-06-11 [2] CRAN (R 4.2.2)
##  stringi                              1.7.8   2022-07-11 [2] CRAN (R 4.2.2)
##  stringr                            * 1.5.0   2022-12-02 [2] CRAN (R 4.2.2)
##  systemfonts                          1.0.4   2022-02-11 [2] CRAN (R 4.2.2)
##  textshaping                          0.3.6   2021-10-13 [2] CRAN (R 4.2.2)
##  tibble                               3.1.8   2022-07-22 [2] CRAN (R 4.2.2)
##  tidyselect                           1.2.0   2022-10-10 [2] CRAN (R 4.2.2)
##  tzdb                                 0.3.0   2022-03-28 [2] CRAN (R 4.2.2)
##  utf8                                 1.2.2   2021-07-24 [2] CRAN (R 4.2.2)
##  vctrs                                0.5.1   2022-11-16 [2] CRAN (R 4.2.2)
##  xfun                                 0.36    2022-12-21 [2] CRAN (R 4.2.2)
##  yaml                                 2.3.6   2022-10-18 [2] CRAN (R 4.2.2)
## 
##  [1] /private/var/folders/hc/tft3s5bn48gb81cs99mycyf00000gn/T/RtmpwnOTN6/temp_libpath4da5ab32fec
##  [2] /Users/adamsparks/Library/R/arm64/4.2/library
##  [3] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────