Key for data collection

This vignette documents the fields and values found in the “article_evaluations” sheet in the “article_notes” Google Sheets workbook, https://docs.google.com/spreadsheets/d/19gXobV4oPZeWZiQJAPNIrmqpfGQtpapXWcSxaXRw1-M/edit#gid=1699540381

Autogenerated values

In the reproducibility score there are two parts, one part is automatically generated, as much as possible, using the article DOI to retrieve the reference.

The articles’ DOIs are looked up manually by Adam and assigned in the “article_notes” sheet in the googlesheets document, doi.

The five-year impact factor, IF_5year for each journal was recorded in a separate sheet in the workbook.

Other items including journal, year, and contains_page are auto-generated by the “Assigning Articles” vignette.

“journal”

Auto-generated in “Assigning Articles”

Name of journal

Entered as character string in title case

“year”

Auto-generated in Assigning Articles

Year the article was published

Entered as a 4-digit number.

“contains_page”

Auto-generated in Assigning Articles

Randomly selected page number which is used to select article for evaluation. The article is selected if it contains this page number in its page range.

“assignee”

Auto-generated in Assigning Articles

Randomly assigned evaluator selected from Adam, Emerson, Nik or Zach and assigned to a particular article (row)

Manually entered metadata

“doi”

Articles are looked up by Adam using the “article_notes” sheet in googlesheets. The DOI is entered as a character string or left as NA if one is not provided.

Entered as a character string.

“comments”

Comments column, Adam has used this to enter comments regarding article selection in cases where the contains_page number does not align with our criteria. Can be used by evaluator for other comments on the article when evaluating.

Entered as a character string.

“IF_5year”

Nik has looked up and provided the five-year impact factor for each journal and entered it in the “article_notes”.

Entered as a decimal value.

Evaluation of reproducibility

“open”

Note that this was collected but not used in final analysis because it was likely to change over the time-period which was looked at

Whether the journal is open-access or not

Entered as TRUE/FALSE/BOTH

“repro_inst”

Note that this was collected but not used in final analysis because it was likely to change over the time-period which was looked at

Does the journal have guidelines for reproducibility?

  • 0 - Not mentioned

  • 1 - At least a suggestion to put data online, but not details on how/where

  • 2 - Detailed instructions on where to place data

  • 3 - Detailed instructions on where to place data and code

“art_class”

Note that this was collected but not used in final analysis

What kind of research the article describes

One or more of the following:

  • Fundamental

  • Applied

Entered as an R vector object in comma-separated list with spaces.

For example: Applied or Fundamental

“molecular”

Note that this was collected but not used in final analysis

Was the article primarily focused on using or developing molecular methods?

Entered as TRUE or FALSE.

“comp_mthds_avail”

Score of computational methods documentation.

Are the computational methods used readily available, e.g., R, SAS, Python scripts are shared?

Entered as a number between 0 and 3. Assign the lowest score possible, i.e., if some methods are available but not all, then the score is 0.

  • 0 - Not available and not mentioned in publication.

  • 1 - Available upon request to author.

  • 2 - Online, but inconvenient/non-permanent, e.g., login necessary, pay wall, FTP server, personal lab website.

  • 3 - Freely available online to anonymous users for foreseeable future, e.g. archived using Zenodo, dataverse or university library or some other proper archiving system.

  • NA - No computational methods were used that can be shared, i.e., there is no statistical analysis or figures that were prepared. For example, a molecular study used proprietary software for the experiment imaging and only gel images are included in the article. Then there are no computational methods to be shared.

In the case of multiple methods where some are available and some are not, score using the lowest value.

“software_avail”

Note that this was collected but not used in final analysis

Score of software availability.

Entered as a number between 0 and 3. The score should be assigned for the paper on the basis of the lowest-scoring software that was used/cited.

  • 0 - Not available or not mentioned in the publication.

  • 1 - Uses expensive proprietary software or requires getting a quote, e.g., ArcGIS standard is 7000 USD, Minitab is 1610 USD.

  • 2 - Uses proprietary software that most individuals can afford, e.g., Excel as a part of Microsoft365 is <100USD/year or freely available non-OSS software. SAS is rated at 2 due to the “SAS on Demand for Academics” cloud-hosted version. MEGA is free to download but is not open-source and cannot be redistributed, so is rated at 2.

  • 3 - Uses entirely free and open source software (FOSS), e.g., R, Julia, Python, QGIS.

  • NA - No software was used in the research that can be determined as the article is written.

“software_cite”

Note that this was collected but not used in final analysis

Score of citations for software used.

Entered as a number between 0 and 3. Assign the lowest score possible, i.e., if some software was cited but not all, then the score is 0.

  • 0 - not mentioned.

  • 1 - Software mentioned by name only.

  • 2 - Software cited with version number.

  • 3 - All software components, e.g. SAS PROCs, R, Julia or Python packages, etc. properly cited.

  • NA - No software was used in the research that can be determined as the article is written.

“data_avail”

Raw data availability score.

Entered as a number between 0 and 3.Assign the lowest score possible, i.e., if some data are available but not all, then the score is 0.

  • 0 - Not available or not mentioned in the publication.

  • 1 - Available upon request to author.

  • 2 - Online, but inconvenient or non-permanent, e.g., login needed, pay wall, FTP server, personal lab website that may disappear, or may have already disappeared.

  • 3 - Freely available online to anonymous users for foreseeable future, e.g., archived using Zenodo, dataverse or university library or some other proper archiving system including Genbank or other similar databases.

“software_used”

Entered as an R vector object in comma-separated list with spaces, NA for no software that can be determined.

For example: Excel, SAS, TableTool or R

Bibliography fields

The following fields are automatically filled by searching for the DOI and retrieving a BibTex entry, where applicable. In cases where a DOI is not present or the data are incomplete or incorrect the evaluator should enter the necessary data for the article assigned to them. In these cases, important fields to enter are:

“author”

Entered as character string in R vector format with spaces following BibTex style, e.g., A H Sparks, E Del Ponte, Z Foster, N Grünwald

“pages”

Entered as a character string in BibTex format, numbers separated by a double “-”, e.g., 1–18.

“publisher”

Name of journal publisher.

Entered as character string, e.g., Springer or Wiley-Blackwell or Scientific Societies

“month”

Month in which the article was published`

Entered as character string, e.g., jan, feb, mar…

“number”

Issue number of the journal

Entered as an integer

“title”

Article title

Entered as a character string

“volume”

Volume number of the journal

Entered as an integer

The following fields are automatically created by BibTex and are optional

“category”

“bibtexkey”

“address”

“annote”

“booktitle”

“chapter”

“crossref”

“edition”

“editor”

“howpublished”

“institution”

“key”

“note”

“organizations”

“school”

“series”

“type”

“url”

“x.article.pm..title”

Colophon

sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.1 (2024-06-14)
##  os       macOS Sonoma 14.6
##  system   aarch64, darwin20
##  ui       X11
##  language en
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Australia/Perth
##  date     2024-08-07
##  pandoc   3.3 @ /opt/homebrew/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  bslib         0.8.0   2024-07-29 [1] CRAN (R 4.4.0)
##  cachem        1.1.0   2024-05-16 [1] CRAN (R 4.4.0)
##  cli           3.6.3   2024-06-21 [1] CRAN (R 4.4.0)
##  desc          1.4.3   2023-12-10 [1] CRAN (R 4.4.0)
##  digest        0.6.36  2024-06-23 [1] CRAN (R 4.4.0)
##  evaluate      0.24.0  2024-06-10 [1] CRAN (R 4.4.0)
##  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
##  fs            1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
##  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
##  htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.4.0)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.4.0)
##  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.4.0)
##  knitr         1.48    2024-07-07 [1] CRAN (R 4.4.0)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
##  pkgdown       2.1.0   2024-07-06 [1] CRAN (R 4.4.0)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
##  ragg          1.3.2   2024-05-15 [1] CRAN (R 4.4.0)
##  rlang         1.1.4   2024-06-04 [1] CRAN (R 4.4.0)
##  rmarkdown     2.27    2024-05-17 [1] CRAN (R 4.4.0)
##  rstudioapi    0.16.0  2024-03-24 [1] CRAN (R 4.4.0)
##  sass          0.4.9   2024-03-15 [1] CRAN (R 4.4.0)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
##  systemfonts   1.1.0   2024-05-15 [1] CRAN (R 4.4.0)
##  textshaping   0.4.0   2024-05-24 [1] CRAN (R 4.4.0)
##  xfun          0.46    2024-07-18 [1] CRAN (R 4.4.0)
##  yaml          2.3.10  2024-07-26 [1] CRAN (R 4.4.0)
## 
##  [1] /Users/283204f/Library/R/arm64/4.4/library
##  [2] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────