class: center, middle, inverse, title-slide # Reproducible Research with RMarkdown ### Dr. Priyanga D. Talagala & Dr. Thiyanga S. Talagala ### Research Lounge Meet, University of Moratuwa ### 18-10-2022
Workshop Materials available at hellormd.netlify.app
<i class="fas fa-globe fa-2x faa-ring animated faa-fast " style=" color:white;"></i>
--- class: inverse, middle, center # Who we are? --- class: center .pull-left[ <img src="fig/thiyanga2.png" width="40%" style="display: block; margin: auto;" /> Thiyanga S. Talagala PhD, Monash University, Australia Senior Lecturer, University of Sri Jayewardenepura
thiyanga.netlify.app
thiyangt ].pull-right[ <img src="fig/priyanga2.png" width="40%" style="display: block; margin: auto;" /> Priyanga D. Talagala PhD, Monash University, Australia Senior Lecturer, University of Moratuwa
prital.netlify.app
pridiltal ] -- - Associate Investigator of the Australian Research Council (ARC) Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS). -- - Associate Editor for The R Journal (Scimago Journal & Country Rank Q1, ERA ranking A* ) --- class: inverse, middle, center ### Co-founders and co-organizers of R Ladies Colombo ## R-Ladies Global <img src="fig/globalhexSticker.png" width="20%" style="display: block; margin: auto;" />
https://rladies.org/
RLadiesGlobal --- background-image:url('fig/talks.jpg') background-position: 60% 100% background-size: 100% class: top, center .pull-left[ ## R-Ladies Global ### <span style="color:black"> 216 Chapters</span> ### <span style="color:black"> 100267 Members </span> ### <span style="color:black"> 61 Countries</span> ] .pull-right[ <img src="fig/RL.png" width="100%" style="display: block; margin: auto;" /> <font size="2"> Source: https://benubah.github.io/r-community-explorer/rladies.html</font> ] --- background-image:url('fig/Rjourney2.png') background-position: 50% 50% background-size: 110% class: top, center, inverse --- background-image:url('fig/Rjourney1.png') background-position: 50% 50% background-size: 110% class: top, center, inverse --- class: middle, center # Why RLadies Colombo? <div class="figure" style="text-align: center"> <img src="fig/RL.png" alt="R-Ladies Across Regions" width="80%" /> <p class="caption">R-Ladies Across Regions</p> </div> <font size="2"> Source: https://benubah.github.io/r-community-explorer/rladies.html</font> --- .pull-left[ # R-Ladies Colombo ] .pull-right[
rladiescolombo.netlify.app
RLadiesColombo
www.meetup.com/rladies-colombo/ ] <div class="figure" style="text-align: center"> <img src="fig/RLcol.png" alt="R-Ladies Across Regions" width="70%" /> <p class="caption">R-Ladies Across Regions</p> </div> --- class: inverse, middle, center ## Main developer and the main maintainer of several R packages on CRAN <div class="figure" style="text-align: center"> <img src="fig/cran.png" alt="R-Ladies Across Regions" width="80%" /> <p class="caption">R-Ladies Across Regions</p> </div> --- class: inverse, center, middle # Code of conduct This workshop series is dedicated to providing a harassment-free experience for <span style="color:#8dcefc"> EVERYONE</span>. To ensure a <span style="color:#8dcefc"> safe, enjoyable</span> , and <span style="color:#8dcefc"> friendly</span> experience for everyone who participates, we follow the [Workshop Code of Conduct](https://hellormd.netlify.app/2022/10/code-of-conduct/) This code of conduct applies to all the spaces, including online workshops, Twitter, mailing lists, both online and offline. --- class: inverse, middle, center # Why Learn R? <img src="fig/Rlogo.png" width="30%" style="display: block; margin: auto;" /> --- .pull-left[ #### `DSjobtracker` R Package (on CRAN, 2020) DSjobtracker: What Skills and Qualifications are Required for Data Science Related Jobs? <font size="4">by Statistical Consultancy Service, <br/> University of Sri Jayewardenepura, 2020 https://thiyangt.github.io/DSjobtracker/ </font> <img src="fig/DSjobtrackerhexsticker.png" width="50%" style="display: block; margin: auto;" /> ].pull-right[ #### Top twenty skills required for data science jobs <img src="2022_uom_Research_Lounge_RMD_files/figure-html/unnamed-chunk-19-1.png" width="100%" style="display: block; margin: auto;" /> ] <!--Both of these datasets contain information about job vacancies related to data science--> --- class: inverse, middle, center # R Vs Python --- class: inverse, middle, center # ~~R Vs Python~~ # R AND Python -- <img src="2022_uom_Research_Lounge_RMD_files/figure-html/unnamed-chunk-20-1.png" width="40%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # ~~R Vs Python~~ # R AND Python # <span style="color:SkyBlue">Stay</span> <span style="color:orange">TUNED !!</span> <!--
<i class="fas fa-bell faa-ring animated faa-fast "></i>
--> .pull-left[ <img src="fig/Rlogo.png" width="25%" style="display: block; margin: auto;" /> + <img src="fig/python.png" width="35%" style="display: block; margin: auto;" /> ].pull-right[
<img src="fig/quarto.png" style="height:3.5em; width:auto; " align="middle"/>
] --- background-image:url('fig/tidyworkflow1.png') background-position: 60% 80% background-size: 85% class: top, center # Tidy Workflow --- background-image:url('fig/tidyworkflow2.png') background-position: 60% 80% background-size: 85% class: top, center # Tidy Workflow --- class: inverse, middle, center ## Reproducible Research <img src="fig/reprod.png" width="90%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # Reproducible Research "Reproducibility refers to the ability of a researcher to duplicate the results of a prior study using the <span style="color:red"> same materials </span> (data, software code, etc.) as were used by the original investigator. That is, a second researcher might use the same raw data to build the same analysis files and implement the same statistical analysis in an attempt to yield the same results.... <span style="color:red"> Reproducibility is a minimum necessary condition for a finding to be believable and informative. </span>” - U.S. National Science Foundation (NSF) subcommittee on replicability in science - <font size="4">Source: Goodman, Steven N, Daniele Fanelli, and John P A Ioannidis. 2016. “What does research reproducibility mean?” Science Translational Medicine 8 (341): 1–6. https://doi.org/10.1126/scitranslmed.aaf5027.</font> --- # Traditional approach to writing reports* - Import data set into statistical software package - Run the procedure to get results - Copy & paste appropriate pieces from the analysis into editor - Add descriptions - Finish/submit report .footnote[ Source: https://ismayc.github.io/talks/thesisdown17/slides.html#1] --- ## Disadvantages of this process * - Lots of manual work (prone to make errors) -- - Tedious (who likes to carefully copy-and-paste?) -- - Likely not recordable (did you write down all the steps you followed to get your analysis?) (Articles that may require revisions) -- - What if you made an error at the beginning of your analysis? If your data had an error? -- - Tech companies are moving more and more towards reproducible research/analysis to help with employee turnover .footnote[ Source: https://ismayc.github.io/talks/thesisdown17/slides.html#1] --- class: inverse, middle, center .pull-left[ # Reproducibility crisis "...Reproducibility crisis is an ongoing methodological crisis in which it has been found that the results of many scientific studies are difficult or impossible to reproduce. ...Because the reproducibility of empirical results is an essential part of the scientific method, such failures undermine the credibility of theories building on them and potentially call into question substantial parts of scientific knowledge..." - Wikipedia, 2022 <font size="4"> Ioannidis, J. P. (2005). Why most published research findings are false. PLoS medicine, 2(8), e124.</font> ].pull-right[ <img src="fig/reprod_crisis.png" width="90%" style="display: block; margin: auto;" /> ] --- ## Reproducible Research ### Methods Reproducibility The provision of enough detail about **study procedures** and **data** so the same procedures could be exactly repeated with the same data ### Results Reproducibility Obtaining the same results from the conduct of an independent study whose **procedures are as closely matched** to the original experiment as possible with **independent data** ### Inferential Reproducibility Drawing of qualitatively **similar conclusions** from either an independent replication of a study or a reanalysis of the original study <!-- Of these, we are most interested in methods reproducibility for the purposes of this chapter. That is, we will discuss how R can help you, as a researcher, improve in this aspect of reproducibilit --> .footnote[https://tysonbarrett.com/Rstats/chapter-9-reproducible-workflow-with-rmarkdown.html] --- class: inverse, middle center <img src="fig/rmarkdown.png" width="30%" style="display: block; margin: auto;" /> --- class: inverse, middle, center
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">I haven’t even heard of latex or markdown, makes me feel a tad old 😢</p>— Oscar Jonsson (@OAJonsson) <a href="https://twitter.com/OAJonsson/status/1162473463174635520?ref_src=twsrc%5Etfw">August 16, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> --- class: inverse, middle, center
.pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">I also wrote my recent <a href="https://twitter.com/OReillyMedia?ref_src=twsrc%5Etfw">@OReillyMedia</a> book in Jupyter notebooks: <a href="https://t.co/YOvGLm9dxu">https://t.co/YOvGLm9dxu</a><br><br>I wouldn't say it's the best option for "formal" publishing—I'd prefer something closer to rmarkdown / bookdown—but Jupyter certainly worked well enough for me!</p>— Jake VanderPlas (@jakevdp) <a href="https://twitter.com/jakevdp/status/948943224390889473?ref_src=twsrc%5Etfw">January 4, 2018</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .pull-right[ <img src="fig/PDSH-cover.png" width="80%" style="display: block; margin: auto;" /> ] --- class: inverse, middle, center
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Plus you can mix up multiple languages in single Rmd if you need to. I don't think that is possible with Jupyter. You can even exchange data between python and R using the reticulate package.</p>— Jerry Thomas (@jerrythomas_in) <a href="https://twitter.com/jerrythomas_in/status/1030691102624382976?ref_src=twsrc%5Etfw">August 18, 2018</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> --- class: inverse, middle center # Buckle Up! <img src="fig/rmarkdown.png" width="23%" style="display: block; margin: auto;" /> -- ## Take down, Note down, Write down, Jot down -- ## R Markdown