getTBinR: an R package for accessing and summarising the World Health Organisation Tuberculosis data


Developing tools for rapidly accessing and exploring data sets benefits the public health research community by enabling multiple data sets to be combined in a consistent manner, increasing the visibility of key data sets, and providing a framework that can be used to explore key questions of interest. Tooling also reduces the barriers to entry, allowing non-specialists to explore data sets that would otherwise be inaccessible. This widening of access may also lead to new insights and wider interest for key public health issues. getTBinR is an R package (R Core Team, 2019) to facilitate working with the data (World Health Organisation, 2018) collected by the World Health Organisation (WHO) on the country level epidemiology of Tuberculosis (TB). All data is freely available from the WHO and the package code is archived on Zenodo (Abbott, 2019) and Github. The aim of getTBinR is to allow researchers, and other interested individuals, to quickly and easily gain access to a detailed TB data set and to start using it to derive key insights. It provides a consistent set of tools that can be used to rapidly evaluate hypotheses on a widely used data set before they are explored further using more complex methods or more detailed data. The functions provided in this package were developed to have sensible defaults to allow those new to the field to quickly gain key insights but also allow sufficient customisation so that experienced users may rapidly prototype new ideas.

In The Journal of Open Source Software (JOSS).