getTBinR 0.5.5 now on CRAN - 2017 data.

getTBinR 0.5.5 is now on CRAN and should be available on a mirror near you shortly! This update is mainly about highlighting the availability of TB data for 2017, although some small behind the scenes changes were required to get the code set up going forward for yearly updates. A few more plotting options have been added, along with the corresponding tests (definitely the most exciting news). The full changelog is below along with a short example highlighting some of the changes in the 2017 data.

The main message from the 2017 data is that in 2017 there were again over 10 million estimated TB cases globally with only a 1.8% decrease in incidence rates compared to 2016. Over the last 10 years progress has been made with an average of a 1.9% decrease in TB incidence rates year on year. However there is little evidence of an increase in the rate that TB incidence rates are falling, with incidence rates forecast to remain at over 100 per 100,000 for the next 10 years if progress is made at the same rate as in the last decade.

Feature updates

  • Added a years filter to plot_tb_burden and plot_tb_burden_overview. This allows a range of years to be plotted. The default is all years which was the previous de facto default.

Package updates

  • Updated docs to reflect new year of data.
  • Updated examples to use the new year of data as standard.
  • Updated README to always use the current year of data.
  • Updated all vignettes to reflect new data or be fixed to historic data as appropriate.
  • Update site with links out to blog posts using the newest version of pkgdown.

Example: Changes in TB incidence rates in 2017

The code below quickly explores the updated data by first estimating global incidence rates and the annual change between years. The country level annual changes in TB incidence rates are then plotted, first globally and then by region. Finally, the trend in incidence rates is explored using country, regional and global level TB incidence rates. See here for a full size version of the storyboard.

## Get required packages - managed using pacman
if (!require(pacman)) install.packages("pacman"); library(pacman)
p_load("getTBinR")
p_load("ggplot2")
p_load("ggrepel")
p_load("scales")
p_load("viridis")
p_load("dplyr")
p_load("forcats")
p_load("ggridges")
p_load_gh("thomasp85/patchwork")

##Pull TB data 
tb_burden <- get_tb_burden() 


## Summarise global changes
global_tb <- tb %>% 
  group_by(year) %>% 
  summarise_at(.vars = c("e_inc_num", "e_pop_num"), ~ sum(., na.rm = TRUE)) %>% 
  mutate(inc_rate = e_inc_num / e_pop_num * 1e5,
         per_change_inc = (inc_rate - lag(inc_rate)) / lag(inc_rate)) %>% 
  mutate(g_whoregion = "Global",
         label = ifelse(year == max(year), g_whoregion, ""))



global_tb

## TB in 2017
tb_2017 <- global_tb %>% 
  filter(year == 2017)

tb_2017

## Global annual change
global_annual_change <- ggplot(global_tb, aes(year, per_change_inc)) +
  geom_smooth(se = FALSE, col = "black", size = 1.2, alpha.line = 0.7) +
  geom_point(size = 1.2, alpha = 0.8, col = "black") +
  scale_y_continuous(label = scales::percent, minor_breaks = NULL, breaks = seq(-0.025, 0, 0.0025)) +
  theme_minimal() +
  labs(
    y = "Annual Percentage Change",
    x = "Year",
    title = "Global Annual Percentage Change in Tuberculosis Incidence Rates",
    caption = ""
    
  )

global_annual_change

## Remove countries with incidence below 1000 or incidence rates below 10 per 100,000 to reduce noise and cal country level annual change.
countries_with_tb_burden <- tb_burden %>% 
  filter(year == 2017,
         e_inc_100k > 10,
         e_inc_num > 1000)

tb_annual_change <- tb_burden %>% 
  semi_join(countries_with_tb_burden, by = "country") %>% 
  group_by(country) %>% 
  arrange(year) %>% 
  select(year, g_whoregion, country, e_inc_100k, e_inc_num, e_pop_num) %>% 
  mutate(annual_change = (e_inc_100k - lag(e_inc_100k)) / lag(e_inc_100k)) %>% 
  ungroup

## Function to plot annual change
plot_annual_change <- function(df, strat = NULL, subtitle = NULL, years = 2000:2017) {
  dist <- df %>% 
    filter(year %in% years) %>% 
    rename(Region = g_whoregion) %>% 
    mutate(year = year %>% 
             factor(ordered = TRUE) %>% 
             fct_rev) %>% 
    ggplot(aes_string(x = "annual_change", y = "year", col = strat, fill = strat)) +
    geom_density_ridges(quantile_lines = TRUE, quantiles = 2, alpha = 0.6) +
    scale_color_viridis(discrete = TRUE, end = 0.9) +
    scale_fill_viridis(discrete = TRUE, end = 0.9) +
    geom_vline(xintercept = 0, linetype = 2, alpha = 0.6) +
    scale_x_continuous(labels = scales::percent, breaks = seq(-0.4, 0.4, 0.1),
                       limits = c(-0.4, 0.4), minor_breaks = NULL) +
    theme_minimal() +
    theme(legend.position = "none") +
    labs(x = paste0("Annual Change in ", search_data_dict("e_inc_100k")$definition),
         y = "Year",
         title = "Annual Percentage Change in Tuberculosis Incidence Rates",
         subtitle = subtitle,
         caption = "")
  
  return(dist)
}

## Overall country level annual change
overall <- plot_annual_change(tb_annual_change, NULL,
                              years = seq(2001, 2017, 2), subtitle = "By Country") 

overall

## Regional country level annual change
region <-  plot_annual_change(tb_annual_change, "Region",
                              subtitle = "By Region", 
                              years = seq(2001, 2017, 2)) + 
  facet_wrap(~Region) +
  labs(caption = "")

region

## Regional and Global TB incidence rates over time
regional_incidence <- tb_burden %>% 
  group_by(g_whoregion, year) %>% 
  summarise(inc = sum(e_inc_num, na.rm = TRUE), pop = sum(e_pop_num, na.rm = TRUE)) %>% 
  ungroup %>% 
  mutate(inc_rate = inc / pop * 1e5) %>% 
  mutate(label = ifelse(year == max(year), g_whoregion, "")) %>% 
  ggplot(aes(year, inc_rate, col = g_whoregion)) +
  geom_line(alpha = 0.8, size = 1.2) +
  scale_color_viridis(discrete = TRUE, end = 0.9) +
  theme_minimal() +
  theme(legend.position = "none") +
  scale_y_continuous(breaks = seq(0, 400, 25), minor_breaks = NULL, limits = c(0, NA)) +
  scale_x_continuous(breaks = seq(2000, 2017, 1), minor_breaks = NULL) +
  geom_text_repel(aes(label = label),
                  nudge_x = 4,
                  force = 10,
                  na.rm = TRUE,
                  min.segment.length = 10) +
  labs(x = search_data_dict("e_inc_100k")$definition,
       y = "Year",
       title = "Tuberculosis Incidence Rates",
       subtitle = "By Region, per 100,000 population") +
  geom_line(data = global_tb, aes(col = NULL), alpha = 0.8, size = 1.2) +
  geom_text_repel(data = global_tb,
                  aes(label = label, col = NULL),
                  nudge_x = 2,
                  nudge_y = 8,
                  na.rm = TRUE)

## Map global TB incidence rates for 2017 using getTBinR
map <- map_tb_burden(year = c(2005, 2009, 2013, 2017), facet = "year") +
  theme(strip.background = element_blank()) +
  labs(caption = "",
       title = "Tuberculosis Incidence Rates",
       subtitle = "By Country, per 100,000 population")

## Compose storyboard
storyboard <- (map + regional_incidence + plot_layout(widths = c(2, 1))) /
  (region + (global_annual_change /
               overall + labs(caption = "For country level annual percentages change countries with incidence above 1000 and an incidence rate above 10 per 100,000 are shown.
                    The global annual percentage change is shown with a LOESS fit. 
                    By @seabbs | Made with getTBinR | Source: World Health Organisation")) + plot_layout(widths = c(2, 1))) +
  plot_layout(widths = c(1, 1))

## Save storyboard
ggsave("storyboard.png",
       storyboard, width = 20, height = 15, dpi = 330)

getTBinR 0.5.5 storyboard

For other examples of using getTBinR to visualise the WHO TB data see my gists, previous blog posts, and the getTBinR website.

Related

comments powered by Disqus