Population estimates comparison
Source:vignettes/articles/Population-estimates-comparison.Rmd
Population-estimates-comparison.Rmd
On this article, we will compare the population estimates available
at the brpop
package.
Currently, the package present population estimates by municipalities, sex and age groups computed by the DataSUS (Brazilian Health Ministry) and by the UFRN-PPGDEM-LEPP laboratory. The package also have the total estimates computed by IBGE for inter-census years, and population data for Census and population inquiries years.
names_helper <- tibble(
uf_code = c("11","12","13","14","15","16","17","21","22","23","24","25","26","27","28","29","31","32","33","35","41","42","43","50","51","52","53"),
uf_name = c("Rondônia","Acre","Amazonas","Roraima","Pará","Amapá","Tocantins","Maranhão","Piauí","Ceará","Rio Grande do Norte","Paraíba","Pernambuco","Alagoas","Sergipe","Bahia","Minas Gerais","Espírito Santo","Rio de Janeiro","São Paulo","Paraná","Santa Catarina","Rio Grande do Sul","Mato Grosso do Sul","Mato Grosso","Goiás","Distrito Federal"),
cap_code7 = c(1100205,1200401,1302603,1400100,1501402,1600303,1721000,2111300,2211001,2304400,2408102,2507507,2611606,2704302,2800308,2927408,3106200,3205309,3304557,3550308,4106902,4205407,4314902,5002704,5103403,5208707,5300108),
cap_code6 = c(110020,120040,130260,140010,150140,160030,172100,211130,221100,230440,240810,250750,261160,270430,280030,292740,310620,320530,330455,355030,410690,420540,431490,500270,510340,520870,530010),
cap_name = c("Porto Velho","Rio Branco","Manaus","Boa Vista","Belém","Macapá","Palmas","São Luís","Teresina","Fortaleza","Natal","João Pessoa","Recife","Maceió","Aracaju","Salvador","Belo Horizonte","Vitória","Rio de Janeiro","São Paulo","Curitiba","Florianópolis","Porto Alegre","Campo Grande","Cuiabá","Goiânia","Brasília")
)
Total population per UF
First, we will compare the total population estimates, using the
brpop
functions to aggregate the estimates per UF.
datasus_pop_uf <- uf_pop_totals(source = "datasus") |>
mutate(source = "DataSUS") |>
filter(uf != "5e")
ufrn_pop_uf <- uf_pop_totals(source = "ufrn") |>
mutate(source = "UFRN")
ibge_pop_uf <- uf_pop_totals(source = "ibge") |>
mutate(source = "IBGE")
bind_rows(datasus_pop_uf, ufrn_pop_uf, ibge_pop_uf) |>
left_join(names_helper, by = c("uf" = "uf_code")) |>
ggplot(aes(x = year, y = pop, color = source, group = source)) +
geom_line(stat = "identity", alpha = .7, lwd = 1) +
geom_vline(xintercept = c(2000, 2010, 2022), alpha = .5, linetype = "longdash") +
scale_y_continuous(labels = unit_format(accuracy = 1, scale = 0.00001, unit = NULL)) +
facet_wrap(~uf_name, scales = "free_y", ncol = 3) +
theme_bw() +
theme(legend.position = "bottom", legend.direction = "horizontal") +
labs(title = "Population estimates per UF from different sources",
subtitle = "in 100,000 units",
color = "Source", x = "Year", y = "Population estimate")
It can be observed that the estimates from DataSUS and UFRN generally agrees on tendency and present an overlapping period (2010 - 2021) with similar values. The IBGE population estimates present a more erratic pattern at some UFs.
Total population per capitals
Now let’s look at capitals data.
datasus_pop_mun <- mun_pop_totals(source = "datasus") |>
mutate(source = "DataSUS") |>
right_join(names_helper, by = c("code_muni" = "cap_code6"))
ufrn_pop_mun <- mun_pop_totals(source = "ufrn") |>
mutate(source = "UFRN") |>
right_join(names_helper, by = c("code_muni" = "cap_code7"))
ibge_pop_mun <- mun_pop_totals(source = "ibge") |>
mutate(source = "IBGE") |>
right_join(names_helper, by = c("code_muni" = "cap_code7"))
bind_rows(datasus_pop_mun, ufrn_pop_mun, ibge_pop_mun) |>
ggplot(aes(x = year, y = pop, color = source, group = source)) +
geom_line(stat = "identity", alpha = .7, lwd = 1) +
geom_vline(xintercept = c(2000, 2010, 2022), alpha = .5, linetype = "longdash") +
scale_y_continuous(labels = unit_format(accuracy = 1, scale = 0.00001, unit = NULL)) +
facet_wrap(~cap_name, scales = "free_y", ncol = 3) +
theme_bw() +
theme(legend.position = "bottom", legend.direction = "horizontal") +
labs(title = "Population estimates per capital from different sources",
subtitle = "in 100,000 units",
color = "Source", x = "Year", y = "Population estimate")
At the municipal level, the population estimates for the capitals presents more variability. At some capitals, the estimates from DataSUS and UFRN does not converge on the overlapping period. The IBGE estimates also present a more erratic pattern on some capitals.
Session info
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] brpop_0.5.0 scales_1.3.0 ggplot2_3.5.1 tibble_3.2.1 dplyr_1.1.4
#>
#> loaded via a namespace (and not attached):
#> [1] rappdirs_0.3.3 sass_0.4.9 utf8_1.2.4 generics_0.1.3
#> [5] tidyr_1.3.1 bitops_1.0-9 digest_0.6.37 magrittr_2.0.3
#> [9] evaluate_1.0.1 grid_4.4.1 fastmap_1.2.0 jsonlite_1.8.9
#> [13] backports_1.5.0 purrr_1.0.2 fansi_1.0.6 httr2_1.0.5
#> [17] textshaping_0.4.0 jquerylib_0.1.4 cli_3.6.3 rlang_1.1.4
#> [21] munsell_0.5.1 withr_3.0.1 cachem_1.1.0 yaml_2.3.10
#> [25] tools_4.4.1 checkmate_2.3.2 colorspace_2.1-1 curl_5.2.3
#> [29] vctrs_0.6.5 R6_2.5.1 lifecycle_1.0.4 fs_1.6.4
#> [33] ragg_1.3.3 pkgconfig_2.0.3 desc_1.4.3 pkgdown_2.1.1
#> [37] pillar_1.9.0 bslib_0.8.0 gtable_0.3.5 glue_1.8.0
#> [41] data.table_1.16.2 systemfonts_1.1.0 zendown_0.1.0 highr_0.11
#> [45] xfun_0.48 tidyselect_1.2.1 knitr_1.48 farver_2.1.2
#> [49] htmltools_0.5.8.1 labeling_0.4.3 rmarkdown_2.28 compiler_4.4.1
#> [53] RCurl_1.98-1.16 dtplyr_1.3.1