38 Building a Bibliometric Shiny App
38.1 Learning objectives
After completing this chapter, you will be able to:
- Design a Shiny app architecture for bibliometric analysis
- Implement reactive data fetching from OpenAlex
- Build interactive UI components: dropdowns, sliders, and tabbed panels
- Create server-side logic for citation analysis and network visualisation
- Deploy a Shiny app to shinyapps.io or a self-hosted server
38.3 Conceptual background
Shiny transforms R scripts into web applications. Users interact through widgets (dropdowns, sliders, buttons) and see results update in real time. For bibliometrics, Shiny apps serve two audiences: researchers who want to explore data without writing code, and decision-makers who need accessible analytics dashboards.
A Shiny app has two components: the UI (user interface, defining layout and inputs) and the server (the R logic that reacts to user input and generates output). Reactive programming connects inputs to outputs: when a user changes a dropdown, only the computations that depend on that input are re-executed.
For bibliometric apps, the typical architecture involves:
- Input panel: Select a journal, institution, or search query; choose date range and sample size.
- Data fetching: Reactively query OpenAlex based on user selections.
- Analysis tabs: Trend charts, citation distributions, author rankings, and optionally network visualisation.
- Export: Download buttons for data (CSV) and figures (PNG).
biblioshiny() from the bibliometrix package (Aria and Cuccurullo 2017) provides a ready-made Shiny interface for many standard analyses. Building a custom app gives full control over the analysis pipeline and user experience.
Deployment options include shinyapps.io (hosted by Posit, free tier available), Posit Connect (enterprise), and self-hosted via Shiny Server (open source).
38.4 Worked example
38.4.1 App structure
We demonstrate the key components of a bibliometric Shiny app. The code below shows the UI and server logic separately for clarity.
38.4.2 UI definition
ui <- fluidPage(
titlePanel("Bibliometric Explorer"),
sidebarLayout(
sidebarPanel(
textInput("journal_id", "OpenAlex Source ID:",
value = "S148561398"),
sliderInput("year_range", "Publication Years:",
min = 2015, max = 2023, value = c(2020, 2023)),
numericInput("sample_size", "Sample Size:", value = 200,
min = 50, max = 1000),
actionButton("fetch", "Fetch Data", class = "btn-primary"),
hr(),
downloadButton("download_csv", "Download CSV")
),
mainPanel(
tabsetPanel(
tabPanel("Trends",
plotOutput("trend_plot"),
tableOutput("summary_table")),
tabPanel("Citations",
plotOutput("cite_dist"),
plotOutput("cite_boxplot")),
tabPanel("Data",
DT::dataTableOutput("data_table"))
)
)
)
)38.4.3 Server logic
server <- function(input, output, session) {
works_data <- eventReactive(input$fetch, {
withProgress(message = "Fetching from OpenAlex...", {
oa_fetch(
entity = "works",
primary_location.source.id = input$journal_id,
from_publication_date = paste0(input$year_range[1], "-01-01"),
to_publication_date = paste0(input$year_range[2], "-12-31"),
type = "article",
options = list(sample = input$sample_size, seed = 42)
) |>
transmute(
id, title = display_name,
year = year(publication_date),
cited_by_count, oa_status
)
})
})
output$trend_plot <- renderPlot({
req(works_data())
works_data() |>
count(year) |>
ggplot(aes(x = factor(year), y = n)) +
geom_col(fill = palette_sci(1)) +
labs(x = "Year", y = "Publications") +
theme_sci()
})
output$cite_dist <- renderPlot({
req(works_data())
ggplot(works_data(), aes(x = cited_by_count)) +
geom_histogram(binwidth = 5, fill = palette_sci(1), colour = "white") +
labs(x = "Citations", y = "Count") +
theme_sci()
})
output$cite_boxplot <- renderPlot({
req(works_data())
works_data() |>
filter(!is.na(oa_status)) |>
ggplot(aes(x = oa_status, y = cited_by_count + 1)) +
geom_boxplot(fill = palette_sci(1), alpha = 0.7) +
scale_y_log10() +
labs(x = "OA Status", y = "Citations (log)") +
theme_sci()
})
output$summary_table <- renderTable({
req(works_data())
works_data() |>
summarise(
Articles = n(),
`Mean Citations` = round(mean(cited_by_count), 1),
`Median Citations` = median(cited_by_count),
`OA Rate` = scales::percent(mean(oa_status != "closed", na.rm = TRUE))
)
})
output$data_table <- DT::renderDataTable({
req(works_data())
works_data() |>
select(title, year, cited_by_count, oa_status) |>
arrange(desc(cited_by_count))
})
output$download_csv <- downloadHandler(
filename = function() paste0("biblio_data_", Sys.Date(), ".csv"),
content = function(file) write_csv(works_data(), file)
)
}38.4.4 Running the app
shinyApp(ui = ui, server = server)38.5 Diagnostics and interpretation
-
Responsiveness: API calls can take several seconds. Use
withProgress()to show loading indicators. -
Error handling: Wrap API calls in
tryCatch()to display user-friendly error messages instead of stack traces. - Memory: Each user session loads data into R memory. For apps with many concurrent users, pre-aggregate data or use database backends.
-
Testing: Use
shinytest2to automate testing of Shiny apps.
38.7 Limitations and responsible use
- Shiny apps are not analyses. They enable exploration but do not substitute for rigorous, documented analytical pipelines. Use Shiny for discovery, not for producing final results.
- API rate limits. Live OpenAlex queries from a Shiny app can hit rate limits if many users fetch data simultaneously. Use caching or pre-fetched data for production apps.
- Maintenance. Shiny apps require ongoing maintenance: API changes, package updates, and server costs. Plan for long-term sustainability before deploying.
- Interpretive guardrails. Non-technical users may misinterpret metrics. Build in tooltips, descriptions, and warnings — especially around citation-based indicators (Hicks et al. 2015; American Society for Cell Biology 2012).
38.9 Common pitfalls
-
Fetching data on every input change. Use
eventReactive()with an action button instead ofreactive()to prevent re-fetching on every slider adjustment. - Not validating inputs. Check that the journal ID exists and the date range is valid before making API calls.
- Monolithic server functions. Break the server logic into modules for maintainability as the app grows.
- Deploying without testing. Test with realistic data volumes before deploying. An app that works with 100 records may fail with 10,000.
38.10 Exercises
Add network tab. Extend the app with a tab that builds and displays a co-authorship network using
visNetwork.Caching. Implement server-side caching so that repeated queries with the same parameters return instantly.
Deployment. Deploy the app to shinyapps.io. Test with different journal IDs and date ranges.
biblioshiny comparison. Run
biblioshiny()and compare its features with your custom app. What does biblioshiny offer that your app does not?
38.11 Solutions
Solutions are provided in 2.11.
38.12 Further reading
-
Aria and Cuccurullo (2017) —
bibliometrixandbiblioshiny()for turnkey bibliometric exploration. - Priem et al. (2022) — OpenAlex API for reactive data fetching.
- Hicks et al. (2015) — The Leiden Manifesto: designing interfaces that prevent metric misuse.
- American Society for Cell Biology (2012) — DORA: responsible assessment in interactive tools.
38.13 Session info
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] shiny_1.13.0 DT_0.34.0
#> [3] plotly_4.12.0 uwot_0.2.4
#> [5] Matrix_1.7-0 word2vec_0.4.1
#> [7] stm_1.3.8 topicmodels_0.2-17
#> [9] quanteda.textstats_0.97.2 visNetwork_2.1.4
#> [11] ggraph_2.2.2 tidygraph_1.3.1
#> [13] igraph_2.3.1 quanteda_4.4
#> [15] pdftools_3.9.0 arrow_24.0.0
#> [17] bibliometrix_5.4.0 RefManageR_1.4.0
#> [19] bib2df_1.1.2.0 rcrossref_1.2.1
#> [21] gt_1.3.0 tidytext_0.4.3
#> [23] glue_1.8.1 openalexR_3.0.1
#> [25] lubridate_1.9.5 forcats_1.0.1
#> [27] stringr_1.6.0 dplyr_1.2.1
#> [29] purrr_1.2.2 readr_2.2.0
#> [31] tidyr_1.3.2 tibble_3.3.1
#> [33] ggplot2_4.0.3 tidyverse_2.0.0
#>
#> loaded via a namespace (and not attached):
#> [1] splines_4.4.1 later_1.4.8 urltools_1.7.3.1
#> [4] cellranger_1.1.0 triebeard_0.4.1 polyclip_1.10-7
#> [7] XML_3.99-0.23 lifecycle_1.0.5 httr2_1.2.2
#> [10] rprojroot_2.1.1 NLP_0.3-2 lattice_0.22-6
#> [13] brand.yml_0.1.0 vroom_1.7.1 MASS_7.3-60.2
#> [16] crosstalk_1.2.2 backports_1.5.1 SnowballC_0.7.1
#> [19] magrittr_2.0.5 openxlsx_4.2.8.1 sass_0.4.10
#> [22] rmarkdown_2.31 jquerylib_0.1.4 yaml_2.3.12
#> [25] httpuv_1.6.17 otel_0.2.0 zip_2.3.3
#> [28] askpass_1.2.1 RColorBrewer_1.1-3 downlit_0.4.5
#> [31] contentanalysis_1.0.0 tweenr_2.0.3 rappdirs_0.3.4
#> [34] tm_0.7-18 ggrepel_0.9.8 tokenizers_0.3.0
#> [37] crul_1.6.0 rentrez_1.2.4 RSpectra_0.16-2
#> [40] codetools_0.2-20 xml2_1.5.2 ggforce_0.5.0
#> [43] tidyselect_1.2.1 rscopus_0.9.0 httpcode_0.3.0
#> [46] farver_2.1.2 viridis_0.6.5 matrixStats_1.5.0
#> [49] stats4_4.4.1 base64enc_0.1-6 jsonlite_2.0.0
#> [52] tools_4.4.1 stringdist_0.9.17 Rcpp_1.1.1-1.1
#> [55] gridExtra_2.3 xfun_0.57 here_1.0.2
#> [58] mgcv_1.9-1 ca_0.71.1 withr_3.0.2
#> [61] fastmap_1.2.0 digest_0.6.39 timechange_0.4.0
#> [64] R6_2.6.1 mime_0.13 qpdf_1.4.1
#> [67] dichromat_2.0-0.1 utf8_1.2.6 generics_0.1.4
#> [70] data.table_1.18.4 FNN_1.1.4.1 graphlayouts_1.2.3
#> [73] stopwords_2.3 httr_1.4.8 htmlwidgets_1.6.4
#> [76] pkgconfig_2.0.3 gtable_0.3.6 modeltools_0.2-24
#> [79] S7_0.2.2 janeaustenr_1.0.0 htmltools_0.5.9
#> [82] bookdown_0.46 scales_1.4.0 knitr_1.51
#> [85] rstudioapi_0.18.0 tzdb_0.5.0 reshape2_1.4.5
#> [88] nlme_3.1-164 curl_7.1.0 cachem_1.1.0
#> [91] parallel_4.4.1 miniUI_0.1.2 shinycssloaders_1.1.0
#> [94] pubmedR_1.0.2 pillar_1.11.1 grid_4.4.1
#> [97] vctrs_0.7.3 slam_0.1-55 promises_1.5.0
#> [100] xtable_1.8-8 evaluate_1.0.5 cli_3.6.6
#> [103] compiler_4.4.1 rlang_1.2.0 crayon_1.5.3
#> [106] labeling_0.4.3 dimensionsR_0.0.3 plyr_1.8.9
#> [109] fs_2.1.0 stringi_1.8.7 viridisLite_0.4.3
#> [112] assertthat_0.2.1 lazyeval_0.2.3 bibliometrixData_0.3.0
#> [115] hms_1.1.4 patchwork_1.3.2 bit64_4.8.0
#> [118] humaniformat_0.6.0 broom_1.0.12 memoise_2.0.1
#> [121] bslib_0.11.0 bibtex_0.5.2 fastmatch_1.1-8
#> [124] bit_4.6.0 nsyllable_1.0.1 readxl_1.4.5