Chapter 4 Obtaining Data from Database
4.1 Duckplyr
If you are new to database queries but familiar with dplyr
and tidyverse
, we recommend using duckplyr
. You can find more information about duckplyr
here but the main takeaway is that you can query your database using dplyr phrases and pipes, while also saving memory on loading data.
library(celltracktech)
con <- DBI::dbConnect(duckdb::duckdb(),
dbdir = "./data/Meadows V2/meadows.duckdb",
read_only = FALSE)
# load raw data table and find unique tags
unique_tags = tbl(con, 'raw') |>
group_by(tag_id) |>
summarize(num_detect = n()) |>
select(tag_id, num_detect) |>
arrange(desc(num_detect)) |>
collect()
DBI::dbDisconnect(con)
4.2 SQL Queries
If you are more comfortable with the SQL syntax, you can use SQL queries to get data from different tables.
This chapter is a quick summary on how to use SQL in R. If you would like more info on how to run different queries, use this tutorial: https://solutions.posit.co/connections/db/getting-started/database-queries/