The bulk of the data access protocol involves converting the
.accdb salvagae database file on the remote ftp to a local set of
.csv files named by the tables in the database. We accomplish this in two lines of code by pulling and then running a stable
Docker software container that contains a set of
bash scripts designed specifically for this task. The specific image used for data access is called
accessor, is freely available on Docker Hub, and has default setting configured for the salvage database. Code for the construction of the
accessor image is available in its repository.
For accessability and reproducibility, we provide an up-to-date version of the salvage data as
.csvs from the “current” (1993 - Present) salvage database file (
Salvage_data_FTP.accdb). The data can be downloaded via various methods from the repository, including from the website.
To use the current image to generate an up-to-date container with data for yourself
An additional conversion makes the data available in
R as a
data.frames that is directly analagous to the
.accdb database of tables.
Building on the list above, you can leverage the
r_script.R script included in the image, which sources the
r_functions.R script and loads the database in as an
R object named
database. Docker provides ample access and avenues to run R within the container. For example, the
docker exec command opens a full API for running within the top (read/write) layer of the container, allowing an endless supply of R code to be included within a single character input:
Rscript from the command line
You can copy your own scripts into the image and then run them from that environment:
Note that we are running the main
r_script.R first still here; that script does not save any files externally, so the R session in 6. is gone when we run 8.. For the sake of simplifying the command line call, it is therefore recommended that expanded uses follow 8. and use
my_r_script.R as a hub file that directs all of your specific functions, including files saved out from R. For simplicity, saving out all files into a single folder, e.g.
output allows a single docker command for retrieving the results from the top layer of the container:
Alternatively, the functions are written in only base R, so they should be reproducibly functional outside the image (in an open session).
Within an instance of
R, navigate to where you have read the data out from the container into/where this code repository located and source
database object is a named
list of the database’s tables, ready for analyses.
Data preparation code is in development!
Having brought the data into R as-is, we can now prepare them for summaries and analyses. We use the functions included in the
salvage_functions.R R script, which is included within the and
salvage docker image, which provides a stable runtime environment for the analyses and output generation (including website rendering).