Managing Data Products for the Canadian Hydrogen Intensity Mapping Experiment
The Canadian Hydrogen Intensity Mapping Experiment (CHIME) is a revolutionary new radio telescope designed to answer major questions in astrophysics and cosmology. Its data rate and remote location pose a major challenge to managing collected data and making it available for scientific analyses. As a solution, the CHIME team has written Alpenhorn, a set of tools for managing an archive of scientific data across multiple sites.
Alpenhorn tracks data as files are added, and indexes their contents in its database based on their type. An extensible framework allows projects to define custom data types, associated metadata, and indexing logic. Cron scripts and human operators can request syncing of those files to other locations, which Alpenhorn instances running at the destination manage automatically, using a transfer method appropriate to the host (e.g, bbcp, rsync, or hpss). The database tracks all copies of every file, handles available disk storage on the destination, and ensures file integrity and sufficient replication.
Software Developer, Department of Physics and Astronomy, University of British Columbia