Data Products
The chief scientist of each cruise will receive one copy of all data from the cruise. This copy may also include intermediate products, nascent products, mission planning files, and other elements which are described in a “Data Deliverables Summary” that is included in the drive.
Storage Media Formatting
Media formatting is a difficult issue given the variety of platforms. All of our data storage uses Ubuntu Linux and the EXT4 filesystem for internal purposes. We strongly recommend that you do the same if you are able but recognize that other formats may be necessary. We can provide a freeware plug-in to allow easy access to EXT4 from a windows PC, but we have not yet found a viable way to enable this for Mac. We are also able to write NTFS formatted drives for Windows. During dive planning meeting, the team will ask you to confirm what media format you would like.
Additional Copies
As a part of your cruise, the Chief Scientist will receive one copy of the data in the form or one or more external hard drives or raid arrays. We are happy to write additional copies, but you must provide the media in a format compatible with our systems. The exact media needs will vary by cruise, but typically will be either:
- 5 -8TB USB3.0 external harddrives) (Typically $150 - $200 each drive)
- 1 – ProRaid 4bay, USB 3.0 (UBB 2.0 or esata are not acceptable) with 4 - 5TB drives installed. Expect these to cost $1200 - $1500 each with drives.
- In extreme cases two or more of the pro-raid boxes may be required per cruise.
If you contact us ahead of time, we should be able to give you a good idea of what will be needed and can point you to specific links for purchasing the correct items online. We currently use either 5TB or 8TB USB3.0 “Expansion Desktop Drives” from Seagate and recommend users provide the same.
Organization of Delivered Data
Sentry cruise data has been organized in several different ways since the start of operations. Since January 2014, data has been organized generally as described below. However, because we are continuously seeking improvement and because we often add custom systems, small additions or variations still occur, particularly in directories used for intermediate products. The specific data directory organization for a cruise is typically described in the cruise report. Please contact us directly with any questions. Cruise data is organized into a number of directories. The top level directory structure contains the directories:
- Dives - All raw and processed data from individual dives
- Docs - Documents pertaining to the cruise such as launch positions and dive statistic summaries
- Planning - Files pertaining to mission planning. These are not generally needed by science
- Planning-bathy - This is the bathymetry provided by science for planning purposes
- Plots - Auto-generated plots from the post processing pipeline
- Products - The best at sea derived data products from the cruise
- Raw-usbl - Log and configuration files from the Sonardyne USBL system
- Svp - Sound velocity profiles used during the cruise
At-Sea Processed Data Products - Products Directory
The products directory contains a directory for each dive in the format sentry<xxx>. Most data products include a time and date stamp in the file name. For images that is the time the image was taken, for all other products that is the time of the renavigation process and can be matched to other files created with the same navigation.
Within each dive directory the following directories are included:
- hf-sss - This directory contains data products generated from the 410kHz sidescan sonar system. Note that for a particular survey it is typical to have only HF or LF products, not both.
- lf-sss - This directory contains data products generated from the 120kHz sidescan sonar system. Note that for a particular survey it is typical to have only HF or LF products, not both.
- Multibeam - This directory contains the data products from Sentry's multibeam sonar including grd and pdf files. Most users will want to use the file
- sentryxxx_yyyymmdd_hhmm_nav_tide_xxx.grd where X is the grid size. If \verb=_nav_= is included in the file name this means that mbnavadjust was applied. This is not common but if available these files are probably preferred to others.
- Photos - This directory contains thumbnails and movies of the photos collected by Sentry. Full resolution photos can be found in the dives directory.
- Sbp - This directory contains the products from the sub-bottom profiler.
- Scc - SCCs are 1Hz ASCII files containing post processed navigation and selected other science data. The timestamps on the SCCs can be matched to other data products. This flat ASCII file contains the date, time, latitude, longitude, depth, pressure, heading, altitude, data from one of the magnetometers, optical backscatter (if available), Oxygen Redox Probe, dissolved oxygen (if available), conductivity, temperature, and sound velocity. The file name contains both the dive number and the date on which the scc file was generated. If there are multiple scc files for a single dive, use the file with the most recent date. All fields in the scc file have been interpolated onto a 1 second time base. Users wanting to load the data into Matlab should use the mat files in the nav-sci directory.
Raw and Intermediate Data - Dives Directory
The dives directory contains the raw and intermediate data for each dive. Within the dives directory there will be a directory for each dive labeled as sentry<xxx>. Typically there will also be a directory labeled pre-cruise that contains assorted data from tests conducted prior to the first dive.
Within each dive directory the following directories exist:
- multibeam
- raw - raw s7k files. Does not include navigation data
- proc - all mbsystem files and inputs
- nav - vehicle navigation data
- log - real-time driver output. Not generally useful for data processing
- timing_test (optional) - separate directory used to compute or check timing offsets if relevant
- nav-sci - This directory contains all of the navigation, science, and engineering data logged by the vehicle during the dive. Most of this data is provided for archival purposes only. The scc files provide all standard sensor and vehicle navigation data. Users wishing to load data into Matlab, can use the mat files in /proc. The structure of this directory is:
- nav-sci
- proc – matlab and other files containing processed data extracted from the rosbag files. Files with the dive number contain data from the dive. Files with a date and time in the name include data from the post-processed navigation solution generated at the date and time indicated. If multiple such files exist, use the most recent. The contents of these files are described in more detail in the Appendix.
- raw
- topside-nav - topside tracking data
- mc - mission controller files
- rosbag - raw vehicle science and engineering data. Much of this, including all science data, is converted to matlab files in the proc folder during post-processing.
- nav-sci
- photos - We provide images in several formats with different levels of processing. These include the raw bayer encoded (color) tif files directly from the camera real-time software should users choose to reprocess those images. We also provide automated processing for color compensation and equalization. Filenames include date and time and can be used in conjunction with the SCC to obtain information on vehicle state and scientific sensors. The photos are stored in the following directory structure:
- Presently, Sentry takes photos during the planned camera surveys and in the event that the dive ends with a photo survey, also during the ascent. Thus there may be photos of the water column.
- photos
- raw --- Bayer encoded original images (not useful without reprocessing)
- proc - color corrected and smoothed color TIFF photos
- labeled - color corrected and smoothed color TIFF photos with navigation and some science data interpolated and rendered at the top of the image
- thumbnails - smaller, jpeg-encoded versions of the labeled photos
- sss-sbp - All of the data from sentry's sub-bottom profiler and sidescan sonar
- We provide the raw and processed Edgetech sonar data. These data are processed using commercial software 'SonarWiz5' developed by Cheasepeake Inc. into which the raw sonar files (.jsf) are imported. The software generates a project directory structure, associated files and populates the directories for each sonar data set processed. For each dive, there is a folder containing the raw data (jsf) files, the navigated data files and a SonarWiz project sub-directroy for each processed sonar (LF=120kHz, HF=420kHz, SBP=Chirp Subbottom).
- The structure of this directory is:
- The structure of this directory is:
- sss-sbp
- raw – raw JSF files as recorded by the sonar. These do not include navigation
- navigated – The raw JSF files after navigation has been injected
- lf-sss
- *** SonarWiz Project Directories for low-frequency (210kHz) sidescan ***
- Hf-sss
- *** SonarWiz Project Directories for high-frequency (400kHz) sidescan ***
- sbp
- *** SonarWiz Project Directories for subbottom profiler***
- Blueview – All data from Sentry’s BlueView P900 multibeam imaging sonar. This data is typically only used for collision-avoidance in real-time. The structure of the directory is:
- *.DAT – Engineering data from the real-time collision avoidance software
- son – Raw BlueView son files. Can be viewed using BlueView’s ProViewer (3.6 or later) software
- subsea-acomm - log files from Sentry’s WHOI micro-modem if installed
- topside-comms – log files from various topside communications links
- acomm – log files from the WHOI micro-modem installed on the ship
- sdyne – log files from interacting with the Sonardyne Ranger or Ranger 2 system installed on the ship
- iridium – log files from the iridium satellite modem installed on the ship
- metadata – Metadata generated by Sentry’s predive recording sensor configuration, serial numbers, etc. All data is provided in an ini-file format. Each file is marked with the date and time it was generated. If multiple copies of a file are present, use the most recent one from before the start of the dive.
Data Delivery
When data package is delivered at the end of the cruise, the chief scientist will be asked to:
(1) review a form that describes the package inventory and provides details for returning the drives to WHOI and acknowledge delivery of the package.
(2) assign embargo duration on the various data components (NSF policy states maximum 2-years).
(3) name institutions for whom acknowledgement is due when cruise data is used for outreach or commercial purposes.
Please review updated NSF and NDSF data policy for more information.