Data Deliverables Summary for HOV Alvin
Overview
Introduction
The data package received at the conclusion of an Alvin cruise is a collection of real-time automated logs, observer logs and photos, raw data collections, vehicle video, and post-processed products. This document is intended to describe the contents and organization of the package so groups can process and use it more efficiently after the cruise.
Data Deliverables
General organization of the data package
The Alvin data package will be delivered on two or more portable USB hard drives to the Chief Scientist. This is a first-generation copy of the complete data package provided for transport back to the Chief Scientist’s home institution. Long-term and online preservation of the data package will be managed by WHOI in accordance with the NDSF data policy.
When the data package is delivered at the end of the cruise, the chief scientist will be asked to:
- Review a form that describes the package inventory and acknowledges delivery of the package.
- Assign embargo duration on the various data components (NSF policy states a maximum of 2 years).
- Name institutions for whom acknowledgement is due when cruise data is used for outreach or commercial purposes.
Non-video vehicle data collected throughout the cruise will be provided on a single hard drive; the raw and proxy video packages generated by the imaging system will be provided on one or more additional drives. For planning purposes, current volume estimates are approximately 25-50GB of non-video data per dive and 2-3TB of video data.
The content on the hard drives will resemble the following, though variations may occur:
Drive 1: Vehicle Data Deliverables
Vehicle Data Hard Drive
AT##-## {cruise ID}
|-----Documentation
|-----Sealog
|-----AL#### {dive ID}
| |-----Audio
| |-----C+C
| |-----ExtPhotos
| |-----IntPhotos
| |-----Nav
| |-----Proc
| |-----Science
| |-----Toplab_Navest
| |-----USBL
| |-----Videos
1. Documentation
Location: Data drive in /AT##-##/Documentation
Description: This directory contains all documentation and metadata necessary for interpretation of the contents of the complete Alvin data package. The contents are described below.
Contents:
- This summary
- Vehicle data formats definitions file: Alvin_Data_Formats2022.xlsx
- USBL system calibration records
- Automated dive and cruise summary reports
- Imagery viewing and mission planning primer documents
- Archive media disposition log
Copy of the data archive acknowledgement and assignment form
2. Sealog
Location: Data drive in /AT##-##/sealog/
Description: Sealog is an augmented event logger. During dives, observers use a multipurpose user interface to make event entries using either free text or hot buttons. Sensor data and framegrabs are co-registered at the time of each event. Periodic autosnaps trigger to fill in the narrative between manual user events. The result is a web browsable review of the dive, available onboard during the cruise and ashore following the cruise. The Sealog content available in the cruise data package serves two purposes. Some content, such as the .csv file, provides a table of events along with sensor measurements and the names of the co-registered framegrabs found in the “images” subdirectory.
Shoreside access: After the cruise the full narration will be hosted on a shoreside server, which can be reached at https://sealog.whoi.edu/sealog-alvin. Viewing of the shoreside server content can be password protected for up to two years.
3. Internal Sphere and Vehicle Deliverables, by Dive
3.1. Audio:
Location: Data drive in /AT##-##/AL####/Audio/
Description: Each observer is provided with a handheld digital audio recorder to use for personal note logging inside the sphere. Audio is downloaded from each recorder as part of the standard post-dive procedures. Files are archived in the native .mp3 format of the recorders with the naming convention yymmdd_(sequencenumber).mp3.
3.2. Command and Control:
Location: Data drive in /AT##-##/AL####/C+C/
Description: The contents of the command and control directory (c+c) are sourced from the Pilot’s command, control, and navigation computer located inside the sphere. When possible, all raw non-navigational sensor records are logged and organized into a single hourly ASCII text file. This aggregate file is delivered with the data package and is also parsed by sensor type during the data offload process, creating records for individual sensor types. Instrumentation requiring manufacturer-specific software, such as multibeam or sonar data, are not logged by this computer, and are instead run from the in-sphere Science computer and are logged in their native file formats.
Contents:
- VDAT: Raw vehicle sensor data is captured in sensors’ native format and native sampling rate and logged as hourly yyyymmdd_hhmm.VDAT files over a calendar day (UTC). After a dive, the Alvin data processor concatenates the hourly VDAT files into record type-specific files covering the entire range of the dive, and compresses the original .VDAT records into gzip format.
- Reference Documentation:
See /AT##-##/Documentation/Alvin_Data_Formats2022.xls
- ICL:
- When utilized, 1Hz data from the inductively-coupled link (ICL) temperature probe is captured as hourly .ICL files over a calendar day (UTC).
- Additional Data are subdivided into the following groupings and subdirectories:
- /sms/: Hourly logs of sms (sonardyne messaging system) messages sent by the vehicle to the surface, and those received by the vehicle from the surface, during the course of the dive, are preserved in this subdirectory.
- /rosbag/: New in 2021 are the rosbag and log subdirectories, containing system message logs produced following the integration of Robot Operating System.
- /log/: New in 2021 are the rosbag and log subdirectories, containing system message logs produced following the integration of Robot Operating System.
3.3. External Photos:
Location: Data drive in /AT##-##/AL####/ExtPhotos/
Description: A 23MP 5.3k GoPro camera mounted on the vehicle’s brow provides a quiescent fixed focus timelapse photo overview of the dive.
Contents:
- Photos are shot in .jpg format and renamed post-dive to include dive number and date information with convention AL####_yyyymmddThhmmssZ-BrowCam.JPG but have not otherwise been altered.
- Timelapse: Photos from the dive are compiled into two MP4 timelapse videos, one composed of the raw images, and a second with time stamp information overlayed in the upper left corner of the images.
3.4. Internal Photos:
Location: Data drive in /AT##-##/AL####/IntPhotos/
Description: Photos captured with hand-held digital cameras during the dive by the pilot and observers in the sphere. Original photos are renamed with the convention AL####_yyyy_mm_dd_hh_mm_ss.JPG
Contents: Still photos are shot as .jpg images. The cameras may also be used on occasion to shoot short .mp4 video clips. The still photos directory for each camera will additionally contain the file index.html, which provides a web browsable thumbnail index of all photos located within that directory. The video files recorded during the dive are in the “videos” subdirectory of each respective camera.
3.5. Navigation:
3.5.1 Sphere Navigation:
Location: Data drive in /AT##-##/AL####/Nav/
Description: The contents of the navigation directory (nav) are sourced from the Pilot’s command, control, and navigation computer, located inside the sphere. Alvin’s primary inertial navigation sensors consist of a Doppler velocity log (DVL) paired with a fiber optic gyroscope (FOG). These and other navigation sensors are logged to an hourly ASCII text file. Georeferenced vehicle navigation information is produced by the USBL system. At the conclusion of every dive, Alvin-specific navigational and sensor data is transferred from the submersible to the support ship’s network for further processing. The aggregate navigation ASCII files are parsed by sensor type during the data offload process and files for individual sensors are created. These sensor-specific files can be viewed and processed using programs like a text editor, Matlab, or Excel. Technicians on board the ship also execute a dedicated navigation post-processing script, discussed further in the Processed data products section.
Contents:
- DAT files:
- Description: Raw vehicle navigation data is captured in sensors’ native format and native sampling rate and logged as hourly yyyymmdd_hhmm.DAT files over a calendar day (UTC). After a dive, the Alvin data processor concatenates the hourly *.DAT files into record type-specific files covering the duration of the dive, and compresses the original .DAT records into gzip format.
- Reference Documentation:
See /AT##-##/Documentation/Alvin_Data_Formats2022.xls - Additional Data is subdivided into the following groupings and subdirectories:
- /Inifiles/: Configuration files used by the vehicle’s navigation software, specifying information such as dive origin, vehicle and map display preferences, and navigation source messages.
- /ScreenGrabs/: Manually initiated screen captures of the navigation interface taken inside the sphere during the course of the dive.
- /Targets/: Labeled latitude and longitude vehicle position targets displayed in the navigation user interface. These targets include both those created during the pre-dive planning process, and those created manually by the pilot during the course of the dive.
- /Underlays/: The Underlays subdirectory may contain georeferenced .grd files providing a bathymetric underlay used in the vehicle’s navigation user interface, or simple image files in .png, .tif, or .bmp format along with a corresponding text file geolocating the boundaries of the image file.
3.5.2 Surface Navigation:
Location: Data drive in /AT##-##/AL####/Toplab_Navest/
Description: The contents of the surface navigation directory (Toplab_Navest) are sourced from the Surface Controller’s navigation computer, located in the Atlantis Top Lab control station. This is a complimentary navigation data set to the in-sphere navigation computer. This directory contains shipboard navigation sensor data as it was collected in real time, concatenated and sorted presentations of this data, and subdirectories containing other relevant files used by the surface navigation application.
Contents:
- DAT files: Raw surface navigation data is captured in sensors’ native format and native sampling rate and logged as hourly yyyymmdd_hhMM.DAT files over a calendar day (UTC). After a dive, the Alvin data processor concatenated the hourly .DAT files into record type-specific files covering the entire range of the dive, and compressed the original .DAT records into gzip format.
Reference Documentation: /AT##-##/Documentation/
Alvin_Data_Formats2022.xls
- /Inifiles/: Configuration files used by the surface instance of the navigation software, specifying information such as dive origin, vehicle and map display preferences, and navigation source messages.
- /ScreenGrabs/: Manually initiated screen captures of the navigation interface taken by the Surface Controller during the course of the dive.
- /Targets/: Labeled latitude and longitude vehicle position targets displayed in the surface navigation user interface. These targets include both those created during the pre-dive planning process, and those created manually by the Surface Controller during the course of the dive.
- /Underlays/: The Underlays subdirectory may contain georeferenced .grd files providing a bathymetric underlay used in the surface navigation user interface, or simple image files in .png, .tif, or .bmp format along with a corresponding text file geolocating the boundaries of the image file.
- /sms/: Hourly logs of sms (sonardyne messaging system) messages sent by the vehicle to the surface, and those sent from the surface to the vehicle during the course of the dive are preserved in this subdirectory.
3.6. Processed data products
Location: Data drive in /AT##-##/AL####/Proc/
Description: After a dive, the onboard data processor executes a dedicated post-processing script known as renavigation that performs an initial ‘grooming’ of the navigation data and generates an output file that merges inertial and USBL navigation data. Real-time position estimates are improved as a result of the post-processing. The primary product of interest produced by this process is a 1Hz ASCII .SCC file containing post-processed navigation data and a selection of other sensor data during the dive. Summary plots offering status validation and visual comparisons of the various navigation solutions are also provided. More detailed mapping requirements associated with specialized equipment (e.g. multibeam, photo mosaics, etc.) may require processing expertise that is not routinely provided by the Alvin team.
Contents:
- Main directory: A collection of Matlab processing and output files resulting from the renavigation process. The resulting .SCC file is also located in this directory. Typical output from this post-processed navigation file includes date, time, latitude, longitude, depth, pressure, heading, and altitude. Additional sensor data from a magnetometer, CTD, or other resident sensor may also be available.
- /ins_nav/: This subdirectory provides comparison plots which illustrate the quality of renavigated vehicle navigation sensors and solutions.
- /data_reports/: A series of CSV files providing quality analysis for data collected by the vehicle’s various depth, temperature, and CTD sensors.
- /plots/: A series of plots providing visual quality analysis data of the vehicle’s various depth, temperature, and CTD sensors.
3.7. Science:
Location: Data drive in /AT##-##/AL####/Science/
Description: The contents of the science directory are sourced from the vehicle’s science computer. This directory contains any facility-standard instrument data logged outside of the vehicle’s command, control, and navigation environment. Instrumentation is divided by subdirectory. If an instrument was not used during a dive, that directory will be empty.
Contents:
- Temperature Logs: Text files produced from temperature probes located on the port window, starboard window, and science basket.
- Device Type/Make/Model: WHOI RTD probe
- File Naming convention: *Window Temp-ddmmmyy .csv
- Format details: 1Hz CSV labeled with column headers
- Data processing status: Raw
- Forward_Sonar:
- Device: Kongsberg Mesotech 1171
- Use: search and obstacle avoidance sonar
- Format: proprietary Kongsberg file
- Naming Convention: *00*.smb
- Data processing staus: Raw
- Heatflow_Probe: Temperature logs produced from Alvin’s 0.6-meter long, five-element temperature sensor, used to measure temperature gradients in soft sediments.
- Device Type/Make/Model: WHOI
- File Naming convention: HeatFlow_yyyymmdd_hhmm.dat
- Format details: 1 Hz ASCII
- Data processing status: Raw
- Scanning Sonar:
- Device Type/Make/Model: Tritech SeaKing sonar
- File Naming convention: Ddd_dd__Mmm_hh_mm.V4LOG
- Format details: Proprietary TriTech file format
- Data processing status: Raw
- CTD: 20Hz CTD records produced by the SBE49 CTD. The original source of these logs is within the command and control .VDAT files, however the parsed and consolidated data product is found here in the CTD directory for convenience.
- Device Type/Make/Model: Seabird SBE49
- File Naming convention: CTD_AL####.csv
- Format details: 20Hz ASCII log file
- Data processing status: Raw
3.8. USBL:
Location: Data drive in /AT##-##/AL####/USBL/
Description: USBL (Ultra-Short BaseLine) navigation data is sourced directly from the Sonardyne Ranger2 USBL system located in the Atlantis TopLab control station. The entirety of the raw system log files are exported in the manufacturer’s fixed .csv file format.
3.9. Videos:
Location: Data drive in /AT##-##/AL####/Videos
Description: For convenience of access during the cruise and on WHOI shoreside servers, a subset of proxy video files including .mov and renavigated subtitle files are available in the Videos section of the main Alvin data drive. During post-dive data processing, the .srt and .ass subtitle files associated with proxy video files are re-created utilizing the renavigated vehicle dive information. See Section 3.6 for a detailed description of the renavigation process.
The imaging system still image captures from the dive are also available in this location.
Drive 2-N: Video Imaging System Deliverables
AT##-## {cruise ID}
|-----AL#### {dive ID}
| |-----AL####_ProRes
| |-----AL####_Proxies
| |-----AL####_ExternalStills
| |-----AL####_GoPro
1. ProRes Video Recordings
Location: Video drive in /AT##-##/AL####/AL####_ProRes/
Description: Two streams of video, one controlled by each observer in the sphere, are recorded to solid state drive during the bottom time portion of each dive. These are uncompressed ProRes LT .mov files, in formats up to 2160p60, playable using QuickTime, Windows Media Player, and VLC. ProRes video will contain embedded timecode that is synchronized to the in‐sphere NTP time server. File names are created in the format: AL####_yyyymmddThhmmssZ_*_prores0000.mov. Automated clip duration is set at 15 minutes.
Storage volumes for these video files can approach several terabytes per dive.
2. Proxy Video Recordings
Location: Video drive in /AT##-##/AL####/AL####_Proxies/
Description: The same two observer video streams are recorded and stored as H.264 encoded .mp4 proxy video files in 1080p30 format during the dive. The in-sphere imaging server also generates real-time copies of the proxy video data, while capturing real time navigation metadata and embedding the metadata in a series of subtitle files for each video file. Each video clip folder will contain a copy of the raw, unprocessed .mp4 proxy video, a copy of the recontainerized .mov proxy video, an ffmpeg log file, an Advanced SubStation Alpha subtitle file, and two subtitle file subfolders. The /srt_subtitles and /renav_srt_subtitles subfolders contain “soft embedded” SubRip .srt subtitle files that were generated from either real-time or post-processed (renav) metadata streams, respectively. The original video, audio, and timecode data is not re‐encoded or altered in any way. During post-dive data processing, the .srt and .ass subtitle files associated with proxy video files are re-created utilizing the renavigated vehicle dive information. See Section 3.6 for a detailed description of the renavigation process.
There are six .srt files to choose from in each subfolder, along with the Advanced SubStation Alpha subtitle files, each of which can produce a line of text overlain on the video with the following information:
- Subtitle file null: Cruise ID, vehicle ID, dive ID, date, UTC time, longitude, latitude, origin longitude, origin latitude, UTM zone, X & Y coordinates, depth, altitude, heading, pitch, roll
- Subtitle file 1: Cruise ID, vehicle ID, dive ID, date, UTC time
- Subtitle file 2: UTC time, origin longitude, origin latitude, UTM zone
- Subtitle file 3: UTC time, X & Y coordinated, heading, depth, altitude
- Subtitle file 4: UTC time, latitude, longitude, pitch, roll
- Adv subs alpha file: Cruise ID, vehicle ID, dive ID, date, UTC time, heading, altitude, depth, pitch, roll, UTM zone, X & Y coordinates, longitude, latitude, origin longitude, origin latitude
Proxy videos and subtitle files are best merged in playback using VLC Media Player.
Storage volumes for these video files can approach several hundred gigabytes per dive.
3. Video System Still Captures
Location: Video drive in /AT##-##/AL####/AL####_ExternalStills/
Description: Observers are able to manually capture still images from their active user interface video feed in the sphere. The images are generated by capturing a single frame from the viewed camera’s video stream. The still image capture resolution will be either HD (1920x1080) or 4K (3840x2160), depending on the video resolution of the selected camera. The captured image will also contain embedded metadata as part of the exif “comment” field that will include various dive, camera, navigation, and temperature sensor related data associated with the moment of capture, along with appropriate copyright information.
4. GoPro Video Recordings
Location: Video drive in /AT##-##/AL####/AL####_GoPro
Description: A 24MP 5.3k GoPro camera mounted on the vehicle basket or arm provides a quiescent fixed focus video imaging overview of the dive. Video files are provided in the native GoPro .mp4 format. Files have been renamed to include dive number and date information in the format AL####-yyyymmddThhmmssZ-camera.MP4, but have not otherwise been altered.
Storage volumes for this collection can approach several hundred gigabytes per dive.
NDSF contacts
- Alvin Manager
- Bruce Strickrott, strickrott@whoi.edu, 508-289-3860
- Alvin Data Engineer
- Joseph Garcia, jogarcia@whoi.edu, 508-289-2670
- NDSF Associate Director for Data and Science Ops:
- Christina Haskins, haskins@whoi.edu, 508-289-3920
- NDSF Director
- Andrew Bowen, abowen@whoi.edu, 508-289-2643
- NDSF Chief Scientist
- Anna Michel, amichel@whoi.edu