Data Resources in the Health Sciences
Storage Services :: Local
UW Data Hosting and Storage Services
- ResearchWorks Archive (UW Libraries) -deposit datasets up to 10GB, contact them for larger datasets.
- UW IT Service Catalog - view a comprehensive list of self-sustaining services including data hosting and archiving.
Storage Services :: General
NCBI provides several repository services for genomic and genetic data.
- Sequence Data
- Microarray Data
- Bioassay Data, Substance or Sequence-Based Reagents
- Human Clinical Data and Genetic Tests
- Table of NIH-supported data repositories that accept submissions of appropriate data from NIH-funded investigators (and others).
- Includes instructions for submitting and accessing data.
Dryad - international, nonprofit organization that accepts data for storage and reuse. Free for UW researchers.
ICPSR (Inter-university Consortium for Political and Social Research)
UK Data Service- Acquires data from academic, public sector and commercial sources within the United Kingdom and abroad.
figshare - allows researchers to publish all of their research outputs in an easily citable, sharable and discoverable manner. All file formats can be published, including videos and datasets that are often demoted to the supplemental materials section in current publishing models.
Open Science Framework - open source scholarly commons for the entire research cycle including data deposit; from the Center for Open Science
Zenodo - CERN initiative; free deposit for the long tail of Science
Common Data Formats
File format to use depends on many aspects, including:
- the data itself
- how it was collected
- repository capabilities and preferences
- preservation considerations
- compatibility with re-use
File formats currently recommended by the UK Data Archive for long-term preservation of research data. Other data centres or digital archives may recommend different formats.
Type of Data | Recommended File Formats for Sharing, Re-Use and Preservation |
---|---|
Quantitative tabular data with extensive metadata a dataset with variable labels, code labels, and defined missing values, in addition to the matrix of data |
SPSS portable format (.por) delimited text and command ('setup') file (SPSS, Stata, SAS, etc.) containing metadata information some structured text or mark-up file containing metadata information, e.g. DDI XML file |
Quantitative tabular data with minimal metadata a matrix of data with or without column headings or variable names, but no other metadata or labelling |
comma-separated values file (.csv) tab-delimited file (.tab) |
Geospatial data vector and raster data |
ESRI Shapefile (essential: .shp, .shx, .dbf ; optional: .prj, .sbx, .sbn) geo-referenced TIFF (.tif, .tfw) CAD data (.dwg) tabular GIS attribute data |
Qualitative data textual |
eXtensible Mark-up Language (XML) text according to an appropriate Document Type Definition (DTD) or schema (.xml) Rich Text Format (.rtf) plain text data, ASCII (.txt) |
Digital image data | TIFF version 6 uncompressed (.tif) |
Digital audio data | Free Lossless Audio Codec (FLAC) (.flac) |
Digital video data |
MPEG-4 (.mp4) motion JPET.2000 (.jp2) |
Documentation |
Rich Text Frormat (.rtf) PDF/A or PDF (.pdf) OpenDocument Text (.odt) |