Data Services

The Data Center has established data submission and data sharing policies to maximize both the value and security of study data.

The following sections explain the Data Center’s services in more detail.

1. Data Repository

The Data Center repository holds all data generated by the HHEAR Lab Hubs along with pre-existing epidemiologic data provided by HHEAR study Principal Investigators (PI). After an embargo period, the HHEAR Data Center will provide publicly accessible de-identified datasets.

These data include, but are not limited to:

  • The environmental exposure biomarkers and environmental exposure data generated by the HHEAR Lab Hubs
  • Previously collected epidemiologic data such as key covariates (e.g., sex, race/ethnicity, age, education)
  • Previously measured biomarkers (including clinical biomarkers)
  • Genetic data
  • Health outcome information

In addition to managing the repository, the Data Center provides the following services.

2. Data Analysis Plan Development

The Data Center offers consultations with investigators to develop a data analysis plan that includes:

  • Assessment of appropriate study design and sample size
  • A statement of analysis objectives
  • The type of dependent variable (e.g., continuous scale, binary, time-to-event with censoring)
  • Identification of appropriate covariates/confounders
  • Examination of the Data Dictionary and Codebook to:
    • Understand the variables and their characteristics
    • Link the PI’s epidemiologic data to biomarker results produced by the HHEAR Lab Hubs
  • Consideration of complex correlations among environmental chemicals and indications of whether to focus on select chemicals or evaluate mixture effects
  • A description of how planned analyses will deal with missing data
  • Evaluation of a modeling strategy for:
    • Fitting the available data (e.g., nonlinear or linear regression, logistic regression)
    • Evaluating goodness of fit of the data with the estimated model
    • Selection of statistics for testing for association (e.g., likelihood ratio test, Wald test)

3. Statistical Analysis

The Data Center offers statistical analyses according to the data analysis plan developed in collaboration with the PI. Services include:

  • Performing or assisting in the proposed statistical analyses based on the modeling strategy chosen in the data analysis plan
  • Designing and populating tables and figures

4. HHEAR Ontology Services

Building on the HHEAR ontology, the Data Center is responsible for expanding and maintaining the HHEAR Ontology—a common vocabulary for use in the HHEAR program. The Ontology is evolving with the program and will connect to best-in-class existing vocabularies, thus facilitating the integration of data from multiple studies. The Data Center assists PIs in applying the Ontology to their studies. Services include:

  • Facilitating the mapping of variables from data dictionaries into terms consistent with the HHEAR Ontology
  • Incorporating the study's data into the HHEAR Ontology to support collaborative research across the HHEAR consortium, including pooled analyses from cohort studies participating in HHEAR
  • Developing methods and services for comparing similar variables from different data dictionaries, starting with very basic mappings of equivalent terms and moving into more sophisticated analyses of relationships among variables
  • Providing tools and services to manage the HHEAR Ontology evolution

5. HHEAR Knowledge Graph Services

A knowledge graph is a network that connects a wide range of information types relevant to a specific domain, such as children’s environmental health. The HHEAR Knowledge Graph will:

  • Make it possible to query all types of information defined in the HHEAR Ontology
  • Enable users to browse and query the HHEAR Data Center Repository utilizing the vocabulary from the HHEAR Ontology