Skip to main content

Environmental Data Science

Environmental Data Science is comprised of Bioinformatics Services, Data Visualization, Methods Development, Data Management, and Comparative Toxicogenomics Database.


Center members can access four bioinformaticians/biostatisticians (Drs. Reif, Wright, and Zhou and Mr. Jima) with expertise in EHS data generated from in silico, in vitro, in vivo, and populations. All five are co-located as resident BRC members, along with the BRC Computing Cluster system administrators.

Mr. Jima also has an office in the CHHE office suite where he is located 75% of his time providing easy accessibility and consultation for CHHE members.

Mr. Jima offers support for genetic, genomic, proteomic, epigenomics and pathway data analysis. To facilitate reproducibility and increase efficiency of future analysis, we maintain a shared, active library of analysis scripts written in R and Python, as well as documented routines for analyses implemented using CHHE-licensed Ingenuity Pathway Analysis and CLC Genomics Workbench.

Mr. Jima offers statistical support as well as for pathway analysis and analysis of metabolomics data.

Statistical Consulting Support

Dr. Griffith offers support on data analysis that largely uses existing methods. She has a broad array of experience working with applications including toxicology, chemistry, veterinary medicine, and horticulture.

She is able to provide guidance and advice on experimental design and power calculations, general statistical analysis including ANOVA and linear models, generalized linear models, survival analysis, and some spatial and time series analysis techniques.

Because of Dr. Griffith’s work with the Department of Statistics and the Data Science Academy, she is also able to connect CHHE members with needed expertise across campus.

Data Visualization

As a transdisciplinary center, the CHHE makes substantial use of data graphics and visualization to communicate results. Such visualization is an effective translational device, and the IHSFC will direct CHHE members to the personnel and software to generate appropriate data graphics.

Mara Blake, Department Head for Data & Visualization Services in the NCSU Libraries, will offer support for design and implementation of visualization solutions, taking advantage of state-of-the-art, immersive visualization environments available at the libraries.

Methods Development

Many CHHE projects involve the generation or recombination of novel data types, for which new analysis methods must be developed. For these situations, we draw upon expertise in developing statistical and machine learning methods for integrated analysis of EHS data.

Dr. Wright offers support on omics-scale data analysis and design of studies that leverage omics data (especially gene expression) resources.

Dr. Zhou offers support for experiments requiring development of new analytical methods, such as those aiming to integrate high-dimensional genomic, metabolomic data and transcriptomics involving single cell analysis.

Data Management Support

Nathan offers comprehensive data management support. Nathan has expertise in NIH policy compliance, data management, and analysis.

Nathan has a wide range of experience with database development, data visualization and reporting, and experience giving researchers the support needed to stay in compliance with funder public access and data requirements.

He will also be serving as a liaison between data management and storage resources across the university and CHHE. You can find his office in the CHHE office suite. If you need assistance with any of these topics, he will be more than happy to help.

Comparative Toxicogenomics Database

The Comparative Toxicogenomics Database (CTD) provides a resource that facilitates mechanistic discoveries underlying environmentally influenced diseases that can be tested experimentally.

CHHE partnered with CTD to enhance the ability of members to translate mechanistic discoveries using model systems, while also providing insights into mechanisms that may underlie human health endpoints in population-based studies. CTD is a freely available database developed to promote understanding about the effects of the environment on human health and the etiologies for environmentally influenced diseases).

Dr. Carolyn Mattingly will provide consultation to CHHE members. CTD contains manually curated data describing:

  • chemical-gene/protein interactions in vertebrates and invertebrates,
  • chemical-disease relationships,
  • gene/protein-disease relationships, and
  • comprehensive exposure information.

In addition to directly curated disease relationships, data integration in CTD allows users to computationally infer relationships among chemicals, genes, and diseases that can then be tested experimentally. Data curation is typically prioritized by chemical according to the foci of the NIEHS, EPA, FDA, and other groups. Consequently, CTD contains abundant data for diverse chemical compounds and drugs. CTD was expanded to include curated exposure data. This project will centralize and contextualize exposure data into a broader biological framework while grounding CTD’s experimental data in a “real-world” human exposure context.

Partnering with CTD will provide specific advantages to CHHE members through several mechanisms. First, CTD will facilitate translation of basic science findings by virtue of the unique integration of cross-species chemical-gene interaction data with human disease information. Second, CTD will provide mechanism-based insights for population-based or clinical studies.