Job description

School of Medicine:

Established in 1930, Duke University School of Medicine is the youngest of the nation’s top medical schools. Ranked tenth among its peers, the School takes pride in being an inclusive community of outstanding learners, investigators, clinicians, and staff where traditional barriers are low, interdisciplinary collaboration is embraced, and great ideas accelerate translation of fundamental scientific discoveries to improve humanhealth locally and around the globe.

Comprised of 2,400 faculty physicians and researchers, the Duke University School of Medicine along with the Duke University School of Nursing and Duke University Health System create Duke Health. Duke Health is a world-class health care network. Founded in 1998 to provide efficient, responsive care, the health system offers a full network of health services and encompasses Duke University Hospital, Duke Regional Hospital, Duke Raleigh Hospital, Duke Primary Care, Private Diagnostic Clinic, Duke Home and Hospice, Duke Health and Wellness, and multiple affiliations.


The DCI Bioinformatics team members are extensively engaged with the research and educational initiatives and programs of the Duke Department of Biostatistics and Bioinformatics, including the Duke Center of Statistical Genetics and Genomics. The team includes faculty, staff, and graduate student interns. The team fosters a highly collaborative environment. Team members are expected and encouraged to actively seek guidance from DCI Bioinformatics faculty and senior staff.

Occupational Summary

This position is a key member of the Duke Cancer Institute (DCI) Bioinformatics team. Provide support for the statistical and computational considerations of research projects led by basic, translational, and clinical scientists from the DCI. These projects aim to address key questions in cancer biology, pharmacology, pharmacogenomics, and immunology utilizing data generated from high-throughput genomic sequencing assays. Provide support for data quality control and assessment, analysis, and reporting for multiple research projects within the framework of strict adherence to the principles of reproducible analysis and literate programming. Participate as a key member of each project to which he or she is assigned, including developing a solid understanding of the relevant scientific hypotheses and considerations. While being heavily engaged in data analysis, the expectation is to actively participate in and contribute to the development of statistical methods and computational tools needed to address the scientific considerations of each project.

Work Performed

Data Analysis and Programming (50%)

Provide support for analysis of genomic data from array and high-throughput sequencing assays. This will include quality assessment analysis, downstream statistical analysis, and genomic annotation. The downstream analyses will include association analyses as well as supervised (e.g., machine learning) and unsupervised (e.g., class discovery) learning.

Contribute to statistical study design, including power and sample size calculations using existing software or by conducting simulation studies.

Understand the scientific objectives and statistical methodology of each project to which he or she is assigned.

Critically review study documents (e.g., protocols), and the relevant biology and medical literature.

Independently validate analysis data sets and analysis results programmatically.

Documentation (software, analysis, website) (20%)

Use systems for dynamic report generation (e.g., knitr or Jupyter) to generate reproducible and literate reports.

Contribute to the development, testing, documentation, and deployment of data analysis pipelines, designed for use on local, cloud server, and cluster resources, for pre-processing of genomic data.

Contributions to methods research and tools development (5%)

Possess a solid understanding, including strengths and limitations, of any method or tool used for the analysis of the data. Conduct critical reviews of any relevant technical documentation.

Use modern programming tools and frameworks for data science, including the R tidyverse and python pandas ecosystems, to conduct elegant, efficient and reproducible data programming tasks.

Contribute to the writing of methods and software papers. Conduct simulation studies, conduct data analysis, and contribute to programming and documentation of code.

Software & paper review (5%)

Continually extend his or her knowledge and expertise in statistical methods (e.g., competing risks and cause-specific hazard analysis) and computational algorithms, and successfully apply them to the projects.

Conduct critical reviews of existing analysis methods and tools and present the findings to the team so as to help with the assessment of the feasibility and appropriateness for adopting new methods and tools.

Administrative tasks and meetings (20%)

Prepare material, including figures, listings, and tables, for team, department, and national presentations, scientific meetings, abstracts, and papers. Critically review reports to ensure that the methodology and results are accurately reported.

Prepare preliminary statistical reports and contribute to final study reports as needed. Work closely with investigators to ensure the project results and conclusions are presented accurately.

Manage multiple competing deadlines and coordinate the needs for each project.

Actively provide project updates using the team’s project management system, and be prepared to report on the progress of the projects in team meetings.

All other duties as assigned.

The above statements describe the general nature and level of work being performed. This is not intended to be an exhaustive list of all responsibilities and duties required. Employees may be directed to perform job-related tasks other than those specifically presented in this description.

The intent of this job description is to be representative of the level and the types of duties and responsibilities that will be required of this position and shall not be construed as a declaration of the total specific duties and responsibilities.


Demonstrable training and skills and training in data analysis, scientific and data programming, mathematics (linear algebra, real analysis, numerical optimization, discrete mathematics), computer science, and statistics; or a bachelor degree in quantitative sciences (e.g., statistics, mathematics, physics, or theoretical computer science) with formal training or experience in molecular biology and genetics.

    A master’s degree or formal graduate training in statistics, biostatistics, mathematics, physics, or theoretical computer science is strongly preferredMinimum of three years working as scientific programmer or as quantitative researcher (e.g., biostatistician) in an academic or research settingExperience analyzing genomic data (e.g., from high throughput sequencing or high-dimensional array platforms) is strongly preferredSolid understanding of the key elements of molecular biology and population genetics Working experience with the GNU/Linux operating system and using UNIX tools (e.g., sed, awk)Experience using programming languages (e.g., C/C++, python, scala) for scientific computingExperience using and embedding scientific libraries (e.g., Eigen, Cuba, GSL, NLopt, numpy)Experience using R and its extension packages for programming and data analysisExperience using software frameworks for machine learning (e.g., scikit-learn, tensorflow)Experience with programming frameworks for genomic data (e.g., biopython, htsjdk)Experience with programming tools for data science (e.g., R tidyverse, python pandas)Experience with software for dynamic report generation (e.g., knitr, sphinx, Jupyter notebooks)Experience using distributed source code management systems (e.g., mercurial or git)Experience drafting formal statistical reports

To be reviewed applications must include a proper cover letter containing a statement of purpose, along with a summary of the candidate's technical and educational credentials relevant to this position.

Qualifications Required At This Level


Work requires a B.S. in the Biological Sciences with demonstrable computational skills; or a B.S. in Computer Science with a strong interest in Biology/Genomics, M.S. preferred.


Or an equivalent combination of relevant education and/or experience.

Duke is an Affirmative Action/Equal Opportunity Employer committed to providing employment opportunity without regard to an individual's age, color, disability, gender, gender expression, gender identity, genetic information, national origin, race, religion, sex, sexual orientation, or veteran status.

Duke aspires to create a community built on collaboration, innovation, creativity, and belonging. Our collective success depends on the robust exchange of ideas—an exchange that is best when the rich diversity of our perspectives, backgrounds, and experiences flourishes. To achieve this exchange, it is essential that all members of the community feel secure and welcome, that the contributions of all individuals are respected, and that all voices are heard. All members of our community have a responsibility to uphold these values.

Essential Physical Job Functions: Certain jobs at Duke University and Duke University Health System may include essentialjob functions that require specific physical and/or mental abilities. Additional information and provision for requests for reasonable accommodation will be provided by each hiring department.




Diversity Profile: University



View more

Learn more on Inside Higher Ed's College Page for University

Arrow pointing right
Job No:
Posted: 2/19/2021
Application Due: 5/20/2021
Work Type: Full Time