Logo
Arrayo

Senior Computational Biology Director

Arrayo, Boston, MA, United States


We are excited to be expanding our scientific Data Science group in our Boston and Cambridge offices. We are looking for a Senior Computational Biology Developer to join or lead our Scientific Data Analytics Group.

Our Scientific Data Analytics Group is in charge of:

  • Understand chemical proteomics workflows and proteomics data structure to develop automated data process, analysis and visualization tool
  • Identify and integrate relevant internal and external data and knowledge resources to enable data mining in biological context
  • Working with in house, open-source and/or commercially available software platforms to enable target annotation and enrichment infrastructure and pipelines
  • Combine target annotation and analysis with chemical proteomics data to enable deep understanding of compound-target affinity relationships and their functions at system level cross cell lines, tissues, in the context of disease and healthy state
  • Enable intra-/ inter-experiment knowledge integration: define pathway, networks, co-regulators, upstream regulator, substrate analysis, others.
  • Build molecular phenotyping/ compound connectivity maps at protein/ phosphoprotein level and integration of compound protein binding profile and cellular functional profile
  • Be abreast with most current systems analysis methodology and algorithms and introduce them in-house when appropriate.

Qualifications
Education

  • Ph.D. or Master degree with 5+ year experience with a focus on computational biology, bioinformatics, or computer science or related fields, or equivalent experience preferred.

Preferred Experience:

  • Knowledge of human genome annotation and biological pathway resources and tools, i.e. StringDB, IPA, MetaCore, GO, GSEA, DAVID, KEGG, etc., and familiarity with systems biology concepts and best practices of software development.
  • Understanding of SQL/NoSQL database schemas and development, familiarity with MongoDB, XML, RDF, neo4j, or equivalent desirable
  • Proficiency in programming languages (Python and JAVA/JavaScript) in a Unix/Linux environment.
  • Experience with high-performance Linux cluster and cloud computing.
  • Experience in statistics, including familiarity with mathematics and statistics packages such as R/Bioconductor
  • Highly self-motivated, with excellent attention to detail and strong organizational and communication (oral and written) skills.
  • Creative and independent scientific thinking to solve complex technical and scientific problems. Capable of integrating information generated from multiple sources to shape and strengthen research hypotheses.
  • Ability to work independently yet team-oriented. Capable of building strong relationships with peers, partners outside of function and customers.
  • Knowledge in the wide range of biological research processes, and data management tools/ applications used in biological research, especially proteomic research
  • In-depth knowledge in the frontier of system chemical biology; expertise in network analysis, data mining, visualization tools and new machine learning techniques is preferred
  • Previous experience in the analysis, visualization and interpretation of large “omics” data sets as well as integration with chemical and biological data
  • Previous experience on Pipeline Pilot and Sportfire is a plus
  • Deep Learning (TensorFlow, PyTorch) and/or Bayesian (PyMC3) frameworks
  • RDF, Linked data, Semantic Web

Arrayo is an Equal Employment Opportunity employer and as such does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.