Excella

Data Scientist

Excella, Arlington, Virginia, United States, 22201

OverviewThe Data Scientist is responsible for using advanced statistical, algorithmic, machine learning, data mining, and visualization techniques to help advance and complete client projects. The Data Scientist must also be able to communicate complex quantitative analyses in a clear, precise, and actionable manner to management and executive level audiences.

Responsibilities

Working directly with client stakeholders to understand and define analysis objectives and then translate these into actionable results.

Obtaining data from multiple, disparate data sources including structured, semi-structured, and unstructured data.

Using machine learning and data mining techniques to understand the patterns in large volumes of data, identify relationships, detect data anomalies, and classify data sets.

Working with data integration developers to assess data quality and define data processing business rules for cleansing, aggregation, enhancement, etc. support analysis and predictive modeling activities.

Designing and building algorithms and predictive models using techniques such as linear and logistic regression, support vector machines, ensemble models (random forest and/or gradient boosted trees), neural networks, and clustering techniques.

Deploying predictive models and integrating them into business processes and applications.

Validating and optimizing model performance upon deployment and tracking over time as necessary.

Presenting complex analysis results tailored to different audiences (e.g. technical, manager, executive) in a highly consumable and actionable form including the use of data visualizations.

QualificationsTechnical:

3+ years in a hands-on role performing advanced predictive analytics using tools like Python, R, or Scala.

3+ years writing simple to complex SQL queries to obtain data from multiple source systems.

3+ years using data mining methods, such as clustering analysis and anomaly detection, to understand data patterns and select appropriate predictive techniques.

Experience with applied machine learning (tree-based methods, ensemble methods, neural networks/deep learning).

Proficient understanding of relational (e.g. Oracle, SQL Server, PostgreSQL) and Big Data distributed structures (Hadoop/Spark) in order to source data effectively.

Experience using natural language processing techniques preferred.

Experience using advanced analytics techniques for fraud detection and prevention preferred.

Experience building machine learning models for production environment preferred.

Non-Technical:

Excellent communication skills to be able to interact directly with non-technical client stakeholders and act in a business-to-technical translation role.

Experience working in an onsite client technical consulting environment preferred.

Experience working within the Agile Scrum Framework.

Self-motivated and self-managing.

Proficient in creating reasonable and accurate time estimates for assigned tasks.

Understanding of DevOps Research and Assessment (DORA) and the capabilities within the DORA capability catalog is encouraged.

#J-18808-Ljbffr