Logo
Tbwa Chiat/Day Inc

Principal Data Scientist New York, New York, United States

Tbwa Chiat/Day Inc, New York, New York, us, 10261


Octus is a leading global provider of credit intelligence, data, and analytics. Since 2013, tens of thousands of professionals across hedge fund, investment banking, management consulting, and law firm verticals have come to rely on Octus to make better, faster, and more confident decisions in pace with the fast-moving credit markets. For more information, visit:

https://octus.com/ Working at Octus Octus hires growth-minded innovators and trailblazers across the globe to drive our business and culture. Our core values – Action Oriented, Customer First Mindset, Effective Team Players, and Driven to Excel – define an organizational ethos that’s as high-performing as it is human. Among other perks, Octus employees enjoy competitive health benefits, matched 401k and pension plans, PTO, generous parental leave, gym subsidies, educational reimbursements for career development, recognition programs, pet-friendly offices (US only), and much more. Role Job Description : Build pipelines for data acquisition by writing code for querying huge amounts of unstructured textual data from a variety of data sources like Financial SEC filings of publicly traded companies, private & public company press releases, co. transcripts, bond offering memorandum docs, etc. using Elasticsearch & RMySQL frameworks in R & database software like HeidiSQL to query databases like MySQL & MongoDB. Utilize frameworks like XML, rjson, pdftools in R to parse & process data from different sources & formats incl. .pdf, XML, json, csv etc. & store it in a structured & organized format for data processing, analysis, & modeling. Devise & implement processes to perform data preprocessing & assessment of data quality text processing & statistical techniques incl. imputation to handle missing data, data type conversions to maintain consistency in data integration, dimensionality reduction, normalization, feature aggregation, encoding, etc. Leverage frameworks in R incl. OpenNLP, Quanteda, tm, text2vec to provide comprehensive functionality for text analysis & natural language processing. Utilize frameworks daily for a variety of tasks incl. corpus creation & management, tokenization, formulation of doc. feature matrices, parts-of-speech tagging, entity extraction, etc. to generate analysis for data exploration, engineer features, formulate details of the model, & overall build robust frameworks for projects. Execute defined frameworks for projects that require data-driven solutions by building, executing, & testing various data science models or enhancing existing models using text mining & machine learning algorithms. Track & monitor model’s performance by testing & debugging when required. Incorporate feedback, business requests from stakeholders to continually improve & enhance workflow & performance. Conceptualize & build supervised &/or unsupervised models from structured &/or unstructured text data. Generate static & interactive data visualizations using frameworks & tools incl. ggplot, Shiny, d3.js to share & present complex ideas, results, project takeaways with technical & non-technical stakeholders. Review, evaluate, & communicate recommendations on modeling techniques & results to team, leadership, & stakeholders. Develop case studies using model output & suggest ways insights might be used. Deploy models in real-time by writing production-level code for scalable models & integrating it within the company’s data infrastructure. Collaborate & participate with different business units across the company to identify areas where data science can be used to automate manual processes. Mentor & lead new & junior members of the team. Formulate & implement ideas at the intersection of distressed debt investing & data science, develop credit-risk models & transform them into data products. Education and Experience : Requires a Master’s degree in Data Science and 4 years of experience in job offered or 4 years of experience in the Related Occupation. Experience can be pre or post degree. Related Occupation: 2 years of experience as a Data Scientist or any other job title performing the following job duties: Build pipelines for data acquisition by writing code for querying huge amounts of unstructured textual data from a variety of data sources like Financial SEC filings of publicly traded companies, private & public company press releases, co. transcripts, bond offering memorandum docs, etc. using Elasticsearch & RMySQL frameworks in R & database software like HeidiSQL to query databases like MySQL & MongoDB. Utilize frameworks like XML, rjson, pdftools in R to parse & process data from different sources & formats incl. .pdf, XML, json, csv etc. & store it in a structured & organized format for data processing, analysis, & modeling. Devise & implement processes to perform data preprocessing & assessment of data quality text processing & statistical techniques incl. imputation to handle missing data, data type conversions to maintain consistency in data integration, dimensionality reduction, normalization, feature aggregation, encoding, etc. Leverage frameworks in R incl. OpenNLP, Quanteda, tm, text2vec to provide comprehensive functionality for text analysis & natural language processing. Utilize frameworks daily for a variety of tasks incl. corpus creation & management, tokenization, formulation of doc. feature matrices, parts-of-speech tagging, entity extraction, etc. to generate analysis for data exploration, engineer features, formulate details of the model, & overall build robust frameworks for projects. Execute defined frameworks for projects that require data-driven solutions by building, executing, & testing various data science models or enhancing existing models using text mining & machine learning algorithms. Track & monitor model’s performance by testing & debugging when required. Incorporate feedback, business requests from stakeholders to continually improve & enhance workflow & performance. and 2 years of experience as an Analyst or any other job title performing the following job duties: Evaluating alternative datasets like – Consumer Transactional, Email Receipt, URL Clickstream, OTA Pricing/Booking, Import/Export Shipments, Geolocation data & performing analysis to study & track market shifts, industry trends, & user dynamics. Generating actionable insights used in developing the investment thesis for the Long Short Equity Strategy. Performing predictive analytics on big data which includes - data munging, data validation, normalization, regression analysis, back testing, data visualization using tools like Python, SQL, Excel, & Tableau. Identifying abnormalities & opportunities & recommending trades to the trading desk. Building predictive models to forecast company KPIs like “Revenue”, “Orders/Transaction Volume”, “Attendance”, etc. for publicly traded companies in the US Consumer sector. Developing techniques to analyze user retention & churn & building models that predict subscribers for companies in the OTT streaming & Cable sector. Developing novel data-driven processes using natural language processing techniques onto alternative data to drive qualitative analyses & building KPI prediction models - like development. Leveraging natural language processing & machine learning algorithms. The salary range estimate for this position is

$170,373 to $210,000. Octus is committed to providing equal employment opportunities to all employees and applicants for employment without regard to race, colour, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, marital status, pregnancy, veteran status, or any other legally protected status. We strive to create an inclusive and diverse work environment where all individuals are valued, respected, and treated fairly. We believe that diversity enriches our workplace and enhances our ability to innovate and succeed. Apply for this job

* indicates a required field First Name * Last Name * Email * Phone * Resume/CV * LinkedIn Profile * Are you authorized to work in the country you are applying? * Select... Will you require visa sponsorship now or in the future? * Select... Preferred first and last name * Please share whatever preferred names you have that differ from your full legal name.

#J-18808-Ljbffr