Top Secret Clearance Jobs

SME Data Engineer

Top Secret Clearance Jobs, Ashburn, Virginia, 22011

About the job SME Data Engineer Top Secret Clearance Jobs is dedicated to helping those with the most exclusive security clearance find their next career opportunity and get interviews within 48 hours. Each day U.S. Customs and Border Protection (CBP) oversees the massive flow of people, capital, and products that enter and depart the United States via air, land, sea, and cyberspace. The volume and complexity of both physical and virtual border crossings require the application of solutions to promote efficient trade and travel. Further, effective solutions help CBP ensure the movement of people, capital, and products is legal, safe, and secure. In response to this challenge, ManTech, as a trusted mission partner of CBP seeks capable, qualified, and versatile SME Data Engineers to help develop complex data analytical solutions for law enforcement personnel to assess risk of potential threats entering the country. Responsibilities include, but are not limited to: Developing large volume data sets sourced from multitude of relational (Oracle) tables to allow data analysts and scientists to construct training sets for machine learned models and recurring reports & dashboards. Working closely with government client and technical teams (DBA's etc.) to assist in creating/managing/optimizing scheduled Extract, Transform and Load (ETL) jobs and workflows. Data analysis, problem solving, investigation and creative thinking to manage very large datasets to be used in variety of formats for varying analytical products. Assist with the implementation of data migration/pipelines from on-prem to cloud/non-relational storage platforms. Respond to data queries/analysis requests from various groups within an organization. Create and publish regularly scheduled and/or ad hoc reports as needed. Researching and documenting data definitions and provenance for all subject areas and primary data sets supporting the core business applications. Responsible for data engineering source code control using GitLab. Minimum Qualifications: Experience with relational databases and knowledge of query tools and/or BI tools like Power BI or OBIEE and data analysis tools. Experience with the Hadoop eco system, including HDFS, YARN, Hive, Pig, and batch-oriented and streaming distributed processing methods such as Spark, Kafka, or Storm. Strong experience in automating ETL jobs via UNIX/LINUX shell scripts and CRON jobs. Demonstrate a strong practical understanding of data warehousing from a production relational database environment. Strong experience using analytic functions within Oracle or similar tools within non-relational (MongoDB, Cassandra etc.) database systems. Experience with Atlassian suite of tools such as Jira and Confluence Knowledge of Continuous Integration & Continuous Development tools (CI/CD). Must be able to multitask efficiently and progressively and work comfortably in an ever-changing data environment. Must work well in a team environment as well as independently. Excellent verbal/written communication and problem solving skills; ability to communicate information to a variety of groups at different technical skill levels. A high school diploma and 20 years of experience, an Associates degree and 18 years of experience, a Bachelor's degree and 12 years of experience, a Master's degree and 9 years of experience or a PhD and 7 years of experience is required. Preferred Qualifications 5 years of experience in developing, maintaining, and optimizing complex Oracle PL/SQL packages to aggregate transactional data for consumption by data science/machine learning applications. 10 years of experience in working in large (80 TB) complex data warehousing environment. Must have full life cycle experience in design, development, deployment and monitoring. Experience with one or more relational database systems such as Oracle, MySQL, Postgres, SQL server, with heavy emphasis on Oracle. Experience in architecting data engineering pipelines/data lakes within cloud services (AWS, GCP etc.). Experience with Amazon S3, Redshift, EMR and Scala. Experience with migrating on-prem legacy database objects and data to the Amazon S3 cloud environment. Strong experience in converting JSON documents to targets such as Parquet, Postgres, and Redshift. Experience or familiarity with data science/machine learning and development experience for supervised and unsupervised learning with structure and unstructured datasets. Security Clearance Requirements: Must be a U.S Citizen with the ability to obtain CBP Suitability and Top Secret clearance Physical Requirements: The person in this position needs to occasionally move about inside the office to access file cabinets, office machinery, or to communicate with co-workers, management, and customers, which may involve delivering presentations.