AlienVault

Senior-Big Data Software Engineer

AlienVault, Dallas, Texas, United States, 75215

Job Overview Job Description:POSITION: Senior-Big Data Software EngineerJOB LOCATION: 208 S. Akard St. Dallas, TX 75202 [and various unanticipated locations throughout the U.S.; may work from home]Senior-Big Data Software Engineer needed by AT&T Services, Inc. in Dallas, Texas [and various unanticipated locations throughout the U.S.; may work from home] to be responsible for the development of high performance, distributed computing tasks using Big Data technologies such as Hadoop, NoSQL, text mining and other distributed environment technologies. Utilize JVM-based function languages including Scala and Clojure; Hadoop query languages including Pig, Hive, Scalding, Cascalog, and PyCascading; and alternative HDFS-based computing frameworks including Spark and STORM. Utilize big data programming languages and technology, write code, complete programming and documentation, and perform testing and debugging of applications. Analyze, design, program, debug and modify software enhancements and/or new products used in distributed, large scale analytics and visualization solutions. Interact with data scientists and industry experts to understand how data needs to be converted, loaded and presented. Develop Python spark(Pyspark) ETL, ELT Pipelines to perform distributed big data processing in both Batch and real time streaming from different data sources. The sources will be Snowflake, MongoDB, Redis cache, Microsoft Event hubs and Kafka topics for real time machine model training, and service. Integrate multiple cloud data SAAS technologies with cloud provider services like Azure function app. Create serverless framework asynchronous APIs for distributed data processing and computing tasks. Develop and maintain different high performance Kubernetes spark application microservices and restful Api's for distributed Big data processing and computation tasks. Perform analytics on structured, semi-structured data using Pyspark, Spark SQL queries, operations, Joins, tuning queries and UDF. Integrate the cloud data lakes with the downstream cloud provider visualization and analytics tools like Power-BI etc. Develop a robust and highly scalable code by implementing several design patterns. Ensure the application performs at lightning speed by applying high performance structures. Interact with other external interfaces to exchange different forms of data, using CLI and RESTful serverless framework. Handle different file formats such as: Csv, Json, multiline Json, nested Json, Text, Avro, Parquet file formats, snappy, bz2, gzip compression. Develop different spark pipelines in data bricks and schedule the notebook jobs for daily sync jobs. Monitor the feature engineering pipelines. Develop scripts, using Pyspark, and Spark SQL to convert schema-less data into more structured files for further analysis. Modify SQL queries to Spark using Spark RDDs,Scala, and python. Participate in creating combiners, partition, cache distribution to improve the performance of spark jobs.MINIMUM REQUIREMENTS: Requires a Master's degree, or foreign equivalent degree, in Computer Science or Information Studies and 3 years of experience in the job offered or 3 years in a related occupation developing Python-spark(Pyspark) ETL, ELT, distributing big data processing in both Batch and real time streaming from different data sources; performing analytics on structured and semi-structured data using Pyspark, Spark SQL queries, operations, Joins, tuning queries and UDF; developing a robust and highly scalable code by implementing several design patterns; developing scripts, using Pyspark, and Spark SQL to convert schema-less data into more structured files for further analysis; modifying SQL queries to Spark using Spark RDDs, Scala, and python; utilizing big data programming languages and technology, writing code, completing programming and documentation, and performing testing and debugging of applications; analyzing, designing, programming, debugging and modifying software enhancements and/or new products used in distributed, large scale analytics and visualization solutions; interacting with data scientists and industry experts to understand how data needs to be converted, loaded and presented.Our Senior-Big Data Software Engineers earn between $157,322 - $231,100 yearly. Not to mention all the other amazing rewards that working at AT&T offers.Joining our team comes with amazing perks and benefits: Medical/Dental/Vision coverage 401(k) plan Tuition reimbursement program Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays) Paid Parental Leave Paid Caregiver Leave Additional sick leave beyond what state and local law require may be available but is unprotected Adoption Reimbursement Disability Benefits (short term and long term) Life and Accidental Death Insurance Supplemental benefit programs: critical illness/accident hospital indemnity/group legal Employee Assistance Programs (EAP) Extensive employee wellness programs Employee discounts up to 50% off on eligible AT&T mobility plans and accessories, AT&T internet (and fiber where available) and AT&T phoneWeekly Hours:40Time Type:RegularLocation:Dallas, TexasIt is the policy of AT&T to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, AT&T will provide reasonable accommodations for qualified individuals with disabilities. Job ID R-33385 Date posted 09/26/2024