Logo
System One

Engineer, Systems Ld

System One, Reston, Virginia, United States, 22090


ALTA IT has a Contract position open for a Datadog Systems Engineer.

Citizen or Green Card Holder Only

100% Remote - EST prefer someone in DMV area in case needed onsite in Reston, VA once in a while.

Seeking a Lead Systems Engineer to support the Systems Monitoring initiatives.

Responsible for software tool administration for systems and applications monitoring tools. Expertise with at least one of the Monitoring tools like DataDog.

DataDog Administration experience on Linux platform to instrument Java based applications running on Tomcat Application Server.

Configuration experience in Infrastructure Monitoring, Network Monitoring and Centralized Logging.

Or similar Administration experience with ELK Stack - Elasticsearch (search and analytics engine), Logstash (ingest pipeline) and Kibana (visualization and creating dashboards).

Strong Linux platform (Red Hat) background.

Automation experience with scripting (Python, Shell, ANSIBLE) preferred.

Understanding of SSL setup on Linux servers. Installing CA certs etc.

Experience with Network Monitoring and knowledge on Network components like Switches, Routers, Palo Alto Network utilization SNMP, F5 Load Balancers, WebSeal, Info Blocks, Gigamon, Network Mapping is a plus.

Working knowledge of other monitoring tools like Big Panda, CloudBeat (Synthetic Monitoring) is desired.

Responsibilities include script writing, installing, managing, and maintaining the monitoring tools, as needed, as well as integration with other tools and collaboration with other groups and their tools.

Tasks:

Manages, configures and maintains the Data Dog tool on Linux platform.

Responsible for Network Monitoring, Infrastructure/Server Monitoring (Linux, Windows, AIX) using Data Dog, Application, SNMP and Log Monitoring.

Configure centralized logging of all logs from different sources like WebSphere / Tomcat and IHS WebServers on AIX servers to Data Dog on Linux.

Creates required dashboards with data visualization in Data Dog.

Manages, configures and maintains the DataDog APM tool on Linux platform.

Responsible for Java Applications instrumentation with Data Dog, set up health rules and fine tune monitoring in Data Dog.

Setup End User Monitoring / Browser Real User Monitoring of Data Dog for applications, using Java script injection.

Creates Selenium scripts to monitor business transactions using CloudBeats Synthetic Monitoring.

Provides support to all significant production issues.

Creates documentation to support the management and maintenance of Data Dog / Data Dog tools.

Analyzes tool data and usage.

Works with different Systems and Application Architecture teams to ensure that systems monitoring requirements are addressed early in the development process.

Assists in reviewing and analyzing business & system requirements and specifications for systems monitoring tool protocols and future tool usage.

Competencies:

Effective organizational, interpersonal, analytical, communications skills and Hands on technical experience.

Self-motivated, adaptable to change, forward-thinking.

Must be able to prioritize and manage time under tight deadlines.

Enthusiasm to engage in continuous learning.

Strong technical skills and ability to work proactively.

Comfortable working under Project Manager supervision.

Specific Required Skills:

5-8 years strong IT experience and good working knowledge of a variety of technology platforms.

A minimum of 3 years hands-on experience installing, integrating, managing and maintaining monitoring tools like Data Dog administration and support.

Or similar Log Management experience with ELK Stack.

Experience in writing Shell, Python, Selenium, VuGen scripts.

Experience with SSL certs, encryption methods on Linux.

Experience in developing and implementing systems monitoring and alerting strategies.

Experience developing and documenting processes, procedures, and policies for tool usage and integration.

Knowledge and experience with configuring alerts, dashboards and ad-hoc reports.

Strong understanding of service level management (SLAs, SLRs, etc.).

Experience with data management tools and databases.

Experience in systems and Java applications troubleshooting using monitoring tools like DataDog.

Understanding and experience with both waterfall and agile Software Development Life Cycles (SDLC).

Bachelor of Science in Computer Science or related field or equivalent experience.

Experience with SAFe agile methodologies.

#J-18808-Ljbffr