Students! Find the fintech job of your dreams here.

Cloud Monitoring/Observability Engineer

Equifax

Equifax

Alpharetta, GA, USA
Posted on Sep 12, 2025

Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you.

We are looking for an experienced professional to join the Operational Resilience organization to work on the Global Monitoring and Observability team. In this exciting role you will assist Equifax SREs and Developers with ensuring the stability of our critical applications and infrastructure, set global standards, develop related automation and compliance reporting. The ideal candidate will have a strong background in observability and a passion for building robust, automated and proactive monitoring capabilities.

This role requires participation in a two-week 24/7 on-call rotation every 6-8 weeks.

Equifax has a hybrid work schedule that allows for 2 days of remote work (Monday and Friday), with 3 required onsite days (Tuesday, Wednesday, Thursday) every week.

This role will work the required onsite days at our Equifax office in Alpharetta, Georgia.

This position does not offer immigration sponsorship (current or future) including F-1 STEM OPT extension support.

This position is not open to third-party vendors or C2C.

What you will do

  • Design and implement monitoring solutions for GCP environments, including Compute Engine, GKE, Cloud Functions and Cloud Storage.

  • Configure and maintain monitoring tools (Datadog, GCP Cloud Monitoring, Cloudwatch) for comprehensive application and infrastructure monitoring, including metrics, logs and traces.

  • Establish and improve operational resilience by creating, implementing and maintaining governance processes for and participate in on-call rotations, monitoring and alerting. This includes developing reporting processes to track adherence to policies across our tooling, such as PagerDuty, Datadog and other cloud monitoring platforms.

  • Enforce Site Reliability Engineering (SRE) principles, focusing on governance of reliability, observability and performance.

  • Automate monitoring tasks, alerting configurations and incident response workflows.

  • Collaborate with development, operations and security teams to improve system reliability and availability.

  • Participate in a two-week 24/7 on-call rotation every 6-8 weeks.

What experience you need

  • BS degree in Computer Science or related technical field.

  • 5-7 years of related experience in Site Reliability Engineering or general monitoring and observability.

  • Cloud Expertise: Strong experience with GCP Cloud Services and architecture.

  • Monitoring & Observability: Expertise with observability tools (Datadog, GCP Cloud Monitoring) for full-stack monitoring, including hands-on experience with PagerDuty for incident management.

  • SRE & Automation: Solid understanding of SRE principles (SLOs, SLIs) and a track record of automating tasks with scripting (Python, Go) and CI/CD tools (Terraform, Jenkins).

  • Visualization & Reporting Tools: Experience with visualization tools (Grafana, Looker Studio).

What could set you apart

  • Experience with ITIL processes, particularly in incident, problem and change management

  • Experience working in a regulated industry, such as financial services.

  • Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes).

  • Cloud Certification Strongly Preferred.

#LI-Hybrid

#LI-KD1

We offer comprehensive compensation and healthcare packages, 401k matching, paid time off, and organizational growth potential through our online learning platform with guided career tracks.

Are you ready to power your possible? Apply today, and get started on a path toward an exciting new career at Equifax, where you can make a difference!

Primary Location:

USA-GA-Alpharetta-JVW3

Function:

Function - Tech Dev and Client Services

Schedule:

Full time