I'm a Senior Site Reliability Engineer at Cisco (Splunk) who sits at the intersection of DevOps and platform engineering. I build the tools and infrastructure that make other engineers' lives easier -- from CI/CD pipelines to internal developer platforms. When something breaks at 2am, I'm probably already looking at it.
Building and maintaining internal developer platforms that streamline deployments, reduce toil, and empower engineering teams to ship faster with self-service tooling.
Designing and implementing infrastructure as code with Terraform, Puppet, and CI/CD pipelines to minimize manual intervention and increase deployment reliability.
Monitoring systems and infrastructure to ensure production operability, serving on-call for critical environments, and driving rapid incident resolution to minimize downtime.
Running production workloads on Kubernetes and Docker, working with Kubernetes operators for Splunk Cloud, maintaining Git repositories and Dockerfiles, and promoting container-first workflows across engineering teams.
Working within AWS and FedRAMP/GovCloud environments, managing secrets and access policies with HashiCorp Vault, and collaborating with security teams to address vulnerabilities in regulated infrastructure.
Developing custom tools in Python, Go, and Bash to automate operational tasks -- from Slack bots for on-call reminders to bulk Jira ticket creation. Focused on reducing human error and building documented, repeatable processes for team adoption.
Build and maintain internal developer platforms and infrastructure tooling for FedRAMP environments. Manage secrets infrastructure with HashiCorp Vault, including policy authoring and role generation. Author reusable Terraform modules for core and network infrastructure, and maintain Puppet hieradata across GovCloud and FedRAMP stacks. Develop custom Go services and Kubernetes operator workflows. Build internal automation tools including Slack bots and bulk Jira integrations to streamline team operations.
Provided engineering support for production and staging environments to maintain 100% operability. Developed automation tools in Python, Go, and JavaScript with GitLab and Docker integration. Created training curricula and onboarded new hires through daily one-on-one shadowing. Built Splunk dashboards to track SLA metrics and delegated workloads across the team. Collaborated with security teams to address vulnerabilities within the GovCloud space.
Provided technical support and troubleshooting for enterprise customers, using Splunk SPL to investigate issues on customer stacks. Served on-call for high-priority cases with quick turnaround on resolutions. Created internal bug tickets with thorough documentation for dev teams and collaborated across Account and Sales teams to ensure customer success.
Monitored transactions on virtual and bare metal server provisions and reloads. Served as an escalation point for Systems Administrators and Engineers. Deployed and maintained international server environments for 24/7 critical uptime in a mixed Windows/Linux environment. Leveraged automation tools to decrease deployment times and increase reliability. Managed on-call support for critical business applications and maintained complete system inventory.