Hello! I’m an exerienced SRE/DevOps Engineer who loves learning and working with new tech. I got my start working in Higher Ed. in Research Computing/HPC doing the DevOps end of things for (mosty) a few High Performance COmputing Clusters for general use on campus. It was though this work I was exposed to basially as parts of modern tech infrastructure from networking/DNS, to config mgmt and CI/CD, to distributed storage and large scale metrics collection/alerting. I’ve built these environments and supporting infra. from the ground up, as well as joined teams and helped level-up the tech. stack and keep the lights on. While happy in my currentl role, I’m always looking for an interesting and meanful place to contribute and grow.
Part of a Cross-site, cross-functional team owning Reliability and Observabiliy. Develop and operate an in-house metrics pipeline ingesting ~1.6M msgs/sec, site-site fialover strategy and tooling, SLI/SLO system, service scorcard, centralized logging, etc. In this role I’ve had extensive Google Cloud, Docker and Kuberetes exposure
Lead Engineer hepling improve an existing HPC cluster, while leading the design and build out of an all new cluster environment and all supporting infra (such as config mgmt, bare-metal provisioning, virtualization, metrics/monitoring/alerting, etc.).
Helped bootsrap an Openstack based HPC computing environment to supplement the existing one, formed first “DevOps Team” assisting developers with a Docker based CI/CD environment for quick execution on a high exposure project
Grew from junior position working directly with reserchers/faculty to help solve small scale computing issues, into a lead role running as well as building the “2.0” centralized HPC environment on campus, topping out @ ~60k compute cores and 5PB active storage. Was resposible for everything from DNS/DHCP infra, config managment, distributed/NFS storage, provisioning, metrics and monitoring as well as virtualization platform and visualization lab.