DevOps Support Engineer
Platform support at CELUM, homelab tinkerer, previously shipped an AI assistant at Steelcase.
Experience
CELUM (Contractor via Weasweb) – DevOps Support Engineer
L3 platform operations for a Digital Asset Management platform across Azure/AKS and on-prem. Root-cause isolation using Grafana + centralized logs, structured bug/feature escalation, and operational lifecycle management.
- Troubleshoot service health, configuration, and connectivity across AKS/Kubernetes, Docker, Linux, and Azure SQL.
- Root-cause analysis using Grafana dashboards, centralized logging, and API validation (Swagger/Postman) for escalation decisioning.
- Create structured engineering items: bugs with repro steps, feature requests with acceptance criteria, and support requests with evidence + impact assessment.
- Handle operational requests: access provisioning, environment lifecycle ops, and data/integration triage with SQL + API validation.
Steelcase – Service Desk Analyst L2
Shipped an AI chatbot from zero allocation, led security incident response, and drove process automation across the NA support team.
- Proposed and delivered AI chatbot MVP (RAG-based ServiceNow assistant in Teams): designed dual-agent architecture, ran structured testing rounds, analyzed failure modes (traced to KB gaps, not retrieval logic), and produced rollout plan with automation roadmap (Graph Connector → Power Automate ticket creation).
- Led security incident response: investigated phishing campaigns, stopped a live intrusion attempt, and improved escalation workflows.
- Proposed and delivered process improvements: phishing workflow changes, DL/SG provisioning automation, alert/incident management optimization.
- Daily exposure to enterprise tooling: AD, ServiceNow, SCCM/Intune, Exchange admin, Cisco ISE, Darktrace, Fortimail.
- Participated in monthly on-call rotation for critical incidents.
Electronic Arts – BioWare – QA Tester
Quality assurance for Star Wars: The Old Republic; built testing discipline, issue triage, and crisp reporting practices.
- Supported 3-month release cycles through regression, feature, black-box, A/B, and compliance testing.
- Automated repetitive test setup tasks with batch scripts (hours → minutes).
- Collaborated with distributed teams across Disney, EA, and BioWare.
Projects
tresor homelab
Self-hosted production platform managed entirely as infrastructure-as-code. Three hosts (prod, QA VM, Hetzner VPS edge), five isolated Docker networks, zero-trust ingress, and a full observability stack. Provisioned end-to-end with 22 Ansible roles and 120+ playbooks. No manual steps after SSH is up.
- Hetzner VPS as public edge (Velocity for Minecraft, nginx for Jellyfin Music + FileBrowser over WireGuard); tresor as private backend with no exposed ports. All public traffic enters through Cloudflare Tunnel or WireGuard.
- Five Docker networks enforcing strict traffic boundaries:
public_net(CF Tunnel + Traefik),internal_net(LAN-only),mc_net/mc_pub(Minecraft backend + WG bridge),lan_pub. LAN services bind to 192.168.0.42, WG services to 10.66.66.2, never 0.0.0.0. - Version pinning centralized in
group_vars/prod/versions.yml, Vault per environment (prod/qa/vps), consistent 8-playbook lifecycle per service (deploy, remove, backup, start, stop, restart, status, update), infra-widebackup-all.ymlandstatus-all.ymlrollups.
tresor-ctl: Python TUI (Rich + Questionary + Paramiko) that auto-discovers services from the playbook directory tree and runs lifecycle actions over live SSH.- QA environment (
tresor-vmKVM sandbox) for testing roles before prod.verify-*.ymlplaybooks confirm base, Docker, networks, and WireGuard state post-deploy. - Services: Jellyfin, Jellyfin Music, FileBrowser (personal + 1TB friends drop), Grafana + Prometheus + Node Exporter + cAdvisor, Uptime Kuma, Kiwix (110GB offline Wikipedia), PaperMC, three Discord bots.
nursing pas cu pas platform
Took over a rough handoff and turned it into something that feels maintainable: clearer repo boundaries, a sane path into QA, a proper image pipeline, and enough cleanup in the app itself that it reads like a real platform instead of a stitched-together demo.
- Pulled the work apart into frontend, backend, and infra repos so ownership was obvious and deployment concerns stopped leaking through the app.
- Defined the runtime layout: Angular on the public side, NestJS behind
/api, Redis for cached quiz state, and Postgres as the source of truth. - Replaced the old image handling path with signed uploads plus CDN delivery, so question media stopped bloating the normal request flow.
- Wired app changes into a repeatable QA release path through GitHub Actions and Ansible instead of manual VPS updates.
- Helped steady the product itself too: question and category management, image-backed quizzes, Romanian copy, and cleaner theme behavior.
- Captured the architecture, delivery flow, and sanitized examples so the platform could be understood without tribal knowledge.
service desk ai chatbot
RAG-based assistant for Steelcase's service desk team to surface ServiceNow documentation directly in Teams. Built the MVP end-to-end with zero dedicated allocation, from architecture to testing to rollout planning.
- Dual-agent architecture: one bot for end users, one for the service desk team, both backed by the same knowledge source.
- Ran structured testing rounds with the team and validated answer consistency across both agents.
- Analyzed response gaps: most failures traced to missing or outdated KB articles, not retrieval logic. Documented remediation list.
- KB ingestion via manual PDF export from ServiceNow into SharePoint (documented automation path via Graph Connector).
- Designed phased roadmap: Azure AI Search indexing → Power Automate ticket creation from chat → auto-sync when new KBs are published.
- Produced full demo deck and rollout documentation for leadership review.
other selected projects
portfolio website (this site)
End-to-end static site build: domain registration, DNS config, responsive frontend, CF Pages deployment, Worker backend, Turnstile integration, and AWS SES in sandbox with verified addresses.
repobatch-yt-downloader
Bash script using yt-dlp to mirror YouTube "Liked Videos" playlists into local MP3 files with thumbnails and metadata. Music synced to Jellyfin for streaming across devices.
repolife-scheduler
Trello-based automation combining Butler rules with a Python script (run via GitHub Actions) to keep daily rituals and recurring tasks self-managing.
repoemail spam detection
End-to-end KDD pipeline on classic spam datasets: cleaning, feature engineering, model training and evaluation (LR, RF, GB).
repoEducation & Learning Path
Informatics & Economics (FSEGA)
Prioritized full-time roles at EA and Steelcase.
Contact
Open to platform, DevOps, and infrastructure roles. Drop a message or reach out directly.