Conference Talk Proposals for HPC + DevOps + Kubernetes
Overview
Your unique positioning (10+ years HPC + modern DevOps + Kubernetes expertise) opens doors to several high-impact conference venues. These proposals are designed to establish you as a recognized expert in the HPC+Kubernetes intersection while building authority in the technical community.
Proposal 1: “Running Genomic Workloads on Kubernetes: From Local Dev to Multi-Cloud Production”
Target Conferences:
- KubeCon Europe/NA (primary - 2,000+ attendees)
- Bioinformatics Open Source Conference (BOSC) (secondary - 500-800 attendees)
- ISMB/ECCB (tertiary - 2,000+ life sciences researchers)
Duration: 45 minutes (presentation) + 15 minutes (Q&A)
Abstract:
Genomic analysis has entered the cloud era, but the tooling hasn’t caught up. Researchers need pipelines that run identically on laptops, local clusters, and cloud infrastructure—without rebuilding for each platform.
In this talk, I’ll share how we built a production genomic analysis platform on Kubernetes that handles 15,000+ samples/month while reducing infrastructure costs by 60% and enabling researchers to deploy their own pipelines in minutes.
Learn:
- Why Kubernetes is the right abstraction layer for compute-intensive scientific workloads
- A complete architecture using Argo Workflows, Kueue, and Karpenter
- Real metrics from deploying across Azure, AWS, and on-premises
- How to onboard researchers without requiring DevOps expertise
- Common pitfalls and how we solved them (memory management, storage I/O, cost optimization)
Who should attend:
- Bioinformaticians and research engineers adopting Kubernetes
- DevOps teams supporting research institutions or biotech companies
- Platform engineers building infrastructure for scientific computing
- Anyone running compute-intensive workloads (ML training, simulations, data processing)
Talk Outline (60 minutes total):
0-5 min: Problem statement: genomic workflows at scale 5-15 min: Why Kubernetes + why Slurm isn’t enough 15-30 min: Architecture deep-dive (Argo + Kueue + Karpenter) 30-40 min: Live demo or walkthrough (submit workflow, watch it scale) 40-50 min: Real-world results and lessons learned 50-60 min: Q&A
Key Takeaway: Kubernetes isn’t just for microservices—it’s the platform for reproducible, portable scientific computing at scale.
Proposal 2: “Taming the Cost Beast: From $50k to $5k—Cost Optimization Patterns for HPC in the Cloud”
Target Conferences:
- KubeCon Europe/NA (primary - cost optimization track)
- DevOps Days (regional conferences - 500-1,000 attendees each)
- Open Infrastructure Summit (secondary - 2,000+ attendees)
- AWS re:Invent (tertiary - massive reach but harder to get accepted)
Duration: 45 minutes + 15 minutes Q&A
Abstract:
Cloud is expensive. HPC workloads are doubly expensive. Most organizations over-provision infrastructure, pay for unused capacity during off-peak hours, and have no visibility into cost attribution.
I’ll show you how we reduced monthly infrastructure costs from $50k to $5k (an 80% reduction) while maintaining 99.5% availability and improving utilization from 18% to 72%.
Learn a multi-layer optimization strategy:
- Layer 1: Right-sizing compute resources
- Layer 2: Dynamic workload distribution (Kueue + priority classes)
- Layer 3: Auto-scaling patterns (HPA, cluster autoscaler, Karpenter)
- Layer 4: Reserved capacity strategies (mixing on-demand and spot)
- Layer 5: Multi-cloud arbitrage and zone optimization
I’ll share the exact tools, YAML configurations, and governance patterns that made this possible—plus the hard-won lessons about tradeoffs between cost and latency.
Who should attend:
- Platform engineers managing cloud infrastructure budgets
- DevOps teams responsible for cost governance
- Engineering managers wondering why their AWS bill is $200k/month
- Anyone running batch workloads (genomics, ML training, simulations, data processing)
Talk Outline (60 minutes total):
0-5 min: Cost problem statement (metrics from real organization) 5-15 min: Cost structure breakdown (what’s actually expensive) 15-30 min: Multi-layer optimization strategy walkthrough 30-45 min: Real configurations and tool recommendations (Karpenter, Kueue, Kubecost) 45-55 min: Results, metrics, and lessons learned 55-60 min: Q&A
Key Takeaway: 80% cloud cost reduction is achievable without sacrificing reliability—you just need the right architecture and governance.
Proposal 3: “Infrastructure-as-Code for HPC: Building Reproducible, Multi-Cloud Deployments”
Target Conferences:
- DevOps Days (regional focus - excellent fit)
- Terraform Collaboration (if exists)
- Open Source Summit (primary)
- KubeCon (secondary)
Duration: 45 minutes + 15 minutes Q&A
Abstract:
Infrastructure-as-Code (IaC) is non-negotiable for modern DevOps. But applying IaC principles to HPC environments is non-trivial: multiple clouds, heterogeneous workloads, complex networking requirements.
In this talk, I’ll share a production IaC architecture that enables:
- Single codebase, multi-cloud deployment (AWS, Azure, on-premises)
- Reproducible infrastructure: destroy and recreate in < 30 minutes
- GitOps workflows: infrastructure changes via pull requests
- Cost governance: parameterized infrastructure for different performance/cost tradeoffs
Learn:
- Terraform module design patterns for HPC environments
- Managing secrets and state securely at scale
- Testing infrastructure changes before deployment
- GitOps workflows with ArgoCD
- Disaster recovery and rollback strategies
- Real metrics: from manual deployments (3 weeks) to IaC deployments (30 minutes)
Who should attend:
- Infrastructure engineers building platforms
- DevOps teams managing multiple environments
- SREs responsible for reliability and reproducibility
- Anyone managing infrastructure across multiple clouds
Talk Outline (60 minutes total):
0-5 min: Problem: manual infrastructure deployments (pain points) 5-15 min: IaC philosophy and benefits for HPC 15-30 min: Terraform design patterns (modules, variables, outputs) 30-45 min: Multi-cloud architecture walkthrough + live demo 45-55 min: GitOps workflows and disaster recovery 55-60 min: Q&A
Key Takeaway: Infrastructure-as-Code isn’t just about automation—it’s about reproducibility, auditability, and the ability to scale from one datacenter to many without doubling your operational burden.
Proposal 4: “Migrating from Slurm to Kubernetes: A Practical Guide for HPC Administrators”
Target Conferences:
- SC (Supercomputing) Conference (primary - 10,000+ HPC professionals)
- ISC High Performance (secondary - 3,000+ international HPC community)
- HPC Systems Symposium (tertiary)
Duration: 90 minutes (workshop format better than talk)
Abstract:
Kubernetes is coming to HPC. The question isn’t “if” but “when” and “how.”
This workshop demystifies the Kubernetes migration path for HPC teams. We’ll cover practical aspects: how existing Slurm workflows map to Kubernetes, what you gain and lose, and realistic migration timelines.
Topics:
- Kubernetes concepts for HPC admins (pods, operators, resource management)
- Slurm → Kubernetes comparison (schedulers, workflows, resource allocation)
- Real-world migration strategies (phased approach, parallel operation)
- Tools for HPC-style workloads (Kueue for job queuing, Argo for workflows)
- Case study: migrating 5,000+ jobs/month from Slurm to Kubernetes
- Cost implications and ROI analysis
By the end, attendees will have a clear migration roadmap for their own organizations.
Who should attend:
- HPC cluster administrators and systems engineers
- Research computing teams considering cloud/Kubernetes adoption
- DevOps engineers supporting HPC users
- Anyone managing Slurm clusters thinking about modernization
Proposal 5: “Observability for HPC: Monitoring and Debugging Compute Workloads at Scale”
Target Conferences:
- Observability Conferences (CloudConf, ObservabilityCON if available)
- KubeCon (reliability/observability track)
- DevOps Days
Duration: 45 minutes + 15 minutes Q&A
Abstract:
You deployed a Kubernetes cluster for HPC workloads. Great! Now why is your job taking 2x longer than expected? Which node has the bottleneck? Where did your 100 jobs disappear to?
Observability for HPC is different from observability for microservices. You care about:
- Job execution time and resource utilization
- Network I/O patterns between nodes
- Storage bottlenecks and queue depth
- Cluster-wide trends, not individual pod metrics
Learn:
- Prometheus metrics that matter for HPC workloads
- Custom metrics for job performance tracking
- Debugging techniques for containerized scientific computing
- Cost tracking and attribution per team/project
- Building dashboards that HPC users actually use
I’ll share observability patterns from monitoring 15,000+ monthly job runs across multiple clusters.
Talk Outline (60 minutes):
0-5 min: Problem statement (why HPC observability is hard) 5-15 min: Metrics that matter (throughput, latency, utilization, cost) 15-30 min: Prometheus setup for HPC workloads 30-45 min: Custom metrics and dashboards 45-55 min: Debugging real HPC performance issues 55-60 min: Q&A
Submission Strategy & Timeline
Phase 1: Immediate (Next 2-4 weeks)
-
KubeCon Europe 2026: Deadline likely Nov 2025 (SUBMIT ASAP)
- Proposals 1, 2, 3 are strong fits
- Target: get at least one accepted
-
DevOps Days (regional): Rolling submissions
- Proposals 2, 3 are strong fits
- Multiple opportunities per year
Phase 2: Medium-term (2-3 months)
-
Open Infrastructure Summit 2026: CFP typically Feb-Mar 2026
- Proposal 2 (cost optimization)
- Proposal 3 (IaC)
-
SC Conference 2025/2026: CFP typically Feb 2026 (for Aug conference)
- Proposal 4 (Slurm → Kubernetes)
- Proposal 5 (Observability)
Phase 3: Long-term (6+ months)
- ISC High Performance 2026: CFP typically Dec 2025/Jan 2026
- Proposal 4 (Slurm migration)
- Proposal 1 (Genomic workloads)
How to Strengthen Proposals
Each proposal should include:
-
Speaker Bio (50 words): “Senior HPC Architect & DevOps Engineer with 10+ years experience scaling high-performance computing infrastructure. Currently leading genomic analysis platform on Kubernetes processing 15,000+ samples/month across multiple clouds. Open-source contributor (Kubernetes, Kueue). AWS certified solutions architect.”
-
Talk Type: Technical deep-dive (not introductory)
-
Target Audience: Be specific (bioinformaticians, HPC admins, DevOps engineers)
-
Learning Outcomes: What will attendees know/be able to do after your talk?
- Understand architecture decisions and tradeoffs
- Know specific tools and configurations to use
- Avoid common pitfalls based on real-world experience
-
Difficulty Level: Intermediate to Advanced (not beginner-friendly)
Success Metrics
For 2026 (Year 1):
- Goal: Submit to 5+ conferences
- Goal: Get 1-2 talks accepted
- Goal: Deliver at least 1 talk (reach 500+ attendees)
For 2027 (Year 2):
- Goal: 3-4 talks delivered
- Goal: 2,000+ combined attendees reached
- Goal: Become recognized speaker in HPC+Kubernetes space
Next Steps
- Choose 2-3 proposals that align with your interests
- Research deadlines for target conferences
- Customize abstracts for each venue (KubeCon abstract ≠ DevOps Days abstract)
- Submit early (deadlines are hard stops)
- Track submissions in a spreadsheet (conference, deadline, submitted date, status)
- Prepare backup plans (if one doesn’t get accepted, you have others in pipeline)
Pro tip: Don’t wait for “perfect readiness.” Submit to multiple venues. The acceptance rate is typically 20-30%, so you need volume.
Good luck! The HPC + Kubernetes community needs more voices like yours.