Location: Onsite Houston
Employment Type: Full-Time
About the Role
We are seeking a highly skilled and hands-on Databricks Lead with deep expertise in Azure Databricks, PySpark, and Structured Streaming. This role is ideal for a senior engineer who brings a programming-first mindset, a deep understanding of distributed systems, and the ability to build high-performance, cost-optimized data solutions.
This is not a traditional ETL role — it requires someone with a strong grasp of Spark internals, declarative pipeline development, and real-time data processing. You’ll also provide technical leadership and collaborate closely with cross-functional teams to deliver robust, scalable data solutions.
Key Responsibilities
- Lead the design and development of scalable, high-performance data pipelines, streaming tables, and Delta Live Tables (DLT) using Azure Databricks and PySpark
- Drive Spark performance tuning and implement cost optimization strategies within the Databricks environment
- Build and manage real-time and batch workflows using Structured Streaming
- Leverage Delta Live Tables (DLT) and Lakehouse Declarative Pipelines (LDP) to build scalable, reliable, and maintainable data pipelines
- Collaborate with data architects, analysts, and business stakeholders to deliver robust data solutions
- Provide technical leadership through mentorship, code reviews, and architectural guidance
- Ensure system reliability and performance through proactive monitoring and best practices
Must-Have Qualifications
- Extensive hands-on experience with Azure Databricks and PySpark
- Strong programming background and in-depth knowledge of distributed data processing (beyond ETL tools)
- Proven expertise in: Spark performance tuning and optimization, Cost management within Databricks, Structured Streaming for real-time data processing and Lakehouse Declarative Pipelines (LDP) and Delta Live Tables (DLT)
- Familiarity with the Azure data ecosystem (e.g., ADLS, Azure Data Factory, Synapse)
- Excellent communication skills and the ability to work collaboratively with cross-functional teams
- Willingness and ability to work onsite
Nice-to-Have
- Industry experience in energy, utilities, or heavy industry
- Knowledge of data governance, security and monitoring.
Job Types: Full-time, Contract, Permanent
Application Question(s):
- How many years of hands-on experience do you have with Databricks?
- How many years have you worked with Azure data services (e.g., ADLS, ADF, Synapse)?
- How many years of experience do you have working with PySpark?
- How many years of experience do you have with Structured Real Time Streaming
- Structured Real Time Streaming
Ideal answer: 2
How many years of hands-on experience do you have working with any of the following: Delta Live Tables (DLT), or Lakehouse Declarative Pipeline (LDP)
Work Location: In person