We are an East Coast contracting firm dedicated to helping our clients achieve excellence. Join a mission-critical team supporting a major U.S. Government organization in modernizing its data architecture. As a Senior Data Engineer, you will design, build, and optimize scalable cloud-based data platforms, with a strong focus on Databricks, to enable secure, reliable data delivery for analytics, AI/ML, and congressionally mandated reporting. This is a high-impact role driving data modernization in a regulated federal environment.
Key Responsibilities
- Design and implement robust ETL/ELT pipelines using AWS services (S3, Glue, Lambda) and Databricks (Delta Lake, Unity Catalog, Auto Loader, Delta Live Tables) to ingest, transform, and publish data from diverse structured and unstructured sources.
- Develop scalable batch and real-time data processing workflows leveraging Apache Spark, Python, and streaming frameworks (e.g., Kafka or Structured Streaming).
- Collaborate closely with data scientists and analysts to support ML pipelines, feature engineering, model training data preparation, and BI dashboards.
- Apply best practices in data governance, schema evolution, metadata management, and data lineage using Unity Catalog and AWS Lake Formation.
- Ensure compliance with federal security standards (FISMA, NIST 800-53, Section 508) through encryption, access controls, auditing, and secure data handling.
- Manage infrastructure-as-code (IaC) deployments using Terraform or AWS CDK, and partner with DevOps teams for CI/CD, monitoring, and pipeline reliability.
- Optimize performance and cost through partitioning, Z-ordering, clustering, auto-scaling, and table maintenance in Databricks.
- Document workflows, technical designs, and data lineage to support auditability, transparency, and knowledge transfer.
Required Qualifications
- 5+ years of hands-on data engineering experience building production-grade pipelines on cloud platforms (AWS preferred).
- Strong expertise with Databricks (Delta Lake, Unity Catalog, Spark, Workflows, Auto Loader, Delta Live Tables).
- Proficiency in Python (and/or Scala) for data processing and scripting; familiarity with AWS Lambda and serverless architectures.
- Solid experience with AWS core services (S3, Glue, EMR, Lambda, IAM, CloudWatch).
- Deep knowledge of SQL, data modeling, and distributed processing with Apache Spark.
- Experience with data governance, security, and compliance in regulated environments (e.g., FISMA, NIST frameworks).
- BS in Computer Science, Engineering, or related discipline required
- U.S. Citizenship.
- Principals only, no recruiting firms
Preferred Qualifications
- Experience with Snowflake or other cloud data warehouses.
- Familiarity with streaming technologies (Kafka, Flink, or Databricks Structured Streaming).
- Certifications: Databricks Certified Data Engineer Associate/Professional, AWS Certified Data Analytics – Specialty, or AWS Certified Big Data.
- Previous work in federal/government contracting environments.
Why Join?
- Contribute to high-visibility, congressionally mandated initiatives that directly impact national priorities.
- Work with cutting-edge tools in a modern data lakehouse environment.
- Competitive compensation, benefits, and opportunities for professional growth in a stable, mission-driven organization.
Work Remotely
Job Type: Full-time
Pay: $170,000.00 - $210,000.00 per year
Benefits:
- 401(k)
- 401(k) matching
- Dental insurance
- Health insurance
- Paid time off
Education:
Experience:
- Software development: 7 years (Required)
Work Location: Hybrid remote in Washington, DC 20003