- A Data Engineer is needed to perform the following duties:
- · Engineer and maintain Python-based ETL pipelines to automate ingestion, transformation, and processing of consumer datasets, improving operational efficiency and enabling scalable analytics workflows.
- · Develop and optimize complex SQL queries in Amazon Athena using regex, window functions, CTEs, and advanced join operations to analyze consumer purchase patterns and support targeted marketing strategies. · Design and build interactive dashboards in Amazon QuickSight and Apache Superset to deliver real-time consumer engagement insights and enable faster, data-driven decision-making. · Support the development and orchestration of machine learning pipelines using AWS Step Functions and AWS Batch to enhance extended audience targeting accuracy across industries such as CPG, Healthcare, and Beauty. · Build LLM-based prompt solutions to automate extraction and enrichment of product metadata, improving product classification accuracy and catalog quality. · Manage API integrations with external data partners to automate metadata ingestion, increase dataset completeness, and enhance fill rates across ingestion pipelines. · Implement data validation checks and quality assurance processes to maintain accuracy, consistency, and reliability across ingestion and transformation workflows.· Document data pipelines, ETL processes, and data models to promote transparency, collaboration, and knowledge sharing across teams.
Bachelor's Degree in Information Management or Computer Science or Computer Engineering or Information Technology
Job Type: Full-time
Pay: $90,000.00 - $100,000.00 per year
Work Location: Hybrid remote in New York, NY 10016