Job Summary
We are seeking a dynamic and innovative Data Engineer to join our team and drive the development of scalable, efficient data solutions. In this role, you will harness your expertise in big data technologies, cloud platforms, and data modeling to build robust data pipelines that empower insightful analysis and strategic decision-making. Your contributions will enable our organization to leverage complex datasets, optimize data workflows, and support advanced analytics initiatives. If you thrive in a fast-paced environment where your technical skills can make a tangible impact, this opportunity is for you!
Duties
- Design, develop, and maintain scalable data pipelines using ETL (Extract, Transform, Load) processes to ensure reliable data flow across systems.
- Implement and manage cloud-based data storage solutions such as AWS, Azure Data Lake, and Hadoop ecosystems to support large-scale data processing.
- Collaborate with cross-functional teams to gather requirements and translate them into efficient database designs utilizing Microsoft SQL Server, Oracle, and other relational databases.
- Develop and optimize SQL queries for data extraction, analysis, and reporting; utilize Python, Bash scripting, Shell scripting, and VBA for automation tasks.
- Build and deploy models for training machine learning algorithms; work with Looker and other visualization tools to create insightful dashboards.
- Integrate linked data sources via RESTful APIs and ensure seamless connectivity between diverse systems.
- Conduct performance tuning of big data frameworks such as Apache Hive, Spark, Hadoop, and Talend to maximize efficiency.
- Support agile development practices by participating in sprint planning, daily stand-ups, and continuous improvement initiatives.
Experience
- Proven experience as a Data Engineer or in a similar role with a strong understanding of big data architectures and cloud platforms like AWS or Azure.
- Hands-on expertise with Java programming for building scalable applications; proficiency in Python for scripting and automation tasks.
- Extensive knowledge of ETL tools such as Informatica or Talend; experience designing data warehouses using SQL Server or Oracle.
- Familiarity with Hadoop ecosystem components including Apache Hive, Spark, and Hadoop Distributed File System (HDFS).
- Strong understanding of database design principles and modeling techniques; experience with linked data concepts is a plus.
- Experience working within Agile methodologies to deliver iterative solutions efficiently.
- Knowledge of RESTful API integration for connecting disparate systems; ability to perform model training for predictive analytics is desirable.
- Excellent analysis skills with the ability to interpret complex datasets; proficiency in Bash shell scripting for system administration tasks. Join us if you're passionate about transforming raw data into powerful insights! We value innovation, collaboration, and continuous learning—empowering you to grow your skills while making a meaningful impact through cutting-edge data solutions.
Pay: $116,839.47 - $140,709.89 per year
Work Location: Remote