We're Helium10, the leading Software company for Amazon sellers. We're dedicated to leveraging large volumes of data to drive insights to improve our customers ability to start their business, improve profit margins, and scale their business to new heights.
We are seeking a highly skilled and experienced Senior Data Engineer to join our growing data team. In this role, you will be responsible for designing, developing, and maintaining our data architecture and infrastructure, ensuring data quality and accessibility across the organization. You'll be a part of the larger data team, which also includes Data Analysts and Data Scientists. Everyone's goal is to move the business forward through the creation of internal and externally facing data products and solutions.
Responsibilities:
- Design, develop, and maintain scalable and efficient data pipelines, ETL processes, and workflows to handle large volumes of structured and unstructured data.
- Collaborate with data scientists, data analysts, and business stakeholders to gather requirements and develop data solutions that support advanced analytics, machine learning, and data-driven decision-making.
- Optimize and improve existing data infrastructure to ensure data quality, integrity, availability, and performance.
- Implement and enforce data security and compliance standards to protect sensitive information.
- Develop and maintain documentation on data models, data dictionaries, data lineage, and data flow diagrams.
- Mentor and provide guidance to junior data engineering team members on best practices and emerging technologies.
- Continuously research and adopt new tools, technologies, and methodologies to enhance the data engineering process.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in data engineering, ETL development, and big data processing.
- Strong expertise in SQL and proficiency with Python.
- Extensive experience with data pipeline tools and frameworks such as Apache Kafka, Apache Flink, or Amazon DMS.
- Experience with cloud-based data storage and processing technologies (e.g., AWS, Azure, or Google Cloud Platform).
- Familiarity with data warehouse and data lake architectures, as well as various data storage formats (e.g., Parquet).
- Knowledge of data modeling techniques and proficiency in working with relational, columnar, and NoSQL databases.
- Knowledge of containerization technologies (e.g., Docker, Kubernetes)
- Excellent problem-solving skills, strong attention to detail, and the ability to work independently or collaboratively in a fast-paced environment.
- Strong communication and interpersonal skills to effectively collaborate with cross-functional teams.
Nice to have:
- Experience with data visualization tools (e.g., Tableau, Power BI, Looker)
- Familiarity with machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, scikit-learn)
- Familiarity with Vector databases
Benefits:
- Hybrid work model
- Flexible work schedules
- Modern equipment
- Tons of educational opportunities
- Public recognition platform
- Business trips to the headquarters in Los Angeles, USA
|