The Data team within Doctor Anywhere oversees all of the company’s data-related requirements. The goal is to use the data to assist the company in making product/business decisions, and measure its effectiveness. This team works closely with our Product, Engineering, Business to develop a deep understanding of how our products are used and influence the company strategy.
We are seeking an enthusiastic and self-motivated Senior Data Engineer who will ensure the smooth operation of a high volume data pipeline solutions and architecture. You will be working closely in a cross-functional environment, collaborating with Data Scientists, Product Managers, Engineering Teams and Business Stakeholders, providing the organization with analytical data.
This role reports directly to the Head of Data.
Responsibilities
- Create and maintain optimal data pipeline architecture.
- Assemble large, complex data sets that meet functional/non-functional business requirements.
- Identify, design and implement internal process improvements: automate manual processes, optimize data delivery, re-design infrastructure for greater scalability.
- Build the infrastructure required for optimal extraction, transformation and loading of data from a wide variety of data sources using SQL, AWS & GCP ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secured across national boundaries through multiple data centers and AWS regions.
- Create data tools for analytics and data scientist team members to assist them in building and optimizing our product into an innovative industry leader.
Requirements
- Degree in Computer Science/Statistics/Information Systems with minimally 5 years of experience in a Data Engineer role
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases
- Has experience in building and optimizing ‘big data’ data pipelines, architectures and data sets
- Has experience with big data tools: Hadoop, Spark, Kafka, etc.
- Has experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Has experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Has experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Has experience with GCP cloud services
- Has experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Has experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Strong project management and organizational skills
- Proven ability of building strong working relationships, both internal and external to the organization
- Strong analytic skills related to working with unstructured datasets
- Experience of working within a tech startup and hyper growth environment
- Comfortable with fast paced operations and not be afraid to "roll up your sleeves" to get things done.
- Strong written and verbal communication skills
Desired Skills and Experience
verbal communication skills, Scalability, Scala, Big Data, Pipelines, Hadoop, Scripting, Cassandra, EMR, Customer Acquisition, SQL, Python, Java, Bridge, Databases, C++