Educational requirements: Bachelor
English requirements: Competent English
Requirements for skilled employment experience for years: 5-8 years
Required residence status: Temporary visa, Permanent resident, Citizen
Accept remote work: unacceptable
Required Skills
At least 6 years of experience in data engineering role Hands on experience with at least two of the following technologies Python, Scala, Spark, Hadoop, SQL, Java Big Data Development experience using Hive, Impala, Spark, and/or Kafka OR Azure OR Redshift Preferably with experience in coaching and training developers Exposure deploying applications into production environments (e.g. code packaging, integration testing, monitoring, release management) Must have experience working in DevOps, Agile, Scrum, Continuous Delivery, and/or Rapid Application Development environments Building mock and proofs-of-concept across different capabilities/toolsets exposure Experience working with structured, semi-structured, and unstructured data, extracting information, and identifying linkages across disparate data sets Experience with data pipeline, workflow management tools, and cloud platforms (AWS, Azure, Google Platform or Databrick) Traditional data warehousing / ETL tools and multiple database technologies (distributed processing, traditional RDBMS, MPP, NoSQL) experience Experience with Azure cloud services: S3, EC2, EMR, RDS, Redshift, and Kinesis Knowledge of Data Governance and a strong understanding of data lineage and data quality Roles and Responsibilities:
Apply strong expertise in data mining and information retrieval to design, develop, optimize, and maintain data architecture and pipelines Work with product teams to identify required data / information for their solutions and the existing sources of said data Contribute in the design and development of data integration and data pipelines Design of a robust and integrated data governance structure and processes to ensure data quality is understood and continuously monitored and improved Work with Product Teams and the Data Integration Engineer to ensure that new data sources deployed for DF products meet data quality requirements Monitor and manage metadata and ensures usefulness of metadata in improving effectiveness and quality of data Acquire, ingest, and process data from multiple sources and systems into Big Data platforms Design of data pipelines / integration flows for transfer of data from existing source systems to the designated target system(s), all the while adhering to Integration standards and existing frameworks