Educational requirements: Bachelor
English requirements: Competent English
Requirements for skilled employment experience for years: 1-3 years
Required residence status: Temporary visa, Permanent resident, Citizen
Accept remote work: unacceptable
About the role
We are looking for a full-time Data Scientist to join our Modelling (Data Science) team.
The Modelling Team is in charge of making the highest possible quality solar data for our customers. We build, deploy, and monitor predictive models. These take satellite imagery, surface measurements and numerical weather prediction (NWP) data to predict irradiance and solar power for the past, present and future.
Experience we are looking for
Predictive model development (e.g. ML, statisical, mathematical-physical, etc...) . You've collected data from disparate sources. You've cleaned that data. You've cleaned it again. And again. You've built models that turn that data into predictions. You've wrangled with the intangibles of what you actually want to optimise vs what simple error statistics optimise.
Predictive model deployment. You've had your models running in operations with real data. You've dealt with out-of-bounds data, NaN data, late data, missing data, too-slow-to-query data, too-slow-to-process data, etc...
Predictive model monitoring. You've discovered you've overfitted your model, that the model has drifted, that a data feed has changed on you, you've caught bad output before a customer, or maybe even you have reassured yourself that for today all systems are nominal!
Physical sciences knowledge. Any prior exposure to meteorology, solar radiation and photovoltaic power modelling is a plus, but more importantly you have the physical sciences background and ability to learn the new scientific domain with the support of your colleagues on the job.
Scientific computing. You have worked with numerical data and algorithms in Python (our workhorse, with heavy use of NumPy and Pandas), or Julia, Fortran, Scala, MATLAB, etc...
Software Development. We build models that we deploy to production, and work closely with our software developers to ensure that our data can be delivered to customers. We use Git, CI/CD, unit tests, Docker, Linux, AWS (S3, Batch + others).