Senior Data Scientist

Data Scientist

Ref: 254Monday 11 July 2022

Sorry, this advert is now closed. Click here to view our live vacancies.

Senior Data Scientist - Remote

To help build powerful, easy-to-use tools to collect, format, and deliver web data, quickly, dependably, and at scale. the data we extract helps thousands of organizations make smarter business decisions, secure competitive advantage, and drive sustainable growth. Today, over 2,000 companies and 1 million developers rely on our tools and services to get the data they need from the web.
About the job:

You are passionate about data science and machine learning, and enjoy working in a collaborative environment. You will get the chance to work with one of the most advanced and comprehensive web crawling and scraping infrastructures in the world, leveraging massive data sets with cutting-edge technology.

Roles & Responsibilities:

  • You will apply your data science and engineering skills to create products based on machine learning, analyze large volumes of complex data, model challenging problems, and develop algorithms to solve our internal and client needs.
  • You will work on research and development, formulating hypotheses, performing experiments, writing up results in an internal wiki and implementing successful approaches into the products.
  • You will work and experiment with state-of-the-art web crawling, machine learning and data processing technologies. Some of the problems you’ll be working on include object detection, text classification, named entity recognition, crawling algorithms.
  • You will work in collaboration with other data scientists and engineers  to design and build creative solutions to challenging problems.
  • You will work on projects that span the whole organization, including areas such as Product and Professional Services.

Requirements

  • Strong machine learning background (natural language processing, computer vision, deep learning, “classical” methods)
  • Hands-on experience in Data Science projects (data preparation, target metrics, model evaluation, validation, etc.)
  • Strong software development skills, ideally in python.
  • Experience with any of these tools is a plus: pytorch, scikit-learn, tensorflow, pandas, jupyter, spacy, gensim, vowpal wabbit, crfsuite, scrapy, spark, docker, GCP, AWS, kafka.
  • Please send source code that shows your programming ability well. If you have many projects on github (or similar) please tell us which we should look at.