Data Science Lead
• 5+ years experience with Apache Hadoop and Spark ecosystems of open-source tools and ML packages. Our data processing and modeling pipelines are built using Spark, MapReduce,Pig, Hive, Kafka, ElasticSearch, HBase, Cassandra, and other open-source platforms. Our team develops the platforms that analyze petabytes of data, develop attributes and deploy models to production - efficient implementation and elegant architecture is essential.
• Solid understanding of algorithms to build recommendation systems, interest graphs, ad targeting models, trend analysis, and fraud/anomaly detection using online and offline features. A big part of the role is to be able to ask open-ended questions, explore new ideas, and choose appropriate techniques for solving a given problem, rather than using packages as a black box to a known problem.
• Must be able to write clean and concise code in at least two of the following: Python, Java, and Scala. Our interview process includes writing some code to solve a problem on the whiteboard.
New York, NY
Fri, 21 Jul 2017 10:33:22 PDT