Job Summary
Job Description
DUTIES: Leverage internal and external data to provide insights and information which supports a facts-based decision-making process; analyze data for trends, correlations, underlying causes, and drivers using analysis tools including Excel, Python, Tableau, R, and similar analysis tools; use R and Python to apply machine learning techniques including Linear Regression, Classification and Regression Tree, Random Forest, XGBoost, Neural Network, Clustering algorithms, and Natural Language Processing; identify opportunities to conduct statistical analysis or apply machine learning techniques, in conjunction with internal or external stakeholders, to drive data-driven decision making; prepare and present documents summarizing findings, insights, and actions from statistical analysis or machine learning to enable data-driven machine learning; use R and Python to perform statistical analysis using techniques including Match Pairing, Causal Inference, and A/B test design; use programming and database languages including SQL, Python, Scala, R, and PySpark; use the following libraries for joining large, disparate datasets, feature engineering, and modeling: Pandas, Scikit-learn, Dplyr, and similar libraries; visualize data using the following: Tableau, ggplot2, and Matplotlib; interpret problems and handle short-term tasks using existing procedures and frameworks; provide solutions to business problems by leveraging data analysis, data mining, optimization tools, machine learning techniques, and statistical methods; work with large, noisy, and complex datasets to producing meaningful analysis of historical patterns of customer behaviors and product performance; apply scientific techniques to data evaluation, performing statistical inference and data mining; use analytical rigor and statistical methods to analyze large amounts of data, extracting actionable insights using advanced statistical techniques such as data analysis, data mining, optimization tools, and machine learning techniques and statistics (e.g., predictive models, LTV, propensity models); build customer-centric models and optimization tools to support large-scale projects that utilize online and offline data, structured and unstructured data, set-top box data, and media, behavioral, and attitudinal data; research and implement algorithms and data structures for our platform; work closely with data warehouse architects and software developers to generate seamless data science solutions for deployment; document and present the data analysis and its conclusions for assessment by full-performance analysts, developers, and their managers; interact with product and service teams to identify questions and issues for data analysis and experiments; and develop and code software programs, algorithms, and automated processes to cleanse, integrate, and evaluate large datasets from multiple disparate sources. Position is eligible to work remotely one or more days per week, per company policy.
REQUIREMENTS: Bachelor’s degree, or foreign equivalent, in Computer Science, any Engineering, or related technical field, and two (2) years of experience analyzing data for trends, correlations, underlying causes, and drivers using analysis tools including any of the following: Excel, Python, Tableau, R, or similar analysis tools; using R or Python to apply machine learning techniques including Linear Regression, Classification and Regression Tree, Random Forest, XGBoost, and Clustering algorithms; identifying opportunities to conduct statistical analysis or apply machine learning techniques, in conjunction with internal or external stakeholders, to drive data-driven decision making; preparing and presenting documents summarizing findings, insights, and actions from statistical analysis or machine learning to enable data-driven machine learning; using R or Python to perform statistical analysis using techniques including Match Pairing, Causal Inference, or A/B test design; using programming and database languages including SQL, Python, Scala, R, or PySpark; using any of the following libraries for joining large, disparate datasets, feature engineering, and modeling: Pandas, Scikit-learn, Dplyr, or a similar library; of which one (1) year experience includes visualizing data using any of the following: Tableau, ggplot2, or Matplotlib; and using R or Python to apply Natural Language Processing.
Disclaimer:This information has been designed to indicate the general nature and level of work performed by employees in this role. It is not designed to contain or be interpreted as a comprehensive inventory of all duties, responsibilities and qualifications.
Skills
PySpark, Python (Programming Language), Random ForestWe believe that benefits should connect you to the support you need when it matters most, and should help you care for those who matter most. That's why we provide an array of options, expert guidance and always-on tools that are personalized to meet the needs of your reality—to help support you physically, financially and emotionally through the big milestones and in your everyday life.
Please visit the benefits summary on our careers site for more details.