Alexander Palensky’s Portfolio Website
Recent projects
Project 1: National Lacrosse League box scores
Tools: Python (Jupyter)
R (Rstudio)
Markdown
Technical skills: Web Scraping
Data Wrangling
- Made all National Lacrosse League (NLL) box score data publicly available in one source for the first time, enabling easy online access and empowering community analysis.
- Scraped and cleaned all publicly available box scores from 1993 through 2020, totaling nearly 55,000 records for floor players and nearly 6,000 records for goalies.
- Applied sorts based on season and added leading performance analysis metrics per game, per 60 minutes of play, as ratios, and as percentages.
- This project is ongoing, with data files downloadable from this Kaggle page. I will continue to add insights, applications, and images to this project so that others can learn more about box lacrosse trends.
Dig Deeper: NLL Github Repository
Project 2: Similarity scoring college pitchers
Tools: R (Rstudio)
RShiny
SQL
Trackman
Technical skills: Data Wrangling
Similarity Learning
Technical Writing
- Imported and transformed over 150,000 plate appearance events.
- Generated novel pitch similarity scoring using R and Euclidean Distance.
- Applied pitch usage weighting with Earth Mover’s Distance to score arsenals.
- Built a Shiny application for staff to use the model in offseason pitch design.
Dig Deeper: Article published on Medium
Project 3: Product recommendation engine
Tools: R (Rstudio)
Rattle GUI
PowerPoint
Technical skills: Data Wrangling
Association Rule Mining
Data Visualization
Technical Consulting
- Consulting project for a beverage distribution company covering 1,500 products, close to 2,500 customers, and more than 8250,000 transactions over the past four years.
- Removed sparse records and top nationally selling brands from analysis to focus on marketing new and niche products, broken down by beverage category.
- Captured insights from association rule mining to construct a recommendation engine which returned the top desired number of uncarried products sold at similar businesses or identified as emergent trends in the beverage industry .
- Presented the company with the recommendation engine and a report covering commonly recomended products and how these recommendations varied by customer business category and size.
Project 4: National Park invasive fish species
Tools: R (Rstudio)
PowerPoint
Technical skills: Web Scraping
Data Wrangling
Data Visualization
- Imported nearly 120,000 known NationalPark wildlife species from the National Park Service Kaggle and reduced data to to extant fish species.
- Scraped additional National Park data from online as well as known fish species non-endemic to each state from the United States Geological Survey (USGS).
- Cleaned and joined our scraped data to perform feature exploration.
- Choropleth mapped invasive fish species as a percentage of all fish species prevalent in states with National Parks.
- Gave a presentation to undergraduate departmental professors on wildlife resource management effectiveness using research on states identified with high invasive ratios.
Dig Deeper: Invasive Fish Species Github Repository
Recent reads
- A Short History of Nearly Everything, Bill Bryson
- Freakonomics, Steven Levitt & Stephen Dubner
- Genghis Khan and the Making of the Modern World, Jack Weatherford
- Range, David Epstein
Learning & certifications
- Master of Science in Business Analytics - University of Iowa, 2021
- Structured Query Language Certificate as part of the University of Michigan’s Web Applications for Everybody Specialization
- IBM Data Science Professional Certificate (in progress)
- Tools for Data Science Certificate (complete)
- Data Science Methodology Certificate (complete)
- Python for Data Science, AI & Development Certificate (complete)
- Databases and SQL for Data Science with Python Certificate (in progress)
- Machine Learning Certificate by Andrew Ng and Stanford University (in progress)
- Kaggle micro-courses
- Python
- Pandas
- Intro to Machine Learning
- Data Visualization