<feed xmlns="http://www.w3.org/2005/Atom"> <id>https://ndjstn.github.io/</id><title>Justin Stone</title><subtitle>I build practical systems across software, automation, local AI tooling, data workflows, and technical problem-solving.</subtitle> <updated>2026-04-24T04:03:24-05:00</updated> <author> <name>Justin Stone</name> <uri>https://ndjstn.github.io/</uri> </author><link rel="self" type="application/atom+xml" href="https://ndjstn.github.io/feed.xml"/><link rel="alternate" type="text/html" hreflang="en" href="https://ndjstn.github.io/"/> <generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator> <rights> © 2026 Justin Stone </rights> <icon>/assets/img/favicons/favicon.ico</icon> <logo>/assets/img/favicons/favicon-96x96.png</logo> <entry><title>Five Archetypes at the Top of YouTube: Creator Segmentation on the Global Top-995</title><link href="https://ndjstn.github.io/posts/youtube-top995-five-archetypes/" rel="alternate" type="text/html" title="Five Archetypes at the Top of YouTube: Creator Segmentation on the Global Top-995" /><published>2026-04-24T00:00:09-05:00</published> <updated>2026-04-24T03:38:22-05:00</updated> <id>https://ndjstn.github.io/posts/youtube-top995-five-archetypes/</id> <content type="text/html" src="https://ndjstn.github.io/posts/youtube-top995-five-archetypes/" /> <author> <name>Justin Stone</name> </author> <category term="Data Science" /> <summary>KMeans on the Global YouTube Statistics 2023 dataset produces five distinct creator archetypes: mega-scale, mainstream, low-engagement, music-video, and upload machines.</summary> </entry> <entry><title>93,000 Customers and No Repeats: Olist Brazilian E-Commerce Analytics</title><link href="https://ndjstn.github.io/posts/olist-ecommerce-retention/" rel="alternate" type="text/html" title="93,000 Customers and No Repeats: Olist Brazilian E-Commerce Analytics" /><published>2026-04-24T00:00:08-05:00</published> <updated>2026-04-24T04:02:59-05:00</updated> <id>https://ndjstn.github.io/posts/olist-ecommerce-retention/</id> <content type="text/html" src="https://ndjstn.github.io/posts/olist-ecommerce-retention/" /> <author> <name>Justin Stone</name> </author> <category term="Data Science" /> <summary>Olist's 2016-2018 marketplace data shows 97 percent single-purchase customers. Revenue growth is all acquisition; retention is near-zero by construction.</summary> </entry> <entry><title>When a Naive Baseline Beats LightGBM: Bike-Sharing Demand with Proper Time-Series CV</title><link href="https://ndjstn.github.io/posts/bike-sharing-timeseries-cv/" rel="alternate" type="text/html" title="When a Naive Baseline Beats LightGBM: Bike-Sharing Demand with Proper Time-Series CV" /><published>2026-04-24T00:00:07-05:00</published> <updated>2026-04-24T03:38:22-05:00</updated> <id>https://ndjstn.github.io/posts/bike-sharing-timeseries-cv/</id> <content type="text/html" src="https://ndjstn.github.io/posts/bike-sharing-timeseries-cv/" /> <author> <name>Justin Stone</name> </author> <category term="Data Science" /> <summary>Random 5-fold CV makes LightGBM look like the winner on UCI hourly bike-sharing data. Time-series CV says a naive seasonal-mean baseline beats it.</summary> </entry> <entry><title>Geography Beats the Odometer: NYC Taxi Trip Duration Prediction</title><link href="https://ndjstn.github.io/posts/nyc-taxi-geography-beats-odometer/" rel="alternate" type="text/html" title="Geography Beats the Odometer: NYC Taxi Trip Duration Prediction" /><published>2026-04-24T00:00:06-05:00</published> <updated>2026-04-24T03:38:22-05:00</updated> <id>https://ndjstn.github.io/posts/nyc-taxi-geography-beats-odometer/</id> <content type="text/html" src="https://ndjstn.github.io/posts/nyc-taxi-geography-beats-odometer/" /> <author> <name>Justin Stone</name> </author> <category term="Data Science" /> <summary>Distance alone predicts 64 percent of trip duration. Adding geography and time of day gets you to 80 percent. The heatmap shows where the missing signal lives.</summary> </entry> <entry><title>Quality, Then Area, Then Everything Else: Ames Housing with Stacked Ensembles</title><link href="https://ndjstn.github.io/posts/house-prices-ames-stacked-ensembles/" rel="alternate" type="text/html" title="Quality, Then Area, Then Everything Else: Ames Housing with Stacked Ensembles" /><published>2026-04-24T00:00:05-05:00</published> <updated>2026-04-24T02:34:45-05:00</updated> <id>https://ndjstn.github.io/posts/house-prices-ames-stacked-ensembles/</id> <content type="text/html" src="https://ndjstn.github.io/posts/house-prices-ames-stacked-ensembles/" /> <author> <name>Justin Stone</name> </author> <category term="Data Science" /> <summary>A stacked Ridge + XGBoost + LightGBM ensemble on Ames houses reaches 0.126 RMSLE, beating a Ridge baseline by about 8 percent. The stack weights are unbalanced, and that's fine.</summary> </entry> </feed>
