Case study
Rubin Variable Star Workflow
A reproducible, end-to-end pipeline that transforms raw Rubin catalog data into a ranked shortlist of candidate variable stars using engineered variability metrics and an interpretable composite index.
Outcomes
- Delivered an end-to-end, reproducible pipeline spanning exploratory analysis, feature engineering, ranking, and machine learning–based classification of candidate variable stars.
- Produced decision-ready outputs, including a ranked candidate shortlist and an interpretable classifier trained on the first wave of photometric data.
- Initial models achieve ~94% classification accuracy, with strong performance on dominant variability classes (LPVs and RR Lyrae).
- Designed to be repeatable, interpretable, and robust to noisy observations and terabyte-scale catalog updates.
- Documented for accessibility by both STEM and non-STEM users, enabling reuse by future researchers and serving as a foundation for variable star classification workflows.
Problem
Time-domain astronomical catalogs are large, noisy, and difficult to translate into concrete decisions. The goal was to design a transparent workflow that converts catalog-level variability signals into a ranked candidate list suitable for inspection, comparison, and follow-up.
Approach
- Performed EDA on large-scale observational data to separate meaningful variability signal from noise.
- Engineered complementary variability metrics and standardized them to enable consistent comparison.
- Combined metrics into an interpretable composite index to support ranking and prioritization.
- Implemented a top-N selection step that produces inspectable outputs (tables and saved artifacts).
- Packaged the workflow as a runnable demo with documented assumptions and clear scope boundaries.
Why this matters
Rather than stopping at exploratory analysis, this project produces decision-ready outputs: a ranked shortlist with inspectable artifacts and documented assumptions. This pattern transparent scoring, prioritization, and repeatability translates directly to strategy analytics, operations, and decision science contexts.