The production of
The creation and analysis of a
(periodic) variable star catalogue
“PRIMVS”
Niall Miller
Prof. Phil Lucas, Dr. Yi Sun
University Of Hertfordshire
VVV(X)
red -------------------------mid IR
- Started in 2010
- 4-meter VISTA telescope in the Z, Y, J, H, & Ks filters.
- ~50 - 80 epochs (~200 measurements*)
- ~700 million stars
- ~10 million variable stars
*this is highly variable
What motivates these tools? - How to reliably extract info
What we don't know:
- Light Curve shape
- Light Curve photometric accuracy
- Light Curve contamination
- Other sources of perturbation
What we know:
- Hopefully a star?
- Probably variable
- Phase folding techniques
-Finds period which produces ‘cleanest’ phase fold
- Nothing assumed about the structure of the data*
- Computationally expensive
- Not practical to perfectly tune
Lomb Scargle
- The most common
- Fourier based
- Is a fitting technique appropriate?
Phase Dispersion Minimisation & Conditional Entropy
But how do we actually know?
How do we know if a star is periodic?
How do we know if we have extracted the right period?
Manual inspection can't be the only way?!?
Checking 200 took me ~ 1hr, all 10 million will take ~ 7 years if only I was funded for that long…
- Compare against other things?
- Analyse the light curve?
- Analyse the periodogram?
Analysis of the periodogram?
There are many methods which try to verify periodicity by analysis of the light
curve.
- A significance test
- A comparison of values to known periodic/aperiodic sources
- More sophisticated statistical approach (e.g Baluev)
All assume noise is correctly modelled/accounted for/ignored
Can a RNN solve this?
For each example of X and Y I have…
Given X, what does the RNN think Y is?
loss = how wrong it was
We can use loss to calculate the direction to take for the next guess (à la gradient descent )
Inform the network on how wrong it was
Can we show an RNN enough examples of labelled light curves such that it can identify periodic variables?
RNN is given:
-Phase
-Mag
-Interpolated Mag
AKA:
-X
-Y
-Smoothed Y
Machine learning is good at lying so we need to check
Same star with 4 different periods identified with different
methods
Each method has produced a visually different phase
folded light curve
Can it definitely identify periodic variables?
Multiple different simulated and real
datasets are tested for this NN
method and the previous Baluev
method for specificity and
sensitivity.
NN method performed better than
Baluev method without the
requirement for a Lomb-Scargle
periodogram
ROC curve
It’s more than just better, its unbiased.
If we calculate a FAP for multiple
simulated periodic light curves at
different SNR, are we biased for more
sinusoidal light curves?
It’s more than just better, its unbiased.
If we calculate a FAP for multiple
simulated periodic light curves at
different SNR, are we biased for more
sinusoidal light curves?
What do we get from all of this?
We have 100,000* confident periodic variables
Of which 35,000 in Gaia, 9,000 in WISE, 40,000 in TESS, 25,000 in 2MASS i.e likely new things here!
Period range: 0.001 - ~1500 days
Multiple time-series statistics separate to periodicity - lots of features *likely to increase
Example to find more Young Stellar Objects with SPICY
Can we use the carefully selected YSOs from the SPICY catalogue to
parameterise VVV YSOs?
Example to find more YSOs with SPICY
Possibly Potentially Candidate YSOs
553 potentially new objects
But finding more of the same is a little boring… What can
we do to further probe this?
We have a bunch of confident periodic variables… now what?
-Feature classification (period, amplitude, skew…)
-Do our features form a unique set for each class of star? …no
We have an unknown amount of unknown unknowns and we would like to identify
new/interesting/understudied things.
“unknown amount of unknown unknowns” - Stable Diffusion 2.1
Un/Self-supervised learning - SimCLR - ML holy grail
By applying various modifications to a light curve without compromising its
inherent meaning, we aim to categorize light curves that a human observer would
perceive as similar.
These are all images of a dog >
See upcoming Miller - Smith
paper for inference and
explanation on this!
Gaia is optical, VVV is near-IR, what can we do with that?
Gaia is optical, VVV is near-IR, what can we do with that?
Can we make fake light curves for better training?
Slightly different to standard image diffusion
We instead train with magnitude,
uncertainty and time*.
Period is never explicitly given!
Time = Time/Period (not really a
phase as integer remains)
1 million epochs
3 months
3 x 32GB Tesla V100
Thank you for listening
n.miller4@herts.ac.uk