The production of

The creation and analysis of a

(periodic) variable star catalogue

“PRIMVS”

Niall Miller

Prof. Phil Lucas, Dr. Yi Sun

University Of Hertfordshire

[email protected]k

VVV(X)

red -------------------------mid IR

- Started in 2010

- 4-meter VISTA telescope in the Z, Y, J, H, & Ks filters.

- ~50 - 80 epochs (~200 measurements*)

- ~700 million stars

- ~10 million variable stars

*this is highly variable

What motivates these tools? - How to reliably extract info

What we don't know:

- Light Curve shape

- Light Curve photometric accuracy

- Light Curve contamination

- Other sources of perturbation

What we know:

- Hopefully a star?

- Probably variable

- Phase folding techniques

-Finds period which produces ‘cleanest’ phase fold

- Nothing assumed about the structure of the data*

- Computationally expensive

- Not practical to perfectly tune

Lomb Scargle

- The most common

- Fourier based

- Is a fitting technique appropriate?

Phase Dispersion Minimisation & Conditional Entropy

But how do we actually know?

How do we know if a star is periodic?

How do we know if we have extracted the right period?

Manual inspection can't be the only way?!?

Checking 200 took me ~ 1hr, ∴all 10 million will take ~ 7 years if only I was funded for that long…

- Compare against other things?

- Analyse the light curve?

- Analyse the periodogram?

Analysis of the periodogram?

There are many methods which try to verify periodicity by analysis of the light

curve.

- A significance test

- A comparison of values to known periodic/aperiodic sources

- More sophisticated statistical approach (e.g Baluev)

All assume noise is correctly modelled/accounted for/ignored

Can a RNN solve this?

For each example of X and Y I have…

Given X, what does the RNN think Y is?

loss = how wrong it was

We can use loss to calculate the direction to take for the next guess (à la gradient descent )

Inform the network on how wrong it was

Can we show an RNN enough examples of labelled light curves such that it can identify periodic variables?

RNN is given:

-Phase

-Mag

-Interpolated Mag

AKA:

-X

-Y

-Smoothed Y

Machine learning is good at lying so we need to check

Same star with 4 different periods identified with different

methods

Each method has produced a visually different phase

folded light curve

Can it definitely identify periodic variables?

Multiple different simulated and real

datasets are tested for this NN

method and the previous Baluev

method for specificity and

sensitivity.

NN method performed better than

Baluev method without the

requirement for a Lomb-Scargle

periodogram

ROC curve

It’s more than just better, its unbiased.

If we calculate a FAP for multiple

simulated periodic light curves at

different SNR, are we biased for more

sinusoidal light curves?

It’s more than just better, its unbiased.

If we calculate a FAP for multiple

simulated periodic light curves at

different SNR, are we biased for more

sinusoidal light curves?

What do we get from all of this?

We have 100,000* confident periodic variables

Of which 35,000 in Gaia, 9,000 in WISE, 40,000 in TESS, 25,000 in 2MASS –i.e likely new things here!

Period range: 0.001 - ~1500 days

Multiple time-series statistics separate to periodicity - lots of features *likely to increase

Example to find more Young Stellar Objects with SPICY

Can we use the carefully selected YSOs from the SPICY catalogue to

parameterise VVV YSOs?

Example to find more YSOs with SPICY

Possibly Potentially Candidate YSOs

553 potentially new objects

But finding more of the same is a little boring… What can

we do to further probe this?

We have a bunch of confident periodic variables… now what?

-Feature classification (period, amplitude, skew…)

-Do our features form a unique set for each class of star? …no

We have an unknown amount of unknown unknowns and we would like to identify

new/interesting/understudied things.

“unknown amount of unknown unknowns” - Stable Diffusion 2.1

Un/Self-supervised learning - SimCLR - ML holy grail

By applying various modifications to a light curve without compromising its

inherent meaning, we aim to categorize light curves that a human observer would

perceive as similar.

These are all images of a dog >

See upcoming Miller - Smith

paper for inference and

explanation on this!

Gaia is optical, VVV is near-IR, what can we do with that?

Can we make fake light curves for better training?

Slightly different to standard image diffusion

We instead train with magnitude,

uncertainty and time*.

Period is never explicitly given!

Time = Time/Period (not really a

phase as integer remains)

1 million epochs

3 months

3 x 32GB Tesla V100

Thank you for listening

n.miller4@herts.ac.uk

[email protected]