# Auxiliary Samplers

Along with sampling the spatial and luminosity distributions, auxiliary properties and be sampled that both depend on and/or influence the luminosity as well each other. This allows you to build up arbitrailiy complex dependencies between parameters which can lead to diverse populations.

[1]:

import networkx as nx
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline
from jupyterthemes import jtplot

jtplot.style(context="notebook", fscale=1, grid=False)

purple = "#B833FF"
yellow = "#F6EF5B"

import warnings

warnings.simplefilter("ignore")

import popsynth

popsynth.update_logging_level("INFO")


## Built in auxiliary samplers

There are several built in auxiliary samplers that allow you to quickly add on auxiliary parameters.

[2]:

popsynth.list_available_auxiliary_samplers()

DeltaAuxSampler
LogNormalAuxSampler
Log10NormalAuxSampler
NormalAuxSampler
ParetoAuxSampler
PowerLawAuxSampler
BrokenPowerLawAuxSampler
TruncatedNormalAuxSampler
ViewingAngleSampler


We can add these on to the populations, but let’s have a look at how to use them.

[3]:

x = popsynth.NormalAuxSampler(name="aux_param", observed=True)

x.mu = 0
x.sigma = 1

# draws the observed values from normal distribution with std equal to tau
x.tau = 1


If value of x is observed (generates data), then we can set the width of the normal distribtuion from which the observed values are sampled from the latent values. Otherwise, only the latent values are stored. This applies to any of the built in auxiliary samplers. However, this can all be customized by adding our own:

## Creating a custom auxiliary sampler

Let’s create two auxiliary samplers that sample values from normal distributions with some dependency on each other.

First, we specify the main population. This time, we will chose a SFR-like redshift distribution and a Schecter luminosity function

[4]:

pop_gen = popsynth.populations.SchechterSFRPopulation(
r0=100,a=0.0157, rise=1.0, decay=1.0, peak=1.0, Lmin=1e50, alpha=2.0
)


Suppose we have a property “demo” that we want to sample as well. For this property, we do not observe it directly. We will get to that. This means that our property latent and could influence other parameters but we can not measure it directly. If you are familiar with Bayesian hierarchical models, this concept may be more familiar to you. As an example, this could be the temperature of a star, which influences its spectrum. The spectrum creates an observable, but the tempreature is imply a random latent variable sampled from a distribution.

We create an AuxiliarySampler child class, and define the true_sampler for the latent values:

[5]:

class DemoSampler(popsynth.AuxiliarySampler):
_auxiliary_sampler_name = "DemoSampler"

mu = popsynth.auxiliary_sampler.AuxiliaryParameter(default=2)
tau = popsynth.auxiliary_sampler.AuxiliaryParameter(default=1, vmin=0)

def __init__(self):

# pass up to the super class
super(DemoSampler, self).__init__("demo", observed=False)

def true_sampler(self, size):

# sample the latent values for this property

self._true_values = np.random.normal(self.mu, self.tau, size=size)


Now we instantiate it and then assign it our pop_gen object. Then we draw out survey

[6]:

demo1 = DemoSampler()

flux_selector = popsynth.HardFluxSelection()
flux_selector.boundary = 1e-9

pop_gen.set_flux_selection(flux_selector)

population = pop_gen.draw_survey()

## plot it
options = {"node_color": purple, "node_size": 3000, "width": 0.5}
pos = nx.drawing.nx_agraph.graphviz_layout(population.graph, prog="dot")
nx.draw(population.graph, with_labels=True, pos=pos, **options)

 INFO     |  registering auxilary sampler: demo
INFO     |  The volume integral is 4760.754853877658

 INFO     |  Expecting 4693 total objects
INFO     |  Sampling: demo
INFO     |  applying selection to fluxes
INFO     |  Detected 3223 distances
INFO     |  Detected 3223 objects out to a distance of 8.44

[7]:

fig = population.display_fluxes(obs_color=purple, true_color=yellow, s=15)


We can see that the population has stored out demo auxiliary property.

[8]:

all_demo = population.demo

selected_demo = population.demo.selected


We can also see that our demo sampler is now known which is important when creating populations from YAML files. This registering happens when we add the property _auxiliary_sampler_name = "DemoSampler" which must be name of the class!

[9]:

popsynth.list_available_auxiliary_samplers()

DeltaAuxSampler
LogNormalAuxSampler
Log10NormalAuxSampler
NormalAuxSampler
ParetoAuxSampler
PowerLawAuxSampler
BrokenPowerLawAuxSampler
TruncatedNormalAuxSampler
ViewingAngleSampler
DemoSampler


## Observed auxiliary properties and dependent parameters

Suppose now we want to simulate a property that is observed by an instrument but depends on latent parameters.

We will create a second demo sampler and tell it what the observational error is as well as how to read from a secondary sampler:

[10]:

class DemoSampler2(popsynth.AuxiliarySampler):
_auxiliary_sampler_name = "DemoSampler2"
mu = popsynth.auxiliary_sampler.AuxiliaryParameter(default=2)
tau = popsynth.auxiliary_sampler.AuxiliaryParameter(default=1, vmin=0)
sigma = popsynth.auxiliary_sampler.AuxiliaryParameter(default=1, vmin=0)

def __init__(
self,
):

# this time set observed=True
super(DemoSampler2, self).__init__("demo2", observed=True, uses_distance=True)

def true_sampler(self, size):

# we access the secondary sampler dictionary. In this
# case "demo". This itself is a sampler with
# <>.true_values as a parameter
secondary = self._secondary_samplers['demo']

# now we sample the demo2 latent values and add on the dependence of "demo"

tmp = np.random.normal(self.mu, self.tau, size=size)

# for fun, we can substract the log of the distance as all
# auxiliary samples know about their distances

self._true_values = tmp + secondary.true_values - np.log10(1 + self._distance)

def observation_sampler(self, size):

# here we define the "observed" values, i.e., the latened values
# with observational error

self._obs_values = self._true_values + np.random.normal(
0, self.sigma, size=size
)


We recreate our base sampler:

[11]:

pop_gen = popsynth.populations.SchechterSFRPopulation(
r0=100, a=0.0157, rise=1.0, decay=1.0, peak=1.0, Lmin=1e50, alpha=2.0
)


Now, make a new demo1, but this time we do not have to attach it to the base sampler. Instead, we will assign it as a secondary sampler to demo2 and popsynth is smart enough to search for it when it draws a survey.

[12]:

demo1 = DemoSampler()

demo2 = DemoSampler2()

demo2.set_secondary_sampler(demo1)

# attach to the base sampler

 INFO     |  registering auxilary sampler: demo2

[13]:

pos = nx.drawing.nx_agraph.graphviz_layout(pop_gen.graph, prog="dot")

fig, ax = plt.subplots()

nx.draw(pop_gen.graph, with_labels=True, pos=pos, ax=ax, **options)

[14]:

flux_selector = popsynth.HardFluxSelection()
flux_selector.boundary = 1e-8

pop_gen.set_flux_selection(flux_selector)
population = pop_gen.draw_survey(flux_sigma=0.1)

 INFO     |  The volume integral is 4760.754853877658

 INFO     |  Expecting 4693 total objects
INFO     |  Sampling: demo2
INFO     |  demo2 is sampling its secondary quantities
INFO     |  Sampling: demo
INFO     |  applying selection to fluxes
INFO     |  Detected 1124 distances
INFO     |  Detected 1124 objects out to a distance of 3.18

[15]:

fig, ax = plt.subplots()

ax.scatter(population.demo2.selected, population.demo.selected, c=purple, s=40)

ax.scatter(population.demo2, population.demo, c=yellow, s=20)

[15]:

<matplotlib.collections.PathCollection at 0x7f33439c6c70>


## Derived Luminosity sampler

Sometimes, the luminosity does not come directly from a distribution. Rather, it is computed from other quantities. In these cases, we want to use the DerivedLumAuxSampler class.

This allows you to sample auxiliary parameters and compute a luminosity from those.

[16]:

class DemoSampler3(popsynth.DerivedLumAuxSampler):
_auxiliary_sampler_name = "DemoSampler3"
mu = popsynth.auxiliary_sampler.AuxiliaryParameter(default=1)
tau = popsynth.auxiliary_sampler.AuxiliaryParameter(default=1, vmin=0)

def __init__(self, mu=2, tau=1.0, sigma=1):

# this time set observed=True
super(DemoSampler3, self).__init__("demo3", uses_distance=False)

def true_sampler(self, size):

# draw a random number
tmp = np.random.normal(self.mu, self.tau, size=size)

self._true_values = tmp

def compute_luminosity(self):

# compute the luminosity
secondary = self._secondary_samplers["demo"]

return (10 ** (self._true_values + 54)) + secondary.true_values

[17]:

pop_gen = popsynth.populations.SchechterSFRPopulation(
r0=100,a=0.0157, rise=1.0, decay=1.0, peak=1.0, Lmin=1e50, alpha=2.0
)

[18]:

demo1 = DemoSampler()

demo3 = DemoSampler3()

demo3.set_secondary_sampler(demo1)

# attach to the base sampler

pos = nx.drawing.nx_agraph.graphviz_layout(pop_gen.graph, prog="dot")

fig, ax = plt.subplots()

nx.draw(pop_gen.graph, with_labels=True, pos=pos, **options, ax=ax)

 INFO     |  registering derived luminosity sampler: demo3

[19]:

flux_selector = popsynth.HardFluxSelection()
flux_selector.boundary = 1e-5
pop_gen.set_flux_selection(flux_selector)
population = pop_gen.draw_survey(flux_sigma=0.1)

 INFO     |  The volume integral is 4760.754853877658

 INFO     |  Expecting 4693 total objects
INFO     |  Sampling: demo3
INFO     |  demo3 is sampling its secondary quantities
INFO     |  Sampling: demo
INFO     |  Getting luminosity from derived sampler
INFO     |  applying selection to fluxes
INFO     |  Detected 3752 distances
INFO     |  Detected 3752 objects out to a distance of 9.99

[20]:

population.display_fluxes(obs_color=purple, true_color=yellow, s=15)

[20]: