-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathmodel.tex
243 lines (227 loc) · 11.8 KB
/
model.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
\section{Radial Velocity Models in the Search for Exoplanets}
\label{sec:exo}
In a two-body system such as a star and planet, the pair rotate
together about a point lying somewhere on the line connecting their
centers of mass. If one of the bodies (the star) radiates light, the
frequency of this light measured by a distant observer will
vary cyclically with a period equal to the orbital period. This
Doppler effect is understood well enough that astronomers can
translate between the frequency shift and the star's velocity toward or
away from earth providing indirect measurements of a star's velocity.
The radial velocity (RV) measurement of a star is the component of velocity
along the line of sight from the earth to the star.
In additional to calculated radial velocities of a star at
time $t_i$, $v_i$, an associated estimate of uncertainty $\s_i$ is
reported which accounts for the propogation of errors introduced
by the telescope and measurement process \uhoh{give REF}.
If a star does not host any orbiting planets, then the radial velocity
measurements $v_i$ will be roughly constant over any period of time,
varying only due to the ``stellar jitter'', $s^2$, the random
fluctuations in a star's luminosity due to eddies, upwellings and downwellings
on the surface of the star. Under the zero-planet
model, $\M_0$, the RV measurements are assumed to have Gaussian distributions
\begin{equation}\label{0p_Model}
v_i \mid \M_0\ind \mathcal{N}\left(\C_0,\s^2_i + \sj_0^2 \right).
\end{equation}
with mean $\C_0$, the constant center-of-mass velocity of the star
relative to the earth, and variances $\sj^2_0 + \s^2_i$. The
parameters $\C_0$ and $\sj_0$ are both in the same units as the RV
measurments. \uhoh{more justification for additive form of the variances, ref?}.
If the star does host planets, the gravitational pull of the orbiting
planets induces a small wobble in the observed radial velocities,
which may be modelled using Keplar's laws.
For a single planet model, $\M_1$, the RV measurements are also
assumed to be Gaussian distributed with
\begin{equation}\label{Velocity_Model}
v_i \mid \M_1 \ind \N \left(\C_1 +\bigtriangleup
V(t_i|\phi_1),\s_i^2+\sj_1^2\right),
\end{equation}
where the velocity shift $\bigtriangleup
V(t_i|\phi_1)$ due to the presence of a single planet is
a family of curves parametrized by
$\phi_1 \equiv (\K_1,\P_1,\e_1,\omega_1,\mu_1)$
\begin{equation}\label{Velocity_1p_Model}
\bigtriangleup V(t|\phi_1)=\K_1[\cos(\omega_1+T(t))+\e_1 \cos(\omega_1)]
\end{equation}
where $T(t)$ is the ``true anomaly at time $t$'' given by
\begin{equation}\label{true_anomaly}
T(t)=2\arctan\left[\tan\left(\frac{E(t)}{2}\right)\sqrt{\frac{1+\e_1}{1-\e_1}}\right].
\end{equation}
and $E(t)$ is called the ``eccentric anomaly at time $t$'', which is the
solution to the transcendental equation
\begin{equation}\label{transcendental_equation}
E(t)-\e_1\sin(E(t))=\mbox{mod}\left(\frac{2\pi}{\P_1}t+\mu_1,2\pi\right).
\end{equation}
The five orbital parameters that comprise $\phi_1$ are the velocity
semi-amplitude $\K_1$, the orbital period $\P_1$, the eccentricity
$\e_1$, $(0\leq \e_1 \le 1)$, the argument of periastron $\omega_1$,
$(0\le \omega_1 \le 2\pi)$ and the mean anomaly at time $t=0$,
$\mu_1$, $(0\le \mu_1 \le 2\pi)$. The parameters $\C_1$ $\K_1$ and
$\sj_1$ have units $m/s$; the velocity semi-amplitude $\K_1$ is
usually restricted to be non-negative to avoid identification
problems, while the velocity offset $\C_1$ may be positive or
negative. The eccentricity parameter $\e_1$ is unit-less, with $\e_1
= 0$ corresponding to a circular orbit, and larger $\e_1$ leading to
more eccentric orbits. Periastron is the point at which the planet is
closest to the star and the argument of periastron $\omega_1$,
measures the angle at which we observe the elliptical orbit. The mean
anomaly $\mu_1$ is an angular distance of a planet from periastron.
\uhoh{Add figure illustrating?}
If there are $P \ge 1$ planets, the expected velocity is $\C_p
+ \bigtriangleup V(t_i|\phi_1,\ldots,\phi_p)$ with the overall velocity shift
$\bigtriangleup V$ approximated as the sum of the velocity
shifts of the individual planets:
\begin{equation}\label{Velocity_2p_Model}
\bigtriangleup V(t_i|\phi_1,\ldots,\phi_p)=\sum_{j=1}^p
\K_j[\cos(\omega_j+T_i(t_i))+\e_j\cos(\omega_j)]
\end{equation}
where the planets' mutual gravitational interactions are assumed to be
negligible. With $p$ planets, there are a total of $2+5p$ parameters,
denoted as
$\theta_p = \{ \mathcal{C}_p,\sj_p^2,\phi_1,\ldots,\phi_p\}$ for each of
the models $\M_p$, $p \in \{0, 1, \ldots P\}$. Of course, we do not
know how many planets there are \emph{a priori} -- indeed, finding the
number of planets $p$ and characterizing their orbital parameters is
a major aim.
\subsection{Bayesian Methods for Identifying the Number of Planets}
Determining the number of planets in a
system is, from a statistical point of view, a model choice
problem. Bayesian model selection requires calculation of marginal
likelihoods of models or ``evidence'' provided by the data for each model:
\begin{equation}\label{marginal_lik}
m( \M_p) \propto p(\v \mid \M_p) = \int_{\Theta_p}
p(\v \mid \theta_p,\M_p) p(\theta_p \mid \M_p) d\theta_p
\end{equation}
which entails integrating the sampling model of the
data $\v = (v_1, \ldots v_n)^T$ with respect to the prior distribution
of model specific parameters $\theta_p$ to obtain the marginal
density $p(\v \mid \M_p)$ of the data under model $\M_p$.
Bayes Factors for comparing a $p$ planet model to the $0$ planet model
may be expressed as
\begin{equation}
\BF(\M_p:\M_0)=\frac{m( \v \mid \M_p)}{m(\v \mid \M_0)}
\end{equation}
where the Bayes factor $\BF(\M_0,\M_0) = 1$,
while the posterior probability of the $p$ planet model is of the
form
\begin{equation}
\label{eq:post-prob}
p(\M_p \mid \v) = \frac{\BF(\M_p:\M_o) O(\M_p: \M_0)}
{ \sum_{ j= 1}^{P} \BF(\M_j:\M_o) O(\M_j: \M_0)}
\end{equation}
where $O(\M_p: \M_0)$ is the prior odds of $p$ planets to $0$
planets and $P$ is the maximum number of planets for the
system. This requires specifying a prior distribution on $\theta_p$
for each of the models in order to obtain marginal likelihoods and
Bayes factors.
\subsection{Priors Distributions}
Many of the parameters in $\theta_p$ allow informative marginal prior
distributions to be specified, although specifying joint distributions
is more difficult. While distributions may be tuned for specific
applications, \citeauthor{ford2006bms} and \citep{SAMSI:2006}
recommended the following independent ``reference'' priors as a starting
point for Bayesian analyses of radial velocity models.
In all models, the velocity offset (intercept) parameter
$\C_p$ and stellar jitter parameter $\sj_p$, are taken as
being {\it a priori} independent, where $\C_p$ is uniform over a finite
set $[\C_{\min}, \C_{\max} ] $
\begin{subequations}
\begin{align}
\label{prior_C}
p_C(C) & =\left\{\begin{aligned}
\frac{1}{\C_{\max}-\C_{\min}} &~ \qquad\qquad\qquad\mbox{for } ~\C_{\min}\leq C\leq \C_{\max}\\
0\quad\quad &~ \qquad\qquad\qquad\mbox{otherwise} \\
\end{aligned}
\right.
\intertext{and $\log(\sj_{\min} + \sj_p)$ has a uniform distribution on
the interval $(\log(\sj_{\min}), \log(\sj_{\max})]$, }
\label{prior_sigma}
p_{\sj}(\sj) & = \left\{\begin{aligned}
\frac{1}{ \log \left(1+\frac{\sj_{\max}}{\sj_{\min}}\right)} \cdot
\frac{1}{\sj_{\min}+ \sj}
&~
\quad\quad\mbox{for } ~0 < \sj \leq \sj_{\max} \\
0 \quad\quad &~ \quad\quad\mbox{otherwise.} \\
\end{aligned} \right.
\end{align}
\end{subequations}
The joint prior on $(\C_p, \sj_p)$ may be viewed as a
modified independent Jeffrey's prior as $\C_{\min} \to -\infty$,
$\C_{\max} \to \infty$, $\sj_{\min} \to 0$, $\sj_{\max} \to \infty$, which leads
to well defined posterior distributions and Bayes Factors even in the limit.
However, informative upper limits on the stellar jitter are
often available, which may aid in the rejection of models that are too
simple to capture the observed periodicities in the data. This is a
particular advantage of the Bayesian approach over frequentist methods.
For components of $\phi_p$, the recommended priors are $\e_p \sim \U(0,1)$, $\omega_p \sim \U(0, 2
\pi)$ and $\mu_p \sim \U(0, 2 \pi)$, which take into account the known
marginal constraints. For period, they recommend that log period have
a uniform distribution over the range $1$ day to $1,000$ years. The
lower limit corresponds to the smallest oribital period of known
exoplanets (which is lightly larger than the theoretical limit for
certain stars), while the upper limit exceeds the period of known
exoplanets, while still within the limits where the planet's orbit would not be
disrupted by outside perturbations to the system. Our prior for $\K_p$
is based on the SAMSI Exoplanet Working Group recommendation, where
$\K_p + \K_{\min} \sim \U(\K_{\min}, \K_{\min} + \K_{\max})$ and
$\K_{\max} = 2128$ corresponds to a maximum
planet-star mass ratio of 0.01; the lower bound corresponds to the
current minimum detection bounds ($1$ m/s); the prior in
\citet{ford2006bms} is of the same form, although they allow the upper
bound to depend on period. Table \ref{tab:hyper} summarizes the
constants used in the prior distributions.
\begin{table}[h]
\begin{center}
\begin{tabular} {|ll|lr|lr|} \hline \hline
\multicolumn{2}{|c}{Model Parameter} &
\multicolumn{2}{|c|}{Hyperparameters} \\ \hline
Period & $\P_p$ & $\P_{\min}$ & $1$ day & $\P_{\max}$ & $1,000$ years \\
Velocity Semi-Amplitude &$\K_p$ & $\K_{\min}$ & $ 1$ m/s & $\K_{\max}$ & $2128$ m/s \\
Velocity Offset & $\C_p$ & $\C_{min}$ & $-2128$ m/s & $\C_{max}$ & $2128$ m/s \\
Steller Jitter & $\sj_p$ &$\sj_{\min}$ & $1$ m/s & $\sj_{\max}$ & $100$ m/s \\ \hline
\end{tabular}
\end{center}
\caption{Hyperparameters for the the refernce prior distributions of
$\phi_p$ from \citet{ford2006bms}.}
\label{tab:hyper}
\end{table}
The nonlinear relationships induce strong correlations among
many of the parameters, which leads to a difficulties for posterior
inference. For circular orbits ($e = 0$), $\omega$ and $\mu_0$ are in
fact unidentifiable. To simplify posterior inference using
stochastic sampling, we work in a transformed parameter space:
$$
\begin{aligned}
x_p & = \e_p\cos\omega_p \label{eq:poincare-x} & \quad & y_p & =
\e_p\sin\omega_p \label{eq:poincare-y} & \quad & z_p & = (\omega_p+\mu_p)
\mod 2\pi \\
\dot{\P}_p & = \log \P_p & \quad & \dot{\K}_p & = \log \K & \quad &
\end{aligned}
$$
leading to $\dot{\phi}_p \equiv (\dot{\K}_p, \dot{\P}_p, x_p, y_p,
z_p)^T$. The Poincar\'e variables $x_p$ and $y_p$ greatly reduce the
very strong correlations between $\mu_p$ and $\omega_p$, which is
particularly important for low eccentricity orbits, where the
parameters are nearly unidentifiable. The use of $z_p$ further
reduces correlations between the parameters $\omega_p$ and $\mu_p$
when $\e_p \ll 1$, but has little effect for large $\e_p$.
Posterior distributions for were more Gaussian in these coordinates, which led to
improved posterior simulation. In the transformed parameter space,
the prior distribution for $\phi_p$ is
\begin{subequations}
\begin{equation}\label{prior_1p}
p_{\dot{\phi}_p}(\dot{\phi}_p)=c_\phi \cdot\exp\dot{\K}_p\cdot\frac{1}{1+\frac{\exp\dot{\K}_p}{\K_{\min}}}\cdot\frac{1}{\sqrt{x_p^2+y_p^2}}
\end{equation}
for $\log(\K_{\min}) < \dot{\K}_p \leq\log(\K_{\max})$,
$\log(\P_{\min})\leq\dot{P}\leq\log(\P_{\max})$, $x^2+y^2<1$, and $0\leq
z\leq2\pi$,
where the normalizing constant is
\begin{equation}\label{kappa_in_prior_1p}
c_\phi
=\frac{1}{\log\left(1+\frac{\K_{max}}{\K_{\min}}\right)}\cdot\frac{1}{\K_{\min}}\cdot\frac{1}{\log\left(\frac{\P_{\max}}{\P_{\min}}\right)}\cdot\left(\frac{1}{2\pi}\right)^2.
\end{equation}
\end{subequations}
For any of the $p > 1$ planet models, there is an inherent difficulty due
to the arbitrary ``labeling'' of planets in the model. To address this lack of
identification, we {\it a priori} restrict the periods so that
$\P_{\min} \le \P_1 \le P_2 \le \ldots \le P_p \le \P_{\max}$.