-
Notifications
You must be signed in to change notification settings - Fork 31
/
Copy pathMake_Plots.py
240 lines (188 loc) · 8.73 KB
/
Make_Plots.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
'''
Usage: python Make_Plots.py
This is meant to be a general use script to fit a demographic model and create
plots of the data and model sfs. The user will have to edit information about
their allele frequency spectrum and provide a custom model. The instructions
are annotated below, with a #************** marking sections that will have to
be edited.
This script must be in the same working directory as Plotting_Functions.py, which
contains all the functions necessary for generating simulations and optimizing the model.
General workflow:
The user provides a model and the previously optimized parameters for their empirical
data. The model is fit using these parameters, and the resulting model SFS is used to
create the plot comparing the data to the model.
Outputs:
A summary file of the model fit will be created, along with a pdf of the plot.
Notes/Caveats:
Sometimes you may see the following error when plotting 2D or 3D:
"ValueError: Data has no positive values, and therefore can not be log-scaled."
To deal with this, you can change the optional argument vmin_val to a value
between 0 and 0.05 (0.05 is the default). I have used values from 0.005-0.01
with good visual results.
Citations:
If you use these scripts for your work, please cite the following publications.
For the general optimization routine:
Portik, D.M., Leache, A.D., Rivera, D., Blackburn, D.C., Rodel, M.-O.,
Barej, M.F., Hirschfeld, M., Burger, M., and M.K.Fujita. 2017.
Evaluating mechanisms of diversification in a Guineo-Congolian forest
frog using demographic model selection. Molecular Ecology 26: 52455263.
doi: 10.1111/mec.14266
-------------------------
Written for Python 2.7 and 3.7
Python modules required:
-Numpy
-Scipy
-dadi
-------------------------
Daniel Portik
https://github.com/dportik
Updated September 2019
'''
import os
import numpy
import dadi
import Plotting_Functions
from dadi import Numerics, PhiManip, Integration, Misc
from dadi.Spectrum_mod import Spectrum
#===========================================================================
# Import data to create joint-site frequency spectrum
#===========================================================================
#**************
#path to your input file
snps = "/Users/portik/Documents/GitHub/dadi_pipeline/Two_Population_Pipeline/Example_Data/dadi_2pops_North_South_snps.txt"
#Create python dictionary from snps file
dd = Misc.make_data_dict(snps)
#**************
#pop_ids is a list which should match the populations headers of your SNPs file columns
pop_ids=["North", "South"]
#**************
#projection sizes, in ALLELES not individuals
proj = [16, 32]
#Convert this dictionary into folded AFS object
#[polarized = False] creates folded spectrum object
fs = Spectrum.from_data_dict(dd, pop_ids=pop_ids, projections = proj, polarized = False)
#print some useful information about the afs or jsfs
print("\n\n============================================================================")
print("\nData for site frequency spectrum:\n")
print("Projection: {}".format(proj))
print("Sample sizes: {}".format(fs.sample_sizes))
print("Sum of SFS: {}".format(numpy.around(fs.S(), 2)))
print("\n============================================================================\n")
#================================================================================
# Fit the empirical data based on prior optimization results, obtain model SFS
#================================================================================
'''
We will use a function from the Plotting_Functions.py script:
Fit_Empirical(fs, pts, outfile, model_name, func, in_params, fs_folded)
Mandatory Arguments =
fs: spectrum object name
pts: grid size for extrapolation, list of three values
outfile: prefix for output naming
model_name: a label to help name the output files; ex. "sym_mig"
func: access the model function from within script
in_params: the previously optimized parameters to use
fs_folded: A Boolean value indicating whether the empirical fs is folded (True) or not (False).
'''
#**************
#COPY AND PASTE YOUR MODEL HERE, do not use this one!
#you can copy/paste directly from the Models_2D.py or Models_3D.py scripts
def sym_mig(params, ns, pts):
"""
Split into two populations, with symmetric migration.
nu1: Size of population 1 after split.
nu2: Size of population 2 after split.
T: Time in the past of split (in units of 2*Na generations)
m: Migration rate between populations (2*Na*m)
"""
nu1, nu2, m, T = params
xx = Numerics.default_grid(pts)
phi = PhiManip.phi_1D(xx)
phi = PhiManip.phi_1D_to_2D(xx, phi)
phi = Integration.two_pops(phi, xx, T, nu1, nu2, m12=m, m21=m)
fs = Spectrum.from_phi(phi, ns, (xx,xx))
return fs
#create a prefix based on the population names to label the output files
#ex. Pop1_Pop2
#DO NOT EDIT THIS
prefix = "_".join(pop_ids)
#**************
#Make sure to define your extrapolation grid size.
pts = [50,60,70]
#**************
#Provide best optimized parameter set for empirical data.
#These will come from previous analyses you have already completed.
emp_params = [0.1487,0.1352,0.2477,0.1877]
#**************
#Fit the model using these parameters and return the model SFS.
#Here, you will want to change the "sym_mig" and sym_mig arguments to match your model, but
#everything else can stay as it is. See above for argument explanations.
model_fit = Plotting_Functions.Fit_Empirical(fs, pts, prefix, "sym_mig", sym_mig, emp_params, fs_folded=True)
#================================================================================
# Plotting a 2D spectrum
#================================================================================
'''
We will use a function from the Plotting_Functions.py script to plot:
Plot_2D(fs, model_fit, outfile, model_name, vmin_val=None)
Mandatory Arguments =
fs: spectrum object name
model_fit: the model spectrum object name
outfile: prefix for output naming
model_name: a label to help name the output files; ex. "sym_mig"
Optional Arguments =
vmin_val: Minimum values plotted for sfs, default is 0.05 and to fix the common error this should
be changed to something between 0 and 0.05.
'''
#**************
#Now we simply call the function with the correct arguments (notice many are the same from the
#empirical fit).
Plotting_Functions.Plot_2D(fs, model_fit, prefix, "sym_mig")
#**************
#Although the above function does not produce an error, let's pretend it did and
#change the vmin value. You can see the effect on the colors in the plot. We will
#edit the "model_name" string so the output file will be called something different.
vmin = float(0.01)
Plotting_Functions.Plot_2D(fs, model_fit, prefix, "sym_mig_vmin", vmin_val = vmin)
#================================================================================
# Plotting a 1D spectrum
#================================================================================
#Although we've used the script to plot a 2D jsfs, it can also be used to create
#plots for a 1D spectrum in much the same way.
'''
For a 1D sfs:
We will use a function from the Plotting_Functions.py script to plot:
Plot_1D(fs, model_fit, outfile, model_name)
Mandatory Arguments =
fs: spectrum object name
model_fit: the model spectrum object name
outfile: prefix for output naming
model_name: a label to help name the output files; ex. "sym_mig"
'''
#Example 1D plot call (will not work with the example 2D sfs!):
#**************
#Unhash the command below to run the 1D plot, make sure to change
#the 'prefix' and 'something' args to fit your dataset
#Plotting_Functions.Plot_1D(fs, model_fit, prefix, "something")
#================================================================================
# Plotting a 3D spectrum
#================================================================================
#Although we've used the script to plot a 2D jsfs, it can also be used to create
#plots a 3D spectrum in much the same way.
'''
For a 3D jsfs:
We will use a function from the Plotting_Functions.py script to plot:
Plot_3D(fs, model_fit, outfile, model_name, vmin_val=None)
Mandatory Arguments =
fs: spectrum object name
model_fit: the model spectrum object name
outfile: prefix for output naming
model_name: a label to help name the output files; ex. "sym_mig"
Optional Arguments =
vmin_val: Minimum values plotted for sfs, default is 0.05 and to fix the common error this should
be changed to something between 0 and 0.05.
'''
#Example 3D plot call (will not work with the example 2D sfs!):
#**************
#Unhash the command below to run the 1D plot, make sure to change
#the 'prefix' and 'something' args to fit your dataset
#Plotting_Functions.Plot_3D(fs, model_fit, prefix, "something")