Skip to content

Commit

Permalink
get_bio_fmt_specimens improvements (#9)
Browse files Browse the repository at this point in the history
* get_bio_fmt_specimens checks out the latest tag by default.

* Add fresh option

* Update docstring

* Provide slightly more conveinience for bio_fmt_specimen selection

Without breaking setups already extant in packages.

* Added docstring

* Update documentation
  • Loading branch information
Ben J. Ward committed Feb 15, 2018
1 parent 9bf3acc commit 913b25f
Show file tree
Hide file tree
Showing 6 changed files with 111 additions and 16 deletions.
4 changes: 2 additions & 2 deletions .appveyor.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
environment:
matrix:
- JULIA_URL: "https://julialang-s3.julialang.org/bin/winnt/x64/0.5/julia-0.5-latest-win64.exe"
- JULIA_URL: "https://julialang-s3.julialang.org/bin/winnt/x64/0.6/julia-0.6-latest-win64.exe"

branches:
only:
Expand All @@ -10,7 +10,7 @@ notifications:
- provider: Email
on_build_success: false
on_build_failure: false
on_build_status_changed: false
on_build_status_changed: true

install:
- ps: "[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]::Tls12"
Expand Down
1 change: 0 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ os:
- linux
- osx
julia:
- 0.5
- 0.6
- nightly
matrix:
Expand Down
3 changes: 1 addition & 2 deletions REQUIRE
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
julia 0.5
Compat 0.17
julia 0.6
Automa 0.3
BufferedStreams
19 changes: 15 additions & 4 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,10 +1,21 @@
using Documenter, BioCore

makedocs()
makedocs(
format = :html,
sitename = "BioCore.jl",
pages = [
"Home" => "index.md",
"Using file format specimens" => "testing.md",
"Contributing" => "contributing.md"
],
authors = "The BioJulia Organisation and other contributors."
)

deploydocs(
deps = Deps.pip("mkdocs", "pygments", "mkdocs-material"),
repo = "github.com/BioJulia/BioCore.jl.git",
julia = "0.5",
julia = "0.6",
osname = "linux",
latest = "master"
target = "build",
deps = nothing,
make = nothing
)
16 changes: 16 additions & 0 deletions docs/src/testing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Using the BioJulia format specimen archive

BioJulia maintains an archive of different file formats
[here](https://github.com/BioJulia/BioFmtSpecimens).

To use these file specimens when creating unit tests for a package,
use the `bio_fmt_specimens` method to download / update the datasets, and return
the paths of specimen files that match some conditions (specified using
a function).

```@doc
bio_fmt_specimens
```

Then you can iterate through the returned paths and use them with whatever IO
methods you like.
84 changes: 77 additions & 7 deletions src/Testing.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,86 @@

module Testing

function get_bio_fmt_specimens(commit="222f58c8ef3e3480f26515d99d3784b8cfcca046")
path = joinpath(dirname(dirname(@__FILE__)), "BioFmtSpecimens")
if !isdir(path)
run(`git clone https://github.com/BioJulia/BioFmtSpecimens.git $(path)`)
const FMT_SPECIMEN_PATH = joinpath(dirname(dirname(@__FILE__)), "BioFmtSpecimens")

"""
get_bio_fmt_specimens(checkout = "master", auto_checkout = true, fresh = false)
Install/update and return the path of BioJulia's biological data format
specimen archive.
When the BioFmtSpecimens archive is fetched from the web, the branch or tag
specified by `checkout` is checked out for use. Unless, `auto_checkout` is
true, in which case, the latest tagged release of the BioFmtSpecimens archive
will be checked out.
If `fresh` is set to true, this will force a deletion of any currently installed
BioFmtSpecimens archive repository, and fetch it from the web again. This may
be useful if updating the installed BioFmtSpecimens archive is problematic.
"""
function get_bio_fmt_specimens(checkout = "master", auto_checkout = true, fresh = false)
if fresh
rm(FMT_SPECIMEN_PATH, force = true, recursive = true)
end
cd(path) do
if !isdir(FMT_SPECIMEN_PATH)
run(`git clone https://github.com/BioJulia/BioFmtSpecimens.git $(FMT_SPECIMEN_PATH)`)
end
cd(FMT_SPECIMEN_PATH) do
if auto_checkout
(so, si, pr) = readandwrite(`git describe --tags`)
checkout = readline(so)
end
run(`git fetch origin`)
run(`git checkout $(commit)`)
run(`git checkout $(checkout)`)
end
return FMT_SPECIMEN_PATH
end

"""
bio_fmt_specimens(format::String, fn::Function, checkout = "master", auto_checkout = true, fresh = false)
Get the paths for file format specimen files from BioJulia's biological data
format specimen archive.
Will return a vector of paths for specimen files of a given `format` (e.g. FASTA)
which satisfy the filter `fn`. the input for `fn` should be a single argument,
which is a `Dict{Any, Any}`. Each `Dict{Any, Any}` represents a format file
specimen from BioJulia's biological data format specimen archive, and has the
following fields:
* **"filename"**: Specimen filename.
* **"valid"**: `true` or `false`, indicates whether the example conforms to the format.
* **"origin"** The contributor or source from which a specimen was taken.
* **"tags"** Zero or more words used to group specimens by shared features.
* **"comments"** (Optional) Any additional information that might be of interest.
When the BioFmtSpecimens archive is fetched from the web or updated, the branch or tag
specified by `checkout` is checked out for use. Unless, `auto_checkout` is
true, in which case, the latest tagged release of the BioFmtSpecimens archive
will be checked out.
If `fresh` is set to true, this will force a deletion of any currently installed
BioFmtSpecimens archive repository, and fetch it from the web again. This may
be useful if updating the installed BioFmtSpecimens archive is problematic.
```@example
# Get paths for FASTA format specimens which are examples of a valid file.
bio_fmt_specimens("FASTA", (x) -> x["valid"] == true)
```
"""
function bio_fmt_specimens(format::String, fn::Function, checkout = "master", auto_checkout = true, fresh = false)
get_bio_fmt_specimens(checkout, auto_checkout, fresh)
specimens = YAML.load_file(joinpath(FMT_SPECIMEN_PATH, format, "index.yml"))
filtered_specimens = Vector{String}(length(specimens))
fsi = 0
for specimen in specimens
if fn(specimen)
fsi += 1
filtered_specimens[fsi] = joinpath(FMT_SPECIMEN_PATH, specimen["filename"])
end
end
return path
resize!(filtered_specimens, fsi)
return filtered_specimens
end

function random_array(n::Integer, elements, probs)
Expand Down

0 comments on commit 913b25f

Please sign in to comment.