Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhead in vrt() #1258

Closed
kadyb opened this issue Aug 11, 2023 · 4 comments
Closed

Overhead in vrt() #1258

kadyb opened this issue Aug 11, 2023 · 4 comments

Comments

@kadyb
Copy link
Contributor

kadyb commented Aug 11, 2023

Maybe there is room for optimizing the vrt() function?

library("sf")
library("terra") # 1.7.39
gdal() # 3.6.2

n = 4000
r = rast(ncols = n, nrows = n, vals = rnorm(n ^ 2))
filename = tempfile(fileext = "_.tif")
ff = makeTiles(r, 100, filename) # 1600 tiles
vrt = tempfile(fileext = ".vrt")

system.time({ terra::vrt(ff, filename = vrt) })
#> user  system elapsed 
#> 8.66    7.55   16.21

system.time({ sf::gdal_utils(util = "buildvrt", source = ff, destination = vrt) })
#> user  system elapsed
#> 2.54    2.84    5.39
@dfriend21
Copy link
Contributor

I've also noticed that reading in a ".vrt" that's already been created and that references many rasters takes a while - at least it takes longer than I expect, although I'm not familiar with what's going on under the hood, so I'm not sure what a reasonable expectation is.

library(terra)
#> terra 1.7.39

r <- rast(system.file("ex/elev.tif", package = "terra"))
paths <- makeTiles(r, 3, tempfile(fileext = ".tif"))
length(paths)
#> [1] 960
system.time(r_vrt <- vrt(paths))
#>    user  system elapsed 
#>    3.44    2.66    6.36
system.time(rast(sources(r_vrt)))
#>    user  system elapsed 
#>    2.11    1.65    4.06

Created on 2023-08-24 with reprex v2.0.2

@rhijmans
Copy link
Member

I think the difference is because terra::vrt returns a SpatRaster:

system.time(x <- terra::vrt(ff, filename = vrt) )
#   user  system elapsed 
#  14.94   21.25   36.25 

Which is about the same duration as

system.time({ sf::gdal_utils(util = "buildvrt", source = ff, destination = vrt) })
#   user  system elapsed 
#   4.94    7.47   12.44 
system.time(y <- terra::rast(vrt) )
#   user  system elapsed 
#   9.47   15.17   24.64 

There could be an option to only create the .vrt file and not return a SpatRaster.

rhijmans added a commit that referenced this issue Aug 26, 2023
@rhijmans
Copy link
Member

You can now request that the filename is returned instead of a SpatRaster:

f <- terra::vrt(ff, filename = vrt, return_filename=T)

@kadyb
Copy link
Contributor Author

kadyb commented Aug 27, 2023

Thanks! Now I get:

system.time({ terra::vrt(ff, filename = vrt) })
#> user  system elapsed 
#> 8.17    7.70   15.87

system.time({ terra::vrt(ff, filename = vrt, return_filename = TRUE) })
#> user  system elapsed 
#> 3.30    2.45    5.75

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Dec 10, 2024
# version 1.7-83

## bug fixes

- `flip(direction="vertical")` failed in some cases
  [#1518](rspatial/terra#1518) by Ed Carnell

- `zonal(as.raster=TRUE)` failed when the zonal raster was categorical
  [1514](rspatial/terra#1514) by Jessi L
  Brown

- `distance<data.frame,data.frame>` and `<matrix,matrix>` ignored the
  unit
  argument. [#1545](rspatial/terra#1545) by
  Wencheng Lau-Medrano

- NetCDF files with month time-step encode from 0-11 made R crash
  [#1544](rspatial/terra#1544) by Martin
  Holdrege

- `split<SpatVector>` only worked well if the split field was of type
  character. [#1530](rspatial/terra#1530) by
  Igor Graczykowski

- `gridDist` (and probably some other methods) emitted a "cannot
  overwrite existing file" error when processing large datasets
  [#1522](rspatial/terra#1522) by Clare
  Pearson

- `terrain` did not accept multiple variables
  [#1561](rspatial/terra#1561) by Michael
  Mahoney

- `rotate` was vulnerable to an integer overflow
  [#1562](rspatial/terra#1562) by Sacha
  Ruzzante

- `getTileExtents` could return overlapping tiles or tiles with gaps
  due to floating point
  imprecision. [#1564](rspatial/terra#1564)
  by Michael Sumner


## enhancements

- `as.list<SpatRasterDataset>` sets the names of the list
  [#1513](rspatial/terra#1513)

- a SpatVectorCollection can now be subset with its names; and if made
  from a list it takes the names from the list.
  [1515](rspatial/terra#1515) by jedgroev

- argument `fill_range` to plot<SpatRaster> and `plot<SpatVector>` to
  use the color of the extreme values of the specified range
  [#1553](rspatial/terra#1553) by Mike
  Koontz

- plet<SpatRaster> can now handle rasters with a "local" (Cartesian)
  CRS. [#1570](rspatial/terra#1570) by
  Augustin Lobo.

## new

- `map-region` returns the coordinates of the axes position of a map
  created with `plot<Spat*>`
  [https://github.com/rspatial/terra/issues/1517](https://github.com/rspatial/terra/issues/1517)
  by Daniel Schuch

- `polys<leaflet>` method
  [#1543](rspatial/terra#1543) by Márcia
  Barbosa

- `plot<SpatVectorCollection>` method
  [#1532](rspatial/terra#1532) by jedgroev

- `add_mtext` to add text around the margins of a
  map. [#1567](rspatial/terra#1567) by
  Daniel Schuch

# version 1.7-78

Released 2023-05-22

## bug fixes

- `writeVector` and `readVector` better handle empty geopackage layers
  [#1426](rspatial/terra#1426) by Andrew
  Gene Brown.

- `writeCDF` only wrote global variables if there was more than one
  [#1443](rspatial/terra#1443) by Daniel
  Schlaepfer

- `rasterize` with "by" returned odd layernames
  [#1435](rspatial/terra#1435) by Philippe
  Massicotte

- `convHull`, `minCircle` and `minRect` with a zero-row SpatVector
  crashed R [#1445](rspatial/terra#1445) by
  Andrew Gene Brown

- `rangeFill` with argument `circular=TRUE` did not work properly
  [#1460](rspatial/terra#1460) by Alice

- `crs(describe = TRUE)` returned an mis-ordered extent
  [#1485](rspatial/terra#1485) by Dimitri
  Falk

- `tapp` with a custom function and an index like "yearmonths" could
  shift time for not considering the time
  zone. [#1483](rspatial/terra#1483) by Finn
  Roberts

- `plot<SpatRaster>` could fail when there were multiple values with
  very small differences
  [#1491](rspatial/terra#1491) by srfall

- `as.data.frame<SpatRaster>` with "xy=TRUE" and "wide=FALSE" could
  fail if coordinates were very similar
  [#1476](rspatial/terra#1476) by Pascal
  Oettli

- `rasterizeGeom` now returns the correct layer name
  [#1472](rspatial/terra#1472) by
  HRodenhizer

- `cellSize` with "mask=TRUE" failed if the output was to be written
  to a temp file
  [#1496](rspatial/terra#1496) by Pascal
  Sauer

- `ext<SpatVectorProxy>` did not return the full extent
  [#1501](rspatial/terra#1501) by
  erkent-carb


## enhancements

- `extract` has new argument "small=TRUE" to allow for strict use of
  "touches=FALSE"
  [#1419](rspatial/terra#1419) by Floris
  Vanderhaeghe.

- `as.list<SpatRaster>` has new argument "geom=NULL"

- `rast<list>` now recognizes (x, y, z) base R "image" structures
  [stackoverflow]
  (https://stackoverflow.com/questions/77949551/rspatial-convert-a-grid-list-to-a-raster-using-terra)
  by Ignacio Marzan.

- `inset` has new arguments "offset" and "add"
  [#1422](rspatial/terra#1422) by Armand-CT

- `expanse<SpatRaster>` has argument `usenames`
  [#1446](rspatial/terra#1446) by Bappa Das

- the default color palette is now `terra::map.pal("viridis")` instead
  of `terrain.colors`. The default can be changes with
  `options(terra.pal=...)`
  [#1474](rspatial/terra#1474) by Derek
  Friend

- `as.list<SpatRasterDataset>` now returns a named
  list. [#1513](rspatial/terra#1513) by Eric
  R. Scott


## new

- `bestMatch<SpatRaster>` method

- argument "pairs=TRUE" to `cells` [https://github.com/rspatial/terra/issues/1487](https://github.com/rspatial/terra/issues/1487) by Floris Vanderhaeghe

- `add_grid` to add a grid to a map


# version 1.7-71

Released 2023-01-31

## bug fixes

- k_means did not work if there were NAs
  [#1314](rspatial/terra#1314) by Jakub
  Nowosad

- `layerCor` with a custom function did not work anymore
  [#1387](rspatial/terra#1387) by Jakub
  Nowosad

- `plet` broke when using "panel=TRUE"
  [#1384](rspatial/terra#1384) by Elise
  Hellwig

- using /vis3/ to open a SpatRaster did not work
  [#1382](rspatial/terra#1382) by Mike
  Koontz

- `plot<SpatRaster>(add=TRUE)` sampled the raster data without
  considering the extent of the
  map. [#1394](rspatial/terra#1394) by
  Márcia Barbosa

- `plot<SpatRaster>(add=TRUE)` now only considers the first layer of a
  multi-layer SpatRaster
  [1395](rspatial/terra#1395) by Márcia
  Barbosa

- `set.cats` failed with a tibble was used instead of a data.frame
  [#1406](rspatial/terra#1406) by Mike
  Koontz

- `polys` argument "alpha" was ignored if a single color was
  used. [#1413](rspatial/terra#1413) by
  Derek Friend

- `query` ignore the "vars" argument if all rows were
  selected. [#1398](rspatial/terra#1398) by
  erkent-carb.

- `spatSample` ignored "replace=TRUE" with random sampling,
  na.rm=TRUE, and a sample size larger than the non NA
  cells. [#1411](rspatial/terra#1411) by
  Babak Naimi

- `spatSample` sometimes returned fewer values than requested and
  available for lonlat
  rasters. [#1396](rspatial/terra#1396) by
  Márcia Barbosa.


## enhancements

- `vect<character>` now has argument "opts" for GDAL open options,
  e.g. to declare a file
  encoding. [#1389](rspatial/terra#1389) by
  Mats Blomqvist

- `plot(plg=list(tic=""))` now allows choosing alternative continuous
  legend tic-mark styles ("in", "out", "through" or "none")

- `makeTiles` has new argument "buffer"
  [#1408](rspatial/terra#1408) by Joy
  Flowers.


## new

- `prcomp<SpatRaster>` method
  [#1361](rspatial/terra#1361 (comment))
  by Jakub Nowosad

- `add_box` to add a box around the map. The box is drawn where the
  axes are, not around the plotting region.

- `getTileExtents` provides the extents of tiles. These may be used in
parallelization. See [#1391](https://github.com/rspa
tial/terra/issues/1391) by Alex Ilich.


# version 1.7-65

Released 2023-12-15

## bug fixes

- `flip` with argument `direction="vertical"` filed in some cases with
   large rasters processed in chunks
   [0b714b0](rspatial/terra@0b714b0)
   by Dulci on [stackoveflow](
   https://stackoverflow.com/questions/77304534/rspatial-terraflip-error-when-flipping-a-multi-layer-spatrast-object)

- SpatRaster now correctly handles `NA & FALSE` and `NA | TRUE`
  [#1316](rspatial/terra#1316) by John Baums

- `set.names` wasn't working properly for SpatRasterDataset or
  SpatRasterCollection
  [#1333](rspatial/terra#1333) by Derek Friend

- `extract` with argument "layer" not NULL shifted the layers
  [#1332](rspatial/terra#1332) by Ewan
  Wakefield

- `terraOptions` did not capture "memmin" on
  -[stackoverflow](https://stackoverflow.com/questions/77552234/controlling-chunk
  -size-in-terra) by dww

- `rasterize` with points and a built-in function could crash if no
  field was used
  [#1369](rspatial/terra#1369) by
  anjelinejeline


## enhancements

- `mosaic` can now use `fun="modal"`

- `rast<matrix> and rast<data.frame>` now have option 'type="xylz"
  [#1318](rspatial/terra#1318) by Agustin
  Lobo

- `extract<SpatRaster,SpatVector>` can now use multiple summarizing
  functions [#1335](rspatial/terra#1335) by
  Derek Friend

- `disagg` and `focal` have more optimistic memory requirement
  estimation [#1334](rspatial/terra#1334) by
  Mikko Kuronen

## new

- `k_means<SpatRaster>` method
  [#1314](rspatial/terra#1314) by Agustin
  Lobo

- `princomp<SpatRaster>` method
  [#1361](rspatial/terra#1361) by Alex Ilich

- `has.time<SpatRaster>` method

- new argument "raw=FALSE" to `rast`, `sds`, and `sprc` to allow
  ignoring scale and offset
  [1354](rspatial/terra#1354) by Insang Song


# version 1.7-55

Released 2023-10-14

## bug fixes

- `mosaic` ignored the filename argument if the SpatRasterCollection
  only had a single SpatRaster
  [#1267](rspatial/terra#1267) by Michael
  Mahoney

- Attempting to use `extract` with a raster file that had been deleted
  crashed R. [#1268](rspatial/terra#1268) by
  Derek Friend

- `split<SpatVector,SpatVector>` did not work well in all
  cases. [#1256](rspatial/terra#1256) by
  Derek Corcoran Barrios

- `intersect` with two SpatVectors crashed R if there was a date/time
variable [#1273]( rspatial/terra#1273) by
Dave Dixon

- "values=FALSE" was ignored by
  `spatSample<SpatRaster>(method="weights")`
  [#1275](rspatial/terra#1275) by François
  Rousseu

- `coltab<-` again works with a list as value
[#1280](rspatial/terra#1280) by Diego
Hernangómez

- `stretch` with histogram equalization was not memory-safe
  [#1305](rspatial/terra#1305) by Evan Hersh

- `plot` now resets the "mar" parameter
  [#1297](rspatial/terra#1297) by Márcia
  Barbosa

- `plotRGB` ignored the "smooth" argument
  [#1307](rspatial/terra#1307) by Timothée
  Giraud


## enhancements

- argument "gdal" in `project` was renamed to "use_gdal"
  [#1269](rspatial/terra#1269) by Stuart
  Brown.

- SpatVector attributes can now be stored as an ordered factor
  [#1277](rspatial/terra#1277) by Ben Notkin

- `plot<SpatVector>` now uses an "interval" legend when breaks are
  supplied [#1303](rspatial/terra#1303) by
  Gonzalo Rizzo

- `crop<SpatRaster>` now keeps more metadata, including variable names
  [#1302](rspatial/terra#1302) by rhgof

- `extract(fun="table")` now returns an easier to use data.frame
[#1294](rspatial/terra#1294) by Fernando
Aramburu.


## new
- `metags<-` and `metags` to set arbitrary SpatRaster/file level
   metadata [#1304](https://github.com/rspatial/terra/issues/ 1304) by
   Francesco Chianucci

# version 1.7-46

Released 2023-09-06

## bug fixes

- `plot<SpatVector>` used the wrong main label in some cases
  [#1210](rspatial/terra#1210) by Márcia
  Barbosa

- `plotRGB` failed with an "ext=" argument
  [#1228](rspatial/terra#1228) by Dave Edge

- `rast<array>` failed badly when the array had less than three
  dimensions. [#1254](rspatial/terra#1254)
  by andreimirt.

- `all.equal` for a SpatRaster with multiple layers
[#1236](rspatial/terra#1236) by Sarah
Endicot t

- `zonal(wide=FALSE)` could give wrong results if the zonal SpatRaster
  had "layer" as
  layername. [#1251](rspatial/terra#1251) by
  Jeff Hanson

- `panel` now support argument "range"
  [#141](rspatial/terra#1241) by Jakub
  Nowosad

- `rasterize` with `by=` returned wrong layernames if the by field was
  not sorted [#1266](rspatial/terra#1266) by
  Sebastian Dunnett

- `mosaic` with multiple layers was not correct
  [#1262](rspatial/terra#1262) by
  Jean-Romain


## enhancements

- `wrap<SpatRaster>` now stores color tables
  [#1215](rspatial/terra#1215) by Patrick
  Brown

- `global` now has a "maxcell" argument
  [#1213](rspatial/terra#1213) by Alex Ilich

- `layerCor` with fun='pearson' now returns output with the layer
  names [#1206](rspatial/terra#1206)

- `vrt` now has argument "set_names"
  [#1244](rspatial/terra#1244) by sam-a-levy

- `vrt` now has argument "return_filename"
  [#1258](rspatial/terra#1258) by Krzysztof
  Dyba

- `project<SpatRaster>` has new argument "by_util" exposing the GDAL
  warp utility [#1222](rspatial/terra#1222) by
  Michael Sumner.


## new
- `compareGeom` for list and SpatRasterCollection
  [#1207](rspatial/terra#1207) by Sarah
  Endicott

- `is.rotated<SpatRaster>` method
  [#1229](rspatial/terra#1229) by Andy Lyons

- `forceCCW<SpatVector>` method to force counter-clockwise orientation
  of polygons [#1249](rspatial/terra#1249)
  by srfall.

- `vrt_tiles` returns the filenames of the tiles in a vrt file
  [#1261](rspatial/terra#1261) by Derek
  Friend

- `extractAlong` to extract raster cell values for a line that are
  ordered along the
  line. [#1257](rspatial/terra#1257) by
  adamkc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants