Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve cellsize and switch from Archgdal to Proj #768

Merged
merged 45 commits into from
Oct 9, 2024

Conversation

tiemvanderdeure
Copy link
Contributor

Solves #764 and makes cellsize calculation much faster for lon-lat projections.

I'm considering getting rid of the ArchGDAL requirement for these projections as well.

Any ideas for further improvements @rafaqz @alex-s-gardner?

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

How do we get rid of ArchGDAL?

@tiemvanderdeure
Copy link
Contributor Author

I was thinking we could just make a _crs2transform that just calls ArchGDAL.crs2transform if ArchGDAL is loaded and otherwise tells the user to load ArchGDAL. If the CRS is WGS84 then cellsize would never have to call _crs2transform

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

But how do we deal with Proj/Wkt, a lot of different text strings can be wgs84

@tiemvanderdeure
Copy link
Contributor Author

But how do we deal with Proj/Wkt, a lot of different text strings can be wgs84

The way we already do it - I don't think this line of code needs ArchGDAL

if convert(CoordSys, crs(dims)) == CoordSys("Earth Projection 1, 104") # check if need to reproject

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Yeah that's ArchGDAL doing that conversion ;)

@tiemvanderdeure
Copy link
Contributor Author

tiemvanderdeure commented Sep 27, 2024

Yeah that's ArchGDAL doing that conversion ;)

I was thrown off by the GeoFormatTypes wrapper, but makes sense. Let's just leave it as it is, then.

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Yes ArchGDAL actually pirates Base.convert on GeoFormat, so the confusion is understandable. We could put it in a GeoFormatTypesArchGDAL extension now, but that wasn't possible at the time. We could also switch to using Proj.jl directly.

@alex-s-gardner
Copy link
Contributor

Can we take this opportunity to rename cellsize to cellarea and to return area in units of meters #747, also adding a units kwarg?

@alex-s-gardner
Copy link
Contributor

alex-s-gardner commented Sep 27, 2024

There is room for significant speed gains for small small grid spacing:

#For lat/lon aligned rasters with small grid spacing, this will be much faster and just as accurate:

# load packages
import Rasters: EPSG
using Rasters
using Rasters.Lookups
using DimensionalData


# define meters to lat/lon function
#"""
    meters2lonlat_distance(distance_meters, latitude_degrees)

Returns the decimal degree distance along latitude and longitude lines given a distance in 
meters and a latitude in decimal degrees.

# Example usage:
#```julia-repl
julia> distance_meters = 1000.0;  
julia> latitude_degrees = 45.0;  

julia> lat, lon = Altim.meters2lonlat_distance(distance_meters, latitude_degrees)
(0.008997741566866717, 0.012718328120254203)
#```
#"""
function meters2lonlat_distance(distance_meters, latitude_degrees)
    # Radius of the Earth in meters
    earth_radius = 6371000

    # Calculate the angular distance in radians
    angular_distance = distance_meters / earth_radius

    # Calculate the longitude distance using the Haversine formula
    longitude_distance = angular_distance * (180.0 / π) / cosd(latitude_degrees)

    latitude_distance = distance_meters / 111139

    return latitude_distance, longitude_distance
end

# build an example raster
dX = 0.1
dY = -0.1
lon = X(Projected(166.:dX:168.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY ), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))

# given a lat/lon raster with small grid spacing caclculate area
dlon = dims(ras, :X)
dlat = dims(ras, :Y)
lonlat_per_meter = meters2lonlat_distance.(Ref(1), dlat)

dist_lon = step(dlon) ./ getindex.(lonlat_per_meter, 1)
dist_lat = step(dlat) ./ getindex.(lonlat_per_meter, 2)
area = ones(dlon) * DimArray(dist_lon .* dist_lat, dlat)'

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Can we take this opportunity to rename cellsize to cellarea and to return area in units of meters #747, also adding a units kwarg?

Let's wait for a complete Unitful extension to add the units keyword to everything all at once. (in a few months when my PhD is done 😅)

I'm happy to rename but we should @deprecate the current name so it still works for a while.

@alex-s-gardner
Copy link
Contributor

I'm happy to rename but we should @deprecate the current name so it still works for a while.

given the incorrectness of cellsize it might actually be better to break it

@tiemvanderdeure
Copy link
Contributor Author

There is room for significant speed gains for small small grid spacing:

Have you compared to the implementation in this PR? For me it is giving similar speeds.

I can see the point of your implementation, but using the haversine formula also has downsides, since it doesn't account for the curvature of the earth. I know this isn't a big deal in most cases, but the error increases with the size of gridcells. So after (dis)aggregating a raster, the total area returned by cellarea would be different, which is not ideal.

Maybe we should implement it for other projections than lon-lat, but then we might need to do some more geometry.

@tiemvanderdeure
Copy link
Contributor Author

Can we take this opportunity to rename cellsize to cellarea and to return area in units of meters #747, also adding a units kwarg?

Let's wait for a complete Unitful extension to add the units keyword to everything all at once. (in a few months when my PhD is done 😅)

Maybe we should already start returning the result in metres though, to avoid another breaking change down the line. If we add a units keyword later then it would default to m^2.

@rafaqz
Copy link
Owner

rafaqz commented Sep 28, 2024

Yeah result in metres is fine, just not the units kw

@alex-s-gardner
Copy link
Contributor

Have you compared to the implementation in this PR? For me it is giving similar speeds.

Your implementation knocks the socks off of my proposal (2x faster) and is more accurate so ignore my recommendation. Excited to see this progressing... fast, useful and intuitive... couldn't ask for anything more

@rafaqz
Copy link
Owner

rafaqz commented Oct 1, 2024

Is this good to go?

@tiemvanderdeure
Copy link
Contributor Author

I think so! Can you or @asinghvi17 just nod at the logic in the line of code that reprojects to mappedcrs? I don't think we need an option to turn that behaviour off, right? I guess users vouch for the accuracy of that transformation by providing a mappedcrs.

@asinghvi17
Copy link
Collaborator

asinghvi17 commented Oct 5, 2024

Happy to report that this solves #764 in web mercator as well!
hello

asinghvi17 and others added 8 commits October 7, 2024 22:00
This commit creates a new `reproject` method that uses Proj to transform an array of values in place using `proj_trans_generic`, only allocating a Float64 output array.  It also performs the reproject check before actually transforming the array.

This decreases runtime on my machine from 4.168 ms (ArchGDAL) to 665.187 μs (Proj), and allocations from 35167 allocs: 1.254 MiB (ArchGDAL) to 108 allocs: 42.859 KiB (Proj).
This is a weakness of this method compared to the old ArchGDAL method.  Does this need a fix in Proj?
[WIP] switch cellarea and reproject to use Proj
@tiemvanderdeure tiemvanderdeure changed the title optimize cellsize for lon-lat projections improve cellsize and switch from Archgdal to Proj Oct 8, 2024
@asinghvi17
Copy link
Collaborator

Full changelog:

  • Switch cellarea and reproject to use Proj instead of ArchGDAL, increasing speed and decreasing the number of allocations.
  • Cellarea changes:
    • Rename cellsize to cellarea
    • Return output in meters by default (can be overridden by the radius in the manifold, so Spherical(radius_of_earth_in_km) gives you are in km, for example. This could be made more user friendly in the future.
    • Allow "algorithm choice" using GeometryOpsCore Manifold types (planar / spherical)
    • Use Eriksson's formula for spherical triangle area to compute the area of a grid cell using Cartesian (x, y, z) coordinates for spherical cellarea, increasing precision and providing correct cell area on at least 5mx5m grid cells (could be smaller, haven't tested)
    • Switch to using sind and cosd rather than deg2rad, since they use extended-precision deg2rad internally
    • Use a direct method to get cell area for lat-long (geographic) projections, greatly simplifying the process
  • Reproject changes:
    • Switch from transforming vectors of points out-of-place with ArchGDAL to transforming a single array of x or y values in-place with Proj's proj_trans_generic. This drastically reduces allocations and memory usage, and improves runtime by 6x.
    • Use Proj's geographic / degree-unit checks to figure out if the CRS is long-lat. This currently checks explicitly for the "degree" unit name, but this could be expanded in the future.

@tiemvanderdeure
Copy link
Contributor Author

The test failures are unrelated and already fixed in #780. There is still room for improvement performance-wise but I think we should merge this now. We can make a different PR for further improvements.

src/methods/reproject.jl Outdated Show resolved Hide resolved
@rafaqz
Copy link
Owner

rafaqz commented Oct 9, 2024

Thanks everyone!

@rafaqz rafaqz merged commit 1f8d0e6 into rafaqz:main Oct 9, 2024
1 check passed
@alex-s-gardner
Copy link
Contributor

Woot woot ! Fantastic effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants