cellsize returns erroneous results for grids with fine spacing [no reprojeciton] #764

alex-s-gardner · 2024-09-25T20:23:50Z

This makes sense as gridsize is increasing linearly with latitude

using Rasters
using Rasters.Lookups
import GeoFormatTypes as GFT
using Plots

dX = 0.1
dY = -0.1
lon = X(Projected(166.:dX:168.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY ), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])

if we decrease the step size in X we start to see some jutter

dX = 0.0006
dY = -0.1
lon = X(Projected(166.0:dX:168.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])

and things really fall apart when we decrease step size in X and Y

dX = 0.0006
dY = -0.0003
lon = X(Projected(166.0:dX:167.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-79.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])

The text was updated successfully, but these errors were encountered:

asinghvi17 · 2024-09-26T21:19:50Z

@alex-s-gardner can you confirm this is on v0.11.8 of Rasters.jl?

alex-s-gardner · 2024-09-26T21:28:07Z

Yes, it is v0.11.8

asinghvi17 · 2024-09-26T22:41:07Z

~~I wonder if we need to densify and subsample points on the grid? I would expect that to have an effect as we increase dx and dy though, not as we decrease...~~

Rasters.jl/ext/RastersArchGDALExt/cellsize.jl

Lines 71 to 77 in 8d13c18

    
           GI.LinearRing([ 
        
               (xb[1], yb[1]),  
        
               (xb[2], yb[1]),  
        
               (xb[2], yb[2]),  
        
               (xb[1], yb[2]), 
        
               (xb[1], yb[1]) 
        
           ]))

nevermind, I just saw that the crs there is in long-lat. It might be something with the circular area calculation then?

alex-s-gardner · 2024-09-26T23:13:43Z

Seems like a precision issue to me... is Float64 geometry being downgraded anywhere?

asinghvi17 · 2024-09-26T23:19:09Z

No, but it is being converted to radians and then multiplied by pi

rafaqz · 2024-09-27T06:21:04Z

Any ideas @tiemvanderdeure ?

tiemvanderdeure · 2024-09-27T07:53:49Z

Thanks for pointing this out. I don't know exactly but it seems like something is being rounded somewhere?

Anyway, it's been on my list for a little while to make a specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells. That should be much faster and avoids this type of error.

The current implementation just treats each cell as a linearring, which makes it super generic but slow and (apparently) introduces some precision issues.

asinghvi17 · 2024-09-27T21:36:05Z

specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells

Could that also apply to any CRS that can map to lat/long via mapcrs?

rafaqz · 2024-09-27T22:10:01Z

Yes we should be able to do the conversion for any grid aligned crs

tiemvanderdeure · 2024-09-28T07:26:58Z

specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells

Could that also apply to any CRS that can map to lat/long via mapcrs?

It depends on this line, which (I think) should return true for any lat/long crs:

Rasters.jl/ext/RastersArchGDALExt/cellsize.jl

Line 57 in 8d13c18

    
           if convert(CoordSys, crs(dims)) == CoordSys("Earth Projection 1, 104") # check if need to reproject

But there may be a better way to test this?

rafaqz · 2024-09-28T08:10:33Z

For mappedcrs we test if changing the projection of two points with the same X and different Y gets points with the same X. But then the user has specified the conversion explicitly and we are just checking it's not really wrong, so its not the same situation.

tiemvanderdeure · 2024-09-28T08:41:53Z

Okay, I've never used mappedcrs and don't think I've ever seen it in the wild, so just trying to wrap my head around this.

For mappedcrs we test if changing the projection of two points with the same X and different Y gets points with the same X.

Are there ever cases where that holds for two projections with a different CoordSys?

I know that there are some projections which use lat/lon but are shifted/rotated compared to EPSG:4326, but I think all of those have Earth Projection 1. My thinking was that, since we're assuming the world is spherical anyway, we don't need to do any reprojecting in those cases.

rafaqz · 2024-09-28T09:03:57Z

I don't really know CoordSys well. But I imagine yes? Anything grid aligned can be converted to another with independent transformations of X and Y. So e.g. a cylindrical equal area projection can be converted to lat/lon by independently transforming it's xs and ys. In lat/lon X will have regular steps, Y will be irregular. Rasters.reproject should try to do this for you.

But it means you can use the same math you are using for lat/lon after the transformation of the lookups.

tiemvanderdeure · 2024-09-28T09:33:11Z

I don't really know CoordSys well.

I don't either and I can't really find much about it, so if you know a better check I'd be happy.

So basically we can just check if the mappedcrs is lon-lat and if it reproject the lookups and calculate the cellsize for those. That makes sense to me.

rafaqz · 2024-09-28T09:39:39Z

Yeah. There is probably a more correct way... We need Julia crs parsers that give you more useful information more easily.

tiemvanderdeure · 2024-09-28T09:43:43Z

So e.g. a cylindrical equal area projection can be converted to lat/lon by independently transforming it's xs and ys.

Do you have an example of a raster like this, so I can add a test? I can't find any in the docs.

rafaqz · 2024-09-28T10:32:20Z

You can resample to it?

asinghvi17 · 2024-10-03T00:55:34Z

I just came across google/s2geometry#190, it seems pretty insightful - maybe we should use one of the approaches there, or convert early to completely spherical (x, y, z) Cartesian points?

asinghvi17 · 2024-10-03T01:36:28Z

With the PR, this works correctly for lat/long. But if I resample the last example raster to web mercator, I get this:

tiemvanderdeure · 2024-10-03T09:10:35Z

I just came across google/s2geometry#190, it seems pretty insightful - maybe we should use one of the approaches there, or convert early to completely spherical (x, y, z) Cartesian points?

But even in that thread they mention their fallback is Girard's theorem, which is exactly what we use.

In this blogpost: https://www.johndcook.com/blog/2021/11/29/area-of-spherical-triangle/, Erikson's formula was basically equal to the estimate using just planar geometry for the area between some cities in Texas: "The triangle is so flat that numerical error has a bigger effect than the curvature of the earth." And here we're talking about the error that becomes relevant when gridcells are well below 1 km^2.

We actually already test for an equal-area projection with grid spacing of 100x100m (the test is if all grid cells are within 1% of 0.1 km^2).

My vote is for assuming flat space whenever grid cells get much smaller than that and just using the haversine formula.

With the PR, this works correctly for lat/long. But if I resample the last example raster to web mercator, I get this:

Could you copy the code?

tiemvanderdeure · 2024-10-03T09:15:29Z

Okay thinking about this some more, I'm not sure if the Haversine formula is numerically stable at this grid spacing either. It also involves a bunch of square roots and sin and cos

tiemvanderdeure · 2024-10-03T12:46:26Z

Did some more testing and I can report back that

yes, this is a numeric stability issue
yes, Eriksson's formula performs better in this particular example

I did this (_linearring_area is the current implementation):

xb = (166.0, 166.0006)
yb = (-78.0, -78.0003)

ring = [
    (xb[1], yb[1]), 
    (xb[2], yb[1]), 
    (xb[2], yb[2]), 
    (xb[1], yb[2]),
    (xb[1], yb[1])
]

ring_bf = map(p -> BigFloat.(p), ring)

eriksson(GI.LinearRing(ring))
_linearring_area(GI.LinearRing(ring); radius = 1)

eriksson(GI.LinearRing(ring_bf))
_linearring_area(GI.LinearRing(ring_bf); radius = 1)

which returns

julia> eriksson(GI.LinearRing(ring))
1.1399895306967292e-11

julia> _linearring_area(GI.LinearRing(ring); radius = 1)
2.2859936166241823e-11

julia> eriksson(GI.LinearRing(ring_bf))
1.139989369278327047449715369542151261981068071545916771493908664977035933833957e-11

julia> _linearring_area(GI.LinearRing(ring_bf); radius = 1)
1.140013862214309994513259890860796762002232266534835233057189872420615878310853e-11

So: when we use big floats both methods return the same answer, and this answer corresponds to what Eriksson's formula returns using Float64.

I implemented Eriksson's formula like this. It has all these dot products and cross products so I think it's going to allocate no matter what.

function eriksson(ring)
    points = GI.getpoint(ring)
    cartesian_points = lonlat_to_cartesian.(points)
    area = 0.0
    area += eriksson_triangle(cartesian_points[1], cartesian_points[2], cartesian_points[3])
    area += eriksson_triangle(cartesian_points[3], cartesian_points[4], cartesian_points[1])
    return area
end

function eriksson_triangle(a, b, c)
    t = abs(dot(a, cross(b, c)))
    t /= 1 + dot(b,c) + dot(c, a) + dot(a, b)

    # alternative that could be more precise but allocates more
    # t = abs(dot(a, (cross(b .- a, c .- a))) / dot(b .+ a, c .+ a))

    2*atan(t)
end

lonlat_to_cartesian(args) = lonlat_to_cartesian(args...)
function lonlat_to_cartesian(lon, lat)
    x = cosd(lat) * cosd(lon)
    y = cosd(lat) * sind(lon)
    z = sind(lat)
    return [x, y, z]
end

tiemvanderdeure · 2024-10-03T12:58:20Z

Just some observations from testing in this way:

calculation with Girard's theorem starts misbehaving when the height is lower than 0.001
it is more unstable for high/low latitudes

So the options are:

always use eriksson
implement both and let users decide
implement both and automatically switch when the area is less than let's say 1e-6 square degrees.

asinghvi17 · 2024-10-03T14:23:32Z

I implemented Eriksson's formula like this. It has all these dot products and cross products so I think it's going to allocate no matter what.

I usually use static vectors or tuples to get around that, you could probably try that, but StaticArrays is a pretty heavy dependency, and tuples don't have all the nice LinearAlgebra dispatches you'd need.

Although you could define a SphericalPoint type here that encodes those specific dispatches in a way where size is known at compile-time...that sounds like a lot of work :D

tiemvanderdeure · 2024-10-03T14:39:21Z

Yes or we can define a _dot and _cross that work on tuples of size 3 :)

asinghvi17 · 2024-10-03T14:45:42Z

Here's a super basic implementation that should get you what you need:

using LinearAlgebra

struct SphericalPoint{T <: Real}
	data::NTuple{T, 3}
end

SphericalPoint(x, y, z) = SphericalPoint((x, y, z))

# define the 4 basic mathematical operators elementwise on the data tuple
Base.:+(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .+ q.data)
Base.:-(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .- q.data)
Base.:*(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .* q.data)
Base.:/(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data ./ q.data)
# Define sum on a SphericalPoint to sum across its data
Base.sum(p::SphericalPoint) = sum(p.data)

# define dot and cross products
LinearAlgebra.dot(p::SphericalPoint, q::SphericalPoint) = sum(p * q)
function LinearAlgebra.cross(a::SphericalPoint, b::SphericalPoint)
	a1, a2, a3 = a
    b1, b2, b3 = b
	SphericalPoint((a2*b3-a3*b2, a3*b1-a1*b3, a1*b2-a2*b1))
end

tiemvanderdeure · 2024-10-03T14:48:48Z

okay let me benchmark this!

EDIT:
not bad!

julia> @btime eriksson($ring)
  1.045 μs (1 allocation: 176 bytes)
1.1399895306958715e-11

julia> @btime _linearring_area($ring; radius = 1)
  2.050 μs (0 allocations: 0 bytes)
2.2859936166241823e-11

EDIT2:
even better after I got rid of the final allocation

julia> @btime eriksson($ring)
  678.571 ns (0 allocations: 0 bytes)
1.1399895306958715e-11

asinghvi17 · 2024-10-03T14:57:24Z

Hmm, curious why there's still an allocation though

tiemvanderdeure · 2024-10-03T15:03:43Z

Hmm, curious why there's still an allocation though

because of this line, which isn't so hard to get rid of.

    cartesian_points = lonlat_to_cartesian.(points)

tiemvanderdeure · 2024-10-03T15:10:24Z

Also just reading back in google/s2geometry#190, and the reason they cite for needing to keep Girard's Theorem around is for points that are antipodal, which I don't think is ever going to happen with a Raster, so I think we can safely go for the option to just always use Eriksson's formula

tiemvanderdeure · 2024-10-03T15:45:57Z

I now implemented this in the same PR: #768

Let's continue the discussion there.

asinghvi17 · 2024-10-08T18:03:08Z

Switching cellsize to use Proj.jl instead of ArchGDAL gives us a 3x speedup in the spherical case, there's also a lot of low hanging fruit. (Just to keep the thread updated)

asinghvi17 · 2024-10-11T23:23:09Z

This can be closed now.

tiemvanderdeure mentioned this issue Sep 27, 2024

improve cellsize and switch from Archgdal to Proj #768

Merged

alex-s-gardner closed this as completed Oct 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cellsize returns erroneous results for grids with fine spacing [no reprojeciton] #764

cellsize returns erroneous results for grids with fine spacing [no reprojeciton] #764

alex-s-gardner commented Sep 25, 2024 •

edited

Loading

asinghvi17 commented Sep 26, 2024

alex-s-gardner commented Sep 26, 2024

asinghvi17 commented Sep 26, 2024 •

edited

Loading

alex-s-gardner commented Sep 26, 2024

asinghvi17 commented Sep 26, 2024

rafaqz commented Sep 27, 2024

tiemvanderdeure commented Sep 27, 2024

asinghvi17 commented Sep 27, 2024 •

edited

Loading

rafaqz commented Sep 27, 2024

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024 •

edited

Loading

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024 •

edited

Loading

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024

asinghvi17 commented Oct 3, 2024

asinghvi17 commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024 •

edited

Loading

tiemvanderdeure commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024 •

edited

Loading

tiemvanderdeure commented Oct 3, 2024

asinghvi17 commented Oct 3, 2024 •

edited

Loading

tiemvanderdeure commented Oct 3, 2024

asinghvi17 commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024 •

edited

Loading

asinghvi17 commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024

asinghvi17 commented Oct 8, 2024

asinghvi17 commented Oct 11, 2024

cellsize returns erroneous results for grids with fine spacing [no reprojeciton] #764

cellsize returns erroneous results for grids with fine spacing [no reprojeciton] #764

Comments

alex-s-gardner commented Sep 25, 2024 • edited Loading

asinghvi17 commented Sep 26, 2024

alex-s-gardner commented Sep 26, 2024

asinghvi17 commented Sep 26, 2024 • edited Loading

alex-s-gardner commented Sep 26, 2024

asinghvi17 commented Sep 26, 2024

rafaqz commented Sep 27, 2024

tiemvanderdeure commented Sep 27, 2024

asinghvi17 commented Sep 27, 2024 • edited Loading

rafaqz commented Sep 27, 2024

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024 • edited Loading

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024 • edited Loading

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024

tiemvanderdeure commented Sep 28, 2024

rafaqz commented Sep 28, 2024

asinghvi17 commented Oct 3, 2024

asinghvi17 commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024 • edited Loading

tiemvanderdeure commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024 • edited Loading

tiemvanderdeure commented Oct 3, 2024

asinghvi17 commented Oct 3, 2024 • edited Loading

tiemvanderdeure commented Oct 3, 2024

asinghvi17 commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024 • edited Loading

asinghvi17 commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024

tiemvanderdeure commented Oct 3, 2024

asinghvi17 commented Oct 8, 2024

asinghvi17 commented Oct 11, 2024

alex-s-gardner commented Sep 25, 2024 •

edited

Loading

asinghvi17 commented Sep 26, 2024 •

edited

Loading

asinghvi17 commented Sep 27, 2024 •

edited

Loading

rafaqz commented Sep 28, 2024 •

edited

Loading

rafaqz commented Sep 28, 2024 •

edited

Loading

tiemvanderdeure commented Oct 3, 2024 •

edited

Loading

tiemvanderdeure commented Oct 3, 2024 •

edited

Loading

asinghvi17 commented Oct 3, 2024 •

edited

Loading

tiemvanderdeure commented Oct 3, 2024 •

edited

Loading