Seasonal subsets of the cube

another probably quite simple question:

Is there an uncomplicated way in Julia for subsetting the cube so that only data from certain months of each year remain in the data cube?

With the normal

d = subsetcube(c, Lon = (73.5, 104.75), 
                  Lat = (25.92, 39.83), 
                  time = (Date(2010,1,1), Date(2012,12,31))

I can define a start and end date for subsetting the cube - but it is not possible to for example only extract the summer months June, July, August of each year.


No, currently there is no simple way to do this. There are 2 strategies to deal with this:

  1. If you do operations on time series and are only interested in summer months, just do the filtering inside the mapslices function. E.g.:
using ESDL
using Dates
c = Cube()

d = subsetcube(c, region="Germany", var="gross", time=2001:2010)
timeaxis = getAxis("Time", d)
summerindices = findall(i->month(i) in (6,7,8), timeaxis.values);

function summermean(gpp_all, summerindices)
    gpp_summeronly = gpp_all[summerindices]

using Statistics
mapslices(summermean, d, summerindices, dims="Time")

Here you pre-comopute the indices of the time series that fall into summer months and pass them to mapslices as additional argument.

Another way to filter by season would be to use the tabular cube interface:

c = Cube()

gpp = subsetcube(c, region="Germany", var="gross", time=2001:2010)
nee = subsetcube(c, region="Germany", var="net_ecosystem", time=2001:2010)

tab = CubeTable(gpp = gpp, nee=nee, include_axes=("Time","Lon","Lat"))

This returns an iterable Table, this means the data is not loaded into memory, but you can loop over the rows of the table and to aggregations and other statistics. By the way, since the data is small in this case, you can simply convert this into a DataFrame:

using DataFrames
df = DataFrame(tab)

However, you can still do statistics if your data is too large to fit in memory. There is the very nice OnlineStats package and we have written a small function that helps you fit these to Datacube tables.

First we define a function that defines the splitting criterion and then we apply the cubefittable function:

getseason(row) = ("Winter","Spring","Summer","Autumn","Winter")[((month(row.Time))รท3+1)]

using OnlineStats
m = cubefittable(tab,Mean,:gpp, by=(:Lon,:Lat,getseason))

Note that the Mean here comes from the OnlineStats package. You might as well have a look at the WeightedOnlineStats package in case spatial weights get important.