Next release

Aug 10, 2011 at 2:05 PM

We like the SDS library very much and are using it in a project with great success. However there are a few things that keeps troubeling us, and is critical for the success of our project.

* Memory leak in the version of netcdf used by sds (newer versions of netcdf have this fixed) --> makes it impossible to use large datasets.

* Only possible to define infinite sized dimensions --> result in slow reads as all dimensions are infinite.

* No user interface to chunk size --> results in slow reads in many cases

* No user interface to define chunk cache

In addition there are a few features we would very much like to have if possible:

* support for groups in netcdf

* support for multiple threads reading from same dataset (this is possible with the basic netcdf library)

I hope the first issues will be fixed in the next version of SDS.

So was also wondering if there exists an approximate timeframe on when the next release and what features that will be included?

Aug 11, 2011 at 8:42 AM

Thank you for your interest to our library!

We are planning to publish next SDS release in September. However it is still unknown whether we will succeed in building NetCDF 4.1.3 for Windows or not, since there is no official build for Windows (or at least it is unknown to us). But we already found that new version of NetCDF has no memory leak and therefore do hard efforts to build it.

At any case, next release will allow to set chunk sizes for NetCDF variables. Probably some other capabilities will be presented too, but it now depends on how quick we will move to the next version of NetCDF. We will consider your feature request. 

 

Regards,

Dmitry Voytsekhovskiy.

Aug 11, 2011 at 11:52 AM

jardar: can you please expalin your interest in supporting groups? Do you think groups have any advantage compared to just naming conventions? Up to now I didn't see any meaningful use of that feature.

Aug 12, 2011 at 2:07 PM

Hi Vassilyl

Thanks for the quick response! We are looking forward to the next release.

We use the sds/netcdf to store oceanographic data. We typically have sensors that provide different kinds of data. And as you say everything could be implemented by naming conventions. However this could easily become quite messy, which would not be that problematic if it was only used for internal use. However when sharing the datafiles it makes sense to be able to group data.

I have two scenarios for a typical use of groups:

1)

When storing this data we typically have a time coordinate that is used in many of our variables (now we have about 20 variables). However some of the data work on different time bases, in which case it would make sense to have a group for each time base instead of having a lot of different time coordinates.

2)

We also process the data – in which case we store the original data and at the same time want to store the processed data. In this case it would make sense to have a group for the unprocessed and one for the processed data.

In addition to this we also try to comply to the CF convention for data – in which there are naming conventions that are implicitly used to automatically show/display data.

Best regards

Jardar

From: vassilyl [email removed]
Sent: 11. august 2011 12:53
To: jardar@nortek-as.com
Subject: Re: Next release [sds:268537]

From: vassilyl

jardar: can you please expalin your interest in supporting groups? Do you think groups have any advantage compared to just naming conventions? Up to now I didn't see any meaningful use of that feature.

Aug 12, 2011 at 2:07 PM

Hi Dmitry

Thanks for the replay. We are looking forward to the next release.

Best regards

Jardar

From: dvoits [email removed]
Sent: 11. august 2011 09:43
To: jardar@nortek-as.com
Subject: Re: Next release [sds:268537]

From: dvoits

Thank you for your interest to our library!

We are planning to publish next SDS release in September. However it is still unknown whether we will succeed in building NetCDF 4.1.3 for Windows or not, since there is no official build for Windows (or at least it is unknown to us). But we already found that new version of NetCDF has no memory leak and therefore do hard efforts to build it.

At any case, next release will allow to set chunk sizes for NetCDF variables. Probably some other capabilities will be presented too, but it now depends on how quick we will move to the next version of NetCDF. We will consider your feature request.

Regards,

Dmitry Voytsekhovskiy.

Aug 16, 2011 at 9:46 AM

Hi

Another feature request is speedup when appending data. Right now committing our data takes a long time.

What we have is a 3 dimensional variable with only one dimension which needs to be unlimited (time). The other two dimension are known upon creation of the variable.

So from my understanding of netcdf it should be possible to have very high write speeds if we only append data in the time direction. However it seems that sds is not able to utilize this, maybe due to the fact that sds is currently only supporting unlimited dimensions?

Is this something which will be improved in the next version?

Jardar

From: dvoits [email removed]
Sent: 11. august 2011 09:43
To: jardar@nortek-as.com
Subject: Re: Next release [sds:268537]

From: dvoits

Thank you for your interest to our library!

We are planning to publish next SDS release in September. However it is still unknown whether we will succeed in building NetCDF 4.1.3 for Windows or not, since there is no official build for Windows (or at least it is unknown to us). But we already found that new version of NetCDF has no memory leak and therefore do hard efforts to build it.

At any case, next release will allow to set chunk sizes for NetCDF variables. Probably some other capabilities will be presented too, but it now depends on how quick we will move to the next version of NetCDF. We will consider your feature request.

Regards,

Dmitry Voytsekhovskiy.