DataSet URI

A DataSet URI completely defines the source of data and its access mode. The DataSet URI has a schema named "msds".
The main identification string has the following syntax:

msds:provider?param1=value1&param2=value2&...

For example:
msds:csv?file=output.csv&culture=en-US&appendMetadata=true
More formally, the URI format is:

DataSetURI ::= msds:provider[?parameters]
provider ::= provider-identifier
parameters ::= parameters&parameters | parameter-name=parameter-value
parameter-name ::= identifier
parameter-value ::= character-string

The provider-identifier and the set of parameter-names depend on the particular DataSet provider. DataSet URI is case-sensitive.

Opening DataSet from URI

DataSet constructors accept a string that is either a file path or a correct URI—that is, a URI that has the correct provider name
and set of parameters. For instance, CsvDataSet accepts the two following URIs:

c:\data\test.csv
msds:csv?file=c:\data\test.csv

However, the string "msds:nc?file=c:\data.test.csv" would cause an error because it specifies "nc" as the provider name instead of "csv".

The provider should have the DataSetProviderNameAttribute specifying its name. The DataSetProviderFileExtensionAttribute associates extensions with the provider. For example:

[DataSetProviderName("csv")]
[DataSetProviderFileExtension(".csv")] 
public class CsvDataSet : DataSet

Every DataSet has the property DataSet.URI, which returns a URI of the DataSet instance.

Note to developers: The DataSetUri class implements basic functionality for parsing and verifying DataSet URIs. You should use
this class to embed support for URIs into a DataSet provider. Some standard flags that often appear in Dataset URIs are
described below in the Standard parameters section.

DataSet factory

The DataSet factory enables the creation of dataset instances by providing only a URI, without an explicit constructor call. For example:

DataSet dataSet = DataSet.Open("csv_for_autotypes.csv"); 
DataSet dataSet = DataSet.Open("msds:csv?file=csv_for_autotypes.csv"); 
DataSet dataSet = DataSet.Open(@"c:\data\ncfile.nc"); 
DataSet dataSet = DataSet.Open("msds:nc?file=ncfile.nc"); 
DataSet dataSet = DataSet.Open("msds:as?server=(local)&database=ActiveStorage" + 
                                  "&integrated security=true&GroupName=mm5&UseNetcdfConventions=true");

Before you can create instances the DataSet.Open method, the provider must be registered in the factory. Installer automatically registers all the installed providers in the Machine.config file.

The factory control class is the static class Microsoft.Research.Science.Data.Factory.DataSetFactory. The class allows you to register providers and create new instances. The class is used by the DataSet.Open method.

There are several ways to register providers:
  • DataSetFactory.Register group of methods.
  • DataSetFactory.SearchFolder.
  • Add special section in application or machine configuration file.

These methods allow you to both register provider names and associate extensions with a particular provider. The methods use the previously mentioned attributes DataSetProviderNameAttribute and DataSetProviderFileExtensionAttribute in the process.

For example, the NetCDFDataSet provider can be registered by using the DataSetFactory.Register methods in the following two ways:

DataSetFactory.Register(typeof(NetCDFDataSet));

or this:

DataSetFactory.RegisterAssembly("Microsoft.Research.Science.Data.NetCDF4.dll");

The second way to register providers is the method DataSetFactory.SearchFolder. This method accepts a path, searches in that folder for all assemblies that contain providers, and registers the providers that it finds. In the following example, the method registers all the providers found in assemblies in the current directory:

DataSetFactory.SearchFolder(Environment.CurrentDirectory);

Remark: SearchFolder method is intended mostly for developers and is marked with the DEBUG conditional compilation symbol. Release builds should rely on correct configuration files or manual provider registration.

Standard parameters

Most providers accept these parameters. Note to developers: support of these flags is built-in to the class DataSetUri.

The "openMode" flag specifies how the dataset should open a file, database, or whatever resource it uses to store the data.

Possible values for the flag are:
createNew Specifies that the data set should create a new resource. If the resource already exists, the exception IOException is thrown.
open Specifies that the data set should open an existing resource. If the resource does not exist, the exception ResourceNotFoundException is thrown.
create Specifies that the data set should create a new resource. If the resource already exists, it will be created again.
openOrCreate Specifies that the data set should open an existing resource. If the resource does not exist, the dataset creates a new resource.


Examples:
DataSet dataSet = DataSet.Open("msds:nc?file=data.nc&openMode=open"); 
DataSet dataSet = DataSet.Open("msds:as?server=(local)&database=ActiveStorage&" + 
                       "integrated security=true&GroupName=mm5&UseNetcdfConventions=true&openMode=createNew");

Last edited Jun 16, 2010 at 1:12 PM by dvoits, version 11

Comments

No comments yet.