CVS with missing value « ,, »

Jul 15, 2010 at 11:19 AM
If I have a csv file file sds.csv A,B,C,D 1,2,,4 5,6,,8 9,10,11,12 whre « ,, » indicate missing value : colomn C line 1 and 2 contains ,, how do you read this csv file with SDS ? I get error: --------------------------------------------------------------- FAILED: Failed to create DataSet instance from uri sds.csv: Variables of value types without defined missing value cannot have gaps. Variables of value types without defined missing value cannot have gaps. Thanks
Coordinator
Jul 15, 2010 at 12:18 PM

Use CVS provider parameter to indicate you have missing values. Something like:

DataSet.Open("sds.csv?fillUpMissingValues=true")
or
DataSet.Open("msds:csv?file=sds.csv&fillUpMissingValues=true")
More parameters are documented in SDS reference to CsvDataSet class. Reference documentation is installed with the package.
Alternatively, you may use properties on the CsvUri class to create the uri string programmatically.
Jul 16, 2010 at 2:50 AM
Edited Jul 16, 2010 at 11:06 AM

Thanks

Now I search add-in for Excel 2007 - Where exactly

I have no DataSet ribbon (?)

 

Also,
I would create VB.NET (2010) project (WPF)
But, I not found DataSetViewerControl  (?)

Coordinator
Jul 16, 2010 at 12:20 PM
Edited Jul 16, 2010 at 12:21 PM

About Excel add-in:

First, you need to launch add-in installation using "Start Menu\Scientific DataSet 1.2\Install DataSetEditor Excel Addin" shortcut. This needs to be done once after SDS package installation.

Then you need to look for DataSet Editor in "Add-Ins" ribbon.

About DataSetViewerControl:

You need to add references in your project to assemblies DataSetViewer*.dll, DynamicDataDisplay*.dll and TableView.dll from "%PROGRAM FILES%\Microsoft Research\Scientific DataSet 1.2" folder. Instructions how to use DataSetViewerControl can also be found in Getting Started document (http://sds.codeplex.com/Project/Download/FileDownload.aspx?DownloadId=127282) starting from page 12. 

And one very important note: to get Excel addin and DataSetViewer you need to install SDS package from Microsoft Research site: http://research.microsoft.com/en-us/downloads/ccf905f6-34c6-4845-892e-a5715a508fa3/, because SDS package from CodePlex site contains only components that are available in source code.

 

Jul 16, 2010 at 11:13 PM

 

OK

   1) uninstall: Scientific_DataSet_CodePlex_1.2.6754.0

   2) install distribution: Scientific_DataSet_Public_1.2.6754.0.msi

   3) Excel Add-In: OK

   4) DataSetViewer  with VB 2010 WPF : OK, - I am working again, but I hope OK

   5) ... after, I would use with F#

 

Thanks

 

Jul 17, 2010 at 11:44 PM
Edited Jul 18, 2010 at 2:45 AM

 

If insterested by VB.NET, here my test:  (VB.NET 2010) -  reference : Introduction to Scientific DataSet, Listing 2, page 15

   Private Sub Window_Loaded(ByVal sender As System.Object, ByVal e As System.Windows.RoutedEventArgs) Handles MyBase.Loaded

        'covert C# example to VB.NET = reference : Introduction to Scientific DataSet, Listing 2, page 15
        ' remark:
        '    - replace ds.Add() by ds.AddVariable             (?)
        '    - replace ds.GetData("X") by ds('X").GetData()  (?)

        Dim ds = DataSet.Open("Tutorial1.csv")
        'Dim ds = DataSet.Open("Tutorial1.csv?inferDims=true")   <= page 17

        If Not ds.Any(Function(var) var.Name = "Model") Then

            'if column Model not exist, add this column

            Dim x() As Double = ds("X").GetData()
            Dim y() As Double = ds("Observation").GetData()

            Dim xm As Double
            Dim ym As Double
            Dim xy As Double = 0
            Dim xx As Double = 0

            xm = x.Sum / x.Length
            ym = y.Sum / y.Length

            For i As Integer = 0 To x.Length - 1
                xy += (x(i) - xm) * (y(i) - ym)
            Next
            For i As Integer = 0 To x.Length - 1
                xx += (x(i) - xm) * (x(i) - xm)
            Next

            Dim a = xy / xx
            Dim b = ym - a * xm
            Dim model = New Double(x.Length - 1) {}
            For i As Integer = 0 To x.Length - 1
                model(i) = a * x(i) + b
            Next

            ' write output data
            ' --------------------
            ds.AddVariable(Of Double)("Model", model)

        End If

        DataSetViewerControl1.DataSet = ds

    End Sub

Comments:
   I just found reason difference between my code VB.NET and example C#: use DataSetExtension
   look this address: http://sds.codeplex.com/wikipage?title=DataSet&referringTitle=Documentation

Coordinator
Jul 18, 2010 at 8:43 AM

Thank you for the sample. I hope extension methods (Imperative API) will make it a little bit smaller. Note about appropriate using directive is included to Imperative API wiki page.

One of our goals is to make SDS library convenient to use from any .NET language. And we are very interested in F# interface. So we'll be very grateful if you share your experience on using SDS from F#!

We'll consider extending our tutorials for other programming languages.