Howtp append 2 csv file with separator semicolon?

Dec 26, 2010 at 2:52 PM

 

After multiple test, I not found how to append data from 2 csv file

 

File: test1.csv
X;Y;Z
1;2;3
4;5;6
7;8;9 

File: test2.csv
X;Y;Z
11;15;19
12;16;20
13;17;21
14;18;22

with this command
    sds copy msds:csv?file=test2.csv&separator=semicolon msds:csv?file=test1.csv&separator=semicolon

Not enough of arguments.
'separator' is not recognized as an internal or external command,
operable program or batch file.
'separator' is not recognized as an internal or external command,
operable program or batch file.

 

Do you have hints?

Thanks

Dec 27, 2010 at 4:19 PM
Edited Dec 27, 2010 at 4:20 PM

Hi,

"sds copy" prints an error because you should place arguments in quotes:

sds copy "msds:csv?file=test2.csv&separator=semicolon" "msds:csv?file=test1.csv&separator=semicolon"

 

An important question here is what you mean by "append data"? If "put data from array X of the first file to the end of array X of the second file", then sds copy won't help you, since it just puts content of first dataset into another dataset as it is, without even respect to names of arrays, but requires resulting dataset to be consistent, i.e. arrays depending on same dimension must have same length. 

In the "msds:csv?file=test1.csv&separator=semicolon" there are 3 arrays depending on 3 different dimensions by default:

[3] Z of type Double (csv_2:3)

[2] Y of type Double (csv_1:3)

[1] X of type Double (csv_0:3)

The "msds:csv?file=test2.csv&separator=semicolon" depends on same dimensions, but has different length: 4 instead of 3.

[3] Z of type Double (csv_2:4)

[2] Y of type Double (csv_1:4)

[1] X of type Double (csv_0:4)

So, because of different lengths these datasets cannot be merged by sds copy.

 

If you clear me your task, I can try to help you to resolve it.

 

Regards,

Dmitry.

 

 

Dec 27, 2010 at 6:10 PM

 

Apprend for is add row to file test1.csv
test1.csv 3 rows x 3 columns
test2.csv 4 rows x 3 columns

Append just add rows to test1.csv
test1.csv 7 rows x 3 columns

File: test1.csv
X;Y;Z
1;2;3
4;5;6
7;8;9 
11;15;19
12;16;20
13;17;21
14;18;22

 

thanks

Dec 29, 2010 at 12:21 PM
Edited Dec 29, 2010 at 12:24 PM

sds copy cannot do that operation, but I've written a sample program "sds_append" which appends all variables of target dataset with data of variables with same names from source dataset, if they depend on same dimensions.

In your case, it works as you described. To do this you should use following command:

sds_append "file2.csv?inferDims=true&separator=semicolon" "file1.csv?inferDims=true&separator=semicolon"

 

One more comment. The URI parameter "inferDims=true" for CSV file automatically infers dimensions of variables from the data itself, so variables with same lengths depend on same dimensions.  So, for "file1.csv" instead of

[3] Z of type Double (csv_2:3)

[2] Y of type Double (csv_1:3)

[1] X of type Double (csv_0:3)

you get

[3] Z of type Double (csv_0:3)

[2] Y of type Double (csv_0:3)

[1] X of type Double (csv_0:3)

 

Hope it will help,

Dmitry.

 

Dec 30, 2010 at 7:17 AM

 

I have downloaded your « sds_append » and compile perfectly into VS 2010 - thank you

I test file1.csv and file2.csv - ok

Now, this code realise automatic detection of type of variable.
With input:

test1.csv - an empty file / no data
parc;EO;date;heure;U;V;W

test2.csv  - one record / small demo
parc;EO;date;heure;U;V;W
BDS;A1;2010-09-04;00:10:00;0.454;0.5561;0.686

sds_append "test2.csv?inferDims=true&separator=semicolon" "test1.csv?inferDims=true&separator=semicolon"
Result is msds:csv?inferDims=true&separator=semicolon&file=C:\eolien\2010\data2IREQ\test1.csv
[2]
DSID: e3746b40-b5ad-4915-b0a5-5c35c796d929
[7] parc of type String (csv_0:1)
[6] EO of type String (csv_0:1)
[5] date of type DateTime (csv_0:1)
[4] heure of type DateTime (csv_0:1)
[3] U of type Double (csv_0:1)
[2] V of type Double (csv_0:1)
[1] W of type Double (csv_0:1)

Done.

now test1.csv contain
parc;EO;date;heure;U;V;W
BDS;A1;09/04/2010 00:00:00;12/30/2010 00:10:00;0.454;0.5561;0.686

Automatic detection of type DateTime is a problem
Evidently, for my « real » exemple : column « date » is of type Date and column « heure » is of type Time   ( because into Excel, a cell is of type Date or Time, but not possible DateTime )

Otherwise your application « sds_append » works very well.
I'm going to make me a script (VB Script).

Again Thank you

 

Dec 30, 2010 at 12:32 PM

Unfortunately, there is no way to affect on type detection, but I have pended this issue for further release.

Now you can use a trick: remove separator between "date" and "heure" columns, then append files and split the columns once again (for example, using editor supporting regular expressions):

parc;EO;date heure;U;V;W
BDS;A1;09/04/2010 00:10:00;0.454;0.5561;0.686

[6] parc of type String (csv_0:1)
[5] EO of type String (csv_0:1)
[4] date heure of type DateTime (csv_0:1)
[3] U of type Double (csv_0:1)
[2] V of type Double (csv_0:1)
[1] W of type Double (csv_0:1)

Nevertheless, if the only problem with handling csv files is appending, it can be done very simple just by copying content of one file without a header into the end of another file.

 

Jan 1, 2011 at 3:42 PM
Edited Jan 1, 2011 at 4:14 PM

 

For exercice, I have translate your CSharp sds_append  to VB.NET sds_append_vb

And upload to http://www.mediafire.com/myfiles.php - sds_append_vb.zip [contain source and solution for VS 2010 ]  ===>  ?  « mediafire » is new for me...

 

Oupssss. Use link: http://www.mediafire.com/?6okvvtqmzmv4mli