Home   Archive   Permalink



CSV processing idea

I think this ought to be possible with REBOL but I don't know where to begin. I have looked at CSV samples in the library and I don't think they do this (mainly because I can't understand them).
    
I want to start with a CSV file with a row of column headings, for example:
    
name,address,birthdate
"Steven","100 1ST AVE",01-JAN-1940
"John","200 2ND ST",02-FEB-1950
...etc.
    
Then I want to write procedures to read that file, one record at a time, starting with the second line because the first line is supposed to give names to the data items on the other lines. When I read the first line, I want "name" to be available as a REBOL word and have the value "Steven." I want "address" to be a REBOL word with the value "100 1ST AVE," and so on. Then, when I read the second line, I want "name" to be available as a REBOL word with the value "John" and so on.    
    
I thought for a few minutes that I knew how to do this, and that the trick was in the "bind" function, but I started writing a demo and I realize I don't get it at all.    
    
I wonder if anyone could point me in the right direction. Thank you.

posted by:   Steven White       16-Oct-2013/17:03:23-7:00



I have to go home for the day, but I am in the process of answering my own question. It appears that the trick is the "set" function. I will know more tomorrow.
    
Thank you.

posted by:   Steven White       16-Oct-2013/17:52:38-7:00



I included a bunch about that in the business tutorial. Start with these:
    
http://business-programming.com/business_programming.html#section-3.11
    
http://business-programming.com/business_programming.html#section-6.16
    
http://business-programming.com/business_programming.html#section-17.8

posted by:   Nick       16-Oct-2013/21:09:39-7:00



The basic procedure is to:
    
1) Define an empty block.
    
2) Use read/lines to read the CSV data into another new block.
    
3) Use a foreach loop to parse (split apart) each line of the read block, at the delimiter character (use "at <blockname> 2" to refer to all but the first header line). During this step, append each individual data item on the line, to the predefined empty block (from step 1).
    
4) At this point, the CSV data has been converted into a REBOL data block/series. You can use another foreach loop to iterate through the data block, and assign word labels (local 'variables' in each iteration) to refer to each group (line) of data values.
    
If you don't need to use word labels, you can skip the first step, the second half of the third step, and the 4th step. Just use indexes to pull out column data from each value in the block created by the parse function.

posted by:   Nick       16-Oct-2013/21:16:59-7:00



Take a look at this basic example that computes a sum of items in the 8th column. It starts with the 2nd line in the CSV file, to avoid the headers and only deal with the data:
    
sum: $0
foreach line at read/lines http://re-bol.com/Download.csv 2 [
     sum: sum + to-money pick (parse/all line ",") 8
]
alert join "Total Gross Sales: " sum
    
This example from the tutorial link above does exactly what you asked. It creates a block of data from the CSV data, and assigns 'name, 'address, and 'phone variable labels to each group of values:
    
R E B O L [title: "Load CSV - Flat"]
block: copy []
csv: read/lines %users.csv
foreach line csv [
     data: parse line ","
     append block data
]
; probe block
foreach [name address phone] block [
     alert rejoin [name ": " address " " phone]
]
halt

posted by:   Nick       16-Oct-2013/21:23:40-7:00



The same examples are all covered in the R3 tutorial at http://learnrebol.com (there are some trivial differences).

posted by:   Nick       16-Oct-2013/21:30:18-7:00



I saved your data example above at re-bol.com/stevenwhite.csv . Here's the simple code example above applied to your data, to demonstrate what you asked. Notice that the CSV data is first converted to a REBOL block (steps 1, 2, and 3 above), using a foreach loop. The second foreach loop is used to assign variable labels to each group of name, address, and date values in the block (step 4 above).
    
R E B O L []
block: copy []
csv: at read/lines http://re-bol.com/stevenwhite.csv 2
foreach line csv [
     data: parse line ","
     append block data
]
foreach [name address date] block [
     alert rejoin [name ": " address " " date]
]

posted by:   Nick       17-Oct-2013/5:30:17-7:00



For quickie scripts, you could shorten that to:
    
R E B O L []
foreach line at read/lines http://re-bol.com/stevenwhite.csv 2 [
     foreach [name address date] parse line "," [
         alert rejoin [name ": " address " " date]
     ]
]

posted by:   Nick       17-Oct-2013/8:12:31-7:00



If you want to reuse the read data, assign it a label when it's read:
    
R E B O L []
foreach line data: at read/lines http://re-bol.com/stevenwhite.csv 2 [
     foreach [name address date] parse line "," [
         alert rejoin [name ": " address " " date]
     ]
]
foreach line data [
     foreach [name address date] parse line "," [
         print (to-date date) + 1
     ]
]
halt

posted by:   Nick       17-Oct-2013/8:18:28-7:00



By the time you have to do that even once, however, it's probably better to just use the original approach (create a new REBOL data block containing all the individual values in the CSV file and deal with it using all of REBOL's native block/series capabilities).

posted by:   Nick       17-Oct-2013/10:39:36-7:00



Here is a quick example: (You need BrianH's load-csv function)
    
     ;create an object from the first line of the file
     row: context append map-each v parse first b: read/lines %file.csv none [to-set-word v] none
    
     ;fill the object with the values
     foreach line load-csv next b [
         set words-of row line
         probe row
     ]


posted by:   Endo       17-Oct-2013/17:06:23-7:00