COBOL relative files
The COBOL programming language has the concept of a "relative" file (or some vendors called it a "random" file IIRC). The data would be in one physical file, and it could be a big file, and the data could be thousands of "records" with each record being a fixed number of bytes. So, if a record were defined as being 100 bytes long, then the first record would by bytes 1-100, the second 101-200, and so on. To read or write a record, one would provide a value 100 bytes long, plus the "relative key" which would be just a number 1 through n (n being the maximum number of records possible), and the given 100 bytes would be read or written at the appropriate spot. What is relevant for this question is that the whole file would NOT be read into memory because it could be huge. Only one record at a time would be read or written (logically speaking; there probably was buffering activity in the background). Can REBOL do something like that? I am only familiar with reading the whole file into memory at one time. I did look at the documentation and see things like READ/DIRECT/PART/SKIP, but it is not clear to me if those features accomplish the COBOL-like relative file. I do realize that my description above was a "logical" one in that I don't actually KNOW how the relative file was accomplished. Maybe record number 2 was NOT stored at bytes 101-200 but something else was going on "under the hood." What I really want is the same logical result, not necessarily the above physical result, but for files that still are too big to bring into memory all at once. Thank you.
posted by: Steven White 30-Mar-2018/18:18:26-7:00
You can use 'open. This page has some useful code: https://stackoverflow.com/questions/27939492/in-rebol-what-is-the-idiomatic-way-to-read-a-text-file-line-by-line
posted by: Nick 30-Mar-2018/23:50:48-7:00
This article has the code you want: http://www.rebol.com/article/0199.html
posted by: Nick 30-Mar-2018/23:52:55-7:00
Try this: R E B O L [] random/seed now ; write %relative-file.dat "" ; uncomment to create a fresh new file ; open file: p: open/seek/binary %relative-file.dat ; create 10000 records of 100 characters: repeat j 10000 [ n: copy "" repeat i 100 [append n form first random "qwertyuiopasdfghjklzxcvbnm"] insert tail p n ] ; read and print record 2000 (starting at byte 200000): probe to-string copy/part at p 200000 100 ; read all sequential records into memory ; (to demonstrate how to loop through each record in the raw file data): x: copy "" forskip p 100 [append x copy/part p 100] probe (length? x) / 100 ; change record 2000 (copy contents of record 2001): change at p 200000 copy/part at p 200100 100 probe to-string copy/part at p 200000 100 probe to-string copy/part at p 200100 100 close p halt
posted by: Nick 31-Mar-2018/0:56:12-7:00
I will mention this in case others follow this trail. This is from: https://forum.rebol.info/t/what-is-a-port/617 Explicit ports give you full control over each I/O action. For example, let’s say you want to read a large file in small 20000 byte chunks. You might use these steps: file: open %bigdata.dat while [not zero? data: read/part file 20000] [ process data ] close file This common method will be familiar to most programmers. The file is opened, reads are done, and the file is closed. Each action is done separately. This type of explicit I/O is common for large files that would consume a lot of memory if you read them with implicit I/O. For example, if the bigdata.dat file is 10 GB, you would not be able to read it all into memory at one time. Explicit I/O is also used when you need strict control over each action. This is often done if you need to seek to different locations within a file or write your own network protocol. For example, let’s say you need to read data from three different parts of a large file. In that case you would use read to seek to each part of the file to do the read: file: open %bigdata.dat da-head: read/part file 4000 da-body: read/seek/part file 12000 10000 da-tail: read/seek/part file 56000 4000 close file
posted by: Steven White 15-May-2018/9:10:31-7:00
Steve, I think you're using R2, and there is no /seek refinement for the read function in R2. For that specific 'explicit' approach, R3 is required. The code I provided above works in R2.
posted by: Nick 16-May-2018/21:41:02-7:00
|