Home   Archive   Permalink

Best way to make search keywords

I am in need of a simple program to create 'readme' files in folders, with some REBOL-readable data in those files that can be used to identify and search for those folders. I am thinking of something like:
DESCRIPTION: ' <short text description> '
CREATED-BY: ' <maybe my own name> '
That SEARCH-WORDS concept is troubling me a bit. I would like a block of words that I could use to make an index. I know enough REBOL to use it, but not enough to have a really deep understanding. I can imagine two ways to do a block of searchable words. For example, if I had a folder of historical pictures of the office, I could do:
SEARCH-WORDS: [ 'pictures' 'history' 'office' ]
SEARCH-WORDS: [ 'pictures 'history 'office ]
I am inclined to do the first way because I understand it. But I am wondering if there is something about the second way that would give some advantage. I have seen words used that way in examples on rebol.org, although at this time I couldn't put my hand on one.
If anyone has any thoughts on the matter I would be happy to hear them. Not an urgent problem. I am just cleaning up after myself so I don't leave a mess.

posted by:   Steven White       18-May-2015/15:17:29-7:00

Though the post is single quoting, I'm going to assume that you meant to ask about strings in case 1, e.g.:
SEARCH-WORDS: [{pictures} {history} {office}]
...vs. words in case 2:
SEARCH-WORDS: [pictures history office]
Because the philosophy of Rebol is to remove as much "cruft" as possible, you would likely want to lean toward the second approach. Though you appear to have used LIT-WORD! instead of plain WORD!. It is not necessary to achieve a "quote" in this way unless you are in an evaluative context.
For instance, I can write:
>> data: [reverse print copy]
But I cannot write just:
>> data: reverse
...without getting an error because reverse has no argument. Words bound to functions will evaluate unless suppressed by some quoting mechanism. That is when you want to use a LIT-WORD! or QUOTE.
(Note: The module header provides somewhat poor guidance on this issue, as it "constructs" an object "safely"... so because it's not going through the usual evaluator it's willing to tolerate plain words as if they had been lit-words. I consider this to be bad, myself.)
In the present world, the main reasons to choose a string over a word for this kind of thing are the following:
* Some legal strings are not legal words, e.g. {4chan} is an okay string but not '4chan the word. (However, there are escaping mechanisms in the proposal queue to solve this.)
* Words are immutable and cannot be passed by reference. You cannot write `append first [pictures] {-of-my-cat}`. (This property also means that to check if two words are equal you can do so without going character-by-character, you can just look at the symbol ID.)
* Words are never garbage collected the way strings and other series are. If you ever create `foo-baz-bar` as a symbol or `blah-blah:`, then it will chew up a permanent symbol ID and string data. Over time you will exhaust all your memory, even if you keep using words and then not having any references left to them. (There is a GC mechanism in the works to solve this.)
Fast interactive answers and feedback available on StackOverflow, so I'll do the usual plug. :-)

posted by:   Fork       19-May-2015/18:52:38-7:00