Home   Archive   Permalink



Testing a character in a string

This must be something dumb, but I just don't see it. I want to go through a string of characters, and when I find an 'X' I want to do something. I never find an X, as shown in the test case below. Can someone point out what I am not understanding?    
    
Thank you.
    
The script:
R E B O L [
]
    
XTEST: '123X123X'
XSUB: 1
XMAX: 8
    
print ['XTEST is a ' type? XTEST]
print ['XSUB is a ' type? XSUB]
print ['XMAX is a ' type? XMAX]
    
while [(XSUB <= XMAX)] [
     print rejoin ['Testing '' XTEST/:XSUB ''']
     either (XTEST/:XSUB = 'X') [
         print [XTEST/:XSUB ' equal X']
     ] [
         print [XTEST/:XSUB ' NOT equal X']
     ]
     XSUB: XSUB + 1
]
    
halt
    
The result:
XTEST is a string
XSUB is a integer
XMAX is a integer
Testing '1'
1 NOT equal X
Testing '2'
2 NOT equal X
Testing '3'
3 NOT equal X
Testing 'X'
X NOT equal X
Testing '1'
1 NOT equal X
Testing '2'
2 NOT equal X
Testing '3'
3 NOT equal X
Testing 'X'
X NOT equal X
>>
    


posted by:   Steven White     25-Feb-2015/18:20:14-8:00



print [ "X is a " type? "X"]    
print ["but" XTEST/:XSUB " is" type? XTEST/:XSUB]
    
print [type? "X" "is not the same type as " type? XTEST/:XSUB]

posted by:   stever     25-Feb-2015/19:02:42-8:00



When you select a single character from a string you get a character of type char!. You are getting false in each of your comparison operations as you are comparing achar to a string.
A simple change to fix this would be to change this line
either (XTEST/:XSUB = 'X') [
To
either (XTEST/:XSUB = #'X') [
    
#'X' is the notation for a char!
Cheers, J

posted by:   johnk     25-Feb-2015/19:08:09-8:00



This is making an interesting learning opportunity.
    
Yesterday I was playing with a: 1 type? a and then forcing it to decimal, but I can't remember now the words to do that. (Can someone remind me?) That would be more useful in general as you could search for a variable set to a char against the embedded char.
    
Also, I just noticed that = is NOT the same as ==, as the latter checks type as well as value. But how can you possibly be equal in value but not in type ... and how would you leverage the distinction?
    
As long as I'm asking questions, is there a reference sheet for r2 words as there is for r3? I know I can keep doing the ? thing, but it would be nice to have a table I could print which would also have the refinements. Something like this would be cool for red, too.
    
    
    
And
    


posted by:   stever     25-Feb-2015/19:45:26-8:00



The whole issue of pattern matching in strings is interesting to me. While some disdain regex utilities such as grep/sed I like them. Yesterday it occurred to me that a flexible search/replace string editor might be fun to try and write in rebol. Has that been done already? If not, why not...can a dialect to do the same thing possibly be hacked together that trivially?

posted by:   stever     25-Feb-2015/19:53:19-8:00



Ok. This works:
    
R E B O L [
]
        
XTEST: "123X123X"
XSUB: 1
XMAX: 8
X: make char! input
        
print ["XTEST is a " type? XTEST]
print ["XSUB is a " type? XSUB]
print ["XMAX is a " type? XMAX]
        
while [(XSUB <= XMAX)] [
     print rejoin ["Testing " XTEST/:XSUB ]
     either (XTEST/:XSUB = X) [
         print [XTEST/:XSUB " equal " X]
     ] [
         print [XTEST/:XSUB " NOT equal " X]
     ]
     XSUB: XSUB + 1
]
        
halt
--
But I had a number of problem which make me wonder if you were running rebol 3? To begin with, your header is R E B O L, which I had to change to REBOL. But then I threw errors at all the use of single quotes instead of double (' vs "). And I still have the open questions of what are the differences between:
to vs make
= vs ==
It is not at all evident to me from the documentation and experimentation. I probably have some other questions, but can't remember them just yet;-/

posted by:   stever     25-Feb-2015/22:55-8:00



Maybe this piece of code can be of help:
XTEST: "123X123X"
anX: "X"
print found? find XTEST anX
print index? find XTEST anX
print XTEST: find XTEST anx
XTEST: next XTEST
print XTEST: find XTEST anx
print index? find XTEST anX
    
XTEST: head XTEST
forall XTEST [
     print ["position: " index? XTEST " has a " first XTEST]
]


posted by:   iArnold     26-Feb-2015/2:20:14-8:00



> And I still have the open questions of what are the differences between:
> to vs make
> = vs ==
    
Excellent questions! Ones that have been asked before and attempted to be given sensible answers. It's difficult to effect change in the projects, so the best thinking is frequently not reflected in the state of any Rebol/Red binaries at the moment. Yet the theory is that good plans will be lobbied strongly enough for that when the time comes to go over things, they'll be instituted.
    
(Red is argued to be following Rebol2 in order to ease development and avoid having to come up with new documentation.)
    
Here is some thinking for you, and perhaps you can be a sounding board for it:
    
----
    
Let's call `=` "natural equality", and `==` "strict equality". The argument is that in Rebol, there is value to having a notion of "natural equality" take the simpler-to-type form, so that (for instance):
    
if 1 = 1.0 [
     print "This condition will run"
]
    
By a similar token, Red has implemented in its floating point a method for allowing close float values to compare as equal, without the complexity of looking for a difference less than a certain epsilon.
    
Many have strongly suggested that #"a" = "a" is a good idea to include in this "natural" equality. We already take this for granted in PARSE:
    
if parse "aa" ["a" #"a"] [
     print "We take this for granted in parse."
]
    
In my view this very helpful form of lax equality was skipped, while very UN-helpful forms of equality were left in:
    
if "fOo" = [
     print "All ANY-STRING! compare naturally equal, case-insensitively"
]
    
if 'FoO = quote :fOo [
     print "All ANY-WORD! compare naturally equal, case-insensitively"
]
    
While case-insensitive comparison as a default for ANY-WORD! is a fundamental of the language, some would like to turn back the clock on the case-insensitive comparison for ANY-STRING!. I don't know about the benefits vs. tradeoffs there.
    
But I do feel that ignoring the types in the ANY-WORD! cases creates far more bugs than value.
    
I suggested there is a missing operator, "approximately equal" (perhaps ~=) which would be a complement to "strictly equal". Then = would just be a matrix of the common case... acting like ~= for some (like INTEGER! vs. FLOAT!/DECIMAL!) and like == for others (like WORD! and STRING!). I also suggest same-spelling? and same-spelling?/case for testing when non-equal types have the same "spelling", and an operator "spelling-of" to extract that property.
    
(Further, I offer that TO STRING! of something like FOO: actually produce "FOO:", and of actually produce ""... in contrast to SPELLING-OF which would produce "FOO" in both cases. This would make TO STRING! a logical replacement for the strangely named FORM, and offer a helpful alternative. Having used these in practice a bit I can say they are already proving to be *much* better ideas.)
    
So under this Matrix, a smattering of samples:
    
1 = 1.0 ; true
1 ~= 1.0 ; true
1 == 1.0 ; false
    
"FoO" = "fOo" ; most likely true
"FoO" == "fOo" ; likely false
"foo" = ; definitely false
"foo" ~= ; false
"" ~= ; likely desirable as true
"" = ; possibly true
same-spelling? "foo" ; true
same-spelling? "foo" 'foo ; true as well
    
'foo = quote :foo ; false
'foo ~= quote :foo ; probably true
'foo == quote :foo ; false
'FoO ~= 'fOo ; true
'FoO == 'fOo ; likely false
'FoO = 'fOo ; true
    
#"a" = "a" ; true
#"a" = "A" ; true
#"a" == "a" ; false
#"a" == #"A" ; false
#"a" ~= #"A" ; true
    
The idea that two ANY-STRING! compare as ~= if they have the same TO-STRING conversions is a useful concept, and it would continue the pattern of PARSE mentioned in character equality above.
    
if parse "bar" [ "bar" ] [
     print {This prints, so = "foo" helps with consistency.}
]
    
None of this touches on the "exactly equal" notion (`===` perhaps, or just stick with `same?`), required to see if two objects or series actually represent *the same node in memory" vs just having comparably equal contents:
    
x: "abc"
x === "abc" ; false
x === x ; true
    
TO vs. MAKE is another conversation entirely, that I have opinions about. (Rarely am I short on them.)
    
May I suggest if you haven't yet to join the StackOverflow chat room, as it's always nice for more people to talk about the language design, as there is still time to influence it--hopefully:
    
http://rebolsource.net/go/chat-faq

posted by:   Fork     26-Feb-2015/3:14:49-8:00



correction: that should read ` ~= ""` helps with consistency (and therefore perhaps ` = ""`). It may be a good "natural" tradeoff for the non-strict notion of equality.
    
For @earl's counter-opinions and some ensuing debate, see:
    
http://chat.stackoverflow.com/transcript/message/21784676#21784676

posted by:   Fork     26-Feb-2015/10:18:06-8:00



Fork, Thanks for all the time you put into trying to explain. I try to catch all your posts here and when browsing stackoverflow. I'll be thinking of what questions I might redeem for the 20 points. :-/
    
There are so many little details in Rebol and I'm not always the most patient in trying to figure out things for myself before I ask. I appreciate everyone in the community being so tolerant when I sometimes ask something which might seem self evident. Very little about Rebol is self evident to me. But the core concepts and potential seem insanely great. Once again, I appreciate everyones' help and tolerance. I aim to be a solid, contributing member of the community someday, even if I'm not the fastest learner.

posted by:   stever     26-Feb-2015/13:11:50-8:00



iArnold,
    
re "Maybe this piece of code can be of help:"
    
That is just brilliant! I don't really grok the details of your snippet just yet, but I am playing with it and will continue to play until I DO get it fully. That sort of cogency in solving a real programming issue is most of what is getting me so excited. The helpfulness of the community and the opportunity to be a (relatively) early adopter of something which I believe can rebolutionize the world is so exciting!

posted by:   stever     26-Feb-2015/13:18:33-8:00



I've been playing with found? find index? and some other things. I'd like to be able to do a string search (and ultimately a replace) while also being secure that the script wouldn't crash if there is no match; this ain't it.
    
>> a: "abcd12345xyz" index? find a "w"
** Script Error: index? expected series argument of type: series port
** Where: halt-view
** Near: index? find a "w"
    
Works fine as long as I get a match. I can obviously use found? in place of index? and test for a match before I proceed with the index? but that seems ridiculously redundant (wasteful to search the string twice).
    
It's hard to believe the found? has to be run separately, as a graceful error /exit built into index? would have combined the two.    
    
How would a wizard do this? While we are at it, how would a rebol wizard probably do the replace?
    
Nick and others have written many tutorials. If someone would prefer to just point me to a good, thorough tutorial on string manipulation that would be fine with me.
    
Thx.

posted by:   stever     26-Feb-2015/19:56:30-8:00



find returns a none! when it is not found, that is why you test it first using 'FOUND?
So
if found? find blabla [index? find blabla]
    
Unfortunately 'FOUND? is not in Red, so there I tested using TYPE? FIND blabla


posted by:   iArnold     27-Feb-2015/11:01:57-8:00



iArnold,
But doesn't that mean the same string has to be searched (with find) TWICE for exactly the same match?? From a language design standpoint that seems grossly...ummm...ahhh...errr....suboptimal :-} I mean, null is s character, too, so why not return null or something at least testable and persistant? Is there some reason why rebol COULDN'T have a hardy and efficient string/series search?
    
Or is there one, but it isn't this one (which still doesn't explain, to my mind, why this one can't behave a bit more civilized? Or is there something else I am missing--maybe the find doesn't actually do the same search twice (even though it looks like you are asking it to)??
    
Imagine you're writing something like a grep (which is what I am thinking of doing) and you are dealing with LOTS of strings and some of them could be near infinite in length....

posted by:   stever     27-Feb-2015/11:44:28-8:00



Well, that is true, and yes it searches the string twice. Most of the time I need searches in small series so this overhead is a small price. Besides if you want speed why use an interpreter? Rebol reduces the time you program on your code and often the time samed by that is far larger than the extra milliseconds of cpu time spend.
There are other ways to traverse the string. Try FOREACH or PARSE.
Try to find more resources on rebol.org and the docs on rebol.com for R2 and R3. There is also a sticky subject in this forum with links to Rebol sources.

posted by:   iArnold     27-Feb-2015/15:00:09-8:00



Stever,
    
An obvious quick fix is:
    
a: "abcd12345xyz"
if error? try [y: index? find a #"w"] [y: none]
probe y
    
Here's another (better) option:
    
a: "abcd12345xyz"
either x: find a "w" [y: index? x] [y: none]
probe y
    
And actually, the example above can be written more succinctly like this (my preferred solution to your question):
    
a: "abcd12w345xywz"
y: either x: find a "w" [index? x] [none]
probe y
    
...and of course you don't need the 'y word unless you want to save the result for later use. Here's a one line solution to your question:
    
probe either x: find "abcd12w345xywz" "w" [index? x] [none]
    
iArnold mentioned that you could use 'parse - getting good with 'parse is extremely valuable in any text search/replace situation:
    
a: "abcd12345xyz"
y: copy []
parse a [any [to "w" mark: thru "w" (append y index? mark)]]
probe y
    
And just for fun, since iArnold also mentioned that you could use foreach - here's one solution which is not recommended here, but possibly may shed some light as a demonstration:
    
a: "abcd12345xyz"
y: 1 foreach c a [if c = #"w" [break] y: y + 1]
if y > length? a [y: none]
probe y
    
And, if you're going to do some operation like above (using indexes on an iterated list), 'repeat usually works nicely:
    
repeat y x: length? a [
     if a/:y = #"w" [break]
     if y > x [y: none]
]
probe y
    
HTH

posted by:   Nick     1-Mar-2015/9:14:05-8:00



The parse code above is variation of the table example from Carl's basic parse documentation:
    
http://www.rebol.com/docs/core23/rebolcore-15.html#section-7.4
    
There are many more 'parse links in the "Useful Rebol Documents and Resources" sticky.

posted by:   Nick     1-Mar-2015/9:26:46-8:00



Aside from being fast, 'parse is also much more capable. For example, if you want to find multiple indexes of found characters:
    
a: "abcd12w345xywz"
y: copy []
parse a [any [to "w" m: thru "w" (append y index? m)]]
probe y

posted by:   Nick     1-Mar-2015/9:44:54-8:00



And for a little more clarification, you could do something similar to the multiple index 'parse example, using 'foreach and 'repeat loops. Repeat is generally a nice succinct (and fast) loop when working with series indexes:
    
    
a: "abcd12w345xywz"
    
    
y: copy []
i: 1
foreach c a [
     if c = #"w" [append y i]
     i: i + 1
]
probe y
    
    
y: copy []
repeat i length? a [
     if a/:i = #"w" [append y i]
]
probe y

posted by:   Nick     1-Mar-2015/17:37:52-8:00



BTW, this is really important to see - try this:
    
a: copy ""
insert/dup a "a" 10000000
insert tail a "w"
    
That creates a string 10 million + 1 characters long, with a "w" at the end. Try each of the solutions above on that string. You'll see that my preferred 'find solution, and the 'parse solution each perform instantaneously (in a small fraction of a second). All of the loop solutions will keep you waiting for at least several seconds. That's really important if you're working with bigger strings, or any kind of large series data set. Any time you can use the native functions in Rebol, performance will be much faster than loops constructed using mezzanines. Luckily, Rebh has a rich and well thought out collection of fast native functions which can handle all sorts of series data processing needs.

posted by:   Nick     1-Mar-2015/17:51:41-8:00



Notice that even iArnold's idea of performing the 'find operation twice is still several orders of magnitude faster than any of the loop examples:
    
a: copy "" insert/dup a "a" 10000000 insert tail a "w"
y: if find a "w" [probe index? find a "w"]

posted by:   Nick     1-Mar-2015/18:09-8:00



Thank you Nick. I will be trying each and every suggested option, probably in multiple ways, to help lodge the options in my brain.
    
I am also going through your tutorials for info on parse.
    
I think I have said this before and will surely say it again: Thank you!    
    
I am in awe of your work. You are an exceptional teacher and I can scarcely imagine attempting to find my way to competence in rebol without your numerous, fascinating, working examples.    
    
I have a funny feeling rebol/red is finally going see its potential reached. It may take a little while to reach that tipping point but when it comes -- watch out! And when that happens you are going to be instrumental, for nearly everyone coming up, to reach comprehension. And so, I think YOU will have been instrumental in changing the world for the better.
    
I hope to meet you and thank you in person one day.

posted by:   stever     1-Mar-2015/22:18:34-8:00