 |
AppletTalk.com Java discussions newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Anand Narasimhan Guest
|
Posted: Tue Apr 26, 2005 5:48 pm Post subject: StringTokenizer/StreamTokenizer. |
|
|
Hi,
I want to tokenize a string like '/Device/Interface?NAME=Serial1/0' into
the following tokens.
Device
Interface?NAME=Serial1/0
If I use StringTokenizer with '/' as sepearator character I get the
following, which is not what I want.
Device
Interface?NAME=Serial1
0
I tried using StreamTokenizer.
StreamTokenizer st = new StreamTokenizer(
"/Device/Interface?NAME=Serial1/0" );
st.whiteSpaceChars( '/', '/' );
st.nextToken()
The result is
Device
INTERFACE
null
NAME
null
Serial1
null
I tried to call resetSyntax
st.resetSyntax();
st.wordChars(0, 255);
st.whitespaceChars( '/', '/');
st.quoteChar('"');
st.quoteChar(''');
st.parseNumbers();
The result I got was
Device
INTERFACE?NAME=Serial1
null
I tried quoting 'Serial1/0' like /Device/Interface?NAME='Serial1/0' The
result I got was
Device
INTERFACE?NAME=
Serial1/0
Is there any way with StringTokenizer, StreamTokenizer or any other
means (without acually having to write a tokenizer on my own) to get the
result I want which is
Device
Interface?NAME=Serial1/0
Thanks
Anand
|
|
| Back to top |
|
 |
Virgil Green Guest
|
Posted: Tue Apr 26, 2005 7:07 pm Post subject: Re: StringTokenizer/StreamTokenizer. |
|
|
Anand Narasimhan wrote:
| Quote: | Hi,
I want to tokenize a string like '/Device/Interface?NAME=Serial1/0'
into the following tokens.
Device
Interface?NAME=Serial1/0
If I use StringTokenizer with '/' as sepearator character I get the
following, which is not what I want.
Device
Interface?NAME=Serial1
0
I tried using StreamTokenizer.
StreamTokenizer st = new StreamTokenizer(
"/Device/Interface?NAME=Serial1/0" );
st.whiteSpaceChars( '/', '/' );
st.nextToken()
The result is
Device
INTERFACE
null
NAME
null
Serial1
null
I tried to call resetSyntax
st.resetSyntax();
st.wordChars(0, 255);
st.whitespaceChars( '/', '/');
st.quoteChar('"');
st.quoteChar(''');
st.parseNumbers();
The result I got was
Device
INTERFACE?NAME=Serial1
null
I tried quoting 'Serial1/0' like /Device/Interface?NAME='Serial1/0'
The result I got was
Device
INTERFACE?NAME=
Serial1/0
Is there any way with StringTokenizer, StreamTokenizer or any other
means (without acually having to write a tokenizer on my own) to get
the result I want which is
Device
Interface?NAME=Serial1/0
Thanks
Anand
|
Not without defining rules regarding when a '' should be treated as a token
and when it should be treated as an included character.
--
Virgil
|
|
| Back to top |
|
 |
Anand Narasimhan Guest
|
Posted: Tue Apr 26, 2005 7:17 pm Post subject: Re: StringTokenizer/StreamTokenizer. |
|
|
Thanks.
Setting whitespaceChars to '/' seems to work, except that when the
tokenizer sees a quote character, tokenizes everything within the quotes
as a seperate token.
eg. /Device/Interface?NAME='Serial1/0' results in
Device
Interface?NAME=
Serial1/0
But I did not set the quote character as a whitespace character.
Anand
Virgil Green wrote:
| Quote: | Anand Narasimhan wrote:
Hi,
I want to tokenize a string like '/Device/Interface?NAME=Serial1/0'
into the following tokens.
Device
Interface?NAME=Serial1/0
If I use StringTokenizer with '/' as sepearator character I get the
following, which is not what I want.
Device
Interface?NAME=Serial1
0
I tried using StreamTokenizer.
StreamTokenizer st = new StreamTokenizer(
"/Device/Interface?NAME=Serial1/0" );
st.whiteSpaceChars( '/', '/' );
st.nextToken()
The result is
Device
INTERFACE
null
NAME
null
Serial1
null
I tried to call resetSyntax
st.resetSyntax();
st.wordChars(0, 255);
st.whitespaceChars( '/', '/');
st.quoteChar('"');
st.quoteChar(''');
st.parseNumbers();
The result I got was
Device
INTERFACE?NAME=Serial1
null
I tried quoting 'Serial1/0' like /Device/Interface?NAME='Serial1/0'
The result I got was
Device
INTERFACE?NAME=
Serial1/0
Is there any way with StringTokenizer, StreamTokenizer or any other
means (without acually having to write a tokenizer on my own) to get
the result I want which is
Device
Interface?NAME=Serial1/0
Thanks
Anand
Not without defining rules regarding when a '' should be treated as a token
and when it should be treated as an included character.
--
Virgil
|
|
|
| Back to top |
|
 |
Oscar kind Guest
|
Posted: Tue Apr 26, 2005 8:40 pm Post subject: Re: StringTokenizer/StreamTokenizer. |
|
|
Anand Narasimhan <anandn (AT) cisco (DOT) com> wrote:
| Quote: | I want to tokenize a string like '/Device/Interface?NAME=Serial1/0' into
the following tokens.
Device
Interface?NAME=Serial1/0
If I use StringTokenizer with '/' as sepearator character I get the
following, which is not what I want.
Device
Interface?NAME=Serial1
0
[...]
Is there any way with StringTokenizer, StreamTokenizer or any other
means (without acually having to write a tokenizer on my own) to get the
result I want which is
Device
Interface?NAME=Serial1/0
|
How is "/Device/Interface?NAME=Serial1/0".split("/", 3) insufficient?
I get {"", "Device", "Interface?NAME=Serial1"}, which is not exactly what
you want, but quite close.
--
Oscar Kind http://home.hccnet.nl/okind/
Software Developer for contact information, see website
PGP Key fingerprint: 91F3 6C72 F465 5E98 C246 61D9 2C32 8E24 097B B4E2
|
|
| Back to top |
|
 |
Tor Iver Wilhelmsen Guest
|
Posted: Wed Apr 27, 2005 1:01 pm Post subject: Re: StringTokenizer/StreamTokenizer. |
|
|
Anand Narasimhan <anandn (AT) cisco (DOT) com> writes:
| Quote: | I want to tokenize a string like '/Device/Interface?NAME=Serial1/0'
into the following tokens.
Device
Interface?NAME=Serial1/0
If I use StringTokenizer with '/' as sepearator character I get the
following, which is not what I want.
Device
Interface?NAME=Serial1
0
|
You want to look into using regular expressions instead (present in
1.4 or later, separate install prior to that).
E.g.
Pattern p = Pattern.compile("/(w+)/(.*)");
Matcher m = p.matcher("/Device/Interface?NAME=Serial1/0");
if (m.matches()) {
tokens = new String[] { m.group(1), m.group(2)};
}
| Quote: | I tried using StreamTokenizer.
|
StreamTokenizer is a very basic C lexer. It, like StringTokenizer,
should be discarded in modern code in preference of regular
expressions or a lexer/parser (google for them, there are quite a few
variants).
|
|
| Back to top |
|
 |
Virgil Green Guest
|
Posted: Thu Apr 28, 2005 5:20 pm Post subject: Re: StringTokenizer/StreamTokenizer. |
|
|
Anand Narasimhan wrote:
| Quote: | Thanks.
Setting whitespaceChars to '/' seems to work, except that when the
tokenizer sees a quote character, tokenizes everything within the
quotes as a seperate token.
eg. /Device/Interface?NAME='Serial1/0' results in
Device
Interface?NAME=
Serial1/0
But I did not set the quote character as a whitespace character.
Anand
|
Still, no rules. What are the rules for when a '/' is considered a separator
and when it is considered a valid character?
--
Virgil
|
|
| Back to top |
|
 |
Ross Bamford Guest
|
Posted: Fri Apr 29, 2005 1:07 pm Post subject: Re: StringTokenizer/StreamTokenizer. |
|
|
On Wed, 2005-04-27 at 15:01 +0200, Tor Iver Wilhelmsen wrote:
| Quote: | Anand Narasimhan <anandn (AT) cisco (DOT) com> writes:
I want to tokenize a string like '/Device/Interface?NAME=Serial1/0'
into the following tokens.
Device
Interface?NAME=Serial1/0
If I use StringTokenizer with '/' as sepearator character I get the
following, which is not what I want.
Device
Interface?NAME=Serial1
0
You want to look into using regular expressions instead (present in
1.4 or later, separate install prior to that).
E.g.
Pattern p = Pattern.compile("/(w+)/(.*)");
Matcher m = p.matcher("/Device/Interface?NAME=Serial1/0");
if (m.matches()) {
tokens = new String[] { m.group(1), m.group(2)};
}
I tried using StreamTokenizer.
StreamTokenizer is a very basic C lexer. It, like StringTokenizer,
should be discarded in modern code in preference of regular
expressions or a lexer/parser (google for them, there are quite a few
variants).
|
Although I'd stick with the tokenizer for simple use cases or tight
code, like splitting into words - regexps are more expensive.
Ross
--
[Ross A. Bamford] [ross AT the.website.domain]
Roscopeco Open Tech ++ Open Source + Java + Apache + CMF
http://www.roscopec0.f9.co.uk/ + [email]info (AT) the (DOT) website.doma[/email]in
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|