News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

IRC data parsing

Started by ecube, March 21, 2007, 05:55:30 PM

Previous topic - Next topic

ecube

Well i'm having a little trouble parsing IRC data, right now I parse line by line which works fine with most things but i've noticed IRC servers seem to be real good at rushing out information so that I may end up with 3 messages in one 1024 buffer, one of which is a half complete. This then throws off the next read as it has the rest of the first message prefixing it. The only time i've had this problem is with getting a massive nick list when joining a channel, so i'm not asking for code but rather an approach on how to solve this.

example

my username being ecube and the msg code 353 signifying the channel nick list


I recieve this in my 1024 buffer
:tolkien.freenode.net 353 ecube = #Perll :ecube  EricL cyth goraxe__ zograk mmk[null] shaggyoaf Gigsaw nazty scud anno_ mkomitee axiom pelai miguelanxo fowlduck Apachez snoopy noganex_ mot michoelc work_donato f3ew DJTrey archpollux koye hvx Casan


then I recv this
blastura UncleRemus norc Southron shucks Twi discordja Rene_J olivierk_ Southen werkRelated Slaughter CPAN JiBEsH glockglock35 imcsk8 saorge kalila dmorar carson p4tux Stx_ Sir_J_ Sephiroth iratik dazjorz Winkie yell0w cjeris buetow jetscreamer




ofcourse my code fails as it's again parsing line by line looking for a msg code,like on the first recv to know how to parse it. Any help would be much appreciated, thanks.

Tedd

The server shouldn't be sending you a message longer than 512 bytes, but a message is terminated by a newline (carriage-return+line-feed).

So, the way I would approach the whole messag receiving problem is with two buffers:
- the first is your data buffer - this is where recv data as it comes in network chunks (which is why you sometimes get 'half' messages)
- the other is the message buffer - you parse the bytes from the data buffer, and copy a whole line until you get to the newline characters

Now, there are a few finer details.

The first is that the data buffer is circular - whether reading or writing, when you get to the end you continue from the start again. You have two pointers, one to read, and one to write. When you recv, only recv upto the number of bytes remaining to the end of the data buffer (from the current position of the write buffer; and the 'end' is either the byte before the current read pointer, or the actual end if read_pointer<write_pointer); you'll get another fd_read to say there's still data, at this point you'll be recving into the start again.

As for reading, make a read_line function which copies characters from the current read pointer until the next newline found (remembering that the buffer is circular, but the 'end' is the current write pointer) - return the number of characters copied, or zero if a newline wasn't met before the end of the buffer (which means you have an incomplete line and have to wait for the rest of it to arrive.)

Make your data buffer fairly large (4k or so) and your message buffer can be 512 bytes (but readline should check for buffer overflow just to be safe :wink)

Kablam! :green


..alternatively, you can be 'hardcore' and just use one buffer, treat it circularly, and interpret the message directly from it (you will have fun when a nick crosses the end of the buffer though.)
No snowflake in an avalanche feels responsible.

ecube

Alright thanks a lot Ted, you're very clever :D have a great day.