The MASM Forum Archive 2004 to 2012

General Forums => The Workshop => Topic started by: joemc on March 02, 2010, 08:02:38 PM

Title: Developing a Winsock read procedure
Post by: joemc on March 02, 2010, 08:02:38 PM
I have always wondered the best way to buffer a incoming TCP Winsock stream
I have seen several different methods. What i used to do it receive until i had one full "data packet" than process it, than shuffle around the whole buffer for each one. That does not seem too correct.  A circular buffer would be awesome accept when it overlaps :) so i have come up with what i have below.  It is just in some made up pseudo language. And designed around the header is 6 bytes, the first byte is 0x02a, second byte is channel and the length is a word long after a  DWORD. (OSCAR :)   I know this is a masm forum and not a winsock forum, but does anyone have advice on how they manage  their buffers with minimal shuffling?

GlobalBuffer
GBSize
GBLength
-----
Recieve Procedure

  LocalBuffer; Begin
  LBSize     ; End
  LBPos      ; Position
  LBLength   ; Position filled to

  COPY GlobalBuffer -> LocalBuffer
  LBPos=0;
  LBLength=GBLength

  LBLength += Recv(LocalBuffer,LBSize-LBLength)

  IF (LBLength==0)
    QUIT

  WHILE TRUE

    IF (LBLength -LBPos) < 6  ; Not a Full HEADER
      STORE IN GlobalBuffer
      BREAK

    IF (BYTE PTR[LocalBuffer+LBPos] != 0x2a) ; Not a Valid HEADER
      QUIT; 

    IF (WORD PTR[LocalBuffer+LBPos+4) > LBLength-LBPos-6 ; Not a Full Packet
      STORE IN GlobalBuffer
      BREAK

    IF (CHANNEL ==2)
      Proccess2(LocalBuffer+LBPos+6,DataLength)  ; Process Without HEADER
    ELSEIF (CHANNEL ==1)
      Proccess1(LocalBuffer+LBPos+6,DataLength)  ; Process Without HEADER
    ELSEIF (CHANNEL ==3)
      Proccess3(LocalBuffer+LBPos+6,DataLength)  ; Process Without HEADER
    ELSEIF (CHANNEL ==4)
      Proccess4(LocalBuffer+LBPos+6,DataLength)  ; Process Without HEADER
 
    INCREASE LBPOS by DataLength + 6

  ENDWHILE
END PROCEDURE
Title: Re: Developing a Winsock read procedure
Post by: clive on March 02, 2010, 09:16:27 PM
Use a single buffer into which you accumulate incoming data, it needs to be sufficiently large to take the largest chunk of data you need. You can't control the size of the data over the socket, the tcp/ip stack will packetize and fragment it.

Use a sliding window within that buffer to access the data, when you reach a threshold pull down the remaining data to the beginning of the buffer with a single memcpy, adjust the end of buffer pointer, and start accumulating data from the socket again.

Alternatively use a state-machine to track if you are waiting for header data or processing a chunk of channel data. When you accumulate enough data to handle the header process it, once you have channel data start filling the channel buffer, once you have all the channel data feed it off to the processing routine, and then go back to collecting a header. Repeat. State specific data request sizes would minimize the amount of unnecessary copying required.

-Clive
Title: Re: Developing a Winsock read procedure
Post by: joemc on March 03, 2010, 02:24:47 AM
so maybe more like this..  I can move a few more things inside loop instead of after it, but not as clear to read.


BuffBegin
BuffEnd
WindBegin
WindEnd


ReceiveProc

  size = Recv(WindEnd,BuffEnd - WindEnd);

  if (size < 1)
    Quit ;Winsock Error or Disconnect
  endif
 
  WindEnd +=size  ; Advance end of window

  while(true)

    if (WindBegin - WindEnd) < 6
      break ; Not a Full Header
    endif

    if (BYTE PTR[WindBegin] != 0x2a)
      Quit  ;Not a Valid Header
    endif
 
    need = WORD PTR[WindBegin+4)
    have = WindEnd - WindBegin - 6
    if ( have < need)
      break  ;Not Enough Data
    endif
   
    WindBegin += 6 ; Skip Header
   
    Process(WindBegin, need)
   
    WindBegin += need  ; Advance Begin of window

  endwhile
 
  if ( WindBegin == WindEnd)
    WindBegin = BufferBegin  ; empty window so reset to
    WindEnd = BufferBegin    ; begining of buffer
    return
  endif

  if (WindowBegin + Need > BufferEnd)
    if (WindowBegin == BufferBegin)
      Quit                                                           ; Data cannot fit in buffer
    endif
    moving = WindowEnd-WindowBegin                ; Get size of window
    memmove(BufferBegin,WindowBegin,moving)   ; Slide Data to beginning of buffer
    WindowBegin=BufferBegin                             ; Reset pointers to beginning
    WindowEnd=BufferBegin+moving                   ;   of buffer
  endif 
End RecieveProc
Title: Re: Developing a Winsock read procedure
Post by: zemtex on November 13, 2011, 12:54:59 AM
Like clive said, cut and slice, which can be done in many ways.

If you need a senddata routine I have made one that you can use:


; Returns SOCKET_ERROR on failure and 0 on success
TransferData PROC TheSock:DWORD, DataPtr:DWORD, DataLen:DWORD

push esi
push ebx
push edi
push ebp
xor esi, esi
mov ebx, DataLen
mov edi, DataPtr
mov ebp, TheSock
jmp again

ALIGN 16
pre:
add edi, eax
again:
INVOKE send, ebp, edi, ebx, esi
test eax, eax
js abandon
sub ebx, eax
jnz pre
mov eax, esi

abandon:
pop ebp
pop edi
pop ebx
pop esi
ret

TransferData ENDP



Title: Re: Developing a Winsock read procedure
Post by: Tedd on November 18, 2011, 01:26:46 PM
A sliding window offers no benefit over a circular buffer with reference to avoiding overlapping. The solution is simply to ensure that your buffer is big enough (in both cases.)
The sliding window substitutes a little complexity in preference for repeated copying of the data (to push it back to the start of the buffer.)
For the circular buffer you avoid the copying, but need to deal with wrapping which results in an extra recv call - though it can often be avoided by resetting the read & write pointers when the buffer becomes empty (which should happen regularly, as network data comes in bursts.)
Title: Re: Developing a Winsock read procedure
Post by: clive on November 18, 2011, 07:42:22 PM
Well it all depends. The pull-down method works exceedingly well in situations where you consume the majority of the buffer, and for that matter if you have to process the data in-place using other functions that can't handle the fragmentation. The code to handle the roll-over point in a circular buffer can be quite significant and pervasive, and frankly I've seen far to many examples where coders simply fail to handle buffer boundary spanning conditions properly. The cost of moving a few hundred bytes, once every few dozen KB, can in many circumstances by far cheaper than checking/handling the boundary cases for every byte processed. It is certainly far cheaper than having to double-buffer the data all the time.

People should pick a method based on the most effective one for the data they are handling, and the nature of that data and it's flow. That and what they can reliably/robustly implement and test. In many cases it's far more important that code functions correctly and is delivered on time, than is the optimum solution.

Circular buffer do work exceedingly well in hardware designs where the wrapping can be totally transparent.

Another trick with circular buffers is to have a slightly larger buffer that extends beyond the natural wrap point, and copy data from the front to the extended portion. The cost of copying a few dozen bytes from known locations, vs masking the address/offset on every access. Or one can simply handle the wrap/span case by copying data, in that situation, to a secondary linear buffer.

Title: Re: Developing a Winsock read procedure
Post by: zemtex on November 18, 2011, 10:20:08 PM
When a burst stops, the ping time between your client and the server can be 1 ms for example. In that short amount of time you can probably copy a Megabyte of data around in memory while waiting for the next burst. You just have to know when a stream is completed (not when a burst within a stream is completed)

Like it was said above, it depends what you are going to use it for. Some protocols works best with small chunks of data, the irc protocol for example, interprets bursts of 512 characters. I'm not sure with telnet, ftp but taking small chunks of data and interpreting them in a fixed manner and delay can give better responsiveness on the client.

EDIT: 100 MB of data using masm32 MemCopy in 78 ms. which is 1,28 MB of data per ms or 1282 bytes per microsecond, certainly plenty enough time to parse irc messages of 512 characters  :naughty: