Hello group. Attached is a RadASM test piece for analyzing text formatting conversion. Below is the algorithm for converting spaced-text into tabbed-text. Sorry they are both a bit dodgy. :red I just can't seem to get the algorithm to work properly. Maybe it qualifies for the "Pearl Code Awards" but it's the best I can do. This one is very close but not exact. This is my fourth attempt and I'm really losing patience with it. :( Could anyone suggest something to try or a better approach? :wink
SpacesToTabs1 proc private uses esi edi lpDest:DWORD,lpSource:DWORD
;LOCAL inChrPos:DWORD ; input line chr position counter
;LOCAL outChrPos:DWORD ; output line chr pos counter
cld ; always copy forwards
mov esi,lpSource
mov edi,lpDest
loopit: lodsd ; 4 bytes EAX<--[ESI] and ADD esi,4
mov cl,1 ; flag as "loop"
; always start after LF!
findLF: mov edx,eax
cmp dl,0Ah ; search for linefeeds,
jne @F
sub esi,3 ; copy any found and restart after
jmp Out1
@@: cmp dh,0Ah
jne @F
sub esi,2
jmp Out2
@@: ror edx,16 ; swap words
cmp dl,0Ah
jne @F
sub esi,1
jmp Out3
@@: cmp dh,0Ah
je Out4 ; still a problem here?
ror edx,16 ; LF not found, look for spaces
findSP: mov cl,0 ; flag as "tab"
cmp eax,20202020h ; " "?
je OutTab
and edx,0FFFFFF00h ; strip off LSB
cmp edx,20202000h ; "x "?
je Out1
and edx,0FFFF0000h
cmp edx,20200000h ; "xx "?
je Out2
and edx,0FF000000h
cmp edx,20000000h ; "xxx "?
je Out3
mov cl,1 ; cannot tab, flag as "loop"
Out4: test al,al ; copy all four bytes
je OutNul ; check each for null
stosb ; write to EDI and increment it
ror eax,8 ; rotate next byte into AL
Out3: test al,al ; three bytes
je OutNul
stosb
ror eax,8
Out2: test al,al ; two bytes
je OutNul
stosb
ror eax,8
Out1: test al,al ; one byte
je OutNul
stosb
cmp cl,1
je loopit ; loop? or
OutTab: mov al,09h ; output tab
stosb
jmp loopit
OutNul: stosb
ret
SpacesToTabs1 endp
[attachment deleted by admin]
I think I understand the problem now, the algorithm considers a read 09h a single character width, while it can actually span 1-4 characters depending on position. Well, back to the drawing board. :)
EDIT: Adding a "tabs to spaces" conversion at the beginning to this algo fixes the issue. Now maybe this amalgamation can be simplified.
cld ; always copy forwards
mov esi,lpSource
mov edi,lpDest
nop
mov cl,4 ; preset chr position counter+1
remTab: lodsb ; replace any tabs with spaces
dec cl
and cl,3 ; make CL always 0-3
cmp al,0Ah ; linefeed?
jnz @F
mov cl,4 ; set CL to 3 on next char
@@: cmp al,09h ; tab?
jnz putchr
mov al,20h
@@: stosb
dec cl
jge @B
inc cl
jmp remTab
putchr: stosb ; else write it
cmp al,00h ; was it a null?
jne remTab ; loop until null
; fall through when null copied
invoke lstrcpy,lpSource,lpDest
How about moving the 'and cl,3' to just before putting the spaces?
Increment ecx for each character to keep track of the line offset, and then only and ecx,3 when you need to put the spaces for the tab (and don't forget to re-adjust ecx after doing the spaces!)