News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Don't understand CMPS instruction.

Started by David, March 24, 2010, 03:53:36 AM

Previous topic - Next topic

David

Hi guys, i dont understand the cmps instruction.  This is how it explained in my tutorial, but I still don't get it.

"The cmps instruction compares two strings. The CPU compares the string referenced by es:di to the string pointed at by ds:si. Cx contains the length of the two strings (when using the rep prefix). Like the movs instruction, the MASM assembler allows several different forms of this instruction:

Can anybody clear this up for me? and possible give me a website with very detailed descriptions of the 80186+ instruction set?

Thanks, David.

BogdanOntanu

The most important instruction descriptions are to be found inside the INTEL CPU manuals. You can find the INTEL CPU manuals on the Intel web site. The description for CMPS is in the manual named: IA-32 Intel Architecture Software Developer's Manual; Volume 2A Instruction Set Reference, A-M.

In summary the CMPS instruction will compare the string at ESI with the string at EDI (in 32 bits) one byte/word or dword at a time and will setup the EFLAGS accordingly. Usually you must take some action based on the flags result after each comparison and because of this the use of REP prefix is possible but not recommended.

It is a shortcut for some pseudo code like this:

mov al,[esi]
mov dl,[edi]
inc esi
inc edi
cmp al, dl


But it uses internal temporary registers and not AL and DL.

For a more detailed explanation you can find the folowing text at page 3-133 in 25366620.pdf (Intel CPU Manual)

Quote
Description
Compares the byte, word, doubleword, or quadword specified with the first source operand with
the byte, word, doubleword, or quadword specified with the second source operand and sets the
status flags in the EFLAGS register according to the results.
Both source operands are located in memory. The address of the first source operand is read
from DS:SI, DS:ESI or RSI (depending on the address-size attribute of the instruction is 16, 32,
or 64, respectively). The address of the second source operand is read from ES:DI, ES:EDI or
RDI (again depending on the address-size attribute of the instruction is 16, 32, or 64). The DS
segment may be overridden with a segment override prefix, but the ES segment cannot be over-
ridden.
At the assembly-code level, two forms of this instruction are allowed: the "explicit-operands"
form and the "no-operands" form. The explicit-operands form (specified with the CMPS
mnemonic) allows the two source operands to be specified explicitly. Here, the source operands
should be symbols that indicate the size and location of the source values. This explicit-operand
form is provided to allow documentation. However, note that the documentation provided by
this form can be misleading. That is, the source operand symbols must specify the correct type
(size) of the operands (bytes, words, or doublewords, quadwords), but they do not have to
specify the correct location. Locations of the source operands are always specified by the
DS:(E)SI (or RSI) and ES:(E)DI (or RDI) registers, which must be loaded correctly before the
compare string instruction is executed.
The no-operands form provides "short forms" of the byte, word, and doubleword versions of the
CMPS instructions. Here also the DS:(E)SI (or RSI) and ES:(E)DI (or RDI) registers are
assumed by the processor to specify the location of the source operands. The size of the source
operands is selected with the mnemonic: CMPSB (byte comparison), CMPSW (word compar-
ison), CMPSD (doubleword comparison), or CMPSQ (quadword comparison using REX.W).
After the comparison, the (E/R)SI and (E/R)DI registers increment or decrement automatically
according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E/R)SI
and (E/R)DI register increment; if the DF flag is 1, the registers decrement.) The registers incre-
ment or decrement by 1 for byte operations, by 2 for word operations, 4 for doubleword opera-
tions. If operand size is 64, RSI and RDI registers increment by 8 for quadword operations.
The CMPS, CMPSB, CMPSW, CMPSD, and CMPSQ instructions can be preceded by the REP
prefix for block comparisons. More often, however, these instructions will be used in a LOOP
construct that takes some action based on the setting of the status flags before the next compar-
ison is made. See "REP/REPE/REPZ/REPNE/REPNZ—Repeat String Operation Prefix" in
Chapter 4, in the  IA-32 Intel® Architecture Software Developer's Manual, Volume 2B, for a
description of the REP prefix.
In 64-bit mode, the instruction's default address size is 64 bits, 32 bit address size is supported
using the prefix 67H. Use of the REX.W prefix promotes doubleword operation to 64 bits (see
CMPSQ). See the summary chart at the beginning of this section for encoding data and limits.

Any assembler programmer should download the Intel CPU manuals and then read them with dedication at least 10 times. They are of the essence and mandatory.
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

theunknownguy

Quote from: David on March 24, 2010, 03:53:36 AM
Hi guys, i dont understand the cmps instruction.  This is how it explained in my tutorial, but I still don't get it.

"The cmps instruction compares two strings. The CPU compares the string referenced by es:di to the string pointed at by ds:si. Cx contains the length of the two strings (when using the rep prefix). Like the movs instruction, the MASM assembler allows several different forms of this instruction:

Can anybody clear this up for me? and possible give me a website with very detailed descriptions of the 80186+ instruction set?

Thanks, David.

http://www.intel.com/products/processor/manuals/

And if you want to read some other interest things:

http://www.agner.org/ (Pretty much all articles there rules)

MichaelW

This source implements some simple macros that duplicate the function of CMPSB, alone and in combination with the REPE and REPNE prefixes, hopefully to demonstrate how the instructions work.

Note that under Windows DS and ES contain the same selector, so they both "point" to the data segment. Note also that the direction flag is normally cleared, and that the Windows API and CRT (and a lot of other code) depend on it being cleared.


;====================================================================

cmps_b MACRO
    mov al, [esi]
    mov bl, [edi]
    pushfd          ;; get EFLAGS...
    pop edx         ;; into EDX
    .IF edx & 400h  ;; if direction flag set
        dec edi
        dec esi
    .ELSE           ;; direction flag clear
        inc edi
        inc esi
    .ENDIF
    cmp al, bl      ;; do comparison and set status flags
ENDM

;====================================================================

repe_cmps_b MACRO
  @@:
    dec ecx
    jz  @F
    cmps_b
    jnz @F
    jmp @B
  @@:
ENDM

;====================================================================

repne_cmps_b MACRO
  @@:
    dec ecx
    jz  @F
    cmps_b
    jz  @F
    jmp @B
  @@:
ENDM

;====================================================================
.NOLIST
;====================================================================
    include \masm32\include\masm32rt.inc
;====================================================================
.LISTALL
;====================================================================
    .data
        str1 db "abcde"
        str2 db "abxdZ"
    .code
;====================================================================
; ------------------------------------------------------------
; Thsis proc displays the value of the Overflow, Direction,
; Interrupt Enable, Sign, Zero, Auxiliary Carry, Parity, and
; Carry flags, in a DEBUG-style format.
;
;   Flag          position  flag set flag clear
;   ----          --------  -------- ----------
; Overflow        (bit 11)     OV        NV
; Direction       (bit 10)     DN        UP
; Interrupt       (bit 9)      EI        DI
; Sign            (bit 7)      NG        PL
; Zero            (bit 6)      ZR        NZ
; Auxiliary Carry (bit 4)      AC        NA
; Parity          (bit 2)      PE        PO
; Carry           (bit 0)      CY        NC
; ------------------------------------------------------------

dumpflags proc
    pushfd
    pushad
    pushfd
    pop ebx
    .IF bx & 1 SHL 11         ; Overflow (bit 11)
      print "OV "
    .ELSE
      print "NV "
    .ENDIF
    .IF bx & 1 SHL 10         ; Direction (bit 10)
      print "DN "
    .ELSE
      print "UP "
    .ENDIF
    .IF bx & 1 SHL 9          ; Interrupt (bit 9)
      print "EI "
    .ELSE
      print "DI "
    .ENDIF
    .IF bx & 1 SHL 7          ; Sign (bit 7)
      print "NG "
    .ELSE
      print "PL "
    .ENDIF
    .IF bx & 1 SHL 6          ; Zero (bit 6)
      print "ZR "
    .ELSE
      print "NZ "
    .ENDIF
    .IF bx & 1 SHL 4          ; Auxiliary Carry (bit 4)
      print "AC "
    .ELSE
      print "NA "
    .ENDIF
    .IF bx & 1 SHL 2          ; Parity (bit 2)
      print "PE "
    .ELSE
      print "PO "
    .ENDIF
    .IF bx & 1 SHL 0          ; Carry (bit 0)
      print "CY ",13,10
    .ELSE
      print "NC ",13,10
    .ENDIF
    popad
    popfd
    ret
dumpflags endp

;====================================================================
start:
;====================================================================

    mov edi, OFFSET str1
    mov esi, OFFSET str2
    REPEAT 5
      cmpsb
      call dumpflags
    ENDM

    print chr$(13,10)

    mov edi, OFFSET str1
    mov esi, OFFSET str2
    REPEAT 5
      cmps_b
      call dumpflags
    ENDM

    print chr$(13,10)

    mov edi, OFFSET str1+4
    mov esi, OFFSET str2+4
    REPEAT 5
      std
      cmpsb
      cld
      call dumpflags
    ENDM

    print chr$(13,10)

    mov edi, OFFSET str1+4
    mov esi, OFFSET str2+4
    REPEAT 5
      std
      cmps_b
      cld
      call dumpflags
    ENDM

    print chr$(13,10,13,10)

    mov edi, OFFSET str1
    mov esi, OFFSET str2
    mov ecx, 5
    repe cmpsb
    call dumpflags
    print str$(ecx),13,10
    mov edi, OFFSET str1
    mov esi, OFFSET str2
    mov ecx, 5
    repe_cmps_b
    call dumpflags
    print str$(ecx),13,10,13,10

    mov edi, OFFSET str1
    mov esi, OFFSET str2
    mov ecx, 5
    repne cmpsb
    call dumpflags
    print str$(ecx),13,10
    mov edi, OFFSET str1
    mov esi, OFFSET str2
    mov ecx, 5
    repne_cmps_b
    call dumpflags
    print str$(ecx),13,10,13,10

    mov edi, OFFSET str1+4
    mov esi, OFFSET str2+4
    mov ecx, 5
    std
    repe cmpsb
    cld
    call dumpflags
    print str$(ecx),13,10
    mov edi, OFFSET str1+4
    mov esi, OFFSET str2+4
    mov ecx, 5
    std
    repe_cmps_b
    cld
    call dumpflags
    print str$(ecx),13,10,13,10

    mov edi, OFFSET str1+4
    mov esi, OFFSET str2+4
    mov ecx, 5
    std
    repne cmpsb
    cld
    call dumpflags
    print str$(ecx),13,10
    mov edi, OFFSET str1+4
    mov esi, OFFSET str2+4
    mov ecx, 5
    std
    repne_cmps_b
    cld
    call dumpflags
    print str$(ecx),13,10,13,10

    inkey "Press any key to exit..."
    exit

;====================================================================
end start

eschew obfuscation