News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

polyalphabetic susbstitution cipher... help!

Started by gusto, November 24, 2005, 07:18:31 PM

Previous topic - Next topic

gusto

thanks aero , yea im trying to use the Vigenere method... and im not confused on how it works, jsut the actual coding.. my profesor loves to give projects and not explain the calls or procedures we can use or how to use them to do anything within the project, like he said to use tables and repeats to make the alphabet A-Z two times, and to make all the plaintext turn into Uppercase... he wrote down somewhat how to do it, but doesnt explain what each line does, so my whole class is just confused... i approach him so he can try to explain, and he jus brushes me off pretty much..  :'(

MichaelW

Depending on how you do the conversion, if the alphabets are uppercase then the ciphertext will be uppercase. If not, the Irvine16 library includes a str_ucase procedure that converts a null-terminated string to uppercase.

If you are supposed to use just two alphabets, then are you supposed to use the case of the key characters to select the alphabet?

As I stated previously, your alphabets would be easier to access if you placed them in the data segment. You could do this by expanding a macro in the data segment, but with just two alphabets you could do it manually in far less time than it would take to write the macro. For example:

.data
alpha_lc db "ABCDEFGHIJKLMNOPQRSTUVWXYZ",0
alpha_uc db "BCDEFGHIJKLMNOPQRSTUVWXYZA",0

eschew obfuscation

gusto

prof. said not to use anything from the libraries, to do everything by hand.. hence my 'uppercase' procedure, which im not sure how to call it so it'l work.  ::)...  :dazzled:

MichaelW

Your mytbl and uppercase procedures are not functional procedures they are data definitions encased in procedure definitions, in the code segment. As a table, uppercase seems partially correct, but for index values 0-96 and 133-255 I would use the value zero. You could then use the character code of the character to be converted as an index into the table, and if the indexed value were non-zero, substitute it for the character code. In pseudocode, and assuming the table and string were both in the data segment:

bx = offset table
di = offset string
looper:
  al = [di]
  if al == 0 then finished (end of string)
  xlat
  if al != 0 then [di] = al (substitute upper case for lower case)
  di = di + 1
loop looper

eschew obfuscation

tenkey

One possible reason for placing the alphabet twice in a row is to avoid the modulus operation. Vigenere is cyclic, and just repeating the alphabet allows you to use simple indexing, or XLAT. You will need to use Aero's suggestion of translating, at least, the keyword letters so that 'A' --> 0, 'B' --> 1, etc. Or use all of Aero's suggestion and translate both plaintext and keyword. In the second case, you can index (or XLAT) the dual alphabet directly.

The XOR code is puzzling. XOR-encryption was another popular substitution cipher that was usable with multibyte keys. However, it will not produce the "add 1, add 2" effect.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

MichaelW

Aha, an explanation that I can understand. Starting with an already repeated key, my initial coding of a 32-bit encryption routine required just 13 instructions (and it worked, ATTACKATDAWN and LEMONLEMONLE produced LXFOPVEFRNHR, duplicating the results from the Wikipedia Vigenère cipher page).

Make that 12 instructions.


eschew obfuscation

gusto

indexing.. is that taking one letter at a time, and adding it to my key? i wasnt taught how to do that... the proffesor wrote on an example '    txt[ebx]   '  as a way to get each individual letter one by one, but when i try to assemble it kept givin me an error.. Michael, lol so u did the encryption program jus to try it out? hehe u guys are too funny

MichaelW

This would be an example of indexing (taken from a 16-bit version of my encryption routine):

mov   bl, plaintext[si]


The plaintext[si] operand addresses a byte in the data segment at an offset address that is calculated (at runtime) by adding the offset address of the plaintext label to the value in SI. The instruction copies the addressed byte into BL.

For more information start at Direct Memory Operands here:

http://webster.cs.ucr.edu/Page_TechDocs/MASMDoc/ProgrammersGuide/Chap_03.htm

And continue through the Indirect Memory Operands section.

My code was based on this part of tenkey's post:
Quote
Or use all of Aero's suggestion and translate both plaintext and keyword. In the second case, you can index (or XLAT) the dual alphabet directly.

I used indexing instead of XLAT, but I see no reason why you could not use XLAT. I used uppercase for everything. For the dual alphabet I used two identical copies of the alphabet.

See the section that starts with "Vigenère can also be viewed algebraically." Here:

http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher

And the page at the modulo link to understand how the dual alphabet can eliminate the modulus operation.

eschew obfuscation

gusto

#23
me and a friend came up with an convert to uppercase program using xlat... but how do i encorporate it into my encryption program? i have been trying but dont know where to place it and what to use for 'text'...



include \masm615\include\irvine16.inc

.data

table byte 0

i=1

repeat 96

byte i

i=i+1

endm

repeat 26

byte i-32

i=i+1

endm

repeat 133

byte i

i=i+1

endm

abece byte 41h

i=42h

repeat 25

byte i

i=i+1

endm

i=42h

repeat 25

byte i

i=i+1

endm

text byte "uppercase!!!",0

.code

main proc

startup

mov ebx,offset table

mov ecx,sizeof text

ol:

mov al,text[ecx]

xlat

mov text[ecx],al

loop ol

mov al,text[0]

xlat

mov text[0],al

mov edx,offset text

call writestring

exit

main endp

end main






what do i use as 'text' when i have read the string entered by user?, here text is determined by me, in my encryption its based on what the user inputs... what register would it be stored in?

MichaelW

I see now what you are doing with the uppercase conversion table, and it's cleaner than what I had in mind. Irvine16 is, obviously, a 16-bit library. In your code the only 32-bit register that you actually need to use is ECX (in text[ecx]), because CX will not work as an index. You can use a 32-bit register to pass offset addresses and counts, but the library procedures will use only the lower-order 16-bits because in a 16-bit app offset addresses are 16-bit values, and the loop instruction actually uses CX as the loop counter. I edited your code so I could test it, cleaned it up and corrected a few problems, and demonstrated one method of getting a string from the user. The call to ReadChar is so I could see the output without having to modify the program properties or run it from a batch file or the command line.

include \masm615\include\irvine16.inc

.data

  table byte 0
  i=1
  repeat 96
    byte i
    i=i+1
  endm
  repeat 26
    byte i-32
    i=i+1
  endm
  repeat 133
    byte i
    i=i+1
  endm

  abece byte 41h
  i=42h
  repeat 25
    byte i
    i=i+1
  endm
  ;i=42h
  I=41H                 ; NEED TWO IDENTICAL ALPHABETS
  ;repeat 25
  REPEAT 26 
    byte i
    i=i+1
  endm
  BYTE 0                ; SO CAN DISPLAY AS STRING

  ;text byte "uppercase!!!",0
  TEXT DB 50 DUP(0)

.code
;main proc
;startup
.STARTUP
 
    MOV DX,OFFSET ABECE
    CALL WRITESTRING
    CALL CRLF

    MOV DX,OFFSET TEXT
    MOV CX,SIZEOF TEXT
    CALL READSTRING
    CALL CRLF

    mov ebx,offset table
    mov ecx,sizeof text
    CALL UCASE

    mov edx,offset text
    call writestring
    CALL READCHAR

.EXIT

UCASE PROC
  ol:
    mov al,text[ecx]
    xlat
    mov text[ecx],al
    loop ol
    mov al,text[0]
    xlat
    mov text[0],al
    RET
UCASE ENDP
;exit

;main endp

;end main
END

eschew obfuscation

gusto

#25


include \masm615\include\irvine16.inc

key = 239
bufmax = 35

.data

sprompt byte "Enter the plain text: ",0
sencrypt byte "CIPHERTEXT: ",0
skey byte "KEY: ",0
sdecrypt byte "Decrypted: ",0
theend byte "END OF DATA",0
lengthbuf byte "LENGTH = ",0
buffer byte bufmax dup(0)
bufsize dword ?
.code

main proc
startup

table byte 0
i=1
repeat 96
byte i
i=i+1
endm

repeat 26
byte i-32
i=i+1
endm

repeat 133
byte i
i=i+1
endm

abece byte 41h
i=42h
repeat 25
byte i
i=i+1
endm

i=42h
repeat 25
byte i
i=i+1
endm

start:
mov edx,offset skey
call writestring
call crlf

call inputthekey
mov edx,offset lengthbuf
call writestring
mov edx,sizeof bufsize      ; trying to ge LENGTH of KEY, keep getting 4...?
call writedec
call crlf

mov edx,offset sprompt
call writestring
call crlf
call inputthestring

call translatebuffer
mov edx,offset sencrypt
call displaymessage

mov edx,offset theend
call writestring


exit
main endp




inputthekey proc

call readchar
cmp al,'!'
je quit

pushad
mov ecx,bufmax
mov edx,offset buffer
call readstring
mov bufsize,eax
call crlf
popad
ret
inputthekey endp

inputthestring proc

call readchar
  mov ebx,offset table
  mov ecx,sizeof buffer
call UCASE


pushad
mov ecx,bufmax
mov edx,offset buffer
call readstring



mov bufsize,eax
call crlf
popad
ret
inputthestring endp


displaymessage proc
pushad
call writestring
call crlf
mov edx,offset buffer
call writestring
call crlf
popad
ret
displaymessage endp


translatebuffer proc
;pushad
mov ecx,bufsize
mov esi,0
L1:
mov al,buffer[esi]
xlat
add buffer[esi],abece[esi]               ; getting error here, what should i use?
inc esi
loop l1
mov al,buffer[0]
xlat
mov buffer[0],al
;popad
ret
translatebuffer endp



quit: exit

UCASE PROC
  ml1:
    mov al,buffer[ecx]
    xlat
    mov buffer[ecx],al
    loop ml1
    mov al,buffer[0]
    xlat
    mov buffer[0],al
    RET
UCASE ENDP
end main



thats what i have so far.....
tryin to work with the xlat, still confused, am i goin about it the right way? its not conveting to uppercase, and im not gettin the encryption to work  ::) . . . also im getting LENGTH: 321 or big numbers like that for my LENGTH even if i just enter 3 letters... aye wat a headache..

MichaelW

The problem with:

add  buffer[esi],abece[esi]

Is that the instruction contains two memory operands. As stated here under General-Purpose Registers:

http://webster.cs.ucr.edu/Page_TechDocs/MASMDoc/ProgrammersGuide/Chap_01.htm

QuoteThe 8086-based processors do not perform memory-to-memory operations. For example, the processor cannot directly copy a variable from one location in memory to another. You must first copy from memory to a register, then from the register to the new memory location. Similarly, to add two variables in memory, you must first copy one variable to a register, then add the contents of the register to the other variable in memory.

So in this case you could replace the instruction with something like this:

mov  dl,abece[esi]
add  buffer[esi],dl


Your code has a major problem in that the tables are in the code segment, directly in the execution path. As a result the processor is attempting to execute the table data as if it were a valid instruction sequence, which it obviously is not. The tables must be outside the execution path, and should preferably be in the data segment. A program can access data in the code segment, but because most instructions that access data use DS by default, and for the small memory model that the Irvine16 libraries use CS != DS, each memory access would require a CS segment override.

In case you defined the table macros in the code segment thinking that the macros must "execute" to build the tables, you should recognize that macro expansion takes place when the source file is assembled. MASM will expand the macros as they are encountered, regardless of which segment they are defined in.

I see two problems with getting the length of the key that was input. First, I don't understand what this is for:

mov  edx,sizeof bufsize

WriteDec expects the number in EAX, not EDX, and SIZEOF returns the total number of bytes allocated for a variable, in this case 4.

Your inputthekey procedure is loading the length of the string returned by ReadString into bufsize, but the code that calls ReadChar is discarding the first character that is input. If you need to provide the user with a way to abort, a better choice would probably be to detect a zero length input (i.e. detect that the user just pressed enter without typing any characters).

eschew obfuscation

gusto

tables before the execution! ah no wonder, its things like this that get me mad the prof. doesnt explain.. well that fixes that error, and i changed the line from edx to eax for LENGTH, but not all i keep getting is 4 no matter what length KEY i input.., maybe its reading the actual words "KEY:"?... also i have the Readchar instruction, because according to the prof. i need to abort only when the KEY is = ' ! ', you showed me before how to do this by using readchar and comparing my input to to '!' and if equal, jump to exit...

current code:



include \masm615\include\irvine16.inc

key = 239
bufmax = 35

.data

sprompt byte "Enter the plain text: ",0
sencrypt byte "CIPHERTEXT: ",0
skey byte "KEY: ",0
sdecrypt byte "Decrypted: ",0
theend byte "END OF DATA",0
lengthbuf byte "LENGTH = ",0
buffer byte bufmax dup(0)
bufsize dword ?
.code


table byte 0
i=1
repeat 96
byte i
i=i+1
endm

repeat 26
byte i-32
i=i+1
endm

repeat 133
byte i
i=i+1
endm

abece byte 41h
i=42h
repeat 25
byte i
i=i+1
endm

i=42h
repeat 25
byte i
i=i+1
endm


main proc
startup



start:
mov edx,offset skey
call writestring
call crlf

call inputthekey
mov edx,offset lengthbuf
call writestring
mov eax, bufsize    ; edit! i re-read what you write 4 times and understood this part :)
call writedec
call crlf

mov edx,offset sprompt
call writestring
call crlf
call inputthestring

call translatebuffer
mov edx,offset sencrypt
call displaymessage

mov edx,offset theend
call writestring


exit
main endp




inputthekey proc

call readchar
cmp al,'!'
je quit

pushad
mov ecx,bufmax
mov edx,offset buffer
call readstring
mov bufsize,eax
call crlf
popad
ret
inputthekey endp

inputthestring proc


  mov ebx,offset table
  mov ecx,sizeof buffer
call UCASE                                ; trying to make PLAINTEXT all be uppercase, not being converted though.?


pushad
mov ecx,bufmax
mov edx,offset buffer
call readstring



mov bufsize,eax
call crlf
popad
ret
inputthestring endp


displaymessage proc
pushad
call writestring
call crlf
mov edx,offset buffer
call writestring
call crlf
popad
ret
displaymessage endp


translatebuffer proc
;pushad
mov ecx,bufsize
mov esi,0                     ; think this is wrong, but following example in my book :(
L1:
mov  dl,abece[esi]                    ; encryption code, adding alphabet[index] to plaintext[index]..
add  buffer[esi],dl
xlat
add buffer[esi],al
inc esi
loop l1
mov al,buffer[0]
xlat
mov buffer[0],al
;popad
ret
translatebuffer endp

UCASE PROC
  ml1:
    mov al,buffer[ecx]
    xlat
    mov buffer[ecx],al
    loop ml1
    mov al,buffer[0]
    xlat
    mov buffer[0],al
    RET
UCASE ENDP

quit: exit


end main

MichaelW

You moved table and abece outside the execution path, so the program no longer hangs, but they are still in the code segment. If you move table and abece into the data segment then your code will be able to access them without using a CS override. As the code is right now:

CS = 55Eh
Offset address of table (in the segment where it is defined) = 0
Offset address of abece (in the segment where it is defined) = 100h
DS = 5F1h
Absolute address of table = 55Eh*16+0 = 55E0h
Absolute address of abece = 55Eh*16+100h = 56E0h.

Without a CS override MASM assumes that table and abece are in the data segment, so the instructions are trying to access table at absolute address 5F1h*16+0=5F10h and abece at absolute address 5F1h*16+100h=6010h, and failing because the addresses are not correct. You could add a CS override to each instruction that accesses a table (e.g. mov al,cs:buffer[ecx]), and this would correct the problem, but it would be easier to just place the tables in the data segment and leave the instructions that access the tables as they currently are.

Your inputthekey procedure is still discarding the first character input. The simplest method that I can think of to get around this would be to move the returned character into the buffer, add one to the offset address passed to ReadString (so it will not overwrite the first character), and add one to the length returned.

inputthekey proc
  call readchar
  cmp al,'!'
  je  quit
  mov buffer,al   ;<<
  pushad
  mov ecx,bufmax
  mov edx,offset buffer+1  ;<<
  call readstring
  add eax,1  ;<<
  mov bufsize,eax
  call crlf
  popad
  ret
inputthekey endp


And you will need two full alphabets. The macros are currently expanding to:

ABCDEFGHIJKLMNOPQRSTUVWXYZBCDEFGHIJKLMNOPQRSTUVWXYZ

eschew obfuscation

gusto

fixed the LENGTH with what you put, and fixed the position of my tables, now its reading the LENGTH correctly, also i used this code for Alphabet 2x.


abece byte 0
i=1
while i LE 26
byte 40h+1
i=i+1
endm

while i LE 26
byte 40h+i
i=i+1
endm


i have my program encrypting...just not the exact way i want it to... its giving me characters not just letters... im guessing my TranslateBuffer code is not correct  ::)...