Hi all,
this is my first post in the Lab since normally i don't really think about optimizations and alghoritms.
Here's my question:
I'm writing a program which makes use of a random table to use as a base for encrypting files.
There will be an alghoritm used to select bytes from the table by the entered password and use them to encrypt the file.
Purpose:
- without the password and the correct table, the file should be totally safe.
- with the password, but without the table the file should still be totally safe
- with the table but without the password the file should also be totally safe :bg
The table will be stored in a file which can be distributed apart from the file / password.
This also makes it harder for eavedroppers to obtain all 3 variables (file, password, tablefile).
At first i was planning on making a 00 - FF table which would be randomly sorted, but when i come to think of it, an attacker would already know in advance that in the table no byte will ever occur twice or more.
This of course would be already the first point of failure.
So i decided to make a table of 1024 bytes which have random bytes in there. This way, no one can tell which bytes are in that table, let alone on which location they are (at least that's the idea ;)).
The following code should generate the table, but i'm still not convinced it is completely safe.
So here's where you guys come in (i hope).
.686
.model flat,stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
include \masm32\include\masm32.inc
include \masm32\include\user32.inc
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\masm32.lib
includelib \masm32\lib\user32.lib
.data
FileBuffer db "test.txt",0
AppName db "Test",0
CreateOK db "File created successfully!",0
.data?
hFile dd ?
BytesWr dd ?
TableBuf db 1024 dup (?)
.code
start:
invoke GetTickCount
push eax
invoke GetCurrentThreadId
pop ebx
xor eax,ebx
invoke nseed,eax
lea esi,TableBuf
mov ecx,esi
add ecx,sizeof TableBuf
@@:
push ecx
invoke nrandom,256
mov byte ptr [esi],al
inc esi
pop ecx
cmp esi,ecx
jle @B
invoke CreateFile,addr FileBuffer,GENERIC_WRITE, 0, NULL, CREATE_ALWAYS,\
FILE_ATTRIBUTE_NORMAL, NULL
.if eax!=INVALID_HANDLE_VALUE
mov hFile,eax
invoke WriteFile,hFile,addr TableBuf,sizeof TableBuf,addr BytesWr,NULL
invoke CloseHandle,hFile
invoke MessageBoxA,NULL,addr CreateOK,addr AppName,MB_OK
.endif
invoke ExitProcess,0
end start
- Would this be sufficient or should i add even more variables to create the seed? (harddisk serial, free disk space and perhaps username / computername come to mind).
- Would it be better to change the seed in the generation process several times, or should i leave the seed the same?
- Is generating byte per byte sufficient, or should i use dwords at a time?
Thanks for your time!
Let me just check that I have what you're trying to do straight:
1) there is a password and a table of numbers
2) the password is used to select bytes from the table for the key
3) the file is encrypted using some encryption algorithm
OK... now for the security. The encryption still relies on the key being unknown and the encryption algorithm being secure (so I'm asuming that you're using a standard one RC6, IDEA, or something). Therefore we must only make sure that the key is unkown.
I can't really say whether this will be true without knowing how the bytes are chosen from the table. However, here are some thoughts:
1) If Oscar/Eve (our opponent/eavesdropper) gets hold of the table, Oscar can determine the frequency of the bytes in the file... therefore, a key search can be made to search for the more probable keys first (however likely this is) and therefore speed up the key search (but hopefully this should be computationally impossible as the key space should be too large)
2) If bytes can only be chosen once from the table, or if the byte selector doesn't pick from the whole file, or if... then the keyspace may be reduced in some other way, that could lead to a computationally feasible key seach being possible.
It would help if you stated the byte selection algo.
Ossa
I understand, but i don't have the alghoritm yet :bg
I think a simple XOR would be the most effective, since that way counting bytes won't help, and since they do not know the password it is (should be) impossible to tell which byte is used to XOR which byte.
So what i want to create / implement:
- XOR one byte from the file with one byte from the table ( choosen by password specific properties (length, character, calculation between 2 characters or more, etc)
- after (or before) the XOR-ring use kind of a block alghoritm to rotate bytes in the file
- use the password as a resource to move characters in the file
Biggest Problem:
- The password will never be as long as the filesize, so i have to find a way to keep it decryptable without compromizing security.
... i don't understand, why there is a table ?
use a "pseudo" random number generator... the starting seed will be the password (a single 32 bits value), and all the following generated numbers will be the values to applie with xor... it should be safety enought for any use...
Well, that should be safe enough, but i want to be totally sure ;)
The table is just another variable you need to decrypt the file.
Think of it as the something you know + the something you have strategy. You must know the password and you must have the table.
It's just another layer of security.
If you are using XOR type things, it may well be susceptable to a (partially) known plaintext attack... for instance, if you encrypted something with a known header(e.g. EXE, ZIP), it might well be possible to derive the key from just the encrypted data (you may not need the password OR table to break this encryption). If you post the detailed algorithm, it should be possible for someone here to tell you if it is secure (but from what you've said so far, I suspect that it may well be quite weak).
Ossa
scorpion,
There is an algo in the masm32 library to stream xor a random pad against a file and while the algo does not have to be anything fancy, the level of encryption is as good as the randomness of the pad.
XorData proc lpSource:DWORD,ln:DWORD,lpKey:DWORD,lnKey:DWORD
If you want to use very high quality random pads you will get very high quality encryption for it but it works reasonably well on shorter keys as well. Note that if you store the pad in the file someone can usually find it in a hex editor or perhaps a debugger.
A one-time pad is the most secure type of encryption, but this must have an infinite length key... using shorter keys introduces quite severe problems in terms of strength - keys at around 16 bytes I could easily decrypt by hand and ones with longer keys produced from non-cryptographically-secure pseudo-random generators (assuming that I have the program that encrypted the file or know the algorithm - no key needed), I could write a program to break (quite fast - an hour or two running time at most).
Ossa
I have the following protection in mind:
- You have two 32-bit passwords (so you have the two variables you wanted). These can be transmitted when required.
- In order to obtain the final password, you have to multiply the two values. (and wipe away the high half of the result)
Of course you can choose to use even more values if you want, and multiply all of them.
Now, to encrypt a file, you do the following:
- Compress the file. (then you have high entrophy - and plain text attacks can't be performed anymore)
- Use the final password as seed for a random number generator
- Encrypt every byte in the file by xoring it with a random number.
Decrypting it means xoring the encrypted file with the random numbers, and decompressing it.
For compression/decompression you can use the zlib library. http://www.zlib.net/
Stan
stanhebben - sorry to be really quite negative in this thread but - that won't be at all strong unless you modify it significantly. You have compressed your key to 32-bits at one point in your algorithm - this means that a key search only has to try 4 billion or so combinations (sounds like a lot, but it really isn't). As you will (in your program) require something to tell you if the key is correct, I assume that you'll have a hash of the unencrypted file (or even of the compressed file, as it will stop the compression algo breaking then). This means that all I (if I'm trying to break it) would have to do is write a little program to run through all 4 billion keys, decrypt the file using them and check if the answer is correct. Depending how exactly you implement it, the running time of the algorithm will be between 10 minutes and 3 days (10 mins = 8 000 000 keys/sec; 3 days = approx 16000 keys/sec - easy to do on a modern computer).
You must never allow the key to go below 64 bits (56 bits has been cracked... although not by a normal computer). This is even before I begin looking for mathematical shortcuts to cut the time down (and from flimsy encryption like that, there will be some). If you want to make a decent encryption system, you must think like the opposition and try to break it... I used to be an admin on a site where we posted challenges to do with codes... one of the hardest challenges I ever posted used a 32-byte (thats a 256-bit key) with a deliberate flaw in the key translation algo. I only gave out an encrypted message and the assembled exe (no source) - one of our members got the solution in 1 day (they had to reverse the whole exe, find the weakness, rewrite the whole thing and do a key search over a 32-bit value to find the solution). There are people out there much much better at this type of thing than me - if you want secure encryption, use a well known block scheme such as RC6 or something, I find it great fun breaking encryption schemes, as it's a good mental challenge, but if you want a secure system, use something that has been properly peer reviewed.
Ossa
Of course you can choose larger passwords as well. You could also choose not to strip off the high bits of the multiplication.
To check if the file is decrypted correctly, a CRC could be used. The CRC is taken of the original file, so in order to check if you used the right key, you have to decrypt the file fully, decompress it and then calculate the CRC, which takes quite some time, and makes guessing the key unpractical.
And yead, instead of simply xoring, you could use RC6 as well, but I'm also wondering how secure our system would be with this encryption scheme (and with the mentioned improvements)
This is becoming a really interesting thread :U
@Stan>
- using 2 passwords and multiplying them would be the same as using 1 password since that will be the result.
- even worse would be hashing the password since then you would have a predefined password length.
- a check to see if the file has been decrypted successfully is something i definitely will not implement since it will allow an attacker to write a program that can check if the check is correct and therefore bruteforce the file.
@hutch>
- i think this would only be secure if the pad (or table) would be at least as large as the file, otherwise one could detect a pattern and use it to break the file.
@Ossa> a onetimepad is almost impossible to implement with large files in mind.
True, if you have a plaintext file of 1KB and create a (pseudo) random table with the same size and XOR every byte it would be completely secure, but what if the plaintext file is 100MB's?
Implementing an already existing alghoritm would be the easiest solution, but i want to create something unique, even if it means that i will have to spend weeks or months optimizing it.
Of course i will post the alghoritm once it is there and ask people to crack it.
I'm still very much thinking about the alghoritm, and for almost every way i can think of, i can think of a way to crack it.
So this will be an huge task, but i'm sure it is possible.
Quote from: White Scorpion on May 28, 2006, 12:35:25 PM
- a check to see if the file has been decrypted successfully is something i definitely will not implement since it will allow an attacker to write a program that can check if the check is correct and therefore bruteforce the file.
Two problems with this:
1) It makes usability worse. Suppose I accidently type in the wrong password, it will accept it and I will have to check and... just messy. A good hash (CRC is a bad bad choice for a cryptographic program) such as SHA-1 or even a weaker hash like MD5 will not give away info directly and your encryption should make it impossible to ever decrypt the file correctly without the password.
2) There are other test that can provide the same info. Generally the "bad guys" will know what sort of file they should end up with - text, exe, whatever - each of these has identifying traits - text should end up with valid ascii characters only (or mostly), exes have the MZ and PE headers with known bytes for checking purposes.
I really do reccommend using a hash of the original file to check that it is OK.
Quote from: White Scorpion on May 28, 2006, 12:35:25 PM
@Ossa> a onetimepad is almost impossible to implement with large files in mind.
True, if you have a plaintext file of 1KB and create a (pseudo) random table with the same size and XOR every byte it would be completely secure, but what if the plaintext file is 100MB's?
OK, maybe I wasn't being clear back there... suppose you use the nrandom function to do your encryption, you set the seed at the start with a
32-bit value and then use the random numbers generated to XOR the bytes. I'd just run through all 4 billion possible seeds and use one of the checks that I stated above to check if I'd got the right one... hey presto, I have a decrypted file (after perhaps an hour in this case... it's unlikely I would have needed to decrypt the whole file for each key to check if it decrypted OK).
My point? using a random number generator that uses a seed less than 64 bits in size destroys your scheme's security.
Quote from: White Scorpion on May 28, 2006, 12:35:25 PM
Implementing an already existing alghoritm would be the easiest solution, but i want to create something unique, even if it means that i will have to spend weeks or months optimizing it.
Nice to see someone using their creativity rather than being what one of my friends calls a "code monkey" and just writing code for someone else's algorithm.
Ossa
You could also encrypt the file twice using different passwords. Even if someone lucked out and got the right password on the first round, they would never know they had. :-)
The problem with pads is their reuse on other data. The longer the pad is to reduce or remove any looping of the pad, the stronger the encryption is but if you reuse the pad on other data it gets progressively weaker. The other historical weakness of pads is occasionally you get a bit of plain text through it.
In the context of a password for an app a single pad for a single password has no known weakness if the pad is truly random, its something like this,
password
xorred to
xxxxxxxx
Where each "x" is a random character in the range of 0 to 255.
Quote from: White Scorpion on May 28, 2006, 12:35:25 PM
- using 2 passwords and multiplying them would be the same as using 1 password since that will be the result.
Your goal was to make sure that you need two passwords in order to decrypt the file. Just make sure you never send the final password. Then if somebody can obtain one of the passwords, he doesn't know the final password.
Quote from: Ossa on May 28, 2006, 12:57:27 PM
2) There are other test that can provide the same info. Generally the "bad guys" will know what sort of file they should end up with - text, exe, whatever - each of these has identifying traits - text should end up with valid ascii characters only (or mostly), exes have the MZ and PE headers with known bytes for checking purposes.
That's why I recommend compressing the file. Then you don't know it's contents, or at least it's much harder to guess.
And 32- bit seeds are indeed too short. You should have a random number generator which accepts seeds of any size.
QuoteThat's why I recommend compressing the file. Then you don't know it's contents, or at least it's much harder to guess.
Compressing might be useful to implement, but nevertheless, an attacker does usually know what kind of file he is searching for.
Think of filename, location etc etc
This of course would not be the case if it would have been an uploaded file abc.abc which would have no reference on what it should contain.
Unfortunately most files used are Office documents (.xls, .doc, .ppt, .pst) ,executables and textfiles.
If you think of it like this it's a pretty good guess to try one of the six for a plaintext attack.
Ok, to make a long story short, this is what i need (at least):
- compress the file
- create a random generator which can use seeds of every length, or at least much longer then 4 bytes since otherwise i would have 4,294,967,296 possible seeds which of course isn't that much when talking in encryption ways.
- use a checksum to check if the file is decrypted successfully. (i'm still not convinced about this).
for the alghoritm:
- use all possible ways of linking every byte in the password so that even if one bit would change the password wouldn't work anymore. (no collisions).
examples could be:
A+B / 2 = ?
A*A+B*B = C*C --> use C (pythagoras)
A-B=?
A+B=?
A%B=?
and a whole lot (more complicated) ones.
every result could be used to pick a byte from the table and encrypt a byte from the plaintext
- take a predefined block size and start rotating and swapping bytes in the plaintext. move up one byte and do it again, etc, etc
Sorry, you lost me there... what are A, B and C? and how are they used to pick the random byte?
Ossa
sorry..
A,B,C are characters from the password, location doesn't really matter much.
How they pick... That's a good one, still thinking about that.
If i would say A*B/2= 0x45 (for example) and use the 0x45'th place in the table, i would only have a select range of numbers to pick from, since people almost always use the standard keyboard characters as a password.
So i must implement this differently, but how is still the question.
Maybe you could also be interested in rsa encryption.
I don't know if you are familiar with it, but you can use it as follows:
- You have a private key, consisting of two numbers
- Multiplying the two numbers gives you the public key. You can transmit that one. The receiver can then encrypt the file.
- The file can be transmitted
- To decrypt the file, you need the private key. (the public key won't work)
Just an idea I wanted to throw in. I can explain how it operates, if you are interested.
Stan, lets just throw in Diffie-Hellman key exchange and make a full protocol while we're at it... :bg
I think what White Scorpion is after (and correct me if I'm wrong) is to create his own encryption algo (not something that I think is the most sensible way to go, but I have to admire his determination).
Ossa
(RSA is also slow... much better to use a good block cipher for speed).
I don't know exactly what White Scorpion wants, but I wanted to throw it in anyway, as it could be useful.
But I'm also glad to help inventing a new algorithm, if that's what white scorpion is after.
public key encryption is not meant for files, because of it's speed like Ossa said, but it should be very safe.
Yeah, i do want to create my own algo, even though i know it will be a painful task.
I've already read several books on encryption methods, and i still have a few to go, but yes, i'm determined to get this to work.
Unfortunately i don't know most of the math symbols used, so that makes it harder for me to understand the formulas used.
I'm better in understanding the formula once it is written in assembly.
IMO not understanding most of the symbols used in the formulas is terrible, so i'm currently upgrading my math knowledge as well.
Like i said, it will take some time, but i'm positive it is worth it when it's finished.
white_scorpion
What about mixing two vastly different types of encryption? Use IDEA, twofish, blowfish or something similar for the first encryption. For the second one use a book encryption or vignette. So that the two different encryption schemes are really different.
It's a very good idea, i will definitely keep it in mind.
It will then become a combination of my own algo with an already existing one.