Hacking the Crypto.c Algorithm
by Jitsu-Disk
Object
------
This paper presents (briefly) how the Novell one-way hash function operates,
and how the speed of routines may be increased.
Contents : Credits
Foreword
The one-way hash
About optimization
Useless and Over-used routines
Recursion vs. Iteration
*Added : cipher/decipher algorithm of Remote NLM
Credits
-------
Credits go to : Simple Nomad, for being Simply Elite
Itsme, for putting together the first Crypto routine
Rich, for teaching me Novell in the first place!
TheRuiner ('dreamer'), for remote NLM cipher alorithm
Foreword
--------
Sometimes I wonder why computers are drawing so many people to them,
people that often have very different jobs and inclinations when it comes to
leisure or intellectual orientation. Moreover, many of the people who teach
computer-related stuff or simply design them, come from mathematical fields
and believe that you just can't understand a thing if you can't solve an
integral. To me this is all bullshit. Computers are built by dreamers and are
incredible tools for expressing your artistic feelings;
Computer science is like a sphere : you can start by exploring any part
of that sphere, and by going deeper you will very soon reach the
core-algorithms. You can then bounce around to any other facet of that sphere,
your learning is not limited by any academic idea of how one should learn and what
vocabulary should be used in this process. By comparison, math and physics
are like a pyramidal : you must learn (painfully) each layer before moving on
to the next one.
Regarding the encryption scheme used by Novell, I want to say that I'm not
in any way good at CryptoAnalysis and that the Crypto.c routine is being
approached from an algorithmic standpoint only.
Now lets get dirty...
The one-way hash
----------------
All information provided below is assumed by regarding at the resulting
crypto.c and only this.
A one-way-hash function is a formula where you input a cleartext word or
phrase and the output is gibberish with two interesting properties : There is
no way to reverse the process (inputting the hash and obtaining the cleartext),
and the odds that two different cleartexts will give the same hash is almost
nonexistent. Therefore a hash is a safe way to store a password, the computer
processes the password when entered using the function and compares it to
the stored hash.
The one-way-hash scheme we're interested in, here, processes the cleartext
password in four steps.
0) On a NetWare server you can enter passwords up to 128 characters in
length, however the process used to generate the password-hash takes up
to 32 characters in input and gives 16 bytes as the output. Therefore,
if a password is longer than 32 characters, it is divided in chunks of
32 and chunk[0] is XORed against chunk[1] the result is XORed against
chunk[2] and so on until we've reached the end of the password. This
process IS NOT implemented nor further discussed here. The reason is
simple : no one holds enough computer power to brute force such long
passwords (given the algorithm below).
1) The password string is repeated enough times to feed a 32-byte length
string (crypt2data[0->31]) and between each sequence of the password in
crypt2data[] a byte is inserted from a fixed table called crypTab[].
2) Each byte in buf[32] is XORed with the unique IDentification number
attached to the user who owns that password, giving crypt2data[32].
note : At end of step_2 the ciphered crypt2data[] is still reversible to
give the cleartext by XORing it again with the ID
3) A special non-linear function lengthens crypt2data[32] to crypt2data[64]
4) Finally the Password hash is 16 byte length, assuming Z[] is crypt2data[],
nybTab a fixed table of nibbles, we have :
hash[0]= nybTab[Z[64]] OR nybTab[Z[65]<<4]
hash[1]= nybTab[Z[66]] OR nybTab[Z[67]<<4]
.
.
.
hash[15]= nybTab[Z[94]] OR nybTab[Z[95]<<4]
note : Z[65]->Z[95] are processed using the same formula
as in step_3
In a minute we're gonna go deeper into those four steps, but 2 very ESSENTIAL
points must be stated :
* The hash is a 16byte word where each byte is built upon a crypt2data[]
pair because the nybTab is made of bytes with all the 4left-bits set to 0:
nybTab = {0x7, 0x8, 0x0, 0x8, 0x6 ...}
nybTab[high] nybTab[low]
high=Z[64+2*i] low=Z[64+2*i+1]
hash[i]= |____4 left bits_______||____4 right bits______|
ex : if i=0 and Z[64]=0 and Z[65]=4 then hash[i]=0x76
* If the non-linear function is reversed, the "nybTab" would render this,
break-through almost useless: if you study its contents you'll see that a
set of 16 bytes are repeated various times (about 9 to 20) in a 265 byte
length table, so that if you tried to reverse the hash starting from this
table, you obviously had to perform the test at most 32power16 (rough
average) Therefore, if the user used a password longer than 9-11
characters (rough average) you'd be better off trying to reverse the hash
(if you could)... Besides, if the inventors of the Algorithm wanted to
introduce some "BackDoor" (statistical trend known only to themselves) the
nybTab would be a very good spot...
(Note : unless you've got the computing power of the NSA, trying to brute
force or reverse a password longer than 8 is useless).
Step_1) There isn't much more to say about it, just to note that the purpose
of this lengthening thing is to provide enough "space" for the formula of
step_3 to play with. Besides the introduction of bytes from a fixed table
throws in a little more entropy, and finally the longer a password is the
less time it is repeated in crypt2data.
Step_2) You may by asking yourself "What if some people with computers
holding enough RAM (or organizing their data in B-Tree) generated all
the hashes possible so that they would only have to compute them once".
Step_2 is the reason why this can't happen : XORing with the ID
generates a unique hash to this user since ID's are unique. You have to bear
in mind that ID's are fixed-length (4Bytes) but passwords have variable
lengths therefore, if you analyze crypto.c you will notice a little formula
to match the XOR between the correct bytes of crypt2data[] and the ID.
ex : Say we initialized the brute force with a 5-character length password,
AAAAA. On the first turn is AAAAA processed to feed the 32 first bytes
of crypt2data (Z), on the second turn we try AAAAB, therefore we only
need to process the last byte of the password "B" in every occurrence of
Z :
{ Byte introduced form crypTab (cTb[5], cTb[11]...)
{ AAAA XORed on turn_1 and also XORed on turn_1
{ ___|___ |
_1_{ |_|_|_|_|B|x|_|_|_|_|B|x|_|_|_|_|B|x|_|_|_|_|B|x|_|_|_|_|B|x|_|_|
BYT{ 0 1.....4...........10..........16..........22..........28....31
{
{ |_ _ _ _|_ _ _ _|_ _ _ _|_ _ _ _|_ _ _ _|_ _ _ _|_ _ _ _|_ _ _ _|
BYT{4ID_Bytes 0 2 0 2 0
{ | |
_2_{ --> Z[4]="B"XorID[0]................................Z[28]="B"XorID[0]
Step_3) Here comes the fun, the non-linear formula is used to generate a non-
reversible hash. This formula must have a 'good' property : it must be
strong against CryptoAnalysis (i.e. you can't use some kind of statistical
trend against the hash to sufficiently narrow the search upon a limited
key-space). From the analysis of encryptp here's what comes out :
We have two series U and Z, where U[]=Crypt_2_c_val[] (64 byte lengh)
and Z[]=Crypt2data[] (96 byte lengh)
We have one table of fix data D[]=CrypTab[] (32 byte lengh)
U is defined by U[i]=U[i-1]+Z[i+32]
Z is defined by { let i'=i&0x1f
{ let k =(U[i-33]+i')&0x1f
{
{ if k*31] in step_2
So this boils down to (understanding i&0x1f <=> i modulo 32) :
{ let i be the current iteration, i=0..63
{
{ { Z[i+32]=( Z(U[i]&0x1f+i) - D[i&0x1f] ) XOR ( Z[i] + U[i] )
{ {
{ { U[i+1]=U[i]+Z[i+32]
{
{ with U[0] = Z[32] = (Z[0]-D[0]) XOR Z[0]
{
{
{ Note : We don't need the "memory" provided by the use of a table for the
{ series U, therefore we could replace U[] by a variable U, and
{ U[i+1]=U[i]+Z[i+32] by U += Z[i+32]
A few things have to be said here, first we can see that Z[] depends on U[],
and U[] depends on Z[]. Therefore it is not possible to come up with a formula
that could give us U[i] (and therefore Z[i]) without calculating U[i-1] first.
This idea is reinforced by the use of a XOR in the formula, since the XOR
operation has no "good" arithmetical properties allowing any simplifications.
Second, Z[] is not linear and therefore U[] isn't either, and from some simple
tests performed, it doesn’t look like i' or k can be easily guessed. Finally,
the formula relies on a fixed table (CrypTab) to throw in a little more
entropy, so two passwords right next to each other (say AAAA and AAAB) give
a very different hash. It is to be said about this last point, that if the
inventors of this algorithm wanted to introduce some kind of "BackDoor",
the fixed table would certainly be another good spot, this would result in a
thing called "collision" where two different passwords give the same hash.
** Added note : well, a collision exists in this algorithm !!! This was proven
and an exploit coded by P.Semjanov. You can get the source code to this in the
Novell section of : http://194.85.96.197/psw/crack.html. Yet they are better
means to defeat the cryptic mechanism : we can use the hash directly, see the
"PanMount" for Linux on Pandora's download page and details in the "Novell Cries
Pandora" paper.
Step_4) The aim of this last step is to shrink the 4Bits pairs of nybTab[Z[]]
to give 8Bits words (so it ain't ugly like Unicode ASCII) and to make
the word "one-way" meaningful...
Note : It is obvious that there is no need to compare each Byte of the
hash each turn, in fact, comparing the low 4-Bits of hash[0] with Z[64]
proves to be a good test (i.e. rejected) 90% of the time.
About Optimization
------------------
Although the aim of optimization is obvious : accelerating the execution of
a program, the path to it is not always so obvious.
First, let’s consider a few facts : in your computer the "slowest" device is
your hard-drive, next (not considering the various add-on cards) comes your
RAM, then Cache Memory and finally the processor. Therefore minimizing disk
access is a top priority (the number of attempts the brute-force performs
before saving things to disk was HIGHLY increased) (could someone spread the
word to Micro$oft ?), and under certain conditions it is sometimes preferable
to have the processor do multiple times the same simple operation than
storing the result in a variable and calling the memory stack every time its
needed.
Second, we will assume you're using an Intel Pentium processor, or a RISC
processor. One of the interesting facts about these is that almost every
assembly instruction uses 1 computer cycle to execute (i.e. if you have a 100Mhz
processor, you’re computing at most 100 million instructions per seconds (not
considering pipeline)). So the question when you're programming in something
else than assembly isn't much "In what assembly instruction is the compiler
gonna transform this ?", but rather "How many assembly instructions is the
compiler gonna come up with for this ?". Therefore, if you want your program
to run fast, 1) Use a low-level programming language (i.e. close to the actual
assembly code, like the C language) 2) Always turn the compiler optimize
options ON for better results.
Finally, maybe the most important of all is how your program is structured.
I always try to keep two questions in mind, where the first one prevails
on the second (because the answers might go in opposite directions):
1) "Does the program perform only the minimum needed to reach our goal"
2) "Is this the simplest (or easily read) code possible"
One good example of how these two questions can have opposite answers is
found in the XorwithID() function. The function itself is pretty clear but
the trouble is that it performs the XOR thing on all the 32 first Bytes
of Crypt2Data each turn, whereas only a few bytes actually changes each
turn (see the diagram of step_2). So the answer to 1) is "No" because you'd
save time only by changing the bytes that need it, but the answer to 2) is
"Yes" because obviously the XorwithID function doesn't go into the
complications of calculating the position of the bytes that really need to
change.
Another example that answer 1) is the way that the testing of the one-way
hash was approached : previously all bytes were processed and tested against
the original hash; in the modified version each brute-forced byte is
processed independently and you only move to the next if the current one
matches its original hash counter part. This went even further because now
the first byte of the hash is first tested for its 4-last bits (see step_4
for details).
One last thing : during the optimization process, it is wise to start with
optimizing the "deepest" functions, that is to say the ones that are
repeated the most frequently and to end up with the ones that are processed
scarcely.
Useless and Over-used routines
------------------------------
Sometimes during the development process of a program, you come up with
plenty of small routines that you'd thought be more important than they
actually are. These sub-routines should be included directly in the ones
that call them as long as the clarity of the code is preserved and this
doesn't cause the code to be too redundant. Redundancy is ugly, yes,
but it gives faster code because calling a sub-routine is always time
consuming, whatever the size of the sub-routine, and this increases with the
number of local variable in it, and worsen if it’s a very "deep" sub-routine.
Moreover, these sub-routines prevent from easily seeing two things : two
or more sub-routines could be merged and a sub-routine might even prove
useless.
For instance Shrinkbuf() and Crypt2() were merged within the "for" loop of
encryptp(), preparepw() and XorwithID() where modified and moved to the main()
(so that they were on a less "inner" level) and finally fowardcrypt(),
InitRevcalc() and revcrypt2() were simply removed.
Recursion vs. Iteration
----------------------
Recursion happens when a function is defined by itself. Ex :
function Get_Drunk(where_to_go)
{If Alcohol_Test is positive return home;
Drink(Same_as(the_guy_next_to_you));
If (the_guy_next_to_you=drunk)
then Get_Drunk(another_bar)
else Get_Drunk(same_place)
}
A recursive function usually needs a "seed" to start with (Get_Drunk(bar)),
but depending on your goal you will be looking for either a solution to a
problem (forward recursion, i.e. : how can I get drunk) or the problem which a
specific solution solves (reverse recursion, i.e. : what steps it took me to
get drunk before I got home).
If either your problem or your solution is unknown to you, and you came up
with a recursive algorithm to solve it, it may just be that there just isn't
any other way to do it. If there is another way that means using an Iterative
process where you can clearly define a beginning and an end to it.
One of the particularities of a recursive algorithm, is that usually you can't
say WHEN it will stop and worse IF it will stop. Moreover, recursive means
a function call each turn, and there's nothing more CPU consuming than a
function call.
The good news is that if you know the recursive algorithm always comes up
with a solution and in a determinable or even better fixed number of turns,
then you're assured there is an Iterative equivalent to your Recursion.
Some will argue that you can always turn a recursive function in an iterative
loop by stack saving the variables each step in the loop, but this doesn’t
give good optimization.
Regarding the crypto.c, we had a two-level recursive call between calc_c and
calcElement, but some simples tests proved that they always gave their
solution in a fixed number of turns. Therefore the first thing was to track
down the "seed", by looking at how the algorithm gave U[0] (=Z[32]), and then
crack down one after the other the two functions to their Iterative
equivalent.
Cipher/Decipher Algorithm of Remote NLM, by TheRuiner ('dreamer')
-----------------------------------------------------------------
'dreamer' cryptographically cracked Novell's Remote.NLM password
encryption algorithm. It is a very weak algorithm compared
to what Novell has implemented in NDS, as it is instantaneously
decryptable. RConsole password encryption is different from
Remote.NLM password encryption because:
1) Encrypted RConsole passwords are sent across the wire, via
RConsole. Remote.NLM's encrypted passwords are generated at
the server console by typing REMOTE ENCRYPT MyPass, and they
are optionally stored in SYS:System\LDRemote.NCF.
2) They use a different password encryption algorithm. RConsole
passwords are encrypted with information from the workstation
that is currently running RConsole. Remote.NLM passwords are
encrypted with a time byte, one of 255 constants in a hash
table, appended characters, some XORing, and bit-order
separation.
3) Encrypted RConsole passwords are locally obtained with a packet
sniffer, but Remote.NLM passwords are remotely accessible to
anyone with the ability to view SYS:System\LDRemote.NCF.
The Remote.NLM passwords are decrypted using only five steps. To
encrypt, simply reverse the steps. The password will look
something like this:
AF8CBBF48CA9955F5ADAFDADAA23
The structure of the password is as follows:
AF8CBBF48CA99 55F5ADAFDADAA - 23
The first section contains the low-order bits, and the second, the
high-order bits. 23 is the time byte, which is decremented by the
server once every two seconds, from FF to 02, then back up to FF,
etc.
Step 1) Realign the low-order bits and high-order bits
=======================================================
This is extremely simple to do. The high-order bits are in order
from the first character to the last, and so are the low-order
bits. Example:
Password: AF8CBBF48CA99 - 55F5ADAFDADAA,
Output: 5A 5F F8 5C AB DB AF F4 D8 AC DA A9 A9
At this point, ignore 5A 5F F8 5C, or the first four bytes. They
are appended somewhere during encryption, and decrypt to "%*@$".
It was a TERRIBLE idea for Novell to implement those four
characters into every single password, as those are what helps
rebuild their hash table from scratch. Also, if the length of the
password is 10, the password is automatically decryptable to nul.
Step 2)
=======
Match each of the password characters (group of two hex
characters) to the hash table below. Use their position from the
beginning of the table to determine the value of the pre-hash
encrypted password. Example: F4, the 8th character of the
password, matches the hash table at 95. This means that 95 is the
pre-hash value of F4. Thus far, (ignoring the first four
characters) the password was:
AB DB AF F4 D8 AC DA A9 A9
and now the password is:
98 A0 9B 95 A1 9D A6 9C 9C
Remote.NLM Hash Table
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00 5B 58 5E 5F 59 5C 5A 5D-73 70 76 77 71 74 72 75
10 13 10 16 17 11 14 12 15-7B 78 7E 7F 79 7C 7A 7D
20 53 50 56 57 51 54 52 55-03 00 06 07 01 04 02 05
30 1B 18 1E 1F 19 1C 1A 1D-0B 08 0E 0F 09 0C 0A 0D
40 2B 28 2E 2F 29 2C 2A 2D-63 60 66 67 61 64 62 65
50 83 80 86 87 81 84 82 85-3B 38 3E 3F 39 3C 3A 3D
60 8B 88 8E 8F 89 8C 8A 8D-33 30 36 37 31 34 32 35
70 93 90 96 97 91 94 92 95-6B 68 6E 6F 69 6C 6A 6D
80 9B 98 9E 9F 99 9C 9A 9D-A3 A0 A6 A7 A1 A4 A2 A5
90 F3 F0 F6 F7 F1 F4 F2 F5-AB A8 AE AF A9 AC AA AD
A0 DB D8 DE DF D9 DC DA DD-FB F8 FE FF F9 FC FA FD
B0 23 20 26 27 21 24 22 25-B3 B0 B6 B7 B1 B4 B2 B5
C0 CB C8 CE CF C9 CC CA CD-BB B8 BE BF B9 BC BA BD
D0 C3 C0 C6 C7 C1 C4 C2 C5-D3 D0 D6 D7 D1 D4 D2 D5
E0 43 40 46 47 41 44 42 45-E3 E0 E6 E7 E1 E4 E2 E5
F0 4B 48 4E 4F 49 4C 4A 4D-EB E8 EE EF E9 EC EA ED
Step 3)
=======
Subtract the length (the number of groups of hex characters,
excluding the time character) of the full password from each
encrypted password character. Now you have the ACTUAL pre-hash
encrypted password. If the subtracted value is negative then
simply continue from FF down to the negative value. Example: if
the password character is at 04, and the length is 6, the value of
the password character will be FF. The length is 13 (D in hex),
so the password was:
98 A0 9B 95 A1 9D A6 9C 9C
and is now:
8B 93 8E 88 94 90 99 8F 8F
Step 4)
=======
Get the time var, in this situation 23 (hex), and subtract it
from FF. This new character is for use in Step 5. Example:
FF-23=DC.
Step 5)
=======
Finally, XOR each character (group of 2 hex characters) of the
encrypted password with the new time character, and you now have
the decrypted password! The password was:
8B 93 8E 88 94 90 99 8F 8F (before the XOR)
Now, the decrypted password is:
57 4F 52 54 48 4C 45 53 53
"WORTHLESS"
The code to this algorithm was added to the Pandora API in Pandora rel.4
______________________________________________________________________________
I hope all the crap pulled here proves some kind of interest (and criticism).
The result of all this is a speed increase of about 76 times compared to
the first release of the project. Have Fun. Jitsu-Disk (jitsu@nmrc.org).
*