How can I understand the C program logic

Binary Reversing 101, or How Crackers Work - Part 1, The Basics

Crackers are people who try to bypass the licensing of a program so that a program can be used without a valid license. Either the validation of the license key is completely switched off or a tool is written that can generate valid license keys at the push of a button.

What is reverse engineering and what techniques are there?

In reverse engineering, starting with a finished program, an attempt is made to understand the logic of the application - without having access to the source code.
For example, a program can be protected with a specific password when it is started before the further program logic is executed. This means that part of the program must be responsible for checking the password. This check can be very simple, such as comparing the input with a fixed-coded value. A complicated method could add up the individual values ​​of the letters and check whether the sum is divisible by "7" and at the same time the input begins with the string "PWD".

There are basically two types of techniques for analyzing a program: static and dynamicanalysis.

Static analysis

The static analysis tries to understand the program logic - without starting the program. Techniques such as "disassembly"or"decompilation"used.
“Disassembly” means that the program of machine code - a sequence of bytes and bits - becomes a "low level" language to convert. Such a low-level language is assembly, for example, and contains pure CPU instructions. An example of an instruction is to store a particular value in a particular CPU register.
Low level languages ​​can also be read and understood by humans, but do not contain any abstractions - as in high-level languages ​​(C, Python or Java) - and are platform-specific. A program in assembly for Intel CPUs cannot be executed on an ARM CPU.

During “decompilation”, an attempt is made to restore the source code from the machine code. This process is much more complicated than disassembly - especially for compiled languages ​​like C or C ++. During compilation, all metadata (e.g. variable names) are removed and even simple high-level language commands are translated to multiple CPU instructions.
In contrast, "decompilation" is easier for languages ​​that are compiled to bytecode. An example of such a language would be Java.

The following simple C function calculates the square of a number:

The function is compiled into 7 CPU instructions. (The first line defines the function name and is ignored by the CPU.)

In the assembly code, the essentials happen from line 4:

  • "MOV eax, DWORD PTR [rbp-4]" loads the "num" parameter into a CPU register (named EAX).
  • EAX is multiplied by itself. According to Intel documentation, the result of the "IMUL" instruction is again stored in EAX.
  • Ultimately, the function is exited with the "RET" instruction.

(The lines not mentioned are inserted automatically by the compiler. These lines correspond to the rules defined by calling conventions.)

As can be seen in the example, the assembly source code is clear and easy to assign to the original code. However, it is much more difficult to infer the original logic from assembly code (without source code) - especially when more complex operations are carried out.

Dynamic analysis

Dynamic analysis tries to collect as much information as possible while the program is running. The cracker analyzes the behavior of the program:

  • What data will into memory loaded?
  • What data is used for Network traffic generated?
  • Which Files become generated or read?
  • Become Environment variables defined or read?
  • Which Libraries are used?
  • and many more…

For dynamic analyzes are often Debugger used. Debuggers are specialized programs that can be used to define “stopping points”. When such a stopping point is reached, execution is paused. You can now look at the working memory or you can run the program in single steps. With “single stepping” the execution is stopped again after each instruction - until the program is either terminated or this mode is exited.

Debuggers are usually used during software development. But debuggers can also be used on finished programs - unless special protection mechanisms or flags have been set. However, experienced attackers can usually bypass these protective mechanisms as well. It just takes longer to find the right spots.