x86_64 Assembler + C = One Love

In this article I will describe the process of calling C functions from assembler.
Let’s try to call printf ( “Hello World \ n!”); and exit (0);

section .rodata
    message: db "Hello, world!", 10, 0

section .text
    extern printf
    extern exit
    global main

    xor	rax, rax
    mov	rdi, message    
    call printf
    xor rdi, rdi
    call exit

Everything is much simpler than it seems, in the section .rodata we describe the static data, in this case the string “Hello, world!”, 10 it is a newline character, and will not forget it annihilate the.

The section of code declare outside of the printf function, exit libraries, stdio, stdlib, also declare main entry function:

section .text
    extern printf
    extern exit
    global main

In the case of the return function rax pass 0, can be used mov rax, 0; but to accelerate the use xor rax, rax; Further, in the first argument is a pointer to a string:

rdi, message

Next call external C functions printf:

    xor	rax, rax
    mov	rdi, message    
    call printf
    xor rdi, rdi
    call exit

By analogy, transfer case 0 in the first argument and calling exit:

    xor rdi, rdi
    call exit

As the Elves say:
Who does not listen
He eats plov @Alexander Pelevin



Source Code


Hello World x86_64 Assembly

In this article I will describe the IDE configuration process, writing the first Hello World assembler x86_64 for Ubuntu Linux operating system.
Let’s start with IDE SASM plant assembler nasm:

sudo apt install sasm nasm

Next, invoke SASM and write Hello World:

global main

section .text

    mov rbp, rsp      ; for correct debugging
    mov rax, 1        ; write(
    mov rdi, 1        ;   STDOUT_FILENO,
    mov rsi, msg      ;   "Hello, world!\n",
    mov rdx, msglen   ;   sizeof("Hello, world!\n")
    syscall           ; );

    mov rax, 60       ; exit(
    mov rdi, 0        ;   EXIT_SUCCESS
    syscall           ; );

section .rodata
    msg: db "Hello, world!"
    msglen: equ $-msg

Hello World code is taken from the blog James Fisher, Adapted for assembling and debugging SASM. In SASM documentation states that the entry point must be a function named main, otherwise debug and compile code is incorrect.
What we did in this code? Made the call syscall – an appeal to the Linux operating system kernel with the correct arguments in registers, a pointer to a string in the data section.

Zoom Enhance

Consider the code details:

global main

global – assembler directive allows you to set global symbols with string names. A good analogy – interface header files C / C ++ languages. In this case, we ask the main character for the input function.

section .text

section – assembler directive allows define sections (segments) of code. Section directive or a segment equal. The .text section is placed code.


Announces the beginning of the main function. The assembler function called subroutines (subroutine)

mov rbp, rsp

The first machine instruction mov – puts the value of the argument 1 to argument 2. In this case, we transfer the register value in rbp rsp. Of comments you can understand that this line added SASM to simplify debugging. Apparently that is a personal affair between SASM and debugger gdb.

Next, look at the code to .rodata data segment, two call syscall, first outputs Hello World string exits from the second application with the correct code 0.

Let us imagine that the registers are variables with names rax, rdi, rsi, rdx, r10, r8, r9. By analogy with the high-level language, turn from vertical to horizontal view of the assembly, then the call syscall will look like this:

syscall(rax, rdi, rsi, rdx, r10, r8, r9)

Then the call to print text:

syscall(1, 1, msg, msglen)

Calling the exit with the correct code 0:

syscall(60, 0)

Consider the arguments in more detail in the header asm/unistd_64.h file find function __NR_write – 1, then look in the documentation for the arguments write:
ssize_t write (int fd, const void * buf, size_t count);

The first argument – the file descriptor, the second – the buffer with the data, the third – the counter bytes to write to a file handle. We are looking for the number of file descriptor for standard output, in the manual on stdout find the code 1. Then the case for small, to pass a pointer to the Hello World string buffer from the data section .rodata – msg, byte count – msglen, transfer registers rax, rdi, rsi, rdx correct has argument and call syscall.

Designation constant length lines and is described in manual nasm:

message db 'hello, world'
msglen equ $-message

Simple enough right?



Source Code


Abilities in Space Jaguar Action RPG

The first article about the game in the development of Space Jaguar Action RPG. In this article I will describe gempleynuyu feature Jaguar – Specifications.

Many RPG system using static characteristics of the character, such as the characteristics of the DnD (Strength, Physique, Agility, Intelligence, Wisdom, Charm) or Fallout – S.P.E.C.I.A.L (Strength, Perception, Endurance, Cha, Intelligence, Agility, Luck).

The Space Jaguar I plan to implement dynamic characteristics of the system, such as Jag main hero of the game at the start of a total of three characteristics – Possession blade (polusablya), shadow transactions (transactions in the criminal world), picaresque capacity (breaking of locks, theft). During the game, characters will be endowed and deprived of the dynamic characteristics within the gaming unit, all checks will be made on the basis of the level of specific characteristics required for a given game situation. For example Jag not be able to win a game of chess, if no response has a chess game or not has sufficient for screening.

To simplify the logic checks, each characteristic is given by 6 digit code for English letters, name, description. Such as possession of Blade:

var bladeFightingAbility = new Object(); 
bladeFightingAbility.name = "BLADFG"; 
bladeFightingAbility.description = "Blade fighting ability"; 
bladeFightingAbility.points = 3;

Before the start of the gaming unit can view the list of public audits required for passage, also the creator can hide part of checks to create interesting game situations.

Know-hou? It will be interesting? Personally, I find this an interesting system that allows both to ensure the freedom of creativity creators of gaming units, and the ability to transfer characters from a different, but similar in characteristics to the players modules.

Hash Table

Hash table data structure allows to realize an associative array (dictionary), with an average capacity of O (1) to insert, delete, search.

Below is an example of a simple implementation of a hash mapy on nodeJS:

How it works? Watching the hands:

  • Inside is an array of hash mapy
  • Inside the element of the array is a pointer to the first node of a linked list
  • Partitioning the memory to an array of pointers (e.g. 65,535 cells)
  • Implement the hash function, the input dictionary is the key, and at the outlet it can do just about anything, but in the end returns the array index

How does the record:

  • At the entrance there is a pair of key – value
  • The hash function returns the index on
  • Get node linked list from an array by index
  • Check whether it matches the key
  • If it matches, then replace the value
  • If it does not, then move on to the next node, until we find or do not find the node with the correct key.
  • If the node has not found, we create it at the end of a linked list

How does the search key:

  • At the entrance there is a pair of key – value
  • The hash function returns the index on
  • Get node linked list from an array by index
  • Check whether it matches the key
  • If it matches, the return value
  • If it does not, then move on to the next node, until we find or do not find the node with the correct key.

Why do we need a linked list in the array? Because of possible conflicts in the calculation of the hash function. In such a case several different key-value pairs will be located on the same index in the array, in such a case is carried out by extending the linked list with the search key necessary.



Source Code


Resources access through NDK C++ Android

To work with resources in Android through ndk – C ++ there are several options:

  1. Use access to the resources of the apk file using AssetManager
  2. Download resources from the Internet and extract them in the application directory, used by standard methods C ++
  3. Combined method – to get access to the archive with resources apk through AssetManager, unpack them in the application directory, then use with standard C ++ techniques

Next, I will describe the combination of access methods using the game engine Flame Steel Engine.
When using SDL can facilitate access to the resources of the apk, library wraps the calls to AssetManager, offering a similar interface to the stdio (fopen, fread, fclose, etc.)

SDL_RWops *io = SDL_RWFromFile("files.fschest", "r");

After the file download from the apk to the buffer, you need to change the current working directory to the application directory, it is available to the application without additional permits. For this we use a wrapper on SDL:


Next, write down the file from the clipboard to the current working directory using fopen, fwrite, fclose. After the archive will be available in the directory for C ++, unpack it. Archives zip can extract a combination of two libraries – minizip and zlib, the first structure can work with files, the second decompresses the data.
For more control, simplicity, portability, I realized own archive format with zero compression called FSChest (Flame Steel Chest). This format supports the directory archiving files, and unpacking; Support folder hierarchy is missing, can work only with files.
Connecting the library header FSChest, unpack the archive:

#include "fschest.h" 
FSCHEST_extractChestToDirectory(archivePath, SDL_AndroidGetInternalStoragePath()); 

After unpacking, C / C ++ interfaces will be available files from the archive. So I did not have to rewrite all the work with the files in the engine, and add only unpacking files at startup.



Source Code


Stack Machine and RPN

Let’s say we need to implement a simple bytecode interpreter, which approach to the implementation of this task to choose?

Stack data structure provides an opportunity to implement a simple bytecode machine. Features and realization of machines stack described in numerous articles of Western and domestic Internet, just mention that the Java virtual machine is an example of a stack machine.

The principle of operation of the machine is simple, the input is a program containing data and opcodes (opcodes), using the stack manipulations performed realization of necessary operations. Consider the example of the program bytecode my stack machine:

пMVkcatS olleHП

At the output we get the string “Hello StackVM”. Stack machine reads the program from left to right, character by character by uploading data onto the stack, with the appearance of the opcode to symbol – performs implementation team using the stack.

An example of the implementation of the stack machine to nodejs:

Reverse polish notation (RPN)

Also, stacking machine is easy to use for the implementation of calculators, this is done using RPN (postfix notation).
An example of a conventional infix:

Converted into RPN:

To calculate postfix notation use a stack machine:
2 – at the top of the stack (stack 2)
2 – on top of the stack (Stack: 2.2)
* – get the top of the stack twice, multiply the result is sent to the top of the stack (stack of 4)
3 – on top of the stack (the stack 4, 3)
4 – on top of the stack (a stack of 4, 3, 4)
* – get the top of the stack twice, multiply the result is sent to the top of the stack (stack of 4, 12)
+ – get the top of the stack twice, add up the results, go to the top of the stack (stack 16)

As you can see – the result of operations 16 remains on the stack, it can be derived by implementing opcodes stack printing, for example:

П – print start opcode stack, n – opcode closure print stack and sending the final line in rendering.
For conversion from arithmetic operations in postfix infix used Edsger Dijkstra algorithm called “Shunting-yard algorithm”. An example implementation can be found above or in the project repository on the machine stack nodejs below.



Source Code


Skeletal Animation (Part 2 – Node Hierarchy, Interpolation)

Algorithm goes on to describe skeletal animation, as its implementation in the game engine Flame Steel Engine.

Because the algorithm is the most complex of all that I implemented, in the notes on the process of development can occur errors. In the last article of this algorithm, I made a mistake, bone mass is passed to the shader for each mesh separately, rather than for the entire model.

Node Hierarchy

To work correctly you need to model the algorithm contained a link bones together (graph). Imagine a situation in which both played two animations – jumping and raising his right hand. Animation jump should raise the model on the Y axis, the animation show of hands should take this into account and to rise along with the model in a jump, otherwise the hand will remain on its own on the spot.

Describe the relationship of nodes in this case – the body contains a hand. In developing the algorithm will produce bone graph reading, all animations will be included with the correct connections. The memory model graph is stored separately from all animations, just to reflect the connectivity model bones.

Interpolation on CPU

In the last article, I described the principle of rendering skeletal animation – “transformation matrix are transferred from the CPU to the shader when rendering each frame.”

Rendering each frame is processed on the CPU, for each bone mesh engine receives a final transformation matrix by interpolation position, rotation, zoom. During the final interpolation bone matrix produced by extending the tree nodes for all active nodes animations, final matrix is ​​multiplied to the parent, is then sent to the rendering in the vertex shader.

For interpolation position and increasing use of the vector, quaternions are used to rotate because they are very easy interpolated (SLERP) in contrast to the Euler angles, as they are very easy to imagine a transformation matrix.

How to simplify the implementation of

To simplify debugging work vertex shader, I added the simulation work on the vertex shader CPU using FSGLOGLNEWAGERENDERER_CPU_BASED_VERTEX_MODS_ENABLED macro. At NVIDIA graphics cards manufacturer has a tool for debugging the shader code Nsight, perhaps she, too, can simplify the development of complex algorithms vertex / pixel shaders, however, test the functionality I have not had the opportunity, enough simulation on the CPU.

In the next article I plan to describe the mixing of multiple animations, supplement to fill the remaining gaps.