I am Viktor Krapivenskiy. Here is my
CV.
Follows is the list of things I made.
luastatus,
a universal status bar content generator
luastatus is a universal status bar content generator. It allows the user to configure the way the data from event
sources is processed and shown, with Lua. Its main feature is that the content can be updated immediately as some
event occurs, be it a change of keyboard layout, active window title, volume or a song in your favorite music player
(provided that there is a plugin for it) — a thing rather uncommon for tiling window managers. Its motto is:
No more heavy-forking, second-lagging shell-script status bar generators!
It has a modular architecture, supporing plugins for providing data and barlibs for interacting with different status
bars. It supports
i3wm,
dwm,
lemonbar,
dzen/
dzen2,
xmobar,
yabar,
dvtm,
and others.
cmenu, a column-layout dynamic menu for the terminal
cmenu is like dmenu for terminal, but it’s also multi-column.
It can be used to select a Wi-Fi network to connect to, select a device to mount, and similar things.
jjhash, a simple string hash function that is faster than FNV-2 and has the same or better quality than FNV-2
jjhash is a string hash function.
It is:
- simple (also the implementation is very small and consists of a single header file);
- faster than FNV-2 (6x faster on pointer-and-length strings, 3x faster on null-terminated strings);
- either on par with, or better than, FNV-2 on statistical properties;
- streamable (i.e. supports fast calculation of the hash of concatenation A+B by the contents of B and the hashing state of A);
- has both 32-bit and 64-bit variants;
- licensed under public domain-like license (Unlicense).
libdeci, an arbitrary-precision decimal arithmetic library for C
libdeci is an arbitrary-precision decimal arithmetic library for C with add-on libraries
libdeci-kara implementing Karatsuba multiplication,
libdeci-ntt implementing multiplication via Number-Theoretic Transform (NTT), a variant of Fourier transform, and
libdeci-newt implementing fast inversion and division using Newton’s method.
It is faster than the
mpdecimal library.
See also the
libdeci WebAssembly demo.
calx, a bc-like programming language
calx an attempt to make a modern replacement for
bc, while preserving its best features, such as big-decimal
numbers and explicit support for interactivity in the language. It is a full-fledged programming language with
functions, local and global variables, lists, dicts, strings, etc.
It uses
libdeci for arithmetic operations.
This research project achieves a 3x—5x speedup over the
mpdecimal library. The paper describes the implementation
and discusses further possible optimizations.
It also present a simple cache-efficient algorithm for in-place 2n×n or n×2n matrix transposition,
the need for which arises in the “six-step algorithm” variation of the matrix Fourier algorithm, and which does
not seem to be widely known.
Another finding is that use of two prime moduli instead of three makes sense even
considering the worst case of increasing the size of the input, and makes for simpler answer recovery.
fiwia, a generator of x86-64 machine code for fixed-width multi-word arithmetics
The need for fixed-width multi-word arithmetics frequently arises in cryptography.
In this setting, full unroll is usually desirable in two reasons:
- loop overhead, and;
- the need to pass the carry flag to the next iteration, which, without unrolling, would have to be done via saving and restoring the carry flag (which is slow).
fiwia generates fully unrolled x86-64 assembly for fixed-width arithmetic operations, such as: addition/subtraction, masked addition/subtraction, negation, comparison, multiplication, bit shifts.
It’s generally faster than
libgmp’s
mpn_* (except for div/mod operations).
It provides useful information and mathematical proofs on the common methods of doing division and n-th root extraction on long numbers.
This information tends to be scattered across specialized literature and be in a hard-to-follow form.
In this paper I present my implementation of ordered set based on a trie.
It only supports integer keys and is optimized for market data, namely for what we call sequential locality.
The following is the list of what I believe to be novelties:
- cached path to exploit sequential locality in market data, and fast truncation thereof on erase operation;
- a hash table (or, rather, a cache table) with hard O(1) time guarantees on any operation to speed up key lookup (up to a pre-leaf node);
- hardware-accelerated "find next/previous set bit" operations with BMI2 instruction set extension on x86-64;
- order book-specific features: the preemption principle and the tree restructure operation that prevent the tree from consuming too much memory.
We achieve the following speedups over C++'s standard std::map container:
6x—20x on modifying operations,
30x on lookup operations,
9x—15x on real market data,
and a more modest 2x—3x speedup on iteration.
Support for Lua scripting in strace
I implemented support for Lua scripting in
strace as a participant of Google Summer of Code-2017 (
work product submission,
slides).
This project extends the
strace project with tampering capability, allowing the user to inject fake syscall results, and read
and write the memory of the process being traced.
It is a PoC of sharing Lua interpreter memory between multiple processes.
There’s nothing new in Lua being flexible: it allows to override all the memory management by passing a custom allocator to
lua_newstate. The only things left are to create a shared memory mapping and implement an allocator working in it.
That’s it! You can send me an email to ${my_github_handle}nine@gmail.com.