hckrnws
How Hackerman would create an image just by typing 0 and 1 – deep dive into GIF
by happybits
Some years ago (1996) I wrote a GIF decoder from scratch based entirely on http://qzx.com/pc-gpe/gif.txt (and a description of LZW bundled with my copy of pcgpe that seems to be missing from that page) . I remember it being quite a struggle against off-by-one errors.
I do wish there was a really good format for describing binary file formats in a way that was amenable to codegen. Kaitai https://kaitai.io/ seems to be the state of the art.
Kaitai looks nice - have you used it enough to review how it handles? I'm just starting a project to deal with somewhat involved on-disk formats[0] and this might be helpful.
[0] The other day, someone was asking for a "tar2ext4" tool, and I thought "hey, that should exist, and I need a side project!". I was prepared to use an annotated hex viewer ( https://hachoir.readthedocs.io/en/latest/wx.html ) and hand roll the encoder/decoder, but I'll happily take tool assistance:)
Kaitai is good if:
- your format is fully known (it's less helpful if you're trying to incrementally build a parser while reverse engineering)
- you want to read files, but don't care about writing
- you don't mind that the development is not very active
For writing "tar2ext4" I would genuinely look at how much work it would be to run the ext4 code from the kernel in a different context; there's a lot of it to consider. Or do what the Apple "dmg" tooling does and make a ramdisk.
Okay, so of interest but maybe not applicable to my usecase. Thanks:)
Yeah, it remains to be seen how complex the actual format/code is. Would need to balance the difficulty of recreating it (which I assume to be quite high) against difficulty of extracting kernel code... although https://github.com/lkl/linux exists so for all I know maybe it's easy¯\_(ツ)_/¯
And yes, if I needed to actually write a "tar2ext4" tool today - like, start working in the morning and have it done by EOD - I would absolutely use... actually probably a loopback device rather than a true ramdisk, but yeah. But that requires root access and fiddling with loopback config, which seems excessive for what is, ultimately, just another archive format (from a certain point of view). And honestly some of it is just that it sounds fun to get my hands dirty with filesystem code in userspace:)
(Forgive my replying to myself; I'm out of the edit window and this seemed good to leave in case someone stumbles across this comment)
Yeah so I'm maybe a touch dumb and missed this right in the lkl readme - their demos all but include that exact function:
fs2tar - a tool that converts a filesystem image to a tar archive
cptofs/cpfromfs - a tool that copies files to/from a filesystem image
lklfuse - a tool that can mount a filesystem image in userspace, without root priviledges, using FUSE
Love learning about image/video formats and I enjoy how you broke it all down. Just a heads up, pretty sure you've got a typo:
"Next is the Global Packed Field, which in this case is 70 which in binary form is 00000000."
70 in binary would be 100110. (64 + 0 + 0 + 4 + 2 + 0)
This pops up again later when he says 81 is 10000001. Or that width 8 corresponds to 04. I don’t know enough about the gif format to know if I just misunderstood these parts or if they were written incorrectly, but it was a bit confusing.
81 in hex is 10000001 in binary, so I think this part is correct
Thanks mikecx, it was a typo, I've changed the text to "Next is the Global Packed Field, which in this case is 00..."
CUSTOMER
Hackerman, I need an impressive icon for my website. It should be 5x5 pixels big and look like a rabbit. Can you please draw it for me?
HACKERMAN
Draw!? Bah! I don’t need any graphic program for that. I am Hackerman. I will code it for you. You will get the image next week.
CUSTOMER
Next week?? But…
HACKERMAN
No buts! I just need to read about how the GIF file format works, then I can create the image in no time.
[TIME PASSES]
After spending some evenings, Hackerman gets the main idea of how the GIF file format works and the compression algorithm called LZW. With that knowledge, he succeeded in creating the image within an hour.
Hackerman calculated that the binary of the image should be as follows:
47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 05 00 05 00 81 11 11 11 FF FF FF D5 D7 D9 00 00 00 07 0F 80 01 00 83 01 82 84 85 88 82 8A 85 02 85 81 00 3b
So he just opened his code editor, saved the file as rabbit.gif, and sent it to his customer. Boom! Easy-peasy!
Do you want understand the GIF-file format and be as cool as Hackerman?
I teach a foundational media technology course at on of the bigger european art universities — I do the same thing with the students using a broadcast wave file.
The goal of the thing isn't to turn them into hackers, it is to give them a feeling what the stuff they work with is made of, what a file is. This is also a great introduction to talk about compression, metadata, encoding, decoding, sample rate, bitdepth and so on.
If you dive that deep into it, the settings in a typical media conversion program will suddenly become much less intimidating. My motto always was: this was made by humans so it should be possible for humans to understand it as well. And this is maybe the "hidden" lesson: If you bring enough patience you can go into the depth of nearly every topic.
I absolutely love that you do this. There appears to be a trend to discount the value of knowing the low-level details of these sorts of things -- but even if you never actually work at that level, knowing what's happening "behind the scenes" increases and deepens understanding and can make high-level things that seem arbitrary or nonsensical make sense.
First principles and curiosity > blithe unconcern about the "magic box". Likewise, coders who don't have enough understanding all the way down the stack including hardware are merely engineer posers in it for a paycheck.
Thanks a lot JohnFen.
One reason for writing the post and trying to create GIF-images from binary, was to come as close as possible to GIF format and especially the LZW-algorithm. To understand it in an intuitive way. To get a bit closer to the minds who created the algorithm in the first place.
I forgot most of what little I learned in semiconductor physics and analog design courses in college, but I remember enough to know that nothing about computers is magic. Nothing. That knowledge gives me confidence that I can figure out anything having to do with computers given enough quiet time.
On the other hand, I don't really know much about cars. If I had a 2024 model year car and something went wrong with it, I'd very quickly run up against a wall of arcane-seeming knowledge that I don't have easy access to. I don't know where I would start to learn everything about cars from scratch like I did with computers.
There might be a name for this, but I don't know what it is.
Exactly this is the reason I do it. The "knowledge" part will fade if it isn't applied regularily. If I manage to convey a deeper understanding of how things are intertwined or at least the feeling that it is actually possible to understand things and that curiosity is king, that is all I can do.
It is an art school after all. I will never turn all of them into low level systems programmers, but I can give them enough understanding of the tech world that it can become the topic of art without making people with a tech background cringe.
And some of them take this as a starting point to go way deeper.
> nothing about computers is magic
In this vein I think about nelhage's "Computers Can Be Understood" from time to time - https://blog.nelhage.com/post/computers-can-be-understood/ .
computers contain transistors and how transistors work is by magic.
before that, tubes. tubes also work by magic.
before that, relays, which are based on electromagnets. electromagnets work by magic.
> how transistors work is by magic
ok but it's "just" semiconductor band-gap physics, which my monkey grug brain can kind of just barely make out, if I squint
> tubes also work by magic
tubes work by "just" thermionic emission, or "tube physics", which grug brain has a better handle on
> electromagnets work by magic
okay, you've got me there, physics grug brain think magnets high magic.
tubes work by not just thermionic emission (that'd be a diode) but also by a sort of uncanny leveraging of inverse square laws in free space surrounding the grid. when the first tube amplifier was invented, it was by accident, it wasn't understood how it worked.
You can make computers (in a non economical fashion) using pipes of water and water actuated valves. This can be done in a manner _highly_ analogous to transistor, tube or relay machines. That was what convinced me that computers are not magic.
atoav: this sounds like at great exercise. Have you written any public article about that?
The main reason for writing this article was * My curiosity - what secret is behind all strange bytes in a graphical file * To spend some time doing something weird, that I know have zero effect of my career and no chance of giving and benefits or profit in the future.
[dead]
That first
> 00 00 00 00
should be `05 00 05 00`.
Yeah the article says these "aren't used anymore," but my image viewer (sxiv) complained that the gif was corrupted until I put something there. It still displayed the image, but it also complained afterwards.
In the case of viewnior, it said that the resulting image has zero size and displayed nothing. That statement of it not being used seems false.
Aha, thanx for the info, I didn't know. So the canvas width and height should be the same as the width and height?
Yes? Is that such a bad thing? Is this trying to say there is no value in learning something so low level. Exploration leads to learning, and learning leads to innovation. Perhaps Hackerman will go on to create the greatest image encoding library for images so that they can easily scale from 5x5 to 25x25 or more and fit in the same space as the 5x5. Who knows.
He’s quoting from TFA.
Defeats the purpose a bit, but its far easier to type out a PBM [0] file by hand and then just convert it to gif using e.g. imagemagick
There's a cute plugin[0] for Vim which converts any image to XPM, which is a similar format that Vim has syntax-coloring for. You can edit the text, and then on save, it will get converted back to the original format. I've used it a few times to quickly preview an image or edit a favicon. It's more of party trick than seriously useful, though.
Not even just manually typing... Last time I wanted to have a program save a picture [0] it was easiest to write PPM and then convert that to a real format. Super inefficient file sizes, but good tradeoff for a hobby project. I can take some big intermediate files in exchange for not needing a graphics file format library:)
[0] I was playing with the Linux framebuffer and wrote - among other things - a screenshot tool.
Wait, TFA doesn't even contain a link to this vid from Hackerman hacking time!?
(my favorite part is when he goes into hardcore hacking mode while putting a Nintendo glove on)
What was the follow up content that's no longer available?
At some point the writer showed someone the script and said "I'm going to make this" and a bunch of people looked over it and said "thor, dinosaurs, hitler ... bullets going through the phone? right, ok, sure."
Eh, yes it did :-)
Reminds me of graphic designer vs css programmer
> Next is the Global Packed Field, which in this case is 70 which in binary form is 00000000
… what? 70 should be 1000110 surely?
70 in hex. So 01110000. The 111 bits are unused according to the article so it can be anything.
One great resource for GIF-related explorations is Matthew Flickinger's "What's In A GIF" project:
* https://www.matthewflickinger.com/lab/whatsinagif/index.html
The original version is apparently from ~2005 and is used as the basis of the giflib docs referenced by the original article[0]. (The giflib docs do expand on the content of the original, so are still worth reading.)
But Matthew Flickinger's original version has continued to be updated as recently as 2022[1] and now includes two helpful browser-based GIF tools:
* GIF Explorer: https://www.matthewflickinger.com/lab/whatsinagif/gif_explor...
* GIF Encoder: https://www.matthewflickinger.com/lab/whatsinagif/gif_encode...
GIF Explorer displays the "interpreted" bytes of any GIF file in an almost "literate" style and has an UI/UX which I'd be really interested to see used in a generic reverse-engineering/binary viewer tool.
GIF Encoder enables you to create an image in the browser & see how it is GIF encoded.
I have a rant about how modern GIF usage could be so much better than it is (and still be within the original specification) but instead of subjecting you to that I'll subject you to this project of mine instead: https://audiogif.rancidbacon.com
I absolutely love the "What's In A GIF" series. It's what inspired me to write my own GIF decoder while learning Erlang at the same time: https://github.com/avik-das/giferly
The first time around, I struggled a lot with decoding errors. Many years later, after being a more experienced developer, I wrote the LZW decompression with unit tests. Doing so forced me to think about each edge case, and fix issues without breaking existing functionality. Very quickly, I was able to open pretty much any GIF file I threw at it!
Thanks follower
I've read his posts about GIF and referred to it at the end of my article. But I didn't know about "GIF Encoder" and "GIF Explorer" - interesting!
Bah. Should've used PCX files for that extra steganographic space goodness. (JPG is too easy.)
If you can't create punchcards or hex blindfolded, there are always tools:
[pdf] https://www.pedramhayati.com/images/docs/survey_of_steganogr...
[zip] https://ftp.funet.fi/pub/crypt/archive/idea.sec.dsi.unimi.it...
[zip] https://web.archive.org/web/20230828124101/https://dl.packet...
> In Visual Studio Code, there is an extension called Hex Editor, which lets you view and edit the binary file.
I'll take this opportunity to bring up a method to patch binary data in True Scots^H^H^H^H^HHackerman fashion, using nothing more than vim and xxd, which are already installed everywhere (for some definition of "everywhere").
LiveOverflow describes it between 5:02-7:46 in:
https://www.youtube.com/watch?v=LyNyf3UM9Yc&t=302s
It is the `:%!xxd` and `:%!xxd -r` trick.
(Trying it out again before commenting here, it seems like one might need to `:set nofixeol` beforehand so as not to append any nonexistent newline at the end of file).
My mind is blown by this trick, and I've still never got around to understanding wtf is happening here. (ETA: Upon some thought, I reckon `xxd -r` can just chug along happily by completely ignoring the ascii rendering columns.)
This isn't so much a "trick" as it is the main purpose of xxd. xxd is distributed with vim (as in, I'm pretty sure if you want to send a patch to xxd, you send it to the vim repo). One of it's primary purposes is to allow for the editing of binary files in Vim.
I had no idea!
Netpbm https://en.wikipedia.org/wiki/Netpbm is even easier
I like reading articles like this. Compression is always a hard part for me to understand, just feels very unintuitive
Note that the article glances over the first byte of image data which specifies how many bits each pixel occupies and sets it 07, which makes first 128 symbols of the resulting LZW stream byte aligned. This is probably never the case in practice (for the presented images it should be 1 respectively 2).
Someone needs to release bit for bit break downs of common image formats. It’s always fun to draw by hand.
It is mentioned near the end though.
Thanks, I wonder if I can now train an AI to create exquisite GIFs using binary using these instructions.
Crafted by Rajat
Source Code