Just out of curiousity how JPEG works, I decided to try JPEG decompression. This is the start of the project. Currently it only works on greyscale images. Missing from this the ability to do color, downsampling, and reset markers. Hopefully soon, if I have time, I'll add these features in. I also plan on speeding up the huffman decompression and the IDCT function using SSE2. Currently I have it so it outputs a lot of debug information so it's also quite slow :).
jpeg-2006-05-08.tar.gz (Source Code)This package is NOT opensource. If you'd like to use this code in your own projects (commercial or not) let me know and we can work something out.
While working on JPEG decompression I thought about something. JPEG works by taking 8x8 pixel blocks and running a 2D DCT transformation on the pixels. The 2D DCT transformation is actually lossless (other than floating point rounding errors and the fact that each DCT coefficient is floor()'d). After building the DCT matrix, each DCT number is divided by a quantization number. This is where JPEG really becomes lossy. JPEG DCT's also seem to order themselves a little bit. The first DCT typically is a pretty big number while the trailing DCT's are smaller (possibly 0's depending on the colors in the image and how high the quantization numbers are). Anyway, knowing this I wondered if it was possible to take an image that has been compressed and create a scoring value between 0 to 100 depending on how hard the image was compressed. This program reads in a BMP file and runs 8x8 2D DCT's on all the 8x8 blocks. It then counts the trailing numbers that are lower than the threshold used.
To use this program type: ./anal_dct myimage.bmp 0
This will take in a BMP image called myimage.bmp and first convert it to YUV (ignoring the U and V values). For each 8x8 block of pixels, the pixel values - 128 are printed and then the 8x8 DCT matrix is printed. Finally, the number of 0's (the threshold value I used) that were counted trailing (in zig zag order) are printed. After all DCT's are run through, an array is shown in the form of a = [ ] showing the number of counts of trailing 0's for each 8x8 block. In other words of the second to last number in array is 546, then there were 546 8x8 DCT blocks that had only 1 trailing 0. This array can be cut and pasted into GNU Octave or Matlab and plotted by typing plot(a) at the octave command prompt.
The score value is a guess of the probability if this image was heavily JPEG compressed. If the score value is close to 100, then the image was probably previously heavily compressed with JPEG. A score around 15 means it was probably done with medium compression. A score of 1 probably shows that it was never compressed at all. Keep in mind this isn't fool proof at all. An image with not many colors will appear to have been heavily compressed. For example, one of my test images, a 320x200 pure black BMP gets a score of 98.
anal_dct-2006-05-11.tar.gz (Source Code)This package is NOT opensource. If you'd like to use this code in your own projects (commercial or not) let me know and we can work something out.
JPEG In Practice (What Not To Do)
I recently (for some stupid reason) bought a crappy digital camera from Walmart. The images that are created with this camera have are corrupted. If you'd like to read about why they are wrong, I put up a explanationh here.
Copyright 1997-2013 - Michael Kohn