How does 'dictionary size' affect compression?

Solution 1:

Repeatable items are stored in a dictionary and a code is assigned as a substitute.

THIS IS AN OVER SIMPLIFICATION

aaaaaaaaaaaaaaaaaaaaaaaa  0001
bbbbbbbbbbbbbbbbbbbbbbbb  0002
alsdjl;asjdfkl;asdfjkljj  0003

instead of the whole line it just put the code in its place. The larger the dictionary the more codes it can handle. Normally, when a dictionary becomes full it starts a new one on the fly. When it starts a new one it is blank and new codes are assigned to detected patterns.

Generally, the larger the better to a point. The entire dictionary is held in memory so you need more RAM than the dictionary size.

The dictionary size depends on the compressibility of your data, the number of files, size, and overall size.

Generally, 32mb is more than enough, but if your compressing numerous multi-gig files then a much higher number can be used. Larger dictionaries often make the process slower, but the results in a smaller file.

Solution 2:

Compression is dependent on what you are compressing as well as how much RAM you have. Pre-compressed files like pictures and videos are more difficult to compress than files like text, or directories where there is a lot of very similar or same files.

I have six backup directories of a personal php website (xampp) with some mostly minor differences in them. Each main directory is about 600M to 1Gig give or take. Totaling 6gig for all directories together, again, all holding similar files.

Dictionary size to compressed size for all directories in a single package.

4M dictionary = 1,687,995KB
24M dictionary = 1,685,337KB
128M dictionary = 1,685,336KB
512M dictionary = 1,685,336KB (no change from 128) 
1024M dictionary = 315,224KB 

since every directory is roughly the same, and are larger than 512M the 1024M dictionary seems to be the best in this situation.

Compressing 6gigs worth of folders of different music using the same 1024M dictionary size resulted in only 96% ratio of compression. 5.76 gigs instead of 6gigs.The best thing to do for compression of video is lossy compression where you use a program to convert the video. Try lowering the bit-rate to something you don't notice, or can accept the video quality of. Handbrake is a decent video tool, but there are many. VLC is capable of video compression as well using the convert option. Both programs are free to use.