For instance, I don't think they ever thought anyone would want to index the 6-piece with the scheme. It uses 1 index per 8K positions, while I think the RobboTotalBases use 1 per 64K positions, or even 1 per 1M positions (something called "hyper-indexing", though what it is actually doing is not immediately clear).
I have some idea what is going on, having generated endgames for:
chess (see
http://3.bp.blogspot.com/_KQ8DMAfZCik/S ... Gothic.jpg)
checkers (see
http://www.liquidnitrogenoverclocking.com/report.txt)
and Gothic Chess (see
http://www.gothicchess.com/javascript_endings.html)
First of all, each position is translated to an index (just a number). That number is "how far into" the file to look for the result for the position. So, every position in a bitbase might be highly compressed, and you can actually store several positions in a byte. If, however, you are looking at a "distance to conversion" tablebase, every position most likely requires up to one byte. If it is a "distance to win" tablebase, it could be up to 2 bytes per position.
The tablebases cluster indices together in what are called BLOCKS. Most blocks for smaller tablebases are 8K in size. This is what I believe you were calling an "index" earlier. Each BLOCK is indexed, starting from 0 and going up to however many blocks a tablebase needs.
The smaller your BLOCK size is, the better your RAM buffer useage will be, but the more memory you will need. Typically, with 4-byte BLOCK indices, each BLOCK will need 4 bytes of RAM for the buffering in addition to the block size.
So, with 1 4-byte index per 8192 (8K) entries, you can load 1 million BLOCKs with 4 MB of RAM plus the 1 million x 8K = 8 GB of RAM. You can see that using one 4-byte index per 64K BLOCK is not an earth-shattering savings of RAM. You can load 128,000 64-K blocks, requiring the same 8 GB of RAM, and you're using only 128,000 x 4-bytes = 1 MB of RAM for the buffering of the BLOCK indices.
So, you saved 3 MB of RAM with 8 GB at your disposal.
What the 8K scheme "buys you" is less disk access. If your 128,000 blocks, each of 64K, does not contain a position that needs accessing, your buffer marks block # 127999 as "back on disk", it does a very expensive disk read of 64K of data, marks that as the "most recently seen" buffer block, and the links all "downgrade" every other block by 1, creating a new 64K block that is just about ready to be paged back to disk.
If your 8K blocks needs to be paged out to disk, it's much faster, and it happens less frequently, since you are able to keep the "very most recently" seen blocks in RAM, and the truly inactive ones page out to disk.
In my experience with checkers endgames, only about 20% of the entire tablebase of trillions of endgames needs to be held in RAM, and it delivers about 95% of the performance of the entire thing being loaded in there.