Difference between FSQ and LFQ #152

JasonShen-SH · 2024-08-06T15:58:42Z

Hi,

I was wondering about the difference between FSQ (finite scalar quantization) and LFQ (look-up free quantization) for vector quantization.

As for a single vector among the encoder output, whatever the architecture, if we consider the quantizer to only map to -1 and 1. Then what's the difference ? Within LFQ, it still call the quantized vector to be a "codebook vector" within the "codebook". Well, in essence, that "codebook" is somewhat a concept manually made, it is no more than combinations of different quantization possibilities at each dimension, as it's not learned at all.

Then what's exactly made within FSQ, as the paper said: If we map each entry zi to L values (....., followed by rounding to integers), we obtain a quantized zˆ, where zˆ is one of Lˆd unique possible vectors (d is vector dimension, L is quantization levels, L being 2 is -1 and 1). I believe this Lˆd possible vectors are the codebook itself anyway.

Hope someone could teach & discuss about this, many thanks!

leolin65 · 2024-08-25T13:49:34Z

how about https://github.com/zhaoyue-zephyrus/bsq-vit/blob/main/transcoder/models/quantizer/bsq.py

function2-llx · 2024-12-11T12:18:03Z

I think both of them are lookup-free quantization, while the LFQ method, a specific and simple instance of lookup-free quantization, just occupies this general term (so for clarity, I won't use abbreviation for this general approach).

The lookup-free quantization works without explicitly maintaining embeddings for index lookup (e.g., via nearest neighbor search), but using a pre-defined map (probably without any learnable parameter) to obtain an index. As you can verify, both LFQ and FSQ works like this.

Well, in essence, that "codebook" is somewhat a concept manually made, it is no more than combinations of different quantization possibilities at each dimension, as it's not learned at all.

Yes I agree, the "codebook" in this context is implicitly represented as a set of integers.

So, since both of them are lookup-free quantization, they can only differ in specific implementations, that is, the pre-defined map to obtain the index given the (projected) representation vector. As far as I can tell, LFQ simply uses sign function, and FSQ defines a function with something like f(z) = round((L - 1) / 2 * tanh(z)).

Note that even without explicitly maintaining the codebook embedding, after obtaining the index (with base of L), they still have to project the index to higher dimension, and I believe the weight for the projection layer plays a similar role to the codebook embedding. I have also noticed some some popular explanation for this is that the lookup-free quantization decouples the codebook embedding from the index lookup procedure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between FSQ and LFQ #152

Difference between FSQ and LFQ #152

JasonShen-SH commented Aug 6, 2024 •

edited

Loading

leolin65 commented Aug 25, 2024

function2-llx commented Dec 11, 2024 •

edited

Loading

Difference between FSQ and LFQ #152

Difference between FSQ and LFQ #152

Comments

JasonShen-SH commented Aug 6, 2024 • edited Loading

leolin65 commented Aug 25, 2024

function2-llx commented Dec 11, 2024 • edited Loading

JasonShen-SH commented Aug 6, 2024 •

edited

Loading

function2-llx commented Dec 11, 2024 •

edited

Loading