-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Difference between FSQ and LFQ #152
Comments
I think both of them are lookup-free quantization, while the LFQ method, a specific and simple instance of lookup-free quantization, just occupies this general term (so for clarity, I won't use abbreviation for this general approach). The lookup-free quantization works without explicitly maintaining embeddings for index lookup (e.g., via nearest neighbor search), but using a pre-defined map (probably without any learnable parameter) to obtain an index. As you can verify, both LFQ and FSQ works like this.
Yes I agree, the "codebook" in this context is implicitly represented as a set of integers. So, since both of them are lookup-free quantization, they can only differ in specific implementations, that is, the pre-defined map to obtain the index given the (projected) representation vector. As far as I can tell, LFQ simply uses sign function, and FSQ defines a function with something like f(z) = round((L - 1) / 2 * tanh(z)). Note that even without explicitly maintaining the codebook embedding, after obtaining the index (with base of L), they still have to project the index to higher dimension, and I believe the weight for the projection layer plays a similar role to the codebook embedding. I have also noticed some some popular explanation for this is that the lookup-free quantization decouples the codebook embedding from the index lookup procedure. |
Hi,
I was wondering about the difference between FSQ (finite scalar quantization) and LFQ (look-up free quantization) for vector quantization.
As for a single vector among the encoder output, whatever the architecture, if we consider the quantizer to only map to -1 and 1. Then what's the difference ? Within LFQ, it still call the quantized vector to be a "codebook vector" within the "codebook". Well, in essence, that "codebook" is somewhat a concept manually made, it is no more than combinations of different quantization possibilities at each dimension, as it's not learned at all.
Then what's exactly made within FSQ, as the paper said: If we map each entry zi to L values (....., followed by rounding to integers), we obtain a quantized zˆ, where zˆ is one of Lˆd unique possible vectors (d is vector dimension, L is quantization levels, L being 2 is -1 and 1). I believe this Lˆd possible vectors are the codebook itself anyway.
Hope someone could teach & discuss about this, many thanks!
The text was updated successfully, but these errors were encountered: