question about normalized MSE #8

lukaemon · 2024-06-25T18:16:16Z

In paper 2.1:

We report a normalized version of all MSE numbers, where we divide by a baseline reconstruction error of always predicting the mean activations.

In readme example:

normalized_mse = (reconstructed_activations - input_tensor).pow(2).sum(dim=1) / (input_tensor).pow(2).sum(dim=1)

Which is the same as in loss.py:

def normalized_mean_squared_error(
    reconstruction: torch.Tensor,
    original_input: torch.Tensor,
) -> torch.Tensor:
    """
    :param reconstruction: output of Autoencoder.decode (shape: [batch, n_inputs])
    :param original_input: input of Autoencoder.encode (shape: [batch, n_inputs])
    :return: normalized mean squared error (shape: [1])
    """
    return (
        ((reconstruction - original_input) ** 2).mean(dim=1) / (original_input**2).mean(dim=1)
    ).mean()

The way I understand normalized MSE and divide by baseline reconstruction error of always predicting the mean activations is

mean_activations = input_tensor.mean(dim=1)
baseline_mse = (input_tensor - mean_activations).pow(2).mean()
actual_mse = (reconstructed_activations - input_tensor).pow(2).mean()
normalized_mse = actual_mse / baseline_mse

What did I miss? Did I misunderstand the paper or code? Thx for your time.

The text was updated successfully, but these errors were encountered:

TomDLT · 2024-07-09T18:23:00Z

Thanks for reporting this discrepancy.

The version used in the paper is:

# computed only once before training on a fixed set of activations
mean_activations = original_input.mean(dim=0)  # averaging over the batch dimension
baseline_mse = (original_input - mean_activations).pow(2).mean()

# computed on each batch during training and testing
actual_mse = (reconstruction - original_input).pow(2).mean()
normalized_mse = actual_mse / baseline_mse

lukaemon · 2024-07-09T18:46:27Z

Got it. It matches the code in train.py Thanks for clarification.

lukaemon closed this as completed Jul 9, 2024

lukaemon added a commit to lukaemon/mission-sae that referenced this issue Jul 9, 2024

update mse eval result per openai/sparse_autoencoder#8

2f6a14d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about normalized MSE #8

question about normalized MSE #8

lukaemon commented Jun 25, 2024

TomDLT commented Jul 9, 2024

lukaemon commented Jul 9, 2024

question about normalized MSE #8

question about normalized MSE #8

Comments

lukaemon commented Jun 25, 2024

TomDLT commented Jul 9, 2024

lukaemon commented Jul 9, 2024