cringe fix because I guess I moved which logit gets trained for len duration (I should probably rethink this)

This commit is contained in:
mrq 2025-03-25 21:33:01 -05:00
parent a1eb96e6c1
commit 476d87d4aa
2 changed files with 4 additions and 1 deletions

View File

@ -145,6 +145,9 @@ These settings should be avoided:
To be evaluated thoroughly.
* The smaller model seems to have hit its capacity limit, while the larger model is slowly improving (although objective metrics are not noted).
* The model seems pretty quick, even for the large model.
* The smaller model seems small enough for CPU-only inferencing
* Despite its poor zero-shot performance, it could be perfectly fine for finetuning.
At a glance, compared to the prior model setup, this implementation allows for the model to better represent speech as it's able to see the entire signal and account for it in its latent space, rather than only specific levels of it.

View File

@ -765,7 +765,7 @@ class Base_V2(nn.Module):
# needed, cringe
if task_type == "len":
batch[-1] = torch.cat( [ batch[-1], self.sep[None] ] )
batch[-1] = torch.cat( [ batch[-1], self.sep[None], self.sep[None] ] )
x_list.append( _join( batch, self.sep ) )