cringe fix because I guess I moved which logit gets trained for len duration (I should probably rethink this)
This commit is contained in:
parent
a1eb96e6c1
commit
476d87d4aa
|
@ -145,6 +145,9 @@ These settings should be avoided:
|
|||
|
||||
To be evaluated thoroughly.
|
||||
* The smaller model seems to have hit its capacity limit, while the larger model is slowly improving (although objective metrics are not noted).
|
||||
* The model seems pretty quick, even for the large model.
|
||||
* The smaller model seems small enough for CPU-only inferencing
|
||||
* Despite its poor zero-shot performance, it could be perfectly fine for finetuning.
|
||||
|
||||
At a glance, compared to the prior model setup, this implementation allows for the model to better represent speech as it's able to see the entire signal and account for it in its latent space, rather than only specific levels of it.
|
||||
|
||||
|
|
|
@ -765,7 +765,7 @@ class Base_V2(nn.Module):
|
|||
|
||||
# needed, cringe
|
||||
if task_type == "len":
|
||||
batch[-1] = torch.cat( [ batch[-1], self.sep[None] ] )
|
||||
batch[-1] = torch.cat( [ batch[-1], self.sep[None], self.sep[None] ] )
|
||||
|
||||
x_list.append( _join( batch, self.sep ) )
|
||||
|
||||
|
|
Loading…
Reference in New Issue
Block a user