Error when training : TypeError: new(): invalid data type 'str' #212

Open
opened 2023-04-19 19:05:58 +00:00 by Snowad14 · 4 comments
23-04-19 20:59:30.492 - INFO: Training Metrics: {"loss_text_ce": 3.1088507175445557, "loss_mel_ce": 2.75213360786438, "loss_gpt_total": 5.8609843254089355, "lr": 3.90625e-08, "it": 66, "step": 66, "steps": 2254, "epoch": 0, "iteration_rate": 13.106424331665039}
Traceback (most recent call last):
  File "C:\Logiciel\ai-voice-cloning\src\train.py", line 64, in <module>
    train(config_path, args.launcher)
  File "C:\Logiciel\ai-voice-cloning\src\train.py", line 31, in train
    trainer.do_training()
  File "C:\Logiciel\ai-voice-cloning\src\dlas\train.py", line 406, in do_training
    for train_data in tq_ldr:
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\dataloader.py", line 634, in __next__
    data = self._next_data()
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\dataloader.py", line 1372, in _process_data
    data.reraise()
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\_utils.py", line 644, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 127, in collate
    return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 127, in <dictcomp>
    return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 119, in collate
    return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 183, in collate_int_fn
    return torch.tensor(batch)
TypeError: new(): invalid data type 'str'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\fetch.py", line 54, in fetch
    return self.collate_fn(data)
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 264, in default_collate
    return collate(batch, collate_fn_map=default_collate_fn_map)
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 130, in collate
    return {key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 130, in <dictcomp>
    return {key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 119, in collate
    return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
  File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 183, in collate_int_fn
    return torch.tensor(batch)
TypeError: new(): invalid data type 'str'

batch_size: 200 (I have tested many other values)

I tried to see if I had a dataset problem by trying to delete some parts but I still could not fix the error

``` 23-04-19 20:59:30.492 - INFO: Training Metrics: {"loss_text_ce": 3.1088507175445557, "loss_mel_ce": 2.75213360786438, "loss_gpt_total": 5.8609843254089355, "lr": 3.90625e-08, "it": 66, "step": 66, "steps": 2254, "epoch": 0, "iteration_rate": 13.106424331665039} Traceback (most recent call last): File "C:\Logiciel\ai-voice-cloning\src\train.py", line 64, in <module> train(config_path, args.launcher) File "C:\Logiciel\ai-voice-cloning\src\train.py", line 31, in train trainer.do_training() File "C:\Logiciel\ai-voice-cloning\src\dlas\train.py", line 406, in do_training for train_data in tq_ldr: File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\dataloader.py", line 634, in __next__ data = self._next_data() File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\dataloader.py", line 1346, in _next_data return self._process_data(data) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\dataloader.py", line 1372, in _process_data data.reraise() File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\_utils.py", line 644, in reraise raise exception TypeError: Caught TypeError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 127, in collate return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 127, in <dictcomp> return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 119, in collate return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 183, in collate_int_fn return torch.tensor(batch) TypeError: new(): invalid data type 'str' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\fetch.py", line 54, in fetch return self.collate_fn(data) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 264, in default_collate return collate(batch, collate_fn_map=default_collate_fn_map) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 130, in collate return {key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem} File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 130, in <dictcomp> return {key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem} File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 119, in collate return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map) File "C:\Logiciel\ai-voice-cloning-v1\.venv\lib\site-packages\torch\utils\data\_utils\collate.py", line 183, in collate_int_fn return torch.tensor(batch) TypeError: new(): invalid data type 'str' ``` batch_size: 200 (I have tested many other values) I tried to see if I had a dataset problem by trying to delete some parts but I still could not fix the error
Author

"fixed" with a try/catch and the use of iter() instead of the "for" :

    _t = time()
    step = 0
    metrics = []
    data_iter = iter(tq_ldr)
    while True:
        try:
            train_data = next(data_iter)
            step = step + 1
            metric = self.do_step(train_data)
            metrics.append(metric)
            if self.rank <= 0:
                logs = process_metrics(metrics)
                logs['lr'] = self.model.get_current_learning_rate()[0]
                if self.use_tqdm:
                    tq_ldr.set_postfix(logs, refresh=True)
                logs['it'] = self.current_step
                logs['step'] = step
                logs['steps'] = len(self.train_loader)
                logs['epoch'] = self.epoch
                logs['iteration_rate'] = self.iteration_rate
                self.logger.info(f'Training Metrics: {json.dumps(logs)}')
        except StopIteration:
            break
        except Exception as e:
            print(f"Erreur : {e}")
            continue
"fixed" with a try/catch and the use of iter() instead of the "for" : _t = time() step = 0 metrics = [] data_iter = iter(tq_ldr) while True: try: train_data = next(data_iter) step = step + 1 metric = self.do_step(train_data) metrics.append(metric) if self.rank <= 0: logs = process_metrics(metrics) logs['lr'] = self.model.get_current_learning_rate()[0] if self.use_tqdm: tq_ldr.set_postfix(logs, refresh=True) logs['it'] = self.current_step logs['step'] = step logs['steps'] = len(self.train_loader) logs['epoch'] = self.epoch logs['iteration_rate'] = self.iteration_rate self.logger.info(f'Training Metrics: {json.dumps(logs)}') except StopIteration: break except Exception as e: print(f"Erreur : {e}") continue

Can you post your train.txt?

Can you post your train.txt?
Author
https://anonfiles.com/kdSavbmdza/train_txt (50Mo) https://anonfiles.com/o5S9v8m8zb/valid_txt (1Mo)

"fixed" with a try/catch and the use of iter() instead of the "for" :

    _t = time()
    step = 0
    metrics = []
    data_iter = iter(tq_ldr)
    while True:
        try:
            train_data = next(data_iter)
            step = step + 1
            metric = self.do_step(train_data)
            metrics.append(metric)
            if self.rank <= 0:
                logs = process_metrics(metrics)
                logs['lr'] = self.model.get_current_learning_rate()[0]
                if self.use_tqdm:
                    tq_ldr.set_postfix(logs, refresh=True)
                logs['it'] = self.current_step
                logs['step'] = step
                logs['steps'] = len(self.train_loader)
                logs['epoch'] = self.epoch
                logs['iteration_rate'] = self.iteration_rate
                self.logger.info(f'Training Metrics: {json.dumps(logs)}')
        except StopIteration:
            break
        except Exception as e:
            print(f"Erreur : {e}")
            continue

Hello, where do you put this fix? I have the same error. :/

> "fixed" with a try/catch and the use of iter() instead of the "for" : > > _t = time() > step = 0 > metrics = [] > data_iter = iter(tq_ldr) > while True: > try: > train_data = next(data_iter) > step = step + 1 > metric = self.do_step(train_data) > metrics.append(metric) > if self.rank <= 0: > logs = process_metrics(metrics) > logs['lr'] = self.model.get_current_learning_rate()[0] > if self.use_tqdm: > tq_ldr.set_postfix(logs, refresh=True) > logs['it'] = self.current_step > logs['step'] = step > logs['steps'] = len(self.train_loader) > logs['epoch'] = self.epoch > logs['iteration_rate'] = self.iteration_rate > self.logger.info(f'Training Metrics: {json.dumps(logs)}') > except StopIteration: > break > except Exception as e: > print(f"Erreur : {e}") > continue Hello, where do you put this fix? I have the same error. :/
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#212
No description provided.