Lastly, we will train the network. Since we are in WSL, we don't have a Display Driver. Nvidia itself does not recomend to do so. So, we need to erase/comment some lines in the network code.
Erase/comment any reference to the class GPUAccounting (declared in "runstats.py") made in "utils/deeplearningutilities/torch/trainer.py", and the "IterationTimer()" for the display in the same file.
In a nutshell, erase/comment:
- self._display_iteration_timer = IterationTimer()
- self._gpu_accounting = GPUAccounting()
- The "if" right below "# gpu stats"
- The "if" right below "# print display strings"
It is recomended to install PRCTL for Python so you can load the files in parallel, and get rid of a warning. Therefore:
- sudo apt-get install python3.7-dev libcap-dev python3-setuptools
Then, follow the instructions of the official site:
- git clone http://github.com/seveas/python-prctl
- cd python-prctl
- python setup.py build
- sudo python setup.py install
Look for the "prctl" file in the active miniconda environment, and test the command:
- python3.7 -c "import sys; sys.path.append('/usr/local/lib/python3.7/dist-packages/python_prctl-1.8.1-py3.7-linux-x86_64.egg/'); import prctl; print(prctl); print(sys.version_info)"
Now, write this in the train_network_torch.py:
- sys.path.append('/usr/local/lib/python3.7/dist-packages/python_prctl-1.8.1-py3.7-linux-x86_64.egg/')
Note: Remember to switch the path in blue for the one you found.
Start the training:
- python3 train_network_torch.py default.yalm
Evaluate:
- python3 evaluate_network.py --trainscript train_network_torch.py --cfg default.yaml --weights pretrained_model_weights.pt
The End.
No comments:
Post a Comment