多机多卡训练mmseg工程时,命令
第一台机器:
NNODES=2 NODE_RANK=0 PORT=8888 MASTER_ADDR=192.168.XX.XX sh tools/dist_train.sh ./configs/temp.py 4
第二台机器:
NNODES=2 NODE_RANK=1 PORT=8888 MASTER_ADDR=192.168.XX.XX sh tools/dist_train.sh ./configs/temp.py 4
报错信息如下:
RuntimeError: The server socket has failed to listen on any local network address. The server socket has failed to bind to [::]:8888 (errno: 98 - Address already in use). The server socket has failed to bind to ?UNKNOWN? (errno: 98 - Address already in use).
根据报错信息,可以看到是因为8888这个端口号被使用了 ,此时只需要更换PORT的端口号就可以了,比如改成29050,29051......文章来源:https://www.toymoban.com/news/detail-629069.html
至此,问题解决!文章来源地址https://www.toymoban.com/news/detail-629069.html
整理不易,欢迎一键三连!!!
到了这里,关于【debug】mmseg多级多卡训练报错:The server socket has failed to listen on any local network address.的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!