在Linux终端下使用代理
2021/10/22 7:13:13
本文主要是介绍在Linux终端下使用代理,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
在Linux终端下使用代理
前言
最近运行一个Github项目,里面用到了Huggingface的Datasets库,这个库在会主动去网络上下载原始数据集文件,但其下载源都是原始数据集的链接。比如Spider数据集,其下载来源为原作者发布的Google Drive链接上。然而,学校里的服务器并不支持访问外网。故需要使用代理来协助程序访问Google Drive。
问题
下面以一个简单的代码和报错为例,介绍这个问题。
from datasets import load_dataset dataset = load_dataset('spider')
直接运行上述代码,程序会自动去Google drive上尝试下载Spider数据集,但是由于网络访问限制,将会如下报错。
(slurm) jxqi@main-2:~/Text-to-SQL/tmp$ python test_google.py Using the latest cached version of the module from /home/jxqi/.cache/huggingface/modules/datasets_modules/datasets/spider/edbe505fd96c6218feb563fa547869bbc170052a1484d675f9d96d090a9473cf (last modified on Wed Oct 20 15:33:00 2021) since it couldn't be found locally at spider/spider.py or remotely (ConnectionError). Downloading and preparing dataset spider/spider (download: 95.12 MiB, generated: 5.17 MiB, post-processed: Unknown size, total: 100.29 MiB) to /home/jxqi/.cache/huggingface/datasets/spider/spider/1.0.0/edbe505fd96c6218feb563fa547869bbc170052a1484d675f9d96d090a9473cf... Traceback (most recent call last): File "test_google.py", line 3, in <module> dataset = load_dataset('spider') File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/load.py", line 742, in load_dataset builder_instance.download_and_prepare( File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/builder.py", line 574, in download_and_prepare self._download_and_prepare( File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/builder.py", line 630, in _download_and_prepare split_generators = self._split_generators(dl_manager, **split_generators_kwargs) File "/home/jxqi/.cache/huggingface/modules/datasets_modules/datasets/spider/edbe505fd96c6218feb563fa547869bbc170052a1484d675f9d96d090a9473cf/spider.py", line 78, in _split_generators downloaded_filepath = dl_manager.download_and_extract(_URL) File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/utils/download_manager.py", line 287, in download_and_extract return self.extract(self.download(url_or_urls)) File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/utils/download_manager.py", line 195, in download downloaded_path_or_paths = map_nested( File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/utils/py_utils.py", line 195, in map_nested return function(data_struct) File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/utils/download_manager.py", line 218, in _download return cached_path(url_or_filename, download_config=download_config) File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 281, in cached_path output_path = get_from_cache( File "/home/jxqi/anaconda3/envs/slurm/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 623, in get_from_cache raise ConnectionError("Couldn't reach {}".format(url)) ConnectionError: Couldn't reach https://drive.google.com/uc?export=download&id=1_AckYkinAnhqmRQtGsQgUKAnTHxxX5J0
可以看到,由于服务器无法访问Google drive链接导致报错。
解决
查找资料,发现类似的问题,参考Linux 让终端走代理的几种方法,可以通过修改shell配置文件.bashrc实现本用户的程序直接走代理的方法。
其具体步骤为首先打开.bashrc文件,然后再文件尾部追加以下两行内容:
export http_proxy="http://proxy_host:port" export https_proxy="http://proxy_host:port"
其中将proxy_host修改为你的代理服务器名称、port修改为代理端口。然后可能还需要添加用户名和密码,即:
export http_proxy="http://username:password@proxy_host:port" export https_proxy="http://username:passwordproxy_host:port"
之后,需要对shell进行重启。使用以下命令:
source ~/.bashrc
重启之后程序就可以使用代理访问外网了。
参考
[1] Linux 让终端走代理的几种方法, https://zhuanlan.zhihu.com/p/46973701
这篇关于在Linux终端下使用代理的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-11-23linux 系统宝塔查看网站访问的命令是什么?-icode9专业技术文章分享
- 2024-11-12如何创建可引导的 ESXi USB 安装介质 (macOS, Linux, Windows)
- 2024-11-08linux的 vi编辑器中搜索关键字有哪些常用的命令和技巧?-icode9专业技术文章分享
- 2024-11-08在 Linux 的 vi 或 vim 编辑器中什么命令可以直接跳到文件的结尾?-icode9专业技术文章分享
- 2024-10-22原生鸿蒙操作系统HarmonyOS NEXT(HarmonyOS 5)正式发布
- 2024-10-18操作系统入门教程:新手必看的基本操作指南
- 2024-10-18初学者必看:操作系统入门全攻略
- 2024-10-17操作系统入门教程:轻松掌握操作系统基础知识
- 2024-09-11Linux部署Scrapy学习:入门级指南
- 2024-09-11Linux部署Scrapy:入门级指南