服务器无法访问外网部署minigpt4

文章目录

Step1:下载仓库
Step2：配置环境
Step3：准备预训练的模型权重
- 3.1 下载模型
- 3.2 修改配置文件
Step4：准备预训练模型检查点
- 4.1 下载检查点
- 4.2 修改配置文件
Step5:准备相关transformer模型
- 5.1 Q-Former
- 5.2 Bert
- 5.3 ViT
Step6：在本地启动demo
引用

现有的很多部署minigpt4的教程都默认服务器能够访问外网（huggingface），因此需要我们一步一步去解决加载外网的几个模型的问题，略显麻烦。为了节约大家部署的时间，在本文中，我将详细按照步骤描述minigpt4（Vicuna版本）部署的总体流程，包含在连接外网失败时如何解决。

Step1:下载仓库

下载官方仓库

git clone [email protected]:Vision-CAIR/MiniGPT-4.git

Step2：配置环境

运行下面的命令创建并激活环境

git clone https://github.com/Vision-CAIR/MiniGPT-4.git
cd MiniGPT-4
conda env create -f environment.yml
conda activate minigptv

Step3：准备预训练的模型权重

3.1 下载模型

根据官方的文档，你可以根据自己需要的版本进行下载，本文主要是针对Vicuna V0 7B：https://huggingface.co/Vision-CAIR/vicuna-7b/tree/main
在这里插入图片描述

下载后上传至服务器（此时无prerained_minigpt4_7b.pth）：

在这里插入图片描述

3.2 修改配置文件

将MiniGPT-4/minigpt4/configs/models/minigpt4_vicuna0.yaml文件中第18行的

llama_model: "please set this value to the path of vicuna model"

修改为（vicuna-7b文件夹的路径要根据你自己的路径进行修改）：

llama_model: "{vicuna-7b文件夹的路径}"

如下图所示：

在这里插入图片描述

Step4：准备预训练模型检查点

4.1 下载检查点

这里需要根据官方说的对应版本下载。我们在这里下载Vicuna 7B对应的版本：https://drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view

上传prerained_minigpt4_7b.pth至服务器。我在这里上传到了刚刚下载模型的vicuna-7b文件夹下：

在这里插入图片描述

4.2 修改配置文件

将eval_configs/minigpt4_eval.yaml的第8行：

ckpt: 'please set this value to the path of pretrained checkpoint'

修改为：

ckpt: '{prerained_minigpt4_7b.pth的路径}'

在这里插入图片描述

Step5:准备相关transformer模型

如果服务器可以连接huggingface，那么直接进行Step6就好，但是如果服务器无法访问，我们需要提前准备好以下三个模型。

5.1 Q-Former

下载网址：https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained_flant5xxl.pth

上传blip2_pretrained_flant5xxl.pth至服务器：

在这里插入图片描述

将MiniGPT-4/minigpt4/models/minigpt4.py文件下的：

q_former_model="https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained_flant5xxl.pth"

修改为：

q_former_model="服务器上blip2_pretrained_flant5xxl.pth的路径"

在这里插入图片描述

5.2 Bert

下载网址：https://huggingface.co/bert-base-uncased

上传至服务器：

在这里插入图片描述

同样，将MiniGPT-4/minigpt4/models/minigpt4.py文件下的：

encoder_config = BertConfig.from_pretrained("bert-base-uncased")

修改为：

encoder_config = BertConfig.from_pretrained("{服务器中bert-base-uncased的路径}")

在这里插入图片描述

5.3 ViT

下载网址：https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/eva_vit_g.pth

上传至服务器：

在这里插入图片描述

将MiniGPT-4/minigpt4/models/eva_vit.py下的：

url = "https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/eva_vit_g.pth"
cached_file = download_cached_file(
        url, check_hash=False, progress=True
)
state_dict = torch.load(cached_file, map_location="cpu")

修改为：

local_path = "{服务器上eva_vit_g.pth的路径}"
state_dict = torch.load(local_path, map_location="cpu")

在这里插入图片描述

Step6：在本地启动demo

运行下面的命令：

python demo.py --cfg-path eval_configs/minigpt4_eval.yaml  --gpu-id 0

点击下面的网址：

在这里插入图片描述

进入聊天页面，上传图片，输入文本，即可使用minigpt4交互聊天。

在这里插入图片描述

至此，minigpt4部署完成。如果还需要更多的配置，可以看官方文档：https://github.com/Vision-CAIR/MiniGPT-4?tab=readme-ov-file

引用

如果在研究或应用中使用 MiniGPT-4/MiniGPT-v2，请使用此 BibTeX 进行引用：

@article{chen2023minigptv2,
      title={MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning}, 
      author={Chen, Jun and Zhu, Deyao and Shen, Xiaoqian and Li, Xiang and Liu, Zechu and Zhang, Pengchuan and Krishnamoorthi, Raghuraman and Chandra, Vikas and Xiong, Yunyang and Elhoseiny, Mohamed},
      year={2023},
      journal={arXiv preprint arXiv:2310.09478},
}

@article{zhu2023minigpt,
  title={MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models},
  author={Zhu, Deyao and Chen, Jun and Shen, Xiaoqian and Li, Xiang and Elhoseiny, Mohamed},
  journal={arXiv preprint arXiv:2304.10592},
  year={2023}
}