跳到主要内容

Git LFS 大文件使用

git lfs 安装

本地需要额外安装git lfs, 同时需要 git server 服务器支持, 才能支持完整的git lfs大文件托管.

https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage

ubuntu 安装git lfs

https://github.com/git-lfs/git-lfs/blob/main/INSTALLING.md

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

sudo apt-get install git-lfs

macos安装git lfs

homebrew

brew install git-lfs

不过国内的网络环境非常复杂, 离开proxy后brew经常无法下载成功

✗ brew install git-lfs
Warning: You are using macOS 13.4.
We do not provide support for this pre-release version.
You will encounter build failures with some formulae.
Please create pull requests instead of asking for help on Homebrew's GitHub,
Discourse, Twitter or IRC. You are responsible for resolving any issues you
experience while you are running this pre-release version.

==> Cloning https://go.googlesource.com/tools.git
Updating /Users/zhlin/Library/Caches/Homebrew/go--gotools--git
fatal: unable to access 'https://go.googlesource.com/tools.git/': Failed to connect to go.googlesource.com port 443 after 75020 ms: Couldn't connect to server
Error: Failed to download resource "go--gotools"
Failure while executing; `git fetch origin` exited with 128. Here's the output:
fatal: unable to access 'https://go.googlesource.com/tools.git/': Failed to connect to go.googlesource.com port 443 after 75020 ms: Couldn't connect to server

可以直接到web下载zip包, 然后执行install.sh即可.

https://git-lfs.com/

picture 1

Git LFS 基础使用

git lfs 官网使用说明

Git Large File Storage

https://git-lfs.com/

首先需要安装git-lfs, 然后在执行下面的步骤.

  • 全局配置lfs, 在全局的~/.git/.gitconfig里可以看到信息

You only need to run this once per user account.

git lfs install
  • 本地git仓库运行, 关联lfs文件, 也可以配置整个文件夹
git lfs track "*.psd"
  • 生成的.gitattributes需要添加到git中
git add .gitattributes

然后正常使用git即可, 可以使用git lfs ls-files检查会被上传到git lfs的文件, 也可以在git仓库页面直接查看, 非常方便.

gitee关于lfs的介绍

比较清晰, 看完基本就了解了.

Git LFS 操作指南

https://gitee.com/help/articles/4235#article-header0

Git LFS是基于Git的 .gitattributs 配置文件的特性,用 smudge过滤器基于 指针文件寻找大文件内容, 用 clean过滤器在对大文件改动时,创建指针文件的新版本。同时还用 pre-push钩子将大文件上传到Git LFS服务器, 即在 git-push时, 如果提交中包含被LFS跟踪的大文件,pre-push钩子会检测到,并执行上传Git LFS服务器的动作。

被LFS管理的文件,本地仓库中保存的内容实际上是指针文件,其格式类似于下面这样:

$ git show HEAD:2.svg

version https://git-lfs.github.com/spec/v1
oid sha256:158213f90f8b27012034c6f58db63e1861b12aa122d98910de311bf1cb1e50a0
size 14651
(END)

本地linux系统安装git lfs

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

sudo apt-get install git-lfs

第一步:在Git仓库中为仓库设置相关配置:

$ git lfs install

Tips:

这个命令会自动改变Git配置文件 .gitconfig,而且是全局性质的,会自动在配置文件中增加如下配置:

[filter "lfs"]
clean = git-lfs clean -- %f
smudge = git-lfs smudge -- %f
process = git-lfs filter-process
required = true

第二步:选择要用LFS追踪的文件:

$ git lfs track "*.svg"
# 或者具体到某个文件
$ git lfs track "2.png"
$ git lfs track "example.lfs"

Tips:

这个命令会更改仓库中的 .gitattributes配置文件(如果之前不存在这个文件,则会自动新建):

查看如下:

$ cat .gitattributes
*.svg filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text

好奇的同学可能要问了,如果想知道自己到底追踪了哪些文件,怎么办?

好办,一条命令解决!

通过 git lfs ls-files 可以随时查看正在被LFS追踪的文件:

$ git lfs ls-files
9a3c7dae41 * 1.png
d61cf5835a * 2.png
158213f90f * 3.svg

git add file 之后文件才可能被追踪,也才能查看得到

可能还会有人好奇,如果不想LFS追踪某个文件,怎么办?

好办,还是一条命令解决:

$ git lfs untrack "1.png"

解决了好奇同学的问题,我们接着前面的第二步来,选择好需要LFS管理的文件之后,最好先保存一下配置:

第三步:保存并提交配置:

$ git add .gitattributes
$ git commit -m "add .gitattributes"

配置总结:

安装Git LFS之后,只需三步,即可在仓库中配置LFS功能,即:

#step 1
$ git lfs install

#step 2
$ git lfs track files

# step 3
$ git add .gitattributes

实际上,由于第一步是全局配置,所以执行一次即可,后续有其它仓库需要使用LFS,则不需要再次执行,除非中途取消了LFS配置。

Tips: 运行 git lfs uninstall 即可取消LFS的全局配置

本地配置整个文件夹上传

Git LFS track folder recursively

https://stackoverflow.com/questions/35769330/git-lfs-track-folder-recursively

Use git lfs track "myfolder/**", with quotes to avoid the shell already expanding the pattern. All that the track command does is to write to .gitattributes, which in turn uses (almost) the same pattern matching rules as .gitignore, see the PATTERN FORMAT description.

批量配置多个文件后缀

图片类, office文档类的文件, 还是都扔到lfs进行管理.

https://stackoverflow.com/questions/46334725/git-lfs-how-to-track-multiple-file-types-with-one-command

git lfs track "*.jpg" "*.png" "*.jpeg" "*.tiff" "*.pdf" "*.docx" "*.doc" "*.xlsx" "*.xls" "*.ppt" "*.pptx" "*.zip" "*.tar" "*.gz" "*.pps"
 cat .gitattributes 
assets/** filter=lfs diff=lfs merge=lfs -text
docs/** filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text
*.jpg filter=lfs diff=lfs merge=lfs -text
*.jpeg filter=lfs diff=lfs merge=lfs -text
*.pdf filter=lfs diff=lfs merge=lfs -text
*.docx filter=lfs diff=lfs merge=lfs -text
*.doc filter=lfs diff=lfs merge=lfs -text
*.xlsx filter=lfs diff=lfs merge=lfs -text
*.xls filter=lfs diff=lfs merge=lfs -text
*.ppt filter=lfs diff=lfs merge=lfs -text

gitea 服务端配置支持lfs

gitea服务端配置启用后, 客户端就可以支持git lfs了, lfs文件保存在独立的文件夹中.

Git Large File Storage setup

https://docs.gitea.io/en-us/administration/git-lfs-setup/#git-large-file-storage-setup

[server]
; Enables git-lfs support. true or false, default is false.
LFS_START_SERVER = true

[lfs]
; Where your lfs files reside, default is data/lfs.
PATH = /home/gitea/data/lfs

gitea lfs 本地项目测试

2023-05-08

本地的assets文件保存了所有的markdown image文件, docs文件保存了所有的pdf/excel等文件, 这些非文本类型的文件都不需要通过git进行版本管理, 只需要进行存储即可.

本地测试, 从一个非git项目开始配置完整的git lfs.

(ocx) ➜  geesnoopy git lfs track "assets/**"
Not in a Git repository.
(ocx) ➜ geesnoopy git init
Initialized empty Git repository in /Users/geehanlin/projects/geesnoopy/.git/
(ocx) ➜ geesnoopy git:(master)git lfs track "assets/**"
Tracking "assets/**"
(ocx) ➜ geesnoopy git:(master)git lfs track "docs/**"
Tracking "docs/**"
(ocx) ➜ geesnoopy git:(master)cat .gitattributes
assets/** filter=lfs diff=lfs merge=lfs -text
docs/** filter=lfs diff=lfs merge=lfs -text
(ocx) ➜ geesnoopy git:(master)git lfs ls-files
...
cat .gitattributes 
assets/** filter=lfs diff=lfs merge=lfs -text
docs/** filter=lfs diff=lfs merge=lfs -text
(ocx) ➜  geesnoopy git:(master) git push -u origin master

Locking support detected on remote "origin". Consider enabling it with:
$ git config lfs.https://gitea.gee.cool/gee/geesnoopy.git/info/lfs.locksverify true
Uploading LFS objects: 100% (2545/2545), 1.6 GB | 848 KB/s, done.
Enumerating objects: 3648, done.
Counting objects: 100% (3648/3648), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3594/3594), done.
client_loop: send disconnect: Broken pipeB | 2.06 MiB/s
fatal: the remote end hung up unexpectedly
fatal: the remote end hung up unexpectedly

重新执行upload, 静态文件不需要再次上传

(ocx) ➜  geesnoopy git:(master)git push -u origin master              

Locking support detected on remote "origin". Consider enabling it with:
$ git config lfs.https://gitea.gee.cool/gee/geesnoopy.git/info/lfs.locksverify true
Uploading LFS objects: 100% (2545/2545), 1.6 GB | 0 B/s, done.
Enumerating objects: 3648, done.
Counting objects: 100% (3648/3648), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3594/3594), done.
Writing objects: 100% (3648/3648), 6.63 MiB | 456.00 KiB/s, done.
Total 3648 (delta 10), reused 0 (delta 0)
remote: Resolving deltas: 100% (10/10), done.
remote: . Processing 1 references
remote: Processed 1 references in total
To gitea.gee.cool:gee/geesnoopy.git
* [new branch] master -> master
Branch 'master' set up to track remote branch 'master' from 'origin'.

git lfs 使用问题 debug

git lfs server 文件权限问题

2023-05-31

本地开了一个新repo, 发现执行git push后一直无法上传 git fls 文件, 报错 500 Internal server error, 怀疑是不是网络有问题.

 git push
Locking support detected on remote "origin". Consider enabling it with:
$ git config lfs.https://gitea.gee.cool/gee/muddy.git/info/lfs.locksverify true
Fatal error: Internal Server Error 0 B | 0 B/s
Uploading LFS objects: 0% (0/2), 0 B | 0 B/s, done.
Fatal error: Internal Server Error
error: failed to push some refs to 'git@gitea.gee.cool:gee/muddy.git'
  • 本地 debug

咨询了chatgpt, 建议是本地开启git日志, 查看具体信息.

GIT_TRACE=1 GIT_LFS_TRACE=1 git push

同时建议考虑是不是dns出现问题, 查看dns

nslookup <git-lfs-server-url>

本地查看日志

git lfs logs last
git lfs status

  • 服务器查看 gitea 日志 docker-compose logs -f

本地怎么看都看不出原因, 于是到服务器到处看看日志.

服务端 gitea 的 web log 访问日志是什么都没看出来, 没想到查看 gitea server 的日志倒是很清楚, 远程客户端git lfs的操作访问都有记录, 最终发现是文件权限问题.

使用docker-compose logs -f 查看日志:

gitea     | 2023/05/31 19:42:22 ...lfs/content_store.go:58:Put() [E] [6477329e-2] Whilst putting LFS OID[6f619290f1f9cc82e4cf8866b44bdaa04fa6e25d5e54c168bab84d02d7adb292]: Failed to copy to tmpPath: 6f/61/9290f1f9cc82e4cf8866b44bdaa04fa6e25d5e54c168bab84d02d7adb292 Error: mkdir /data/lfs/6f/61: permission denied
gitea | 2023/05/31 19:42:22 ...rvices/lfs/server.go:323:func1() [E] [6477329e-2] Error putting LFS MetaObject [6f619290f1f9cc82e4cf8866b44bdaa04fa6e25d5e54c168bab84d02d7adb292] into content store. Error: mkdir /data/lfs/6f/61: permission denied
gitea | 2023/05/31 19:42:22 ...rvices/lfs/server.go:336:UploadHandler() [E] [6477329e-2] Error whilst uploadOrVerify LFS OID[6f619290f1f9cc82e4cf8866b44bdaa04fa6e25d5e54c168bab84d02d7adb292]: mkdir /data/lfs/6f/61: permission denied

最近服务器换了机器, 整体迁移了git server, 估计迁移的文件用户和启动的用户不一致导致的.

  • gitea server部分文件夹用户为ubuntu, 部分文件夹用户为lighthouse

同时查看了宿主机的进程, ps aux | grep gitea, 发现进程所有者为lighthouse, 问题就比较清楚了, 用户不一致导致.

ls -lah /data/gitea/data/
total 24K
drwxr-xr-x 6 ubuntu ubuntu 4.0K May 7 19:53 .
drwxr-xr-x 3 ubuntu ubuntu 4.0K May 7 19:06 ..
drwxr-xr-x 4 lighthouse lighthouse 4.0K May 7 19:50 git
drwxr-xr-x 15 lighthouse lighthouse 4.0K May 7 19:54 gitea
drwxrwxr-x 259 ubuntu ubuntu 4.0K May 8 03:10 lfs
drwx------ 2 root root 4.0K May 7 19:07 ssh
  • docker 启动用户uid跨系统不一致导致

部署的 docker 环境变量里使用 uid限定了启动用户, 都是1000用户, 问题是不同操作系统间同一个uid代表的是不同的用户.

旧服务器为ubuntu22, 1000用户是ubuntu; 新服务器为ubuntu18, 1000用户是lighthouse.

ubuntu18版本

cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
ubuntu:x:500:500:ubuntu,,,:/home/ubuntu:/bin/bash
lighthouse:x:1000:1000::/home/lighthouse:/bin/bash

ubuntu22版本

 cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
ubuntu:x:1000:1000:qcloud:/home/ubuntu:/bin/bash
lxd:x:999:100::/var/snap/lxd/common/lxd:/bin/false
ntp:x:113:118::/nonexistent:/usr/sbin/nologin
lighthouse:x:1001:1001::/home/lighthouse:/bin/bash
  • 修改用户重启解决

.env

USER_UID=1000
USER_GID=1000

修改为500, 也就是ubuntu用户, 然后重启服务.

查看前后的用户进程, 已经修改为ubuntu了.

ubuntu@VM-0-4-ubuntu:/data/geedeploy/gitea$ ps aux | grep gitea
root 7922 0.0 0.0 1040 720 ? S 19:51 0:00 s6-supervise gitea
lightho+ 7924 11.8 4.9 878392 174440 ? Ssl 19:51 0:00 /usr/local/bin/gitea web
ubuntu 7982 0.0 0.0 13776 1020 pts/6 S+ 19:51 0:00 grep --color=auto gitea
ubuntu@VM-0-4-ubuntu:/data/geedeploy/gitea$ vim .env
ubuntu@VM-0-4-ubuntu:/data/geedeploy/gitea$ docker-compose down
Stopping gitea ... done
Removing gitea ... done
Removing network gitea_gitea
ubuntu@VM-0-4-ubuntu:/data/geedeploy/gitea$ docker-compose up -d
Creating network "gitea_gitea" with the default driver
Creating gitea ...
Creating gitea ... done
ubuntu@VM-0-4-ubuntu:/data/geedeploy/gitea$ ps aux | grep gitea
root 8466 0.0 0.0 1040 732 ? S 19:52 0:00 s6-supervise gitea
ubuntu 8468 10.5 4.9 878136 173736 ? Ssl 19:52 0:00 /usr/local/bin/gitea web
ubuntu 8530 0.0 0.0 13776 1072 pts/6 S+ 19:52 0:00 grep --color=auto gitea
  • git pull仍然报错Permission denied

原因是上面的文件夹用户属主还没有修改, 修改为ubuntu用户后恢复.

git pull
git@gitea.gee.cool: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
  • 成功push上传lfs文件

(ocx) ➜ muddy git:(master)git push
Locking support detected on remote "origin". Consider enabling it with:
$ git config lfs.https://gitea.gee.cool/gee/muddy.git/info/lfs.locksverify true
Uploading LFS objects: 100% (2/2), 2.5 MB | 694 KB/s, done.
Enumerating objects: 46, done.
Counting objects: 100% (46/46), done.
Delta compression using up to 8 threads
Compressing objects: 100% (30/30), done.
Writing objects: 100% (35/35), 18.90 KiB | 1.89 MiB/s, done.
Total 35 (delta 4), reused 0 (delta 0)
remote: . Processing 1 references
remote: Processed 1 references in total
To gitea.gee.cool:gee/muddy.git
4a58645..ad6c09e master -> master

git clone 的lfs文件只是超链接

2023-05-31

远程 git clone 后, 发现lfs文件只是软链接, 并没有下载真实的文件, 搜索后发现是需要做一些前置配置的.

下载的图片文件只是超链接

ubuntu@VM-0-4-ubuntu:/data/muddy$ git pull
Already up to date.

ubuntu@VM-0-4-ubuntu:/data/muddy$ ls -alh assets/2023-05/
total 16K
drwxrwxr-x 2 ubuntu ubuntu 4.0K May 31 19:58 .
drwxrwxr-x 3 ubuntu ubuntu 4.0K May 29 20:45 ..
-rw-rw-r-- 1 ubuntu ubuntu 132 May 31 19:58 20230529083513-32558.png
-rw-rw-r-- 1 ubuntu ubuntu 132 May 31 19:58 20230531052303-20549.png

ubuntu@VM-0-4-ubuntu:/data/muddy$ cat assets/2023-05/20230529083513-32558.png
version https://git-lfs.github.com/spec/v1
oid sha256:f84dc8c326d25adb8b4cc56369c8e8b2f8cae271aec449084689af988a21649a
size 1457024

咨询chatgpt后跟着操作, 问题解决.

  • ubuntu 安装git lfs

https://github.com/git-lfs/git-lfs/blob/main/INSTALLING.md

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
sudo apt-get install git-lfs
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
libpcre16-3 libpcre32-3 libpcrecpp0v5
Use 'sudo apt autoremove' to remove them.
The following NEW packages will be installed:
git-lfs
0 upgraded, 1 newly installed, 0 to remove and 97 not upgraded.
Need to get 2,129 kB of archives.
After this operation, 7,662 kB of additional disk space will be used.
Get:1 http://mirrors.tencentyun.com/ubuntu bionic/universe amd64 git-lfs amd64 2.3.4-1 [2,129 kB]
Fetched 2,129 kB in 2s (1,382 kB/s)
Selecting previously unselected package git-lfs.
(Reading database ... 103213 files and directories currently installed.)
Preparing to unpack .../git-lfs_2.3.4-1_amd64.deb ...
Unpacking git-lfs (2.3.4-1) ...
Setting up git-lfs (2.3.4-1) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
  • 配置 git lfs
ubuntu@VM-0-4-ubuntu:/data/muddy$ git lfs
git-lfs/2.3.4 (GitHub; linux amd64; go 1.8.3)
Sorry, no usage text found for "git-lfs"


ubuntu@VM-0-4-ubuntu:/data/muddy$ git lfs install
Updated git hooks.
Git LFS initialized.


ubuntu@VM-0-4-ubuntu:/data/muddy$ cat .git
.git/ .gitattributes .gitignore

ubuntu@VM-0-4-ubuntu:/data/muddy$ cat .gitattributes
assets/** filter=lfs diff=lfs merge=lfs -text
docs/** filter=lfs diff=lfs merge=lfs -text

  • 手动执行 git lfs 更新

问题解决

git lfs fetch --all
git lfs checkout
ubuntu@VM-0-4-ubuntu:/data/muddy$ git lfs fetch --all
Scanning for all objects ever referenced...
2 objects found
Fetching objects...
Git LFS: (2 of 2 files) 2.40 MB / 2.40 MB

ubuntu@VM-0-4-ubuntu:/data/muddy$ ls -alh assets/2023-05/
total 16K
drwxrwxr-x 2 ubuntu ubuntu 4.0K May 31 20:19 .
drwxrwxr-x 3 ubuntu ubuntu 4.0K May 29 20:45 ..
-rw-rw-r-- 1 ubuntu ubuntu 132 May 31 19:58 20230529083513-32558.png
-rw-rw-r-- 1 ubuntu ubuntu 132 May 31 19:58 20230531052303-20549.png

ubuntu@VM-0-4-ubuntu:/data/muddy$ git lfs checkout
Git LFS: (2 of 2 files) 2.40 MB / 2.40 MB
Git LFS: (2 of 2 files) 2.40 MB / 2.40 MB ubuntu@VM-0-4-ubuntu:/data/muddy$ ls -alh assets/2023-05/
total 2.5M
drwxrwxr-x 2 ubuntu ubuntu 4.0K May 31 20:19 .
drwxrwxr-x 3 ubuntu ubuntu 4.0K May 29 20:45 ..
-rw-rw-r-- 1 ubuntu ubuntu 1.4M May 31 20:19 20230529083513-32558.png
-rw-rw-r-- 1 ubuntu ubuntu 1.1M May 31 20:19 20230531052303-20549.png

询问 chatgpt

question: git clone下载的lfs文件只是链接, 不是真实文件

answer:

如果使用git clone命令下载包含Git LFS(Large File Storage)文件的Git仓库时,下载的LFS文件只是链接而不是真实文件,可能是因为没有正确下载LFS文件。以下是解决这个问题的步骤:

  1. 确认是否安装了Git LFS:在下载包含LFS文件的Git仓库之前,需要先安装Git LFS。可以使用以下命令来检查是否已经安装了Git LFS:
git lfs version

如果返回的结果中包含Git LFS的版本信息,则说明已经安装了Git LFS。

  1. 初始化Git LFS:在下载包含LFS文件的Git仓库之前,需要在本地Git仓库中初始化Git LFS。可以使用以下命令来初始化Git LFS:
git lfs install

3.下载LFS文件:在完成Git LFS的初始化之后,可以使用git clone命令下载包含LFS文件的Git仓库。在下载LFS文件时,Git LFS会自动将LFS文件下载到指定的存储位置。如果下载的LFS文件只是链接而不是真实文件,可以使用以下命令来下载LFS文件:

git lfs fetch --all
git lfs checkout

这个命令将会下载所有的LFS文件并检出到本地仓库中。

总之,要在Git中正确下载LFS文件,需要先安装Git LFS,然后在本地Git仓库中初始化Git LFS,最后使用git clone命令下载包含LFS文件的Git仓库,并使用git lfs fetch和git lfs checkout命令下载和检出LFS文件。

picture 1

picture 2

咨询 git lfs Locking support

Locking support detected on remote "origin". Consider enabling it with:
$ git config lfs.https://gitea.gee.cool/gee/muddy.git/info/lfs.locksverify true

picture 3

picture 4