~jacksonchen666/gitlab-archive-all-projects

helper scripts and instructions to export everything from a gitlab account and make all projects 'archived'
add readme with a lot of instructions and words
add cd to directory thing

refs

main
browse  log 

clone

read-only
https://git.sr.ht/~jacksonchen666/gitlab-archive-all-projects
read/write
git@git.sr.ht:~jacksonchen666/gitlab-archive-all-projects

You can also use your local clone with git send-email.

#prying data out of gitlab

the primary goal of this guide is migration of git repositories and grabbing a copy of everything in the least amount of effort.

NOTE: guide based on 15.9 (not yet released version) docs (2023-02-12), and hard codes some things (graphql query, rest APIs, format of graphql guid things) that may need to be adapted in the future.

#applicability

this guide applies to the gitlab at gitlab.com. if you're gonna apply this to a gitlab other than gitlab.com, replace gitlab.com yourself throughout this repo.

#tools

get bash and curl (maybe with version greater than or equal to 7.66.0 cause this but not super necessary as gitlab won't even give the Retry-After header for some reason).

since you are probably reading this guide to pry data out of gitlab, you probably have git installed. if not, install it.

#get an access token

https://gitlab.com/-/profile/personal_access_tokens?name=exporter&scopes=read_api,read_user,read_registry,read_repository

make sure it expires tomorrow, as the token is extremely temporary.

save it wherever you want, or don't (and keep it in the page).

#modify the scripts

because the scripts were primarily made for macOS clipboards (aka the pbcopy and pbpaste for copy to clipboard and paste from clipboard respectively), this won't work on linux where the commands are different.

all of the script files can be listed with this command

ls */{*/,}*.sh

the ls command can be replaced with another command, like sed (needs more args) or vim.

you can then replace pbpaste with whatever command is used to grab from the clipboard and printing it to standard output, or a command that grabs the personal access token from gitlab (generated from the "get an access token" section).

here is an example with sed:

sed -i -e 's/pbpaste/whatever paste command in linux or something/' */{*/,}*.sh

#get the projects you're a member in

{
  projects(membership: true) {
    nodes {
      id,
      fullPath
    }
  }
}

https://gitlab.com/-/graphql-explorer

run the query, take the returned json and save it somewhere.

then, set the file variable to the filename (relative or absolute) of the returned json.

if you want to exclude some projects like ones that are not yours, remove the blocks of json from the file before processing then with the following commands:

jq '.data.project.nodes[].fullPath' $file | sed -e 's/^"//g' -e 's/"$//g' > project_paths.txt
jq '.data.project.nodes[].id' $file | sed -e 's~gid://gitlab/Project/~~g' -e 's/^"//g' -e 's/"$//g' > project_ids.txt

this gets us started with some information about your projects. that's all that's necessary, for now.

#mirror clone your repos

the core part is your git repos. this is also kind of the easiest step here:

./git-mirror-repos/mirror_clone.sh SSH $(<project_ids.txt)

you can change the SSH argument to HTTPS for https cloning if that's preferred. you can also pass NONE if you know what you're doing, as it will not modify anything after the NONE argument (unlike other options like SSH and HTTPS).

note that private repositories is included, so use a protocol you can be authenticated by gitlab.

this section is included because i don't trust the bundles to work, or just i don't how to do git bundle.

#export

you know that export option in that advanced section of the general settings for a project?

yeah that exists. it also tells you what is exported and what is not:

Export this project with all its related data in order to move it to a new GitLab instance. When the exported file is ready, you can download it from this page or from the download link in the email notification you will receive. You can then import it when creating a new project. Learn more.

The following items will be exported:

  • Project and wiki repositories
  • Project uploads
  • Project configuration, excluding integrations
  • Issues with comments, merge requests with diffs and comments, labels, milestones, snippets, and other project entities
  • LFS objects
  • Issue Boards
  • Design Management files and data

The following items will NOT be exported:

  • Job logs and artifacts
  • Container registry images
  • CI variables
  • Pipeline triggers
  • Webhooks
  • Any encrypted tokens

if want any items listed under the "not exported" list, you'll need to get those manually.

if you want to exclude any projects specifically in this step, you can make a copy of the project_ids.txt file and remove any ids for projects you don't want to "archive", then replace project_ids.txt in the command with the new file you just made.

run the script with the access token in your clipboard. maybe with this command:

./download/request/export_project.sh $(<project_ids.txt)

you will get a notification for each project that gets archived. so if you don't like being bombarded with notifications, shut them off temporarily.

after that's finished, you'll have to download each archive.

run the script with the access token in your clipboard. maybe with this command:

./download/download_archive/download_project_export.sh $(<project_ids.txt)

#archive

this is the last step: make the projects "archived".

if you want to exclude any projects specifically in this step, you can make a copy of the project_ids.txt file and remove any ids for projects you don't want to "archive", then replace project_ids.txt in the command with the new file you just made.

run the script with the access token in your clipboard. maybe with this command:

./archive_all/archive_projects.sh $(<project_ids.txt)

#the rest

getting the rest of the things not included in the git repos and gitlab exports is currently left as an exercise for the reader.