~jacksonchen666/gitlab-archive-all-projects

53ba14766e3b2ebb5ea19cb58daed6d63341d7a8 — Jackson Chen 9 months ago 93987a7
add readme with a lot of instructions and words
1 files changed, 193 insertions(+), 0 deletions(-)

A README.md
A README.md => README.md +193 -0
@@ 0,0 1,193 @@
# prying data out of gitlab
the primary goal of this guide is migration of git repositories and grabbing
a copy of everything in the least amount of effort.

NOTE: guide based on 15.9 (not yet released version) docs (2023-02-12), and
hard codes some things (graphql query, rest APIs, format of graphql guid
things) that may need to be adapted in the future.

<!--
TODO: finish and postify (or just post link)
-->

## applicability

this guide applies to the gitlab at `gitlab.com`. if you're gonna apply this
to a gitlab other than `gitlab.com`, replace `gitlab.com` yourself throughout
this repo.

## tools

get `bash` and `curl` (maybe with version greater than or equal to `7.66.0`
cause
[this](https://github.com/curl/curl/commit/640b9733de74d629af68afcad0ff8bb658e80eff)
but not super necessary as gitlab won't even give the `Retry-After` header
for some reason).

since you are probably reading this guide to pry data out of gitlab, you
probably have `git` installed. if not, install it.

## get an access token

<https://gitlab.com/-/profile/personal_access_tokens?name=exporter&scopes=read_api,read_user,read_registry,read_repository>

make sure it expires tomorrow, as the token is extremely temporary.

save it wherever you want, or don't (and keep it in the page).

## modify the scripts

because the scripts were primarily made for macOS clipboards (aka the
`pbcopy` and `pbpaste` for copy to clipboard and paste from clipboard
respectively), this won't work on linux where the commands are different.

all of the script files can be listed with this command

```
ls */{*/,}*.sh
```

the `ls` command can be replaced with another command, like `sed` (needs
more args) or `vim`.

you can then replace `pbpaste` with whatever command is used to grab from
the clipboard and printing it to standard output, or a command that grabs
the personal access token from gitlab (generated from the "get an access
token" section).

here is an example with `sed`:

```
sed -i -e 's/pbpaste/whatever paste command in linux or something/' */{*/,}*.sh
```

## get the projects you're a member in

```
{
  projects(membership: true) {
    nodes {
      id,
      fullPath
    }
  }
}
```

<https://gitlab.com/-/graphql-explorer>

run the query, take the returned json and save it somewhere.

then, set the `file` variable to the filename (relative or absolute) of the
returned json.

if you want to exclude some projects like ones that are not yours, remove
the blocks of json from the file before processing then with the following
commands:

```
jq '.data.project.nodes[].fullPath' $file | sed -e 's/^"//g' -e 's/"$//g' > project_paths.txt
jq '.data.project.nodes[].id' $file | sed -e 's~gid://gitlab/Project/~~g' -e 's/^"//g' -e 's/"$//g' > project_ids.txt
```

this gets us started with some information about your projects. that's all
that's necessary, for now.

## mirror clone your repos

the core part is your git repos. this is also kind of the easiest step here:

```
./git-mirror-repos/mirror_clone.sh SSH $(<project_ids.txt)
```

you can change the `SSH` argument to `HTTPS` for https cloning if that's
preferred. you can also pass `NONE` if you know what you're doing, as it
will not modify anything after the `NONE` argument (unlike other options
like `SSH` and `HTTPS`).

note that private repositories is included, so use a protocol you can be
authenticated by gitlab.

this section is included because i don't trust the bundles to work, or just
i don't how to do `git bundle`.

## export

you know that export option in that advanced section of the general settings
for a project?

yeah that exists. it also tells you what is exported and what is not:

> Export this project with all its related data in order to move it to a new
> GitLab instance. When the exported file is ready, you can download it from
> this page or from the download link in the email notification you will
> receive. You can then import it when creating a new project. [Learn
> more.](https://gitlab.com/help/user/project/settings/import_export)
> 
> The following items will be exported:
> 
> * Project and wiki repositories
> * Project uploads
> * Project configuration, excluding integrations
> * Issues with comments, merge requests with diffs and comments, labels,
>   milestones, snippets, and other project entities
> * LFS objects
> * Issue Boards
> * Design Management files and data
> 
> The following items will NOT be exported:
> 
> * Job logs and artifacts
> * Container registry images
> * CI variables
> * Pipeline triggers
> * Webhooks
> * Any encrypted tokens

if want any items listed under the "not exported" list, you'll need to get
those manually.

if you want to exclude any projects specifically in this step, you can make
a copy of the `project_ids.txt` file and remove any ids for projects you
don't want to "archive", then replace `project_ids.txt` in the command with
the new file you just made.

run the script with the access token in your clipboard. maybe with this
command:

```
./download/request/export_project.sh $(<project_ids.txt)
```

you will get a notification for each project that gets archived. so if you
don't like being bombarded with notifications, shut them off temporarily.

after that's finished, you'll have to download each archive.

run the script with the access token in your clipboard. maybe with this
command:

```
./download/download_archive/download_project_export.sh $(<project_ids.txt)
```

## archive
this is the last step: make the projects "archived".

if you want to exclude any projects specifically in this step, you can make
a copy of the `project_ids.txt` file and remove any ids for projects you
don't want to "archive", then replace `project_ids.txt` in the command with
the new file you just made.

run the script with the access token in your clipboard. maybe with this
command:

```
./archive_all/archive_projects.sh $(<project_ids.txt)
```

## the rest

getting the rest of the things not included in the git repos and gitlab
exports is currently left as an exercise for the reader.