如何从组织中克隆和提取所有存储库?

时间:2018-10-19 02:25:59

标签: linux github clone pull organization

我看到一些使用github的API来检索库列表的实现,但就我而言,并没有捕获所有库。

1 个答案:

答案 0 :(得分:0)

以防万一有人感兴趣,我写了一个小bash文件来做到这一点,希望对您有所帮助。

用目标组织的名称替换 XXXXXXXX

#!/bin/bash
python3 << END

import re
import urllib.request


ORGANIZATION = "XXXXXXXX"
URL_BASE = "https://github.com/{}?page=".format(ORGANIZATION)
URL_CLONE = "https://github.com/{}/{}.git"


page = 1
libraries = True
urls = []
while libraries:
    url = URL_BASE + str(page)
    resp = urllib.request.urlopen(url)
    code = ''
    if resp.code == 200:
        resp_bytes = resp.read()
        code = resp_bytes.decode("utf8")
        resp.close()
    libraries = re.findall('href="/{}/([^"]+)[^>]+codeRepository'.format(ORGANIZATION), code)
    for library in libraries:
        url = URL_CLONE.format(ORGANIZATION, library)
        urls.append(url)
    page += 1


with open('libraries.txt', 'w', encoding='utf-8') as ofile:
    ofile.write('\n'.join(urls))

END

while read i; do
  git clone $i || echo "    Error while cloning, probably already exists"
done <libraries.txt

ls -d */ | xargs -I{} git -C {} pull