Pygithub: github.GithubException.RateLimitExceededException

에 λ§Œλ“  2019λ…„ 05μ›” 07일  Β·  15μ½”λ©˜νŠΈ  Β·  좜처: PyGithub/PyGithub

λ‚΄ Flask μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ—μ„œ λ‹€μŒ μ½”λ“œλ₯Ό μ‚¬μš©ν•˜μ—¬ λ―Έν•΄κ²° 문제 수λ₯Ό κ°€μ Έμ˜€λ €κ³  ν•©λ‹ˆλ‹€.

g = Github()

repo = g.get_repo(repo_name)

open_pulls = repo.get_pulls(state='open')
open_pull_titles = [pull.title for pull in open_pulls]

open_issues = repo.get_issues(state='open')
open_issues = [issue for issue in open_issues if issue.title not in open_pull_titles]

github.GithubException.RateLimitExceededException: 였λ₯˜κ°€ λ°œμƒν•©λ‹ˆλ‹€.

repo.get_issues() λŠ” λ―Έν•΄κ²° 문제 μˆ˜μ™€ ν’€ μš”μ²­μ„ λ°˜ν™˜ν•©λ‹ˆλ‹€.

stale

κ°€μž₯ μœ μš©ν•œ λŒ“κΈ€

μ˜¬λ°”λ₯΄κ²Œ μ΄ν•΄ν•˜λ©΄ get_issues 및 get_pulls의 λ°˜ν™˜ μœ ν˜•μ€ PaginatedListμž…λ‹ˆλ‹€. λ°˜λ³΅μ„ μœ„ν•΄ yield element λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€. λ”°λΌμ„œ μš”μ²­μ€ open_issues = [issue for issue in open_issues if issue.title not in open_pull_titles] κΉŒμ§€ μˆ˜ν–‰λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. 토큰 μ œν•œμ— λ„λ‹¬ν•˜λ©΄ RateLimitExceedException이 λ°œμƒν•©λ‹ˆλ‹€.

λͺ¨λ“  15 λŒ“κΈ€

μ˜¬λ°”λ₯΄κ²Œ μ΄ν•΄ν•˜λ©΄ get_issues 및 get_pulls의 λ°˜ν™˜ μœ ν˜•μ€ PaginatedListμž…λ‹ˆλ‹€. λ°˜λ³΅μ„ μœ„ν•΄ yield element λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€. λ”°λΌμ„œ μš”μ²­μ€ open_issues = [issue for issue in open_issues if issue.title not in open_pull_titles] κΉŒμ§€ μˆ˜ν–‰λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. 토큰 μ œν•œμ— λ„λ‹¬ν•˜λ©΄ RateLimitExceedException이 λ°œμƒν•©λ‹ˆλ‹€.

ν•΄κ²° 방법이 μžˆμŠ΅λ‹ˆκΉŒ?

g = Github()
이 λ‹¨κ³„μ—μ„œ μΈμ¦ν•˜μ…¨μŠ΅λ‹ˆκΉŒ? 곡개 APIλŠ” 속도 μ œν•œμ΄ μ μŠ΅λ‹ˆλ‹€.

g = Github()
이 λ‹¨κ³„μ—μ„œ μΈμ¦ν•˜μ…¨μŠ΅λ‹ˆκΉŒ? 곡개 APIλŠ” 속도 μ œν•œμ΄ μ μŠ΅λ‹ˆλ‹€.

예, ν•΄λ‹Ή 단계λ₯Ό μΈμ¦ν–ˆμŠ΅λ‹ˆλ‹€.

@242μžμ΄λ‚˜λΉ„
λ‚΄κ°€ 보톡 속도 μ œν•œμ— λ„λ‹¬ν–ˆμ„ λ•Œ ν•˜λŠ” 일은 ν”„λ‘œκ·Έλž¨μ„ μž μ‹œ 보λ₯˜ν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€.

λͺ©λ‘ 이해λ₯Ό μ‚¬μš©ν•˜λŠ” λŒ€μ‹  try-catch와 ν•¨κ»˜ 곡톡 λ£¨ν”„λ§Œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. λΉ„μœ¨ μ œν•œ μ˜ˆμ™Έκ°€ λ°œμƒν•˜λ©΄ sleep ν•¨μˆ˜λ₯Ό ν˜ΈμΆœν•˜μ—¬ μž μ‹œ κΈ°λ‹€λ Έλ‹€κ°€ GitHub API둜 λΉ„μœ¨ μ œν•œμ„ λ‹€μ‹œ ν™•μΈν•˜μ‹­μ‹œμ˜€. 속도 μ œν•œμ΄ 5000으둜 λ‹€μ‹œ λŒμ•„μ˜¬ λ•Œλ§Œ μ½”λ“œκ°€ μ§„ν–‰λ©λ‹ˆλ‹€.

μ½”λ“œμ˜ 일뢀가 λˆ„λ½λ˜μ–΄ μ£„μ†‘ν•©λ‹ˆλ‹€. μ•„λž˜λŠ” 첫 번째 μ£Όμ„μ˜ μ½”λ“œμ— μ΄μ–΄μ§€λŠ” μ½”λ“œμž…λ‹ˆλ‹€.
λͺ¨λ“  λ―Έν•΄κ²° λ¬Έμ œμ— λŒ€ν•΄ created_at λ‚ μ§œμ— μ•‘μ„ΈμŠ€ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. μ΄λ ‡κ²Œ ν•˜λ©΄ λͺ¨λ“  λ¬Έμ œμ— λŒ€ν•΄ API에 λ‹€μ‹œ μ•‘μ„ΈμŠ€ν•˜λ―€λ‘œ κ²°κ΅­ ν•œλ„ μ΄μƒμœΌλ‘œ ν˜ΈμΆœν•˜κ²Œ λ©λ‹ˆλ‹€.

for issue in open_issues:
    created_at = issue.created_at.timestamp()

이 λ¬Έμ œμ— λŒ€ν•œ 해결책을 찾지 λͺ»ν–ˆμŠ΅λ‹ˆλ‹€. μš”μ²­μ„ μΈμ¦ν•˜λ”λΌλ„ λ¬Έμ œκ°€ λ„ˆλ¬΄ 많으면 ν•œλ„κ°€ μ†Œμ§„λ©λ‹ˆλ‹€(예: 2000개).

이 같은:

repositories = g.search_repositories(
    query='stars:>=10 fork:true language:python')

λ˜ν•œ 속도 μ œν•œμ„ νŠΈλ¦¬κ±°ν•©λ‹ˆλ‹€. μžλ™μœΌλ‘œ νŽ˜μ΄μ§€ 맀김을 μˆ˜ν–‰ν•˜κ³  속도 μ œν•œμ„ νŠΈλ¦¬κ±°ν•œλ‹€κ³  κ°€μ •ν•©λ‹ˆλ‹€. μΌμ‹œ 쀑지할 수 μžˆλ„λ‘ μˆ˜λ™μœΌλ‘œ μˆ˜ν–‰ν•  수 μžˆλŠ” 방법이 μžˆμŠ΅λ‹ˆκΉŒ?

이제 이것이 μ‹€μ œ λ¬Έμ œλΌλŠ” 것을 κΉ¨λ‹¬μ•˜μŠ΅λ‹ˆλ‹€.
ν•œ 가지 κ°€λŠ₯ν•œ ν•΄κ²° 방법은 μ•„λž˜μ™€ 같은 μ½”λ“œ 쑰각일 수 μžˆμŠ΅λ‹ˆλ‹€. 아직 해보지 μ•Šμ•˜λŠ”λ° νš¨κ³Όκ°€ μžˆλŠ”μ§€ μ—†λŠ”μ§€ μ•Œλ €μ£Όμ„Έμš”.

iter_obj=iter(open_issues) ## PaginatedList is a generator
 while True:
    try:   
        issue=next(iter_obj) 
        ## do something
    except StopIteration:
        break  # loop end
    except github.GithubException.RateLimitExceededException:
        sleep(3600) # sleep 1 hour
        ## check token limits
        continue

@wangpeipei90 μž‘λ™ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

μ–΄λ–€ μ΄μœ μ—μ„œμΈμ§€ 이 λ™μž‘μ€ 맀우 μ˜ˆμΈ‘ν•  수 μ—†μœΌλ©° λ”μ°ν•©λ‹ˆλ‹€.
λ‚΄ ν”„λ‘œκ·Έλž¨μ€ 403을 λ¬΄μž‘μœ„λ‘œ μ œκ³΅ν•˜κΈ° 전에 1~2μ‹œκ°„ λ™μ•ˆ μ œν•œ λ‚΄μ—μ„œ μ€€μˆ˜ν•˜λŠ”μ§€ ν™•μΈν•˜κΈ° μœ„ν•΄ ratelimit APIλ₯Ό 효과적으둜 μˆœν™˜ν•˜κ³  μ„ μ œμ μœΌλ‘œ ν˜ΈμΆœν•  수 μžˆμŠ΅λ‹ˆλ‹€.

일뢀 κΈ°λ³Έ 속도 μ œν•œ μ€€μˆ˜λŠ” μ—¬κΈ°μ—μ„œ λ”°λœ»ν•œ ν™˜μ˜μ„ 받을 κ²ƒμž…λ‹ˆλ‹€. μ‘μš© ν”„λ‘œκ·Έλž¨μ΄ 2μ‹œκ°„ λ™μ•ˆ μ›ν™œν•˜κ²Œ μ‹€ν–‰ν•œ ν›„ λΉˆμ„ 흘리기둜 κ²°μ •ν•  λ•Œ 직관에 따라 μ ˆμ „ λͺ¨λ“œλ₯Ό κ΅¬ν˜„ν•΄μ•Ό ν•˜λŠ” 것은 μ˜ˆμƒλ˜λŠ” λ™μž‘μ΄ μ•„λ‹™λ‹ˆλ‹€.

 File "word.py", line 126, in get_stargazers_inner
    for i in repo.get_stargazers_with_dates():
  File "/usr/local/lib/python3.6/dist-packages/github/PaginatedList.py", line 62, in __iter__
    newElements = self._grow()
  File "/usr/local/lib/python3.6/dist-packages/github/PaginatedList.py", line 74, in _grow
    newElements = self._fetchNextPage()
  File "/usr/local/lib/python3.6/dist-packages/github/PaginatedList.py", line 199, in _fetchNextPage
    headers=self.__headers
  File "/usr/local/lib/python3.6/dist-packages/github/Requester.py", line 276, in requestJsonAndCheck
    return self.__check(*self.requestJson(verb, url, parameters, headers, input, self.__customConnection(url)))
  File "/usr/local/lib/python3.6/dist-packages/github/Requester.py", line 287, in __check
    raise self.__createException(status, responseHeaders, output)
github.GithubException.RateLimitExceededException: 403 {'message': 'API rate limit exceeded for user ID xxxx.', 'documentation_url': 'https://developer.github.com/v3/#rate-limiting'}

λ˜ν•œ λ°±μ˜€ν”„ 라이브러리λ₯Ό μ‚¬μš©ν•  수 μžˆμ§€λ§Œ ν•­λͺ© 반볡의 ν˜„μž¬ μœ„μΉ˜λ₯Ό μ„€λͺ…ν•  수 μ—†μœΌλ―€λ‘œ μ²˜μŒλΆ€ν„° λ‹€μ‹œ μ‹œμž‘ν•©λ‹ˆλ‹€.

κΈ€μŽ„, λ‚˜λŠ” https://github.com/settings/tokensλ₯Ό λ°©λ¬Έν•˜μ—¬ "토큰 μž¬μƒμ„±"을 μˆ˜ν–‰ν–ˆμŠ΅λ‹ˆλ‹€. 그둜 인해 λ‹€μ‹œ 둀링이 λ°œμƒν–ˆμ§€λ§Œ μ–Όλ§ˆλ‚˜ 였래 κ±Έλ¦΄μ§€λŠ” λͺ¨λ₯΄κ² μŠ΅λ‹ˆλ‹€.

μ €λŠ” "토큰" 인증 방법을 μ‚¬μš©ν–ˆμŠ΅λ‹ˆλ‹€. μ˜ˆμ‹œ:

    github = Github("19exxxxxxxxxxxxxxxxxxxxxe3ab065edae6470")

κ³Όλ„ν•œ μš”μ²­μ€ #1233도 μ°Έμ‘°ν•˜μ„Έμš”.

이 λ¬Έμ œλŠ” 졜근 ν™œλ™μ΄ μ—†μ—ˆκΈ° λ•Œλ¬Έμ— μžλ™μœΌλ‘œ 였래된 κ²ƒμœΌλ‘œ ν‘œμ‹œλ˜μ—ˆμŠ΅λ‹ˆλ‹€. 더 이상 ν™œλ™μ΄ μ—†μœΌλ©΄ νμ‡„λ©λ‹ˆλ‹€. κ·€ν•˜μ˜ 기여에 κ°μ‚¬λ“œλ¦½λ‹ˆλ‹€.

@wangpeipei90

그것은 λ‚˜λ₯Ό μœ„ν•΄ μž‘λ™ν•˜μ§€λ§Œ RateLimitExceededException λŠ” λ‚΄κ°€ μ‚¬μš©ν•œ λ²„μ „μ—μ„œ GithubException μ•„λž˜μ— μžˆμ§€ μ•ŠμŠ΅λ‹ˆλ‹€. μ—¬κΈ° λ‚΄ μ½”λ“œκ°€ μžˆμŠ΅λ‹ˆλ‹€.

from github import RateLimitExceededException

issues = g.search_issues(query=keyword, **{'repo': repo, 'type': 'pr'})
            iter_obj = iter(issues)
            while True:
                try:
                    pr = next(iter_obj)
                    with open(pr_file, 'a+') as f:
                        f.write(pr.html_url + '\n')
                    count += 1
                    logger.info(count)
                except StopIteration:
                    break  # loop end
                except RateLimitExceededException:
                    search_rate_limit = g.get_rate_limit().search
                    logger.info('search remaining: {}'.format(search_rate_limit.remaining))
                    reset_timestamp = calendar.timegm(search_rate_limit.reset.timetuple())
                    # add 10 seconds to be sure the rate limit has been reset
                    sleep_time = reset_timestamp - calendar.timegm(time.gmtime()) + 10
                    time.sleep(sleep_time)
                    continue

λ‹€μŒμ€ 둜그의 μΌλΆ€μž…λ‹ˆλ‹€.

2020/01/08 23:42:09 PM - INFO - search remaining: 0

κ°μ‚¬ν•©λ‹ˆλ‹€, @Xiaoven λ§ˆμΉ¨λ‚΄ κ·€ν•˜μ˜ μ½”λ“œλ‘œ 이 문제λ₯Ό ν•΄κ²°ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

이 νŽ˜μ΄μ§€κ°€ 도움이 λ˜μ—ˆλ‚˜μš”?
0 / 5 - 0 λ“±κΈ‰