Pygithub: InputGitTreeElement should allow passing "null" for sha

Created on 14 Dec 2019  ·  8Comments  ·  Source: PyGithub/PyGithub

Github's Tree creation api allows us to pass sha = null to indicate that the specified blob needs to be deleted.

However, I don't have a way to pass this info to my InputGitTreeElement. I can either give it a str or a github.GithubObject.NotSet. This means I have no way of deleting files from a tree using PyGithub (I'd like to delete multiple files in a single commit so tree creation is the ideal choice for me).

The current design is to only pass the sha if it is actually set:
https://github.com/PyGithub/PyGithub/blob/540a085001/github/InputGitTreeElement.py#L81

I can understand that passing a None goes against the design. I think something like github.GithubObject.Null could be introduced to explicitly say that this field is null. It can be used everywhere the GH API accepts a null value.

Example

new_tree = repo.create_git_tree(
    [
        InputGitTreeElement(
            path="my/dir/my_file.txt", mode="100644", type="blob", sha=github.GithubObject.Null
        ),
   ],
    base_tree=head_commit.tree
)

This will delete my/dir/my_file.txt


My current workaround is to directly hit the api to create tree (using requests, setting sha=None), get the tree sha & use it with pygithub for my remaining workflow (committing, etc).

Please let me know in case I misunderstood some aspect or if anything needs to be elaborated upon.

Most helpful comment

Thanks so much! I'll give this a try in the next day or so. Maybe without the f-strings. :-)

All 8 comments

Actually NotSet is an class in it's own right that has nothing to do with None, so if you pass in None, it should work fine.

Oh, wait, I see. We assert it's a str! Right, okay, so I think loosening to allow None is perfectly fine then, nice catch.

I'm having a lot of trouble with this change. I've made the necessary changes, but the create_git_tree() API either returns 422 GitRPC::BadObjectState or a generic 404 Not Found error. Are you able to provide a working example against a public repo so I can grab the replay data for the unit test?

My code is for a private repo, so I just created a stripped-down version of it for you here:

import json
import requests
from http import HTTPStatus
from github import Github

TOKEN = "<GITHUB TOKEN>"
gh_client = Github(TOKEN)

repo = gh_client.get_repo("duaraghav8/test")
my_branch = "tree-test"

# paths is list of files to delete
paths = ["pods/raghav/network/resources.tf", "abc"]

sha = repo.get_branch(my_branch).commit.sha
head_commit = repo.get_git_commit(sha)

tree_objects = [
    {"path": path, "mode": "100644", "type": "blob", "sha": None}
    for path in paths
]
payload = {
    "base_tree": head_commit.tree.sha,
    "tree": tree_objects,
}
headers = {"Authorization": f"token {TOKEN}"}
url = f"{repo.url}/git/trees"

response = requests.post(url, headers=headers, data=json.dumps(payload))
if response.status_code != HTTPStatus.CREATED:
    raise RuntimeError("Failed")

new_tree_sha = response.json()["sha"]
tree = repo.get_git_tree(new_tree_sha)

1 reason I too received 422 GitRPC::BadObjectState was because I was supplying filenames in paths that didn't exist in the base tree. For eg, try corrupting 1 of the filenames above.

I can't make my test repo public, but here is the structure, just setup dummy files in your own repo

pods/
  raghav/
    network/
      resources.tf

Thanks so much! I'll give this a try in the next day or so. Maybe without the f-strings. :-)

@s-t-e-v-e-n-k wohoo! thanks for fixing this
cc @rohitpaulk

@s-t-e-v-e-n-k I'm creating a commit like this, but this commits even if the file diff is zero. Please, any help on how to commit only if there is some diff . Thanks !

   file_list = [a.txt]
   file_names =[a.txt] 
    for i, entry in enumerate(file_list):
      with open(entry) as input_file:
        data = input_file.read()
      element = InputGitTreeElement(file_names[i], '100644', 'blob', data)
      element_list.append(element)
    tree = repo.create_git_tree(element_list, base_tree)
    parent = repo.get_git_commit(master_sha)
    commit = repo.create_git_commit(commit_message, tree, [parent])

@Abhishek627 Tree SHAs are essentially checksums of all the files in your repository, so you could just check if they match.

In your context, that'd be something like:

if tree.sha != base_tree.sha:
   # then commit

(I don't recall the exact method used to fetch a sha, but you get the idea)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nixoz2k7 picture nixoz2k7  ·  7Comments

jacquev6 picture jacquev6  ·  3Comments

Borkason picture Borkason  ·  4Comments

rthill91 picture rthill91  ·  4Comments

xpdable picture xpdable  ·  5Comments