Gitea: Contributors graph

Created on 5 Feb 2017  ·  30Comments  ·  Source: go-gitea/gitea

Implement contributor graphs: https://github.com/go-gitea/gitea/graphs/contributors

screenshot_20170205_131515


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

kinfeature revieweconfirmed

Most helpful comment

Ok folks, yet another update. I managed it to get to this state:

image


Click to expand:

Gitea vs GitHub (real-life example)

![image](https://user-images.githubusercontent.com/19366641/50791201-6f7d9500-12c1-11e9-9a3d-7612c63e6b4a.png) ![image](https://user-images.githubusercontent.com/19366641/50791210-7c9a8400-12c1-11e9-985c-b0dffcbfae3a.png)

Dark

![image](https://user-images.githubusercontent.com/19366641/50791412-0cd8c900-12c2-11e9-86e7-5fb4142a5bcc.png)



Details:

  • No data exposed over HTTP API, graphs are rendered to SVG (using https://github.com/wcharczuk/go-chart) on the server. This is really performant and keeps things simple.
  • Sorting by number of commits, additions and deletions
  • The UI is "slightly" based on GitHub 😄

Issues left:

  • Contributors, which are not in the gitea DB (e.g. because the repo was imported) will not show up.
  • Performance issues with bigger repositories. Edit: not performance, but HTTP 500 with the message "http: multiple response.WriteHeader calls" written to the console. Might be just me being a golang n00b.
  • Removing the AM/PM stuff from the X-Axis (can be easily done via custom formatter)
  • Fix the Y-Axis scale of user graphs, 1 commit should be half the height as 2 commits
  • Proper dark theme support (CSS for above was tweaked in the dev tools)

Possible enhancements:

  • Stats are for the master branch (hardcoded), this can be easily changed and exposed as an UI control

Ideas for changes and improvements welcome - I'm exited so far! I fear the upcoming code review though :smile:

All 30 comments

Is there a good graph-lib? In my opinion this can be rendered and cached server-side

Any progress?

would be nice to have 🎉

I would like to start working on this feature, if no one is already on it (yeah @lafriks, I learned my lesson, +1 is not constructive 😉).

I would probably need some assistance every now and then, e.g. on how to decide about server or client side rendering, what charting library to use etc.
I also basically don't know any Go but have good frontend knowledge so it should work, and everything has a first time, also I wanted to dive into hacking on Gitea some while ago 😄

Let's start by taking apart existing solutions to identify required data and possible data structure.

GitHub

API endpoint for contibutions data is https://github.com/<owner>/<repo>/graphs/contributors-data.

The returned JSON data is basically a list of objects (each representing one contributor) sorted least contributions first, most contributions last:

[
  { ... }, // User with least contributions
    ...
  { ... }, // User with second most contributions
  { ... }  // User with most contributions
]

The structure is roughly similar to the one documented here and looks like this:

{
  "author": {
    "id": 12345,
    "login": "octocat",
    "avatar": "https://avatars3.githubusercontent.com/u/12345?s=60&v=4",
    "path": "/octocat",
    "hovercard_url": "/hovercards?user_id=12345"
  },
  "total": 123,
  "weeks": [
    // First week in which the repo existed
    {
      "w": 1391904000,
      "a": 6898,
      "d": 77,
      "c": 10
    },
    // Second week in which the repo existed
    {
      "w": 1392508800,
      "a": 2437,
      "d": 439,
      "c": 6
    },
    ...
    // Current week
    {
      "w": 1538265600,
      "a": 0,
      "d": 0,
      "c": 0
    }
  ]
}

Each member of the "weeks" array is contructed has the following attributes:

  • w - Start of the week, given as a Unix timestamp.
  • a - Number of additions
  • d - Number of deletions
  • c - Number of commits

All that information is used to build these cards:

grafik

The big contributions graph obviously can be built by adding up the stats from each user of a week n (0 <= n <= weeks since the repo exists) and plotting the cumulative value for each week.

GitLab

GitLab CE is Open Source, so we have the relevant files:

API endpoint is https://gitlab.com/<owner>/<repo>/graphs/master?format=json.

The returned JSON data is much simpler:

[
  { ... }, // Latest commit
  { ... }, // Second latest commit
    ...
  { ... }, // First commit
]

Each member of the array represents one commit, sorted latest commit first, initial commit last. The structure looks as follows:

{
  "author_name": "Some User",
  "author_email": "[email protected]",
  "date": "2018-10-02"
}

If a user made multiple commits on the same day, there will simply be duplicate entries with the same user information and date, one for each commit.

The per-user tiles will contain less information than on GitHub, the plotting is done by taking the number of commits for one day, X-Axis is time, Y-Axis number of commits. That is done for both the whole repo (ignoring username) and each user (taking all commit entries for a specific user on a specific day).


In both cases rendering is done client side, which has the great advantage of being able to build dynamic charts with zooming.

If it works with your general workflow here, I'd be fine with being assigned to this issue.


Some more thoughts on this. Constructive feedback is of course highly appreciated!

Placing the page link on the UI

image

That should work fine, no need to restructure anything for now.

Speaking of links, the page should probably live at https://git.example.com/<owner>/<repo>/contributors, that's how all the other links up there work.

Another idea, which I do not prefer, is putting the contributor graph(s) on the Activity page.

I did some DOM editing:

image

I chose octicon-organization as the icon, octicon-graph might work as well.

Now some quick CSS editing on the GitHub contributors chart for Gitea and merging the images:

image

That's a very rough idea of how it may look like, not taking individual per-user charts into account.

Looks wonderful ^-^

@linusg great! Go ahead!

@lunny I'm kinda confused right now: Who is @Morlinest and what role will he play in this issue?

It is probably a mistake or maybe he has some secret plans with me :D

@linusg @Morlinest :( sorry. A mistake like what @Morlinest said. I want to assign this issue to @linusg but I found it cannot be assigned to non-maintainers and issue poster.

Ok, thanks for clarification :smile:

Oh, so I will have to do it now :D

Short heads-up for those interested: I wanted to work on this during the Christmas holidays, but couldn't find much time. I've created the basic stuff (page, routing etc.) and plan to continue working on it!

Thanks a lot ^-^

Ok folks, yet another update. I managed it to get to this state:

image


Click to expand:

Gitea vs GitHub (real-life example)

![image](https://user-images.githubusercontent.com/19366641/50791201-6f7d9500-12c1-11e9-9a3d-7612c63e6b4a.png) ![image](https://user-images.githubusercontent.com/19366641/50791210-7c9a8400-12c1-11e9-985c-b0dffcbfae3a.png)

Dark

![image](https://user-images.githubusercontent.com/19366641/50791412-0cd8c900-12c2-11e9-86e7-5fb4142a5bcc.png)



Details:

  • No data exposed over HTTP API, graphs are rendered to SVG (using https://github.com/wcharczuk/go-chart) on the server. This is really performant and keeps things simple.
  • Sorting by number of commits, additions and deletions
  • The UI is "slightly" based on GitHub 😄

Issues left:

  • Contributors, which are not in the gitea DB (e.g. because the repo was imported) will not show up.
  • Performance issues with bigger repositories. Edit: not performance, but HTTP 500 with the message "http: multiple response.WriteHeader calls" written to the console. Might be just me being a golang n00b.
  • Removing the AM/PM stuff from the X-Axis (can be easily done via custom formatter)
  • Fix the Y-Axis scale of user graphs, 1 commit should be half the height as 2 commits
  • Proper dark theme support (CSS for above was tweaked in the dev tools)

Possible enhancements:

  • Stats are for the master branch (hardcoded), this can be easily changed and exposed as an UI control

Ideas for changes and improvements welcome - I'm exited so far! I fear the upcoming code review though :smile:

Sooo... here we go! Now it's the time for some external input, so please see below the images.

image

(gitea repo taken from GitHub)

image

Let me explain:

  • Users who are not in the gitea users DB will be shown, but with no link to the profile, obv. Stats are calculated by Username (available is only "name" and "email" per each commit), that's why there's "Unknown", "Unknwon" and "无闻" vs only "Unknwon" in GitHub: The information, that this is all the same user is lost when cloning/importing the repo. I guess that's the best option available, thoughts?

  • GitHub compiles stats per week, I went with daily stats. Should this be changed?

    That's the reason why the Y-Axis on GitHub ends at ~150 [commits per week] and the Gitea one at 52 [commits per day]. Also it makes the chart on Gitea appear with more "spikes". (interpolation isn't available as well)

  • GitHub excludes merge commits from the stats, I didn't implement anything of this kind (and don't know how hard distinguishing one from a normal commit would be). Do we want this feature?

  • Do you wish a separate color for the per-user charts?

  • What else do you think can be improved?

Performance:

I fixed all of the issues noted in my last post, and I'm back to some performance issues. All stats from my dev machine:

The contributors page of the Gitea blog repo takes 1.1s to load, which is probably fine (_Page: 1090ms Template: 7ms_)

The one for the gitea main repo took 1min 14s and reports _Page: 74443ms Template: 47ms_. It has nine years of history and almost 7k commits, though.

Possible improvements: the gitea repo contributor page ends up with 602 user cards, I believe GitHub cuts off at 100. See https://github.com/go-gitea/gitea/graphs/contributors.

What do you think about that? As the charts become less useful with very few changes/commits because of the fixed Y-Axis, should we either only show top X contributors or maybe don't generate charts for more than e.g. 100 contributors?

image

Since the whole commit history will be traversed each time the page is visited, we can probably as well improve the situation by caching the stats. No clue if that makes sense and how implementation would look like.

I had to clear the ServiceWorker's cache for the changed CSS files to show up (normal cache refresh wouldn't work). What do I have to do here so it works OOTB?

More screenshots, click to expand

![image](https://user-images.githubusercontent.com/19366641/50845620-95f90a00-136d-11e9-94a1-dfcdbdcf8908.png) ![image](https://user-images.githubusercontent.com/19366641/50845863-28011280-136e-11e9-8a93-a194dde3115c.png) ![image](https://user-images.githubusercontent.com/19366641/50846330-2be16480-136f-11e9-8aad-e814157d045f.png) ![image](https://user-images.githubusercontent.com/19366641/50846507-9b575400-136f-11e9-9a6f-97f7a9ec28a1.png)

@linusg Great job!!! How about to let the work as a cronjob when the repository is big(i.e. over 1000 commits)? It can be run one or more days according the configuration. I think top 100 is enough, otherwise pagination is better.

@linusg

  • Stats are for the master branch (hardcoded), this can be easily changed and exposed as an UI control

Maybe you can use the default branch option instead of creating another option.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

Nope, still working on this - someone might remove the stale label!

Nope, still working on this

That's good news, hoping to see this feature coming soon!
Hyped to see charts and graphic data representation everywhere...

That's good news, hoping to see this feature coming soon!

Soon :tm:

Time is definitely a problem for me... next to my basically non-existent knowledge of golang, haha.

I'm happy to see all the excitement (I'm excited too, wouldn't work on this otherwise!), but if you folks feel like this is not making progress fast enough (damn, it's been over half a year), I can make a PR with all the changes and someone else can help out?

Soon ™️

😅

but if you folks feel like this is not making progress fast enough

Naaah.. whoever cares of the time you take for a feature, if it is a very good and working feature?

I can make a PR with all the changes and someone else can help out?

IDK, maybe you can post your repo publicly here on GH so other people can PR your repo and get it working so than you can PR the official one and get that integrated
OR you can open a PR for a new branch on the official repo from which skilled and time willing people can fork and work and PR that instead, waiting for the branch to get merged into master of course...

@linusg Please send a PR that maybe someone could help you when you are absent.

looking forward to this feature. Is it stale now ? .....Thanks

Any further news here?

Did a PR or branch ever get submitted here?

I think not. @linusg

I don't think I ever pushed my changes. Not even sure if I still have them - sorry!

Was this page helpful?
0 / 5 - 0 ratings