Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--animate log in chronological order of the commits #96

Closed
yarikoptic opened this issue Jul 10, 2023 · 9 comments
Closed

--animate log in chronological order of the commits #96

yarikoptic opened this issue Jul 10, 2023 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@yarikoptic
Copy link

Originally brought up in #95 (comment) that the order of commits appearing in --animate is not necessarily chronological. In that example commits are actually happening in parallel on two branches -- the default (e.g. master) and supplemental (git-annex) managed by a tool (git-annex).

Note: there is bother Committer and Author dates

❯ git log --format=fuller
commit 9fc5bfff5864d379093089b740b883c2017d1ef3 (HEAD -> master, remote/synced/master, synced/master)
Author:     Yaroslav Halchenko <[email protected]>
AuthorDate: Thu Jul 6 15:36:14 2023 -0400
Commit:     Yaroslav Halchenko <[email protected]>
CommitDate: Thu Jul 6 15:36:14 2023 -0400

    Adding file.dat

which might differ. So may be it should be an option with value, e.g. --chronological=(author|commit)

@initialcommit-io
Copy link
Contributor

Thanks for creating this. Ok yes I'll think about the author date vs commit date as well.

One other question - if commits are displayed chronologically, do you expect to see all commits displayed linearly (even if they
exist on different branches)? Or do you expect multiple branches to be displayed in parallel in addition to that?

If commits are purely chronological we may want to rethink how the arrows are draw, since by convention the arrows in the Git dag represent parent/child relationships and not time. Thoughts?

@yarikoptic
Copy link
Author

One other question - if commits are displayed chronologically, do you expect to see all commits displayed linearly (even if they
exist on different branches)? Or do you expect multiple branches to be displayed in parallel in addition to that?

if I got question right -- "in parallel" is the correct answer if it is how --animate log is doing now since IIRC it did the right thing.

If commits are purely chronological we may want to rethink how the arrows are draw, since by convention the arrows in the Git dag represent parent/child relationships and not time. Thoughts?

it could retain that semantic no problem. Time would just define in which order commits with their arrows to parents would appear. IIRC ATM it goes backwards - from most recent commits to parents and so on, and I was thinking about from oldest to newer ones (arrows could still point to parents), and commits appearing in chronological order across branches.

note 1: it would be nice if order (but not necessarily chronological distance) along "horizontal axes" across branches is preserved, i.e. later commit on branch Y comes after earlier commit on branch X

note 2: Sure thing it could be that commits order in a branch would be incongruent with chronological order (clock could be changed backward so child commit is earlier than parent) -- then I guess it is up to decision on how to handle it ;)

@initialcommit-io
Copy link
Contributor

if I got question right -- "in parallel" is the correct answer if it is how --animate log is doing now since IIRC it did the right thing.

I think we are understanding each other, but just in case, here are the 2 ways I'm thinking we could do this:

  1. I believe Git's default log output is in reverse chronological order without considering parent/child relationships at all. If a merge commit exists with multiple parents, it is still the timestamps that determine order of all ancestors, which means that commits from both branch histories are interleaved together into a single linear display. But that would be a boring output in git-sim.

  2. It sounds like you prefer a "parallel" setup where commits are still displayed based on timestamp, so if there is a merge commit, the 2 merged histories would display "in parallel" to each other instead of being interleaved. And from your "note 1", preferably to stagger the display so that chronological ordering of commits between branches makes sense (not sure how hard this would be to do based on current git-sim implementation, might be better to start without that).

For (2) I think we could start from the most recent commit timestamp and display commits one by one until we reach one with multiple parents. Then we can split the chain so it displays in parallel, and continue each chain using chronological ordering. Each time we find a commit with multiple parents, we split again the same way.

note 2: Sure thing it could be that commits order in a branch would be incongruent with chronological order (clock could be changed backward so child commit is earlier than parent) -- then I guess it is up to decision on how to handle it ;)

Even in Git's default log, I believe that if a child commit somehow has an earlier timestamp than its parent, it would display first in the log (by that I mean later if we're talking reverse-chronological order). So I think we could just match Git's default behavior and not handle this in a special way.

@yarikoptic
Copy link
Author

  1. I believe Git's default log output is in reverse chronological order without considering parent/child relationships at all.

It seems indeed the case unless I use --graph option -- then it is "off":

this script which also collected time stamps per above hackery but with 1 sec sleep added
#!/bin/bash
# https://github.com/datalad/datalad/issues/7371
#
# A helper

# with 1 sec delay
export PS4='> $(date "+%Y-%m-%d %H:%M:%S.%N"): $(sleep 1)'; 
log="$(mktemp /tmp/sim-XXXXXXX)"
exec 2> "$log"

set -x

bash -x "$@"

echo "Commands with time stamps collected in $log"
ran on this one
#!/bin/bash
# https://github.com/datalad/datalad/issues/7371
#

# do not overload so we could use it outside
# export PS4='> '
# set -x

set -eu

umask 022
cd "$(mktemp -d /tmp/dl-XXXXXXX)"

mkdir remote
(
cd remote
git init
git annex init
)

# Let's slow down now and have it each 1 second
# export PS4='> $(sleep 1)'

mkdir origin
(
cd origin;
git init  # creates main branch, no commit
git annex init  # creates git-annex branch with a commit, also modifies .git/config
echo big-data > file.dat
git annex add file.dat  # creates commit in git-annex branches
git commit -m "Adding file.dat"   # creates commit in main
git annex addurl --file file.dat http:https://www.oneukrainian.com/tmp/file.dat  # only updates git-annex branch if no content change
# For now no remote
git remote add --fetch remote ../remote  # no commits
git annex sync  # if remote had its own history for git-annex branch -- it would get merged
git annex copy --to=remote file.dat  # commit in git-annex branch updating availability information

# now another file
echo text > text.txt
git add text.txt && git commit -m 'Added text file'

git merge --always HEAD^^  # some fake merge arch
echo more-data > another.dat
git annex add another.dat && git commit -m 'Added another.dat'
)


(
cd origin
#git sim log --all
git sim --animate log --all
)

pwd

produces following time stamps

❯ grep '^> ' /tmp/sim-Y2ig55e
> 2023-07-11 10:54:22.163362936: bash -x try-git-sim-longer.sh
> 2023-07-11 10:54:23.169581137: set -eu
> 2023-07-11 10:54:24.173617666: umask 022
> 2023-07-11 10:54:26.184822800: cd /tmp/dl-NVlZEHb
> 2023-07-11 10:54:27.187974748: mkdir remote
> 2023-07-11 10:54:28.194193618: cd remote
> 2023-07-11 10:54:29.198498613: git init
> 2023-07-11 10:54:30.206625548: git annex init
> 2023-07-11 10:54:31.266475134: mkdir origin
> 2023-07-11 10:54:32.274253001: cd origin
> 2023-07-11 10:54:33.278706977: git init
> 2023-07-11 10:54:34.284746511: git annex init
> 2023-07-11 10:54:35.342567494: echo big-data
> 2023-07-11 10:54:36.345593040: git annex add file.dat
> 2023-07-11 10:54:37.395980583: git commit -m 'Adding file.dat'
> 2023-07-11 10:54:38.420397319: git annex addurl --file file.dat http:https://www.oneukrainian.com/tmp/file.dat
> 2023-07-11 10:54:39.582342733: git remote add --fetch remote ../remote
> 2023-07-11 10:54:40.601416618: git annex sync
> 2023-07-11 10:54:41.850178450: git annex copy --to=remote file.dat
> 2023-07-11 10:54:42.923002010: echo text
> 2023-07-11 10:54:43.926877279: git add text.txt
> 2023-07-11 10:54:44.955743605: git commit -m 'Added text file'
> 2023-07-11 10:54:45.975189719: git merge --always 'HEAD^^'
> 2023-07-11 10:54:46.982942231: echo 'Commands with time stamps collected in /tmp/sim-Y2ig55e'

and following git log --all ... in graph (not quite chronological) and non-graph (seems indeed chronological) modes

❯ git -C /tmp/dl-NVlZEHb/origin log --all --graph --date=iso --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cD) %C(bold blue)[%an]%Creset' --abbrev-commit --date=relative
* bb686d3 - (HEAD -> master) Added text file (Tue, 11 Jul 2023 10:54:45 -0400) [Yaroslav Halchenko]
* 4105ad5 - (remote/synced/master, synced/master) Adding file.dat (Tue, 11 Jul 2023 10:54:38 -0400) [Yaroslav Halchenko]
* 14b4e1f - (git-annex) update (Tue, 11 Jul 2023 10:54:42 -0400) [Yaroslav Halchenko]
*   c67b1a8 - (remote/synced/git-annex, remote/git-annex) merging remote/git-annex into git-annex (Tue, 11 Jul 2023 10:54:41 -0400) [Yaroslav Halchenko]
|\  
| * 3d5a8a4 - update (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]
| * 72cb69f - branch created (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]
* 09eaefc - update (Tue, 11 Jul 2023 10:54:39 -0400) [Yaroslav Halchenko]
* 932e32f - update (Tue, 11 Jul 2023 10:54:37 -0400) [Yaroslav Halchenko]
* 36bc849 - update (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]
* 541f666 - branch created (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]

❯ git -C /tmp/dl-NVlZEHb/origin log --all  --date=iso --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cD) %C(bold blue)[%an]%Creset' --abbrev-commit --date=relative
bb686d3 - (HEAD -> master) Added text file (Tue, 11 Jul 2023 10:54:45 -0400) [Yaroslav Halchenko]
14b4e1f - (git-annex) update (Tue, 11 Jul 2023 10:54:42 -0400) [Yaroslav Halchenko]
c67b1a8 - (remote/synced/git-annex, remote/git-annex) merging remote/git-annex into git-annex (Tue, 11 Jul 2023 10:54:41 -0400) [Yaroslav Halchenko]
09eaefc - update (Tue, 11 Jul 2023 10:54:39 -0400) [Yaroslav Halchenko]
4105ad5 - (remote/synced/master, synced/master) Adding file.dat (Tue, 11 Jul 2023 10:54:38 -0400) [Yaroslav Halchenko]
932e32f - update (Tue, 11 Jul 2023 10:54:37 -0400) [Yaroslav Halchenko]
36bc849 - update (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]
541f666 - branch created (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]
3d5a8a4 - update (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]
72cb69f - branch created (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]

NB addition of that --date=iso seems to change nothing for me for %cd (I used %cD here though)...

@yarikoptic
Copy link
Author

For (2) I think we could start from the most recent commit timestamp and display commits one by one until we reach one with multiple parents. Then we can split the chain so it displays in parallel, and continue each chain using chronological ordering. Each time we find a commit with multiple parents, we split again the same way.

but what about going in reverse order -- from oldest to newest?
I think it might be more useful in many cases whenever people want to demonstrate how actually this repo evolved instead of digging back to history.

@initialcommit-io
Copy link
Contributor

and following git log --all ... in graph (not quite chronological) and non-graph (seems indeed chronological) modes

Hmm, so with the graph why is it showing the second commit 4105ad5 with timestamp (Tue, 11 Jul 2023 10:54:38 -0400) more recently than 14b4e1f and c67b1a8? If that commit was an older one that was merged in I would think the graph would reflect that? But it looks like a part of the linear history so I would assume it would be sorted reverse chronologically? Seems I'm missing something there...

but what about going in reverse order -- from oldest to newest?

I actually had that feature in my original program git-story but for git-sim I realized it was a lot simpler to program and draw the Git history in reverse parent/child order because you can just "walk" down the relationships recursively until the desired number of commits are drawn. Going in chronological order actually requires a bit more complexity. It should be do-able, but might take some time due to all the subcommands implemented now.

@yarikoptic
Copy link
Author

Hmm, so with the graph why is it showing the second commit 4105ad5 with timestamp (Tue, 11 Jul 2023 10:54:38 -0400) more recently than 14b4e1f and c67b1a8? ...

if I got it right -- it just confirms my point that it is not chronological in --graph and as for "why": we can only hypothesize: note that those commits come from different branches , so I guess in --graph mode git tries to group commits somehow to have multiple commits from the same branch listed together nearby while sacrificing some chronological order... I would not really worry/think about it much or rely on it -- I would have got commits, sorted them in precedence (child/parent relationship) + chronological (so parallel branches commits could potentially interleave in this line up) as the 2nd factor, and then went animating "events" in the order - drawing across multiple "horizontals" to reflect "branching" structure... Ignorant me doesn't know yet how I would have done actual graph "rendering" though which might get tricky and best accomplished with smth like graphviz.

i messed around with chatgpt a bit to see if it could give me something usable but "we" didn't figure out how to enforce chronological order. FWIW here is the script
#!/bin/bash

echo 'digraph G {'
git log --all --abbrev-commit --pretty=format:'%h [label="%h %s %ci",shape=ellipse];' --date-order |
while IFS=' ' read -r commit label; do
    echo "\"$commit\" $label"
    for parent in $(git log --pretty=%P -n 1 $commit); do
        echo "\"${parent:0:7}\" -> \"$commit\";"
    done
done
echo '}'

which if ran produces smth like

graph

which isn't good really

@initialcommit-io
Copy link
Contributor

initialcommit-io commented Jul 12, 2023

Hahaha nice, I like the idea of referring to yourself and ChatGPT as "we". While we're being honest I also used it to help decipher your shell commands 😸...

Anyway - ok that's true, we don't need to mimic Git's ordering exactly. One thing that's not too encouraging for this is that if you try and run git log --all --graph --reverse to get regular chronological order instead of reverse-chronological, it just gives a fatal error saying that's not possible lol:

$ git log --graph --all --reverse
fatal: options '--reverse' and '--graph' cannot be used together

@initialcommit-io initialcommit-io self-assigned this Apr 18, 2024
@initialcommit-io initialcommit-io added the enhancement New feature or request label Apr 18, 2024
@initialcommit-io
Copy link
Contributor

Closing without implementation as the effort to implement such a thing likely outweighs the demand from a user perspective. However, if more folks request this I will reconsider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants