Understanding how pgrep determines process IDs

This early termination is caused by code using a ps command piped to several grep/grep -v commands. The intent is to check if a process is already running and, if so, not to execute this script again.

Don't do that. It's a horrible method, more so if you don't even bother to check whether your 'grep' matches the input exactly and not just a substring. For example, if you leave vim pgrep_test.sh open, your script will think it's already running.

There are better ways to make a single-instance script:

  • run the script as a systemd .service (either by making your cronjob call 'systemctl start' or by using a systemd .timer to invoke it), as the same service cannot be started twice;

    [Service]
    Type=oneshot
    User=user1
    ExecStart=/tmp/inferencing/pgrep_test.sh
    
  • or use a lock file through flock, which uses kernel-based exclusive locking to guarantee a single instance.

    * * * * * user1 flock -n /tmp/inferencing/lock /tmp/inferencing/pgrep_test.sh
    

What even is PID 12961? PPID 12957 is the /bin/bash call but these two commands are identical otherwise

It's the "subshell" that handles the command within $( ... ). Every time you use command substitution, bash spawns a child process to handle it. If a simple command is being substituted, that subshell process may directly 'exec' the command in-place (e.g. in the case of $(ps -ef)), but if a whole pipeline is being substituted, that won't necessarily happen.

While $$ always expands to the PID of the main shell process (i.e. its value is cloned when bash spawns subshells), you can use $BASHPID to get the real process ID of the current interpreter. For example:

$ echo $$, $BASHPID; ps $$ $BASHPID
208231, 208231
    PID TTY      STAT   TIME COMMAND
 208231 pts/3    Ss     0:00 bash

$ (echo $$, $BASHPID; ps $$ $BASHPID)
208231, 208287
    PID TTY      STAT   TIME COMMAND
 208231 pts/3    Ss     0:00 bash
 208287 pts/3    S+     0:00 bash

$ { echo $$, $BASHPID; ps $$ $BASHPID; }
208231, 208231
    PID TTY      STAT   TIME COMMAND
 208231 pts/3    Ss     0:00 bash

$ var=$(echo $$, $BASHPID; ps $$ $BASHPID); echo "$var"
208231, 208294
    PID TTY      STAT   TIME COMMAND
 208231 pts/3    Ss+    0:00 bash
 208294 pts/3    R+     0:00 ps 208231 208294

The 2nd and 4th examples use subshells (another easy way to detect this is to notice that variables set within a subshell do not get propagated back into the main shell), while the 1st and 3rd don't.