Perforce for Git users? [closed]
There is a lot of "Git for Perforce users" documentation out there, but seemingly very little of the opposite.
I have only used Git previously and recently started a job where I have to use Perforce a lot, and find myself getting very confused a lot of the time. The concepts I'm used to from Git seem not to map to Perforce at all.
Is anyone interested in putting together a few tips for using Perforce for someone who is used to Git?
This something I've been working on over the past couple weeks on and off. It's still evolving, but it may be helpful. Please note I'm a Perforce employee.
An intro to Perforce for Git users
To say that moving from Git to Perforce or from Perforce to Git is non-trivial is a grand understatement. For being two tools that ostensibly do the same thing, their approach could not be more different. This brief write-up will try to help new Perforce users coming from Git understand the new world they are in.
One brief detour before we dive in; if you prefer Git you can use Git with Perforce quite well. We provide a tool called Git Fusion that generates Git repositories that are kept in sync with the Perforce server. Git and Perforce people can live in harmony working on the same code, mostly unaffected by their co-workers choice of version control. Git Fusions 13.3 is available from the Perforce web site. It does need to be installed by the Perforce administrator, but if you install it you will find that its repository slicing feature can be quite handy as a Git user.
If you can't convince your admin to install Git Fusion, Git itself comes with a Perforce binding called Git-P4 that allows you to use Git to change and submit files in a Perforce workspace. More information on that can be found at: https://git.wiki.kernel.org/index.php/GitP4
Still here? Good, let's look at Perforce.
Some Terminology Differences to Sort Out
Before we get into the details we need to briefly cover a couple terminology differences between Git and Perforce.
The first is checkout. In Git this is how you get a copy of the code from a given branch into your working area. In Perforce we call this a sync from the command line or from our GUI P4V "Get Latest Revision". Perforce uses the word checkout from P4V or p4 edit
from the command line to mean that you plan to change a file from the version control system. In the rest of this document, I'll be using checkout in the Perforce sense of the word.
The second is Git commit versus Perforce submit. Where you would commit in Git you will submit in Perforce. Being that all operations happen against the shared Perforce versioning service, Perforce doesn't have an equivalent for git push
. Likewise we don't have a pull
; the sync command from above takes care of getting files for us. There is no concept of a pure local submit in Perforce unless you choose to use our P4Sandbox tool described briefly below.
Key Concepts in Perforce
If I were to simplify Perforce to two key concepts I would focus on the depot and the workspace. A Perforce depot is a repository of files that lives in a Perforce server. A Perforce server can have any number of depots and each depot can contain any number of files. Frequently you will hear Perforce users use depot and server interchangeably, but they are different. A Perforce site may choose to have multiple servers, but most commonly all files are in one server.
A Perforce workspace or client is an object in the system that maps a set of files in the Perforce server to a location on a user's file system. Every user has a workspace for each machine they use, and frequently users will have more than one workspace for the same machine. The most important part of a workspace is the workspace mapping or view.
The workspace view specifies the set of files in the depot that should be mapped to the local machine. This is important because there is a good chance that you do not want all of the files that are available on the server. A workspace view lets you select just the set that you care about. It's important to note that a workspace can map content from multiple depots, but can only map content from one server.
To compare Perforce to Git in this regard, with Git you pick and choose the set of Git repos that you are interested in. Each repo is generally tightly scoped to contain just related files. The advantage of this is there is no configuration to do on your part; you do a git clone of the things you care about and you're done. This is especially nice if you only work with one or two repositories. With Perforce you need to spend a bit of time picking and choosing the bits of code you want.
Many Perforce shops use streams which can automatically generate a workspace view, or they generate the view using scripts or template workspaces. Equally many leave their users to generate their workspaces themselves. One advantage of being able to map a number of modules in one workspace is you can easily modify multiple code modules in one checkin; you can be guaranteed that anyone with a similar client view who syncs to your checkin will have all the code in the correct state. This can also lead to overly dependent code though; the forced separation of Git can lead to better modularity. Thankfully Perforce can also support strict modularity as well. It's all a question of how you choose to use the tool.
Why Workspaces?
I think in coming from Git it is easy to feel like the whole workspace concept is way more trouble than it is worth. Compared to cloning a few Git repos this is undoubtably true. Where workspaces shine, and the reason Perforce is still in business after all these years, is that workspaces are a fantastic way to pare down multi-million file projects for developers while still making it easy for build and release to pull all the source together from one authoritative source. Workspaces are one of the key reasons Perforce can scale as well as it does.
Workspaces are also nice in that the layout of files in the depot and the layout on the user's machine can vary if need be. Many companies organize their depot to reflect the organization of their company so that it is easy for people to find content by business unit or project. However their build system couldn't care less about this hierarchy; the workspace allows them to remap their depot hierarchy in whatever way makes sense to their tools. I have also seen this used by companies who are using extremely inflexible build systems that require code to be in very specific configurations that are utterly confusing to humans. Workspaces allow these companies to have a source hierarchy that is human navigable while their build tools get the structure they need.
Workspaces in Perforce are not only used to map the set of files a user wants to work with, but they are also used by the server to track exactly which revisions of each file the user has synced. This allows the system to send the correct set of files to the user when syncing without having to scan the files to see which files need to be updated. With large amounts of data this can be a sizable performance win. This is also very popular in industries that have very strict auditing rules; Perforce admins can easily track and log which developers have synced which files.
For more information on the full power of Perforce workspaces read Configuring P4.
Explicit Checkout vs. Implicit Checkout
One of the biggest challenges for users moving from Git to Perforce is the concept of explicit checkout. If you are accustomed to the Git/SVN/CVS workflow of changing files and then telling the version control system to look for what you've done, it can be an extremely painful transition.
The good news is that if you so choose you can work with a Git style workflow in Perforce. In Perforce you can set "allwrite" option on your workspace. This will tell Perforce that all files should be written to disk with the writable bit set. You may then change any file you wish without explicitly telling Perforce. To have Perforce reconcile those changes you made you can run "p4 status". It will open files for add, edit, and delete as appropriate. When working this way you will want to use "p4 update" instead of "p4 sync" to get new revisions from the server; "p4 update" checks for changes before syncing so will not clobber your local changes if you haven't run "p4 status" yet.
Why Explicit Checkout?
A question I frequently receive is "why would you ever want to use explicit checkout?" It can at first blush seem to be a crazy design decision, but explicit checkout does have some powerful benefits.
One reason for using explicit checkout is it removes the need to scan files for content changes. While with smaller projects calculating hashes for each file to find differences is fairly cheap, many of our users have millions of files in a workspace and/or have files that are 100's of megabytes in size, if not larger. Calculating all the hashes in those cases is extremely time consuming. Explicit checkout lets Perforce know exactly which files it needs to work with. This behavior is one of the reasons Perforce is so popular in large file industries like the game, movie, and hardware industries.
Another benefit is explicit checkout provides a form of asynchronous communication that lets developers know generally what their peers are working on, or at least where. It can let you know that you may want to avoid working in a certain area so as to avoid a needless conflict, or it can alert you to the fact that a new developer on the team has wandered into code that perhaps they don't need to be editing. My personal experience is that I tend to work either in Git or using Perforce with allwrite on projects where I'm either the only contributor or an infrequent contributor, and explicit checkout when I'm working tightly with a team. Thankfully the choice is yours.
Explicit checkout also plays nicely with the Perforce concept of pending changelists. Pending changelists are buckets that you can put your open files into to organize your work. In Git you would potentially use different branches as buckets for organizing work. Branches are great, but sometimes it is nice to be able to organize your work into multiple named changes before actually submitting to the server. With the Perforce model of potentially mapping multiple branches or multiple projects into one workspace, pending changelists make it easy to keep separate changes organized.
If you use an IDE for development such as Visual Studio or Eclipse I highly recommend installing a Perforce plugin for your IDE. Most IDE plugins will automatically checkout files when you start editing them, freeing you from having to do the checkout yourself.
Perforce Replacements For Git Features
-
git stash
==>p4 shelve
- git local branching ==> either Perforce shelves or task branches
-
git blame
==>p4 annotate
or Perforce Timelapse View from the GUI
Working Disconnected
There are two options for working disconnected from the Perforce versioning service (that's our fancy term for the Perforce server).
1) Use P4Sandbox to have full local versioning and local branching
2) Edit files as you please and use 'p4 status' to tell Perforce what you've done
With both the above options you can opt to use the "allwrite" setting in your workspace so that you do not have to unlock files. When working in this mode you will want to use the "p4 update" command to sync new files instead of "p4 sync". "p4 update" will check files for changes before syncing over them.
Perforce Quickstart
All the following examples will be via the command line.
1) Configure your connection to Perforce
export P4USER=matt
export P4CLIENT=demo-workspace
export P4PORT=perforce:1666
You can stick these settings in your shell config file, use p4 set
to save them on Windows and OS X, or use a Perforce config file.
1) Create a workspace
p4 workspace
# set your root to where your files should live:
Root: /Users/matt/work
# in the resulting editor change your view to map the depot files you care about
//depot/main/... //demo-workspace/main/...
//depot/dev/... //demo-workspace/dev/...
2) Get the files from the server
cd /Users/matt/work
p4 sync
3) Checkout the file you want to work on and modify it
p4 edit main/foo;
echo cake >> main/foo
4) Submit it to the server
p4 submit -d "A trivial edit"
5) Run p4 help simple
to see the basic commands that you will need to work with Perforce.
The biggest difference between git and p4, which none of the existing answers address, is that they use different units of abstraction.
-
In git, the abstraction is the patch (aka diff, aka changeset). A commit in git is essentially the output of running
diff
between the previous and current state of the files being committed. - In perforce, the abstraction is the file. A commit in p4 is the full content of the files in the commit at that point in time. This is organised into a changelist, but the revisions themselves are stored on a per-file basis, and the changelist simply collects different revisions of the files together.
Everything else flows from this difference. Branching and merging in git is painless because, from the perspective of git's abstraction, every file can be fully reconstructed by applying a set of patches in order, and therefore to merge two branches, you just need to apply all the patches on the source branch that aren't present in the target branch to the target branch in the correct order (assuming there are no patches on both branches that overlap).
Perforce branches are different. A branch operation in perforce will copy files from one subfolder to another, and then mark the linkage between the files with metadata on the server. To merge a file from one branch to another (integration
in perforce terms), perforce will look at the complete content of the file at the 'head' of on the source branch and the complete content of the file at the head of the target branch and if necessary merge using a common ancestor. It is unable to apply patches one by one like git can, which means manual merges happen more often (and tend to be more painful).
There's probably not a lot of such documentation because Perforce is a pretty traditional revision control system (closer to CVS, Subversion, etc.) and normally is considered to be less complicated than modern distributed revision control systems.
Trying to map commands from one to the other is not the right approach; concepts from centralized vs. distributed revision control systems aren't the same. Instead, I'll describe a typical workflow in Perforce:
- Run
p4 edit
on each file you want to edit. You need to tell Perforce which files you're editing. If you're adding new files, usep4 add
. If you're deleting files, usep4 delete
. - Make your code changes.
- Run
p4 change
to create a changeset. Here you can create a description of your change and optionally add or remove files from your changeset too. You can runp4 change CHANGE_NUMBER
to edit the description later if necessary. - You can make additional code changes if you need to. If you need to add/edit/delete other files, you can use
p4 {add,edit,delete} -c CHANGE_NUMBER FILE
. - Run
p4 sync
to pull in the latest changes from the server. - Run
p4 resolve
to resolve any conflicts from syncing. - When you're ready to submit your change, run
p4 submit -c CHANGE_NUMBER
.
You can use p4 revert
to revert your changes to files.
Note that you can be working on multiple changesets simultaneously as long as none of their files overlap. (A file in your Perforce client can be open in only one changeset at a time.) This sometimes can be convenient if you have small, independent changes.
If you find yourself needing to edit files that you already have open in another changeset, you can either create a separate Perforce client or you can stash your existing changeset for later via p4 shelve
. (Unlike git stash
, shelving does not revert the files in your local tree, so you must revert them separately.)