Best Pratices for a Network File Share?
Solution 1:
We counsel Customers to "scorch the earth" and start fresh, oftentimes.
I have yet to see a good solution that works that doesn't involve have non-IT stakeholders involved. The best scenario I've seen yet is a Customer that has had management identify "stewards" of various data areas and delegated control of the AD groups that control access to those shared areas to those "stewards". That has worked really, really well, but has required some training on the part of the "stewards".
Here's what I know doesn't work:
- Naming individual users in permissions. Use groups. Always. Every time. Without fail. Even if it's a group of one user, use a group. Job roles change, turnover happens.
- Letting non-IT users alter permissions. You'll end up with "computer Vietnam" (the parties involved have "good" intentions, nobody can get out, and everybody loses).
- Having too grandiose-ideas about permissions. "We want users to be able to write files here but not modify files they've already written", etc. Keep things simple.
Things that I've seen work (some well, others not-so-well):
- Publish a "map" indicating where various data types are to be stored, typically by functional area. This is a good place to do interviews with various departments and learn how they use file shares.
- Consider "back billing" for space usage or, at the very least, regularly publishing a "leader board" of the departmental space users.
- Did I mention naming groups exclusively in permissions?
- Develop a plan for data areas that "grow without bounds" to take old data "offline" or to "nearline" storage. If you allow data to grow forever it will, taking your backups with it to infinity.
- Plan on some kind of trending for space usage and folder growth. You can use commercial tools (someone mentioned Tree Size Professional or SpaceObServer from JAM Software) or you can code something reasonable effective up yourself with a "du" program and some scripting "glue".
- Segment file shares based on "SLA". You might consider having both a "business-critical" share that crosses departmental lines, and a "nice to have running but not critical" share. The idea is to keep the "business-critical" share segregated for backup/restore/maintenance purposes. Having to take down business to restore 2TB of files from backup, when all that was really needed to go about business was about 2GB of files, is a little silly (and I've see it happen).
Solution 2:
I agree with Evan that starting over is a good idea. I've done 4 "file migrations" over the years at my current company, and each time we set up a new structure and copied (some) files over, backed up the old shared files and took them offline.
One thing we did on our last migration might work for you. We had a somewhat similar situation with what we called our "Common" drive, which was a place where anyone could read/write/delete. Over the years, a lot of stuff accumulated there, as people shared stuff across groups. When we moved to a new file server, we set up a new Common directory, but we didn't copy anything to it for the users. We left the old Common in place (and called it Old Common), made it read-only, and told everyone they had 30 days to copy anything they wanted to the new directories. After that, we hid the directory but we would un-hide it on request. During this migration, we also worked with all the departments and created new shared directories and helped people identify duplicates.
We've used Treesize for years for figuring out who's using disk space. We've tried Spacehound recently and some of my co-workers like it, but I keep going back to Treesize.
After our most recent migration, we tried setting up an Archive structure that people could use on their own, but it hasn't worked very well. People just don't have the time to keep track of what's active and what's not. We're looking at tools that could do the archiving automatically, and in our case it would work to periodically move all the files that haven't been touched for 6 months off to another share.
Solution 3:
At 3TB you probably have a lot uf huge unnecessary files and duplicated junk in there. One useful method I've found is to do searches, starting for files > 100MB (I might even go up to 500MB in your case) then take it down. It makes the job of finding the real space wasters more manageable.
Solution 4:
My first order of business would be to use an enterprise file manager/analyzer/reporter/whatever-you-want-to-call-it such as TreeSize Professional or SpaceObServer. You can see what files are where, sort by creation data, access date and a host of other criterion including statistics on file types and owners. SpaceObServer can scan various file systems including remote Linux/UNIX systems via an SSH connection. That can give you great visibility into your collection of files. From there, you can "Divide and Conquer".
Solution 5:
You might want to consider just blanket archiving anything more than six months old to another share, and watch for file accesses on that share. Files that are consistently accessed you could put back on the primary server.
Another option is something like the Google Search Appliance. That way you can let Google's app smartly figure out what people are looking for when they search for things and it will "archive" by putting less-accessed documents further down on the search page.