Usually my work is fun, as it involves thinking about interesting problems. This is one the great joys of being a law professor. But much of my work has not been fun this past ten days or so. The law school is doing renovations on my floor over the summer, and in service of this project all my file cabinets were due to be removed last Friday. (See The Paperless Office (Like it Or Not).) I spent most of last week doing triage and tagging with the help of one very game research assistant. In the end I threw away about half of it, filling several recycle bins. It was a very nostalgic process, especially regarding my earliest cryptography, e-cash, and internet work: I had a chance to interact with some amazing people, far too many of whom I have lost track of since. But back to the files. I marked about 40% of the bulk for temporary storage. I took the rest — twelve banker’s boxes — home and am now trying to figure out where to put them. That takes some tidying and reorganization too.
There is, however, one aspect of this problem that does require at least a little thought: I’ve also been trying to create a schema for a paperless filing system since the tentative plan is that the file cabinets are not coming back. As one does I spent some time on Google looking for advice. Most of it was unhelpful, as it involved spending large sums of money on various proprietary document management tools. But I did find a few pieces of what looked like useful advice, especially Exadox’s Folder and File Naming Convention – 10 Rules for Best Practice. They wanted me to buy something too, but never mind that.
So here are the rules I’ve distilled for my use:
- Every document MUST have a descriptive name.
- Use short and simple folder names and folder structures and focus on using long and information-rich filenames.
- Use the underscore (_) as element delimiter. Use the hyphen (-) to delimit words within an element – NOT spaces. E.g. Smith-John_recommendation-letter.pdf
- Elements should be ordered from general to specific detail of importance as much as possible. The order of importance rule holds true when elements include date and time stamps. Dates should be ordered: YEAR, MONTH, DAY. (e.g. YYMMDD)
- Personal names within an element should have family name first followed by first names or initials.
- Documents with multiple versions (or likely to have multiple versions) should have as their FINAL element _V followed by at least 2 digits. To distinguish between working drafts (i.e. minor revisions) use Vx-01->Vx-99 range and for final draft (i.e. major version release) use V1-00-> V9-xx. (where x =0-9). If you go to more major versions, use the alphabet, ie VA-01, VB-01 etc.
- Aim for a flat file structure–avoid the temptation to create layers and layers of sub-folders as things that are not visible will get lost
I’ve also defined an initial file structure based on asking my secretary for her ideas, and then adding my own ideas based on some time looking through the things found in the drive space she and I share. (I also have my own private space that could really stand to be organized better, but one thing at a time.) The top level of this structure divides the world into Admin, Courses, Research, and Projects. The admin folder has ten sub-folders, only a few of which have sub-folders of their own. None of the other three major sections currently has sub-sub-folders. So it is a very flat structure.
There is one good aspect of all this, however. I think that grading is going to seem joyous by comparison.
You really want YYYYMMDD for dates, unless you don’t have anything from before 2000.
I’ll recommend getting a scanner that does OCR, for example the sheet feeding Fuji Scansnap or a Paperless. I’ve been going digital, and that means taking old beloved, but unused documents and scanning them in, then destroying them. I agree that good filenames are important, but the OCR means I can use Spotlight to find things. If only I had a version to that did a thesaurus augmented search. I’m not sure of the equivalent in the Windows world, but there are companies like the one making DevonThink that do indexing on a number of platforms.
Oh yeah, back your stuff up, back your stuff up, and keep a copy offsite. Luckily, you can fit gazillions of pages on a $100 TB drive.
There are large Xerox scanners at work, so that may be covered. For home use, the Scansnap has tempted me for some time even though it is a little pricey. It doesn’t have a TWAIN driver, which is a little offputting as it means I would not be able to get rid of my Canon flatbed, and there’s only so much gadget real estate in my study.
What is a “Paperless”?
Copernic does adequate disk searches, across file types, although I find that ever new version seems worse than its predecessor.