Directory Structure
Ed Grochowski
Written 6-8-2020
Introduction
Computer users have the task of organizing a large amount of
information. Modern computers have so much storage capacity that the
ability to find information poses a greater challenge than the ability
to hold information.
This article describes the approaches that I use to organize my
collection of computer files.
Organize by Date
In determining a suitable directory structure, the first question to ask
is "Is the amount of information growing over time?" In many
cases, new information is periodically added to an existing archive so
that the archive grows over time.
In my experience, information that grows over time is best organized
chronologically. For example, income tax returns grow at a constant
rate of one set per year. These are ideal for organizing into folders
with one folder per year. Knowing the year of the tax return directly
leads to the right folder.
Other examples of information that grows over time include posts to
bulletin boards and digital photos. These are organized into folders
according to the year and possibly sub-folders naming specific dates.
Names of the form YYYY-MM-DD can be sorted chronologically even
if the original timestamps are lost.
With chronological organization, information is retrieved according to
the date at which a post was made or a photo was taken. Even knowing
an approximate date greatly narrows the search space in archives
spanning a decade or more.
Organize by Subject
Information that does not grow in a periodic manner is best organized
hierarchically by subject. For example, source code is organized into
suite/project/{doc,icons,src,xml}/file. Text documents are organized
according to subject/name. Schematics and 3D models are similarly
organized.
Music is naturally organized into genre/artist_album/track.
Hybrids
The two approaches can be combined.
Email threads and invoices are first organized by earliest date and then
by correspondent or company. This provides an easy means to follow a
conversation or transaction. Knowing an approximate date and the
correspondent allows emails to be found easily.
My preferred file format is plain text. Plain text facilitates the use
of the Unix grep command and ensures that files will remain
readable for the foreseeable future.
Conclusion
The policies are summarized in the following table.
Key |
Class |
Folders |
Files |
Date |
Income Taxes |
YYYY |
Form |
Bulletin Boards |
YYYY |
Thread_page.txt |
Photos |
YYYY/YYYY-MM-DD |
IMAGE_NNNN.JPG |
Subject |
Source Code |
Suite/Project/Type |
File |
Text Documents |
Subject |
Name |
Music |
Genre/Artist_Album |
Track |
Hybrid |
Emails |
YYYY | Correspondent |
Invoices |
YYYY |
YYYY-MM-DD-Company |
Even among decades of stored information, I can easily find the file
that I am looking for.
|