Ed Grochowski's Website

Directory Structure

Ed Grochowski

Written 6-8-2020

Introduction

Computer users have the task of organizing a large amount of information. Modern computers have so much storage capacity that the ability to find information poses a greater challenge than the ability to hold information.

This article describes the approaches that I use to organize my collection of computer files.


Organize by Date

In determining a suitable directory structure, the first question to ask is "Is the amount of information growing over time?" In many cases, new information is periodically added to an existing archive so that the archive grows over time.

In my experience, information that grows over time is best organized chronologically. For example, income tax returns grow at a constant rate of one set per year. These are ideal for organizing into folders with one folder per year. Knowing the year of the tax return directly leads to the right folder.

Other examples of information that grows over time include posts to bulletin boards and digital photos. These are organized into folders according to the year and possibly sub-folders naming specific dates. Names of the form YYYY-MM-DD can be sorted chronologically even if the original timestamps are lost.

With chronological organization, information is retrieved according to the date at which a post was made or a photo was taken. Even knowing an approximate date greatly narrows the search space in archives spanning a decade or more.


Organize by Subject

Information that does not grow in a periodic manner is best organized hierarchically by subject. For example, source code is organized into suite/project/{doc,icons,src,xml}/file. Text documents are organized according to subject/name. Schematics and 3D models are similarly organized.

Music is naturally organized into genre/artist_album/track.


Hybrids

The two approaches can be combined.

Email threads and invoices are first organized by earliest date and then by correspondent or company. This provides an easy means to follow a conversation or transaction. Knowing an approximate date and the correspondent allows emails to be found easily.

My preferred file format is plain text. Plain text facilitates the use of the Unix grep command and ensures that files will remain readable for the foreseeable future.


Conclusion

The policies are summarized in the following table.

Key Class Folders Files
Date Income Taxes YYYY Form
Bulletin Boards YYYY Thread_page.txt
Photos YYYY/YYYY-MM-DD IMAGE_NNNN.JPG
Subject Source Code Suite/Project/Type File
Text Documents Subject Name
Music Genre/Artist_Album Track
Hybrid Emails YYYYCorrespondent
Invoices YYYY YYYY-MM-DD-Company

Even among decades of stored information, I can easily find the file that I am looking for.