New Upload Format, *, for Scribe-style Uploads

Hank says:

I’d like to provide some information about a new file format to some of you who have been involved with uploading already-digitized materials to the Archive. (Please share this message with anyone I didn’t include and should have.)

You may be familiar with (and may be using) our existing and _jp2.tar files. Making these from your own existing images is inconvenient and error-prone, due to the rigid expectations for individual image filenames and directory structure.

The new format is much more flexible. If you provide a file whose name ends in, we’ll make a from it: the will be unpacked, its contents sorted alphabetically (and any subdirectories flattened), and the set of images found within converted into a standard, which we’ll then process as usual.

In a bit more detail, the will be scanned for files it contains, at any directory level, whose names end with .jp2, .jpg, .tif, or .png, matched case-insensitively; any other files (.xml, .txt, etc.) will be ignored. You can mix and match different image formats. All image files found will be sorted alphabetically (including any directory names, so that files originally in the same directory stay together in the new sequence), converted to jp2 if they’re not already, renamed the way our code expects, and packed into a new, leaving your in place as it was.

For an example of how messy an we can deal with, see:

listing from
	767010/	01-06-10 13:18	0
	767010/76701057/	01-06-10 06:59	0
	767010/76701057/00000001.jpg	01-06-10 06:59	268802
	767010/76701061/	01-06-10 07:00	0
	767010/76701061/00000001.jpg	01-06-10 07:00	292476
	767010/76701067/	01-06-10 07:01	0
	767010/76701067/00000001.jpg	01-06-10 07:01	230612
	767010/76701068/	01-06-10 07:02	0
	767010/76701068/00000001.jpg	01-06-10 07:02	235011
	767010/76701069/	01-06-10 07:05	0
	767010/76701069/00000001.jpg	01-06-10 07:05	281997

The 589 images files found there were converted into:

listing from
	hr100106_jp2/	02-22-11 05:31	0
	hr100106_jp2/hr100106_0000.jp2	(JPG)	02-22-11 05:30	143845
	hr100106_jp2/hr100106_0001.jp2	(JPG)	02-22-11 05:30	191348
	hr100106_jp2/hr100106_0002.jp2	(JPG)	02-22-11 05:30	93923
	hr100106_jp2/hr100106_0003.jp2	(JPG)	02-22-11 05:30	100340
	hr100106_jp2/hr100106_0004.jp2	(JPG)	02-22-11 05:30	164196
	hr100106_jp2/hr100106_0005.jp2	(JPG)	02-22-11 05:30	169330

Note that the new, and the files it contains, are named according to the name of the original file (“hr100106”), regardless of how directories and files are names inside the Those files and directories can be named any way you like; the names matter only in that they determine the sequence of the images in the new

Again, please share this info with anyone you think will be interested.

Thanks, Hank!