Accessibility Assistance

Skip to Content

  


Digital Services (DLC)
Smathers Libraries
University of Florida
P.O Box 117003
Gainesville, FL 32611 USA

P: 352.273.2900
F: 352.846.3702
UFDC@uflib.ufl.edu

An A B C, for Baby Patriots

Description: An A B C, for Baby Patriots, by Mary Frances Ames, 1899.

Collection: Baldwin Library of Historical Children's Literature Digital Collection

Ramón Figueroa Mexican & Cuban Film Poster Collection


Collection: Digital Library of the Caribbean

Drew Field Echoes

Description: Newspaper published at the Drew Field Air Force Base in Tampa, Florida.

Collection: Florida Digital Newspaper Library

Antique Maps, Historic Sanborn Maps, and Aerial Photography



Collection: Map & Imagery Library Digital Collections

Archie Carr and Sea Turtles

Description: Archie Carr attaching weather balloons to sea turtles.

Collection: University Archives Photograph Collection

Alfred Browning Parker

Description: Alfred Browning Parker, architectural drawings, from the University of Florida Architecture Archives

Collection: University of Florida Libraries Architecture Archives Collection

Digital Library Center Documentation & Technologies: Archiving

Checking Dropbox

  • Check Dropbox files, which are under \\ad.ufl.edu\uflib\dlc\archive\dropbox
  • Check all of the folders to make sure the BIB_VID format is right
  • Specific things to check:
    • All letters should be capitalized
    • BIBs should be in BIB_VID format OR should have VID folders under the BIB folder
      • Bulk rename any files with no VID to add _00001 after the BIBID (these are old items in the old format that are being migrated from the CD/DVD archive)
    • No BIB folder should be within another BIB folder (this is common with moving files and network lag)

Checking FTP Folders

  • Check all FTP folders
  • Check with area coordinator to move items into queue

FDA Pickups

  • Open WSFTP
  • Use the default FCLA SUS DL profile
  • Connect to FDA FTP server
  • Change the remote directory to be PICKUPS
  • Change the local directory to be {MAIN_UFDC_SERVER_PATH}\INCOMING\FDA
  • Transfer all reports to the local directory
  • Double check that they all made it (with so many reports in a single directory, it often fails to grab them all.  As long as it is moved at the directory level, versus grabbing each file, it should work though)
  • Then, delete the reports from FCLA's FTP site

IR Self-Submitted Items

  • Processing instructions here.

Periodic Checks

  • Send periodic reminders to check local machines for TIFFs
    • TIFFs on local machines should be moved into production queue; or for items that have been verified as being completed (online AND archived) then they should be deleted from the local machine
  • When reminding on TIFFs, also check and delete older PreQC and GoUFDC log files

CNS Costs

CNS Charging Algorithm, updated on 6/14/2010 (Upload/Download = $0.00019 per MB ; Storage = $0.00001275 MB per month)

Current Costs
(upload for .20/GB; storage = $0.00001275 per MB per month per location):

 

MB

GB (MB*1024)

TB (GB*1024)

Upload

$ 0.0001953125

$ 0.20

$ 204.80

Archive / Month / Each

$ 0.00001275

$ 0.013056

$ 13.37 / month

Archive / Day / Each

 

$ 0.0004296875

$ 0.44 / day

Actual Archive / Day / Twice
(archiving to Gainesville and Atlanta)
  $ 0.000859375 $ 0.88 / day

The above costs include the costs for tape archiving to Gainesville and Atlanta. Per 7/16-7/19 emails, this has always been included, but no prior notification. Prior costs were slightly higher.

Prior Quoted Costs
(upload for .00034/mb; archive for .000017/mb/month)

 

MB

GB (MB*1024)

TB (GB*1024)

Upload

$ 0.00034

$ 0.34

$348.16

Archive / Month / Each

$ 0.000017 / month

$ 0.017 / month

$17.82 / month

Archive / Day / Each

 

$ 0.0005589

$ 0.58 / day

Actual Archive / Day / Twice
(updated 7/16/10; unknown start date for this billing)
    $ 1.16 / day

Any items which are ready for archiving should be placed into the TIVOLI drop box, located at on the archive server under Archive\DROPBOX.  Every hour, these folders will be examined, and then moved into TIVOLI.  Every night at 6pm, the TIVOLI service will run.  Once it runs, the resource will be deleted.

Additional Considerations:
  • Folders should be in the flat form ( i.e., UF00001532_00001, UF00001532_00002, UF00001532_00017, etc)
    • The processor will not step down into the folders. So, the digital resource folder must be placed at the highest level. 
    • If you have an intermediary folder between the drop box and resource folders, the processor will ignore them. (i.e., DROPBOX\DLOC\CA00001234\... ). If you need to do this temporarily for any reason, that is okay, but be aware that this will eat into the space we can use for the archiving process.
  • It is not necessary that the bibid and vid are actually present in the tracking database, as long as they appear to be valid.
  • Ensure that you are either done with all processing (such as sending to FDA) or that you retain an additional copy somewhere else.
    • Tivoli MOVES and then DELETES the digital resource files from our server.
  • A new status will appear in tracking indicating that some portion of the digital resource was archived into our TIVOLI solution.
  • If there is an identical file in the archive for a volume ( as defined by same bibid, vid, filename, filesize, and last write date ) the new file will be discarded.  It will not be ‘double archived’.
    • TIVOLI is done file-by-file really, and not by an entire digital resource. 
    • If you drop a new package into the dropbox with one new TIFF file and one new METS file, those two new files will be detected (by a mismatch on the last write date).  All the other files should exactly match the existing files, as the size and last write date will be identical to the archived file.  
      • Only the two new files will be selected for archiving and the rest will be deleted. 
      • The two files will be moved over to the area that TIVOLI will pickup at 6pm.  
      • TIVOLI wouldn't care that two files of the same name and same folder structure are archived.  It will retain both of them, and we will have to select which version we want to retrieve.  However, for administrative simplicity, we are renaming the file names in this case.  If you were to retrieve this sample package (and lets say you update 00002.tif and the mets file) you would get the following
        00001.tif
        00002.tif (first tiff archived)
        00002 (2009_10_10).tif ( non-matching file; originally same name; found in dropbox 10/10/2009)
        00003.tif
        UF12345678_00001.mets
        UF12345678_00001 (2009_10_10).mets ( second METS file archived )
        This essentially matches what TIVOLI has archived for it.  
      • As a corollary to the above, if you were to add a NEW file with the second load (say 00004.tif) which had never been archived for this resource before, it would simply be loaded as 00004.tif, since there is no duplicate archived.  The date would not be appended to the filename in this case.
    • The DLC archiving and dissemination tool keeps a log of every file that has been archived, the date, the size, the location, and the last write date.  This local log will enable retrieval, as well as avoiding duplication in the archiving process.
  • During the process by which you put in a request for copy to be disseminated, the dissemination tool might ask if you just want the latest version of each file, or if you truly want everything archived, as seen above.

Implementation Notes:

  • Scheduled Tasks
    • The Tivoli Preperation Tool runs as a scheduled task directly on the archiving server every hour from 7am to 5pm each day as the ufdc processor user.
    • The process of loading the data to CNS's Tivoli solution runs as a scheduled task on the archiving server once a day at 6pm
  • Tivoli Preparation Tool process
    1. All empty folders are removed from the destination area
    2. Loops through all the top level folders in the dropbox. If the folder is in the flattened form (i.e., UF12345678_00001) then process that folder. If it appears to just be a bibid, loop through all the subfolders that appear to volume folders ( i.e., UF12345678\00001 or UF12345678\VID00001 )
      1. Skips the folder if it was written to in the last five minutes (in case files are still being written)
      2. A folder is created in the destination area for this resource. Folder is based on the Bib ID. (i.e, UF12345678_00001 maps to UF\12\34\56\78\00001 )
      3. List of any files already archived for this volume is pulled from the database
      4. Processor recurses through all subfolders and files. If a file was already archived and has the exact same last write date and size, it is deleted. If a file in the same folder and with the same name was archived previously but had a different last write date or size, this file is renamed to include the current date at the end ( i.e., '00001.tif' becomes '00001 (2009_10_29).tif' ). In addition, a list of the files to be archived is created.
      5. All files and subfolders are moved into the tivoli archiving area. Whenever possible, entire directories are moved. If the directly already exists, files are just copied one at a time
      6. All the file information is saved to the database's tivoli log
      7. Finally, the empty source directory is deleted
    3. Process is complete and application terminates

Last modified: Tuesday August 24 2010 lnt