Accessibility Assistance

Skip to Content

  


Digital Support Services (DSS)
Smathers Libraries
University of Florida
P.O Box 117003
Gainesville, FL 32611 USA

P: 352.273.2900
F: 352.392.6597
UFDC@uflib.ufl.edu

Celebrating 100 years of IFAS and the Smith-Lever Act

Visit the Florida Agriculture and Rural Life Collection for more.

The Haiti Sun

A collaborative project with the Duke University Archives, the DSS is scanning a run of the of the Haiti Sun from 1950-1962.

Ramón Figueroa Mexican & Cuban Film Poster Collection


Collection: Digital Library of the Caribbean

Drew Field Echoes

Description: Newspaper published at the Drew Field Air Force Base in Tampa, Florida.

Collection: Florida Digital Newspaper Library

Antique Maps, Historic Sanborn Maps, and Aerial Photography



Collection: Map & Imagery Library Digital Collections

Archie Carr and Sea Turtles

Description: Archie Carr attaching weather balloons to sea turtles.

Collection: University Archives Photograph Collection

Alfred Browning Parker

Description: Alfred Browning Parker, architectural drawings, from the University of Florida Architecture Archives

Collection: University of Florida Libraries Architecture Archives Collection

Optical Character Recognition (OCR)

After digitally imaging text documents, and conducting the necessary processing for image quality and skew, the DLC uses Prime OCR for optical character recognition, and for image zoning if the target data is arranged in columns or tables. The OCR process recognizes the text within the images and creates plain-text files from the TIFF image files.

Named Entity Recognition (NER)

The textual content is scanned against different database tables, and hits are tagged accordingly: geographic names; personal names; and company names. For example, in this text exerpt, the program tagged place names (geograhical hits) with HTML to highlight them in red:

 earl  Hld  WMOM 
 Crossing,  for  the  site  at  which  the  Florida  
Souwh  was  completed  froalatka  to  Gaisesville  in  
188 co e e1eninsular Railroad. families of Calvin Waits and James Hawthorn. A diverse economy stimulated the development of Hawthorne. The addition of an "e" to the town's name 1. Wafts-Baker House - (606 Sid Martin Highway, U.S. 301) The original log house on this site, which was occupied in the nineteenth century by Hawthorne founder Calvin Waits and his family, had a sepa- rate kitchen and dining room to minimize the possibility of fire. Members of the Baker family have lived in it since 1909 when farmer and rural mail carrier R. B. Baker moved into the house with his new bride. 2. Hawthorne State Bank - (2 N. Johnson St.) This building housed Hawthorne's first bank, which was organized in 1911 and not long after advertised that it had assets of $15,000. Francis J. Hammond, leading merchant and grandson of town founder James M. Hawthorn, donated the land, and A. L. Webb, proprietor of a successful gen- eral store in Hawthorne, served as first presi- dent of the institution. 3. Moore House - (207 W. Lake Ave.) This q house, which was built in 191 1, still looks much as it did not long after Glen D. Moore purchased the house in 1913. A sleeping porch, casement windows, and a bathroom were added by Moore, whose father, William Shepard Moore, had arrived in Hawthorne from Tennessee in 1882. 4. Mahin House - (301 W. Lake Ave.) The broad porch on three sides of this tum-of-the-century dwelling provided a perfect place for occu- pants to sit and catch the breezes. Lottie Mahin, widow of a businessman influen- tial in the town during the second decade of the twentieth century, lived in the house | in the 1920s and rented part of it as apart- ments. 5. State Historical Marker - (On the church grounds, comer N. Johnson St. and N.W. 3rd Ave.) A brief history of the town of ^ Hawthorne is provided. 6. First Baptist Church - (Comer N. Johnson St. and N.W. 2nd Ave.) The First Baptist Church in Hawthorne was organized in 1853. This building was erected in 1900. Gothic style windows punctuate the white horizontal clapboard siding that covers the exterior.

About & Running PrimeOCR™

PrimeOCR features: six OCR engines (WordScan; TextBridge; Recore; TypeReader; OmniPage; FineReader); voting and weighing of the engine results; 11 Western languages; fault tolerant architecture, automatic engine recovery; image pre-processing; 1 to 6 CPU's; 16 output format formats including .PRO: metadata on location, confidence, etc., per character; and features for developers including an application programming interface (API) with 40 documented calls, dynamic link libraries (DLLs), and configurable initialization files (INIs).

Job Server (Batch OCR; Prioritized jobs; Network aware; Job file)
Prime Recognition Job File
Version=3.90 1
E:\Prdev\Images\UF70000002n001.tif
E:\Prdev\Templates\UF70000002n001.ptm
Document Template file:
Prime Recognition Document Template
Version=3.90
0,-1 1,0,1,0,10,0,64,1,0,0
4
0,0,1,999999,0,0,0,0,
1,0,1,999999,0,0,0,0,
2,0,1,999999,0,0,0,0,
3,0,1,999999,0,0,0,0,

Running PrimeOCR™: Write a job file to the Job folder
The Job Server is a Prime Recognition™ program always running on the remote server reads the locations of the TIFF image and the document template from the job file. It processes the TIF file according to the template, and outputs selected file types: PDF+Image and TXT, leaving the original, archival TIF unchanged in its folder.
Prime Recognition Job File
Version=3.90 1
\\Smathersnt19\ScanTech\ScanQC\Complete\UF00016972\00116.tif \\Smathersnt19\Prdev\Templates\UF70000002n001.ptm

Image Zoning

PrimeOCR is also bundled with PrimeView™ image zoning, which features a GUI to draw zones on the image:


Last modified: Saturday July 24 2010 lnt