Wednesday, February 08, 2006

This blog belongs to Poposki Dimitar and it's about Optical Character Recognition*

(*Abbreviation, OCR. In computer and data-processing operations, the reading of alphabetical, numerical, and other characters from hard copy (usually printed matter) by photoelectric methods. It converts the characters into digital data that can be stored in computer memory, on disks and tapes, and transmitted via digital communications networks. It can also allow a computer or robot to read signs, maps, etc. Source: The Illustrated Dictionary of Electronics)

and the various scholarly articles involving the newest improvements of software used for recognising the various languages (sometimes even the non-existing ones).

The sample picture


is a screenshot of my final project done during my internship at the Scholarly Communication Center at Alexander’s Library in Rutgers – The State University of New Jersey: OCR of John Milton’s (1608-1674) book "The Grand Case of Conscience" published in 1650. The book can be found in Rutgers Digital Libraries in the section for Rare Books.

It is written in old English language with long S but it's fully text searchable in modern English. My goal is to OCR non-existent languages and languages that cannot be OCR-ed.

Also, my passion is to actively participate in the Digitisation of Cultural Heritage of the World and help as much as i can to preserve it in the digital form so it can be accessible to all the people around of world free of charge.

"Digitisation is an essential step aimed at preserving and promoting collective cultural heritage, thus safeguarding cultural diversity in the global environment. Also it could improve the presence of the cultural heritage of the region on the Web, more in accordance with its contribution to the world's cultural heritage."

Regional Meeting on Digitisation of Cultural Heritage
Ohrid, 17-20 March 2005
Republic of Macedonia

No comments: