TY - THES T1 - Document recognition application for the world wide web A1 - Gabriel, Alan S. A2 - Risbud, Sanjay S. A2 - Sevilla, Jeremy R. LA - English YR - 2001 UL - https://tuklas.up.edu.ph/Record/UP-99796217608094475 AB - With the mass of printed documents being produced everyday, and the need to encode these documents in a form suitable for the web, either for publishing or for storage, there is a need to devise a system that would make the process easier. This is the problem that the team wishes to address. The page is first scanned using a commercial scanner. The file is converted into a form suitable for system processing. Areas of the code are segregated into "text block" (those containing only readable words) and "image blocks" (those containing pictures, drawings, etc; text in image blocks are considered part of the image). The text blocks are then passed through the neural network for initial recognition, and the output is checked against the dictionary to correct any other errors. The words and the images are then put together into a coherent web page. CN - LG 993.5 2001 C65 G33 KW - Document imaging systems. KW - Optical pattern recognition. KW - Document recognition. ER -