Detecting Character in Video Frames for Traffic Vehicle

International Journal of Computer Science and Engineering
© 2017 by SSRG - IJCSE Journal
Volume 4 Issue 9
Year of Publication : 2017
Authors : P. Mahalakshmi, T. Manigandan, D. Chitra

pdf
How to Cite?

P. Mahalakshmi, T. Manigandan, D. Chitra, "Detecting Character in Video Frames for Traffic Vehicle," SSRG International Journal of Computer Science and Engineering , vol. 4,  no. 9, pp. 9-13, 2017. Crossref, https://doi.org/10.14445/23488387/IJCSE-V4I9P102

Abstract:

Reading text from video frames is a challenging problem that has received a significant amount of attention. The two key components of most systems are (i) text detection from images and (ii) character recognition and many recent methods have been proposed to design better feature representations and models for both. Detection of text and identification of characters in scene images is a challenging visual recognition problem. As in much of computer vision the challenges posed by the complexity of these images have been combated with hand designed features [1], [2], [3] and models that incorporate various pieces of high-level prior knowledge [4], [5]. The produce results from a system that attempts to learn the necessary features directly from the data as an alternative to using purpose-built text-specific features or models. Detecting text regions in natural scene images has become an important area due to its varies applications. Scene text detection from video as well as natural scene images is challenging due to the variations in background, contrast, text type, font type, font size, etc. Besides arbitrary orientations of texts with multi-scripts add more complexity to the problem. Text Information Extraction (TIE) System involves detecting text regions in a given image, localizing it, extracting the text part and recognizing text using OCR. The extracted features are used by the trained SVM classifier to detect the text regions. After detecting text regions characters are extracted and finally displayed. Text in camera captured images contains important and useful information. Text in images can be used for identification indexing and retrieval. Detection and localization of text from camera captured images is still a challenging task due to high variability of text appearance detected text regions are merged and localized.

Keywords:

Text detection, Character Recongnition, Computer Vision, Text Information Extraction, SVM, OCR.

References:

[1] Arbelaez P. and Maire M. and Fowlkes C. and Malik J. (2011), ‘Contour Detection and Hierarchical Image Segmentation’, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 898-916.
[2] Boykov Y. and Jolly M. P. (2001), ‘Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images’, In Proceedings of International Conference of Computer Vision (ICCV), volume 1.
[3] Carreira J. and Sminchisescu C. (2012), ‘CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts’, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1312-1328.
[4] Ciresan D. and Meier U. and Masci J. and Schmidhuber J. (2011), ‘A committee of neural networks for traffic sign classification’, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1918-1921.
[5] Clement Farabet M. and Camille Couprie Y. and Laurent Najman M. and Yann LeCun L. (2013), ’Learning Hierarchical Features for Scene Labeling’, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1915-1929.
[6] Farabet C. and Couprie C. and Najman L. and LeCun Y. (2011), ‘Neuflow: A runtime reconfigurable dataflow processor for vision’, In Proceedings of the Fifth IEEE Workshop on Embedded Computer Vision, pp. 109-116
[7] Farabet C. and Couprie C. and Najman L. and LeCun Y. (2012), ’Scene parsing with multiscale feature learning purity trees and optimal covers’, In Proceedings of the International SSRG International Journal of Computer Science and Engineering (SSRG-IJCSE) – volume 4 Issue 9 – September 2017 ISSN: 2348 – 8387 www.internationaljournalssrg.org Page 13 Conference on Machine Learning (ICML), February, pp. 575- 582.
[8] Farabet C. and Couprie C. and Najman L. and LeCun Y. (2012), ’Scene parsing with multiscale feature learning purity trees and optimal covers’, In Proceedings of the International Conference on Machine Learning (ICML), pp.575-582.
[9] Felzenszwalb P. and Huttenlocher D. (2004), ’Efficient graph-based image segmentation’, International Journal of Computer Vision.DOI. 10.1023.
[10] Fulkerson B. and Vedaldi A. and Soatto S. (2009), ’Class segmentation and object localization with superpixel neighborhoods’, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 670 – 677.
[11] Gould S. and Fulton R. and Koller D. (2009), ’Decomposing a scene into geometric and semantically consistent regions’, IEEE International Conference on Computer Vision, pp. 1 – 8.
[12] Gould S. and Rodgers J. and Cohen D. and Elidan G. and Koller D. (2008), ‘Multi-class segmentation with relative location prior’, IEEE International Conference on Computer Vision, pp. 1480 – 1486.
[13] He X. and Zemel R. (2008), ’Learning hybrid models for image anno-tation with partially labeled data’, Advances in Neural Information Processing Systems, pp. 625-632.
[14] Kavukcuoglu K. and Ranzato M. and Fergus R. and LeCun Y. (2008). ‘Fast inference in sparse coding algorithms with applications to object recognition’, Technical report Courant Institute of Mathematical Sciences, pp. 1010-3467.
[15] Kavukcuoglu K. and Ranzato M. and Fergus R. and LeCun Y. (2009), ‘What is the best multi-stage architecture for object recognition’, In Proceeding. International Conference on Computer Vision (ICCV’09). IEEE, 2009, pp. 2146 – 2153.
[16] Kavukcuoglu K. and Ranzato M. and Fergus R. and LeCun Y. (2010), ‘Learning convolutional feature hierachies for visual recognition’, In Advances in Neural Information Processing Systems , volume 23.
[17] Kumar M. and Koller D. (2010), ‘Efficiently selecting regions for scene understanding’, In Computer Vision and Pattern Recognition (CVPR), DOI. 10.1109.
[18] LeCun Y. and Boser B. and. Denker J. S and Henderson D. Howard R. E. Hubbard R. E. and Jackel L. D. (1990), ‘Handwritten digit recognition with a back-propagation network’. In Computer Vision and Pattern Recognition (CVPR), pp. 396-404.
[19] LeCun Y. and Boser B. and Denker J. S and Henderson D. Howard R. E. Hubbard R. E. and Jackel L. D. (1998), ‘Gradient-based learning applied to document recognition’, In Computer Vision and Pattern Recognition (CVPR), DOI. 10.1109.
[20] LeCun Y. and Bottou L. and Orr G. and Muller K. (1998), ‘Efficient backprop’, In Computer Vision and Pattern Recognition (CVPR), pp. 9-48.
[21] Lempitsky V. and. Vedaldi A and Zisserman A. (2011), ‘A pylon model for semantic segmentation’, In Advances in Neural Information Processing Systems, pp. 171-179.
[22] Najman L. and Schmitt M. (1996), ’Geodesic saliency of watershed contours and hierarchical segmentation’, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1163 – 1173.
[23] Osadchy M. and LeCun Y. and Miller M. (2007), ‘Synergistic face detection and pose estimation with energy-based models’, Journal of Machine Learning Research, pp. 196-206.
[24] Pantofaru C. and Schmid C. and Hebert M. (2008), ‘Object recognition by integrating multiple image segmentations’, In ECCV 10th European Conference on Computer Vision, pp. 481-494.
[25] Ranzato M. and Huang F. and Boureau Y. and LeCun Y. (2007), ’Unsupervised learning of invariant feature hierarchies with applications to object recognition’, In Proceeding of Computer Vision and Pattern Recognition, DOI. 10.1109.
[26] Russell B. and Torralba A. and Liu C. and Fergus R. and Freeman W. (2007), ‘Object recognition by scene alignment’, In Neural Advances in Neural Information, In Proceeding of Computer Vision and Pattern Recognition, DOI. 10.1167.
[27] Russell C. and Torr P. H. S. and Kohli P. (2009), ‘Associative hierarchical CRFs for object class image segmentation’, In Proceeding of Computer Vision and Pattern Recognition, DOI. 10.1109.
[28] Schulz H. and Behnke S. (2012), ‘Learning object-class segmentation with convolutional neural networks’, In 11th European Symposium on Artificial Neural Networks (ESANN), DOI. 10.1.1.307.2322
[29] Shotton J. and Winn J. M. and Rother C. and Criminisi A. (2006) , ‘shape and context modeling for multi-class recognition and segmentation’, In Proceeding of Computer Vision and Pattern Recognition, pp. 1-15.
[30] Socher R. and Lin C. C. and Ng A. Y. and Manning C. D. (2011), ‘ Parsing Natural Scenes and Natural Language with Recursive Neural Networks’, In Proceedings of the 26th International Conference on Machine Learning (ICML), pp. 1675-1680.
[31] Turaga S. and Briggman K. and Helmstaedter M. and Denk W. and Seung H.(2009), ’Maximin affinity learning of image segmentation’, In Proceedings of the 26th International Conference on Machine Learning (ICML), pp. 1865-1873.
[32] Vaillant R. and Monrocq C. and LeCun Y. (1994), ‘Original approach for the localisation of objects in images’, In Proceedings of the 26th International Conference on Machine Learning (ICML), DOI. 10.1049.