Automatsko označavanje slika

Automatsko označavanje slika (takođe poznat kao automatsko obeležavanje slika ili lingvističko indeksiranje) je proces kojim računarski sistem automatski dodeljuje metapodatke u obliku natpisa ili ključnih reči digitalnoj slici. Ova primena tehnika kompjuterskog vida se koristi u sistemima za pronalaženje slika za organizovanje i lociranje slika od interesa iz baze podataka.

Ovaj metod se može smatrati vrstom višeklasne klasifikacije slika sa veoma velikim brojem klasa – velikim koliko i veličina rečnika.^[1]^[2] Obično se tehnikama mašinskog učenja koriste analize slike u obliku ekstrahovanih vektora karakteristika i reči napomena za obuku, kako bi pokušali da automatski primene napomene na nove slike. Prve metode su naučile korelacije između karakteristika slike i napomena za obuku, zatim su razvijene tehnike korišćenjem mašinskog prevođenja kako bi se pokušalo da se prevede tekstualni rečnik 'vizuelnim rečnikom', ili grupisanim regionima poznatim kao mrlje. Rad koji prati ove napore uključivao je klasifikacione pristupe, modele relevantnosti i tako dalje.

Prednosti automatskog označavanja slika u odnosu na pronalaženje slika zasnovano na sadržaju (CBIR) su u tome što korisnik može prirodnije da odredi upite.^[3] CBIR generalno (trenutno) zahteva od korisnika da pretražuju po konceptima slike kao što su boja i tekstura, ili da pronađu primere upita. Određene karakteristike slike u primerima slika mogu zameniti koncept na koji se korisnik zaista fokusira. Tradicionalne metode pronalaženja slika, poput onih koje koriste biblioteke, oslanjale su se na slike koje su ručno označene, što je skupo i dugotrajno, posebno imajući u vidu velike i stalno rastuće postojeće baze podataka slika.

Reference

^ Forsyth, David; Ponce, Jean (2012). Computer vision: a modern approach. Pearson.
^ Russakovsky, Olga; Deng, Jia; Su, Hao; Krause, Jonathan; Satheesh, Sanjeev; Ma, Sean; Huang, Zhiheng; Karpathy, Andrej; Khosla, Aditya; Bernstein, Michael; Berg, Alexander C. (децембар 2015). „ImageNet Large Scale Visual Recognition Challenge”. International Journal of Computer Vision (на језику: енглески). 115 (3): 211—252. ISSN 0920-5691. S2CID 2930547. arXiv:1409.0575  . doi:10.1007/s11263-015-0816-y. hdl:1721.1/104944  . Архивирано из оригинала 2023-03-15. г. Приступљено 2020-11-20.
^ „Archived copy” (PDF). i.yz.yamagata-u.ac.jp. Архивирано из оригинала (PDF) 8. 8. 2014. г. Приступљено 13. 1. 2022.

Literatura

Datta, Ritendra; Dhiraj Joshi; Jia Li; James Z. Wang (2008). „Image Retrieval: Ideas, Influences, and Trends of the New Age”. ACM Computing Surveys. 40 (2): 1—60. S2CID 7060187. doi:10.1145/1348246.1348248.
Nicolas Hervé; Nozha Boujemaa (2007). „Image annotation : which approach for realistic databases ?” (PDF). ACM International Conference on Image and Video Retrieval. Архивирано из оригинала (PDF) 2011-05-20. г.
M Inoue (2004). „On the need for annotation-based image retrieval” (PDF). Workshop on Information Retrieval in Context. стр. 44—46. Архивирано из оригинала (PDF) 2014-08-08. г.
Y Mori; H Takahashi; R Oka (1999). „Image-to-word transformation based on dividing and vector quantizing images with words.”. Proceedings of the International Workshop on Multimedia Intelligent Storage and Retrieval Management. CiteSeerX 10.1.1.31.1704  .
P Duygulu; K Barnard; N de Fretias; D Forsyth (2002). „Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary”. Proceedings of the European Conference on Computer Vision. стр. 97—112. Архивирано из оригинала 2005-03-05. г.
J Li; J Z Wang (2006). „Real-time Computerized Annotation of Pictures”. Proc. ACM Multimedia. стр. 911—920.
J Z Wang; J Li (2002). „Learning-Based Linguistic Indexing of Pictures with 2-D MHMMs”. Proc. ACM Multimedia. стр. 436—445.
J Li; J Z Wang (2008). „Real-time Computerized Annotation of Pictures”. IEEE Transactions on Pattern Analysis and Machine Intelligence.
J Li; J Z Wang (2003). „Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach”. IEEE Transactions on Pattern Analysis and Machine Intelligence. стр. 1075—1088.
K Barnard; D A Forsyth (2001). „Learning the Semantics of Words and Pictures”. Proceedings of International Conference on Computer Vision. стр. 408—415. Архивирано из оригинала 2007-09-28. г.
D Blei; A Ng; M Jordan (2003). „Latent Dirichlet allocation” (PDF). Journal of Machine Learning Research. стр. 3:993—1022. Архивирано из оригинала (PDF) 16. 3. 2005. г.
G Carneiro; A B Chan; P Moreno; N Vasconcelos (2006). „Supervised Learning of Semantic Classes for Image Annotation and Retrieval” (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. стр. 394—410.
R W Picard; T P Minka (1995). „Vision Texture for Annotation”. Multimedia Systems.
C Cusano; G Ciocca; R Scettini (2004). Santini, Simone; Schettini, Raimondo, ур. „Image Annotation Using SVM”. Internet Imaging V. 5304: 330—338. Bibcode:2003SPIE.5304..330C. S2CID 16246057. doi:10.1117/12.526746.
R Maree; P Geurts; J Piater; L Wehenkel (2005). „Random Subwindows for Robust Image Classification”. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. стр. 1:34—30.
J Jeon; R Manmatha (2004). „Using Maximum Entropy for Automatic Image Annotation” (PDF). Int'l Conf on Image and Video Retrieval (CIVR 2004). стр. 24—32.
J Jeon; V Lavrenko; R Manmatha (2003). „Automatic image annotation and retrieval using cross-media relevance models” (PDF). Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. стр. 119—126.
V Lavrenko; R Manmatha; J Jeon (2003). „A model for learning the semantics of pictures” (PDF). Proceedings of the 16th Conference on Advances in Neural Information Processing Systems NIPS.
R Jin; J Y Chai; L Si (2004). „Effective Automatic Image Annotation via A Coherent Language Model and Active Learning” (PDF). Proceedings of MM'04.
D Metzler; R Manmatha (2004). „An inference network approach to image retrieval” (PDF). Proceedings of the International Conference on Image and Video Retrieval. стр. 42—50.
S Feng; R Manmatha; V Lavrenko (2004). „Multiple Bernoulli relevance models for image and video annotation” (PDF). IEEE Conference on Computer Vision and Pattern Recognition. стр. 1002—1009.
J Y Pan; H-J Yang; P Duygulu; C Faloutsos (2004). „Automatic Image Captioning” (PDF). Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME'04). Архивирано из оригинала (PDF) 2004-12-09. г.
Quan Hoang Lam; Quang Duy Le; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen (2020). „UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning”. Proceedings of the 2020 International Conference on Computational Collective Intelligence (ICCCI 2020). arXiv:2002.00175  . doi:10.1007/978-3-030-63007-2_57.
J Fan; Y Gao; H Luo; G Xu (2004). „Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation”. Proceedings of the 27th annual international conference on Research and development in information retrieval. стр. 361—368.
A Oliva; A Torralba (2001). „Modeling the shape of the scene: a holistic representation of the spatial envelope” (PDF). International Journal of Computer Vision. стр. 42:145—175.
A Yavlinsky, E Schofield; S Rüger (2005). „Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation” (PDF). Int'l Conf on Image and Video Retrieval (CIVR, Singapore, Jul 2005). Архивирано из оригинала (PDF) 2005-12-20. г.
N Vasconcelos; A Lippman (2001). „Statistical Models of Video Structure for Content Analysis and Characterization” (PDF). IEEE Transactions on Image Processing. стр. 1—17.
Ilaria Bartolini; Marco Patella; Corrado Romani (2010). „Shiatsu: Semantic-based Hierarchical Automatic Tagging of Videos by Segmentation Using Cuts”. 3rd ACM International Multimedia Workshop on Automated Information Extraction in Media Production (AIEMPro10).
Yohan Jin; Latifur Khan; Lei Wang; Mamoun Awad (2005). „Image annotations by combining multiple evidence & wordNet”. 13th Annual ACM International Conference on Multimedia (MM 05). стр. 706—715.
Changhu Wang; Feng Jing; Lei Zhang; Hong-Jiang Zhang (2006). „Image annotation refinement using random walk with restarts”. 14th Annual ACM International Conference on Multimedia (MM 06).
Changhu Wang; Feng Jing; Lei Zhang; Hong-Jiang Zhang (2007). „content-based image annotation refinement”. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 07). doi:10.1109/CVPR.2007.383221.
Ilaria Bartolini; Paolo Ciaccia (2007). „Imagination: Exploiting Link Analysis for Accurate Image Annotation”. Springer Adaptive Multimedia Retrieval. doi:10.1007/978-3-540-79860-6_3.
Ilaria Bartolini; Paolo Ciaccia (2010). „Multi-dimensional Keyword-based Image Annotation and Search”. 2nd ACM International Workshop on Keyword Search on Structured Data (KEYS 2010).
Emre Akbas; Fatos Y. Vural (2007). „Automatic Image Annotation by Ensemble of Visual Descriptors”. Intl. Conf. on Computer Vision (CVPR) 2007, Workshop on Semantic Learning Applications in Multimedia. doi:10.1109/CVPR.2007.383484. hdl:11511/16027  .
Ameesh Makadia and Vladimir Pavlovic and Sanjiv Kumar (2008). „A New Baseline for Image Annotation” (PDF). European Conference on Computer Vision (ECCV).
Chong Wang and David Blei and Li Fei-Fei (2009). „Simultaneous Image Classification and Annotation” (PDF). Conf. on Computer Vision and Pattern Recognition (CVPR).
Matthieu Guillaumin and Thomas Mensink and Jakob Verbeek and Cordelia Schmid (2009). „TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation” (PDF). Intl. Conf. on Computer Vision (ICCV).
Yashaswi Verma; C. V. Jawahar (2012). „Image Annotation Using Metric Learning in Semantic Neighbourhoods” (PDF). European Conference on Computer Vision (ECCV). Архивирано из оригинала (PDF) 2013-05-14. г. Приступљено 2014-02-26.
Venkatesh N. Murthy; Subhransu Maji and R. Manmatha (2015). „Automatic Image Annotation Using Deep Learning Representations” (PDF). International Conference on Multimedia (ICMR).
Sarin, Supheakmungkol; Fahrmair, Michael; Wagner, Matthias; Kameyama, Wataru (2012). Leveraging Features from Background and Salient Regions for Automatic Image Annotation. Journal of Information Processing. 20. стр. 250—266.
N. B. Marvasti; E. Yörük and B. Acar (2018). „Computer-Aided Medical Image Annotation: Preliminary Results With Liver Lesions in CT”. IEEE Journal of Biomedical and Health Informatics.

[Forsyth2012-1] Forsyth, David; Ponce, Jean (2012). Computer vision: a modern approach. Pearson.

[:2-2] Russakovsky, Olga; Deng, Jia; Su, Hao; Krause, Jonathan; Satheesh, Sanjeev; Ma, Sean; Huang, Zhiheng; Karpathy, Andrej; Khosla, Aditya; Bernstein, Michael; Berg, Alexander C. (децембар 2015). „ImageNet Large Scale Visual Recognition Challenge”. International Journal of Computer Vision (на језику: енглески). 115 (3): 211—252. ISSN 0920-5691. S2CID 2930547. arXiv:1409.0575  . doi:10.1007/s11263-015-0816-y. hdl:1721.1/104944  . Архивирано из оригинала 2023-03-15. г. Приступљено 2020-11-20.

[3] „Archived copy” (PDF). i.yz.yamagata-u.ac.jp. Архивирано из оригинала (PDF) 8. 8. 2014. г. Приступљено 13. 1. 2022.

[1]

[2]

[3]