Xu-Cheng Yin – författare
Document Analysis and Recognition – ICDAR 2025
19th International Conference, Wuhan, China, September 16–21, 2025, Proceedings, Part I
2 425 kr
Skickas inom 10-15 vardagar
2 960 kr
Läs direkt efter köp
Document Analysis and Recognition – ICDAR 2025
19th International Conference, Wuhan, China, September 16–21, 2025, Proceedings, Part II
1 584 kr
Skickas inom 10-15 vardagar
1 922 kr
Läs direkt efter köp
Document Analysis and Recognition – ICDAR 2025
19th International Conference, Wuhan, China, September 16–21, 2025, Proceedings, Part III
1 576 kr
Skickas inom 5-8 vardagar
1 922 kr
Läs direkt efter köp
Document Analysis and Recognition – ICDAR 2025
19th International Conference, Wuhan, China, September 16–21, 2025, Proceedings, Part IV
1 576 kr
Skickas inom 5-8 vardagar
1 922 kr
Läs direkt efter köp
Document Analysis and Recognition – ICDAR 2025
19th International Conference, Wuhan, China, September 16–21, 2025, Proceedings, Part V
1 584 kr
Skickas inom 10-15 vardagar
1 922 kr
Läs direkt efter köp
1 916 kr
Skickas inom 5-8 vardagar
2 435 kr
Läs direkt efter köp
This book delves into visual object tracking (VOT), a fundamental aspect of computer vision crucial for replicating human dynamic vision, with applications ranging from self-driving vehicles to surveillance systems. Despite significant strides propelled by deep learning, challenges such as target deformation and motion persist, exposing a disparity between cutting-edge VOT systems and human performance. This observation underscores the necessity to thoroughly scrutinize and enhance evaluation methodologies within VOT research.
Hence, the primary objective of this book is to equip readers with essential insights into dynamic visual tasks encapsulated by VOT. Beginning with the elucidation of task definitions, it integrates interdisciplinary perspectives on evaluation techniques. The book is organized into five parts, tracing the evolution of VOT from perceptual to cognitive intelligence, exploring the experimental frameworks utilized in assessments, analyzing the various agents involved, including tracking algorithms and human visual tracking, and dissecting evaluation mechanisms through both machine–machine and human–machine comparisons. Furthermore, it examines the trend toward crafting more human-like task definitions and comprehensive evaluation frameworks to effectively gauge machine intelligence.
This book serves as a roadmap for researchers aiming to grasp the bottlenecks in VOT capabilities and comprehend the gaps between current methodologies and human abilities, all geared toward advancing algorithmic intelligence. It also delves into the realm of data-centric AI, emphasizing the pivotal role of high-quality datasets and evaluation systems in the age of large language models (LLMs). Such systems are indispensable for training AI models while ensuring their safety and reliability. Utilizing VOT as a case study, the book offers detailed insights into these facets of data-centric AI research. Designed to cater to readers with foundational knowledge in computer vision, it employs diagrams and examples to facilitate comprehension, providing essential groundwork for understanding key technical components.
556 kr
Skickas inom 5-8 vardagar
714 kr
Läs direkt efter köp
In real-world applications, new data, patterns, and categories that were not covered by the training data can frequently emerge, necessitating the capability to detect and adapt to novel characters incrementally. Researchers refer to these challenges as the Open-Set Text Recognition (OSTR) task, which has, in recent years, emerged as one of the prominent issues in the field of text recognition. This book begins by providing an introduction to the background of the OSTR task, covering essential aspects such as open-set identification and recognition, conventional OCR methods, and their applications. Subsequently, the concept and definition of the OSTR task are presented encompassing its objectives, use cases, performance metrics, datasets, and protocols. A general framework for OSTR is then detailed, composed of four key components: The Aligned Represented Space, the Label-to-Representation Mapping, the Sample-to-Representation Mapping, and the Open-set Predictor. In addition,possible implementations of each module within the framework are discussed. Following this, two specific open-set text recognition methods, OSOCR and OpenCCD, are introduced. The book concludes by delving into applications and future directions of Open-set text recognition tasks.
This book presents a comprehensive overview of the open-set text recognition task, including concepts, framework, and algorithms. It is suitable for graduated students and young researchers who are majoring in pattern recognition and computer science, especially interdisciplinary research.