Language, Data Science and Digital Humanities - Böcker
Visar alla böcker i serien Language, Data Science and Digital Humanities. Handla med fri frakt och snabb leverans.
5 produkter
5 produkter
1 383 kr
Skickas inom 10-15 vardagar
This volume highlights the ways in which recent developments in corpus linguistics and natural language processing can engage with topics across language studies, humanities and social science disciplines.New approaches have emerged in recent years that blur disciplinary boundaries, facilitated by factors such as the application of computational methods, access to large data sets, and the sharing of code, as well as continual advances in technologies related to data storage, retrieval, and processing. The “march of data” denotes an area at the border region of linguistics, humanities, and social science disciplines, but also the inevitable development of the underlying technologies that drive analysis in these subject areas.Organized into 3 sections, the chapters are connected by the underlying thread of linguistic corpora: how they can be created, how they can shed light on varieties or registers, and how their metadata can be utilized to better understand the internal structure of similar resources. While some chapters in the volume make use of well-established existing corpora, others analyze data from platforms such as YouTube, Twitter or Reddit. The volume provides insight into the diversity of methods, approaches, and corpora that inform our understanding of the “border regions” between the realms of data science, language/linguistics, and social or cultural studies.
406 kr
Skickas inom 10-15 vardagar
This volume highlights the ways in which recent developments in corpus linguistics and natural language processing can engage with topics across language studies, humanities and social science disciplines.New approaches have emerged in recent years that blur disciplinary boundaries, facilitated by factors such as the application of computational methods, access to large data sets, and the sharing of code, as well as continual advances in technologies related to data storage, retrieval, and processing. The “march of data” denotes an area at the border region of linguistics, humanities, and social science disciplines, but also the inevitable development of the underlying technologies that drive analysis in these subject areas.Organized into 3 sections, the chapters are connected by the underlying thread of linguistic corpora: how they can be created, how they can shed light on varieties or registers, and how their metadata can be utilized to better understand the internal structure of similar resources. While some chapters in the volume make use of well-established existing corpora, others analyze data from platforms such as YouTube, Twitter or Reddit. The volume provides insight into the diversity of methods, approaches, and corpora that inform our understanding of the “border regions” between the realms of data science, language/linguistics, and social or cultural studies.
Text Analytics for Corpus Linguistics and Digital Humanities
Simple R Scripts and Tools
Inbunden, Engelska, 2024
1 383 kr
Skickas inom 10-15 vardagar
Do you want to gain a deeper understanding of how big tech analyses and exploits our text data, or investigate how political parties differ by analysing textual styles, associations and trends in documents? Or create a map of a text collection and write a simple QA system yourself?This book explores how to apply state-of-the-art text analytics methods to detect and visualise phenomena in text data. Solidly based on methods from corpus linguistics, natural language processing, text analytics and digital humanities, this book shows readers how to conduct experiments with their own corpora and research questions, underpin their theories, quantify the differences and pinpoint characteristics. Case studies and experiments are detailed in every chapter using real-world and open access corpora from politics, World English, history, and literature. The results are interpreted and put into perspective, pitfalls are pointed out, and necessary pre-processing steps are demonstrated. This book also demonstrates how to use the programming language R, as well as simple alternatives and additions to R, to conduct experiments and employ visualisations by example, with extensible R-code, recipes, links to corpora, and a wide range of methods. The methods introducedcan be used across texts of all disciplines, from history or literature to party manifestos and patient reports.
Text Analytics for Corpus Linguistics and Digital Humanities
Simple R Scripts and Tools
Häftad, Engelska, 2025
324 kr
Skickas inom 10-15 vardagar
Do you want to gain a deeper understanding of how big tech analyses and exploits our text data, or investigate how political parties differ by analysing textual styles, associations and trends in documents? Or create a map of a text collection and write a simple QA system yourself?This book explores how to apply state-of-the-art text analytics methods to detect and visualise phenomena in text data. Solidly based on methods from corpus linguistics, natural language processing, text analytics and digital humanities, this book shows readers how to conduct experiments with their own corpora and research questions, underpin their theories, quantify the differences and pinpoint characteristics. Case studies and experiments are detailed in every chapter using real-world and open access corpora from politics, World English, history, and literature. The results are interpreted and put into perspective, pitfalls are pointed out, and necessary pre-processing steps are demonstrated. This book also demonstrates how to use the programming language R, as well as simple alternatives and additions to R, to conduct experiments and employ visualisations by example, with extensible R-code, recipes, links to corpora, and a wide range of methods. The methods introducedcan be used across texts of all disciplines, from history or literature to party manifestos and patient reports.
Linguistic Data Science and the English Passive
Modeling Diachronic Developments and Regional Variation
Inbunden, Engelska, 2025
1 314 kr
Skickas inom 10-15 vardagar
The choice between BE and GET as auxiliary verbs, as in “She was promoted” vs “She got promoted”, is a central, grammatical feature, yet the many proposed nuances conditioning this phenomenon have escaped large-scale empirical validation to date. This book fills this gap, using multivariate statistical analyses of several large corpora to explore different factors determining the choice of English passive auxiliary.Addressing both diachronic developments (using the Corpus of Historical American English) and synchronic regional variation (using the Corpus of Global Web-based English), the book employs methods that combine traditional corpus linguistics with newer machine-learning tools in an innovative and intricate manner. To circumscribe the variable context, the authors train a statistical model to distinguish central from peripheral passives. The study tests the influence of various predictors, derived from the previous literature on the passive, with the use of automated sentiment analysis and subject detection, manual animacy coding, distributional semantics, and a mixed-effects regression model. Putting forward an automatic way of distinguishing more stative from more dynamic passives, the book demonstrates how to examine the passive construction in a much larger dataset than in previous studies, and shows how advanced computational models can be used to productively engage traditional philological questions, such as those related to language change and regional variation.