Single Channel Phase-Aware Signal Processing in Speech Communication
Theory and Practice
1 158 kr
Beställningsvara. Skickas inom 11-20 vardagar. Fri frakt över 249 kr.
Beskrivning
Produktinformation
- Utgivningsdatum:2016-12-23
- Mått:173 x 246 x 18 mm
- Vikt:544 g
- Format:Inbunden
- Språk:Engelska
- Antal sidor:256
- Förlag:John Wiley & Sons Inc
- ISBN:9781119238812
Utforska kategorier
Mer om författaren
Pejman Mowlaee, Graz University of Technology, AustriaDr. Mowlaee is a Senior Research and Teaching Associate at the Speech Communication and Signal Processing Laboratory, Graz University of Technology, Austria. He has received several awards including best M.Sc. thesis, awarded by the National Scientific Students’ Organization of Electrical Engineering in 2007; and was a member of the organizing committee for the annual European Signal Processing Conference in 2010 in Aalborg and AUDIS workshop 2012 in Aachen. He has contributed over 40 journal and conference articles; he is a senior member of IEEE, acted as a reviewer for a number of journals, and has played an active role in organizing special sessions on the topic of the book at INTERSPEECH conferences. Dr. Mowlaee is also Guest Editor for the forthcoming Special Issue in Speech Communication (Elsevier) on Phase-Aware Signal Processing for Speech Communication.Johannes Stahl, Graz University of Technology, AustriaIn 2009, Johannes started studying Electrical Engineering and Audio Engineering, at Graz University of Technology. In 2015, he received his Dipl.-Ing. (MSc) degree with distinction. In 2015 he joined the Signal Processing and Speech Communication Laboratory, at Graz University of Technology, where he is currently pursuing his PhD thesis in the field of speech processing.Josef Kulmer, Graz University of Technology, AustriaJosef received the M.Sc. degree from Graz University of Technology, Austria, in 2014. In 2014 he joined the Signal Processing and Speech Communication Laboratory, at Graz, University of Technology, where he is currently pursuing his PhD thesis in the field of signal processing.Florian Mayer, Graz University of Technology, AustriaIn 2006, Florian started studying Electrical Engineering and Audio Engineering, at Graz University of Technology, and received his Dipl.-Ing. (MSc) in 2015. In 2015 he joined the Signal Processing and Speech Communication Laboratory, at Graz University of Technology, where he is currently pursuing his PhD thesis in the field of speech processing.
Innehållsförteckning
- About the Authors xiPreface xiiiList of Symbols xviiPart I History, Theory and Concepts 11 Introduction: Phase Processing, History 3Pejman Mowlaee1.1 Chapter Organization 31.2 Conventional Speech Communication 31.3 Historical Overview of the Importance or Unimportance of Phase 61.4 Importance of Phase in Speech Processing 91.4.1 Speech Enhancement 91.4.1.1 Unimportance of Phase in Speech Enhancement 101.4.1.2 Effects of Phase Modification in Speech Signals 101.4.1.3 Phase Spectrum Compensation 101.4.1.4 Phase Importance for Improved Signal Reconstruction 111.4.2 Speech Watermarking 111.4.3 Speech Coding 121.4.4 Artificial Bandwidth Extension 131.4.5 Speech Synthesis 141.4.6 Speech/Speaker Recognition 151.5 Structure of the Book 161.6 Experiments 181.6.1 Experiment 1.1: Phase Unimportance in Speech Enhancement 181.6.2 Experiment 1.2: Effects of Phase Modification 201.6.3 Experiment 1.3: Mismatched Window 221.6.4 Experiment 1.4: Phase Spectrum Compensation 241.7 Summary 26References 262 Fundamentals of Phase-Based Signal Processing 33Pejman Mowlaee2.1 Chapter Organization 332.2 STFT Phase: Background and Some Remarks 332.2.1 Short-Time Fourier Transform 332.2.2 Fourier Analysis of Speech: STFT Amplitude and Phase 342.3 Phase Unwrapping 352.3.1 Problem Definition 352.3.2 Remarks on Phase Unwrapping 382.3.3 Phase Unwrapping Solutions 382.3.3.1 Detecting Discontinuities 392.3.3.2 Numerical Integration (NI) 402.3.3.3 Isolating Sharp Zeros 412.3.3.4 Iterative Phase Unwrapping 412.3.3.5 Polynomial Factorization (PF) 422.3.3.6 Time Series Approach 422.3.3.7 Composite Method 432.3.3.8 Schur–Cohn and Nyquist Frequency 442.4 Useful Phase-Based Representations 442.4.1 Group Delay Representations 452.4.2 Instantaneous Frequency 482.4.3 Baseband Phase Difference 492.4.4 Harmonic Phase Decomposition 502.4.4.1 Background on the Harmonic Model 502.4.4.2 Phase Decomposition using the Harmonic Model 512.4.5 Phasegram: Unwrapped Harmonic Phase 522.4.5.1 Definitions and Background 522.4.5.2 Circular Mean and Variance 522.4.6 Relative Phase Shift 532.4.7 Phase Distortion 542.5 Experiments 572.5.1 Experiment 2.1: One-Dimensional Phase Unwrapping 572.5.1.1 Clean Signal Scenario 572.5.1.2 Noisy Signal Scenario 582.5.2 Experiment 2.2: Comparative Study of Phase Unwrapping Methods 582.5.3 Experiment 2.3: Comparative Study on Group Delay Spectra 592.5.4 Experiment 2.4: Circular Statistics of the Harmonic Phase 602.5.5 Experiment 2.5: Circular Statistics of the Spectral Phase 622.5.6 Experiment 2.6: Comparative Study of Phase Representations 632.6 Summary 65References 653 Phase Estimation Fundamentals 71Josef Kulmer and Pejman Mowlaee3.1 Chapter Organization 713.2 Phase Estimation Fundamentals 713.2.1 Background and Fundamentals 713.2.2 Key Examples: Phase Estimation Problem 723.2.2.1 Example 1: Discrete-Time Sinusoid 723.2.2.2 Example 2: Discrete-Time Sinusoid in Noise 763.2.3 Phase Estimation 803.2.3.1 Maximum Likelihood Estimation 803.2.3.2 Maximum a Posteriori Estimation 833.3 Existing Solutions 843.3.1 Iterative Signal Reconstruction 843.3.1.1 Background 843.3.1.2 Griffin–Lim Algorithm (GLA) 853.3.1.3 Extensions of the GLA 873.3.2 Phase Reconstruction Across Time 893.3.3 Phase Reconstruction Across Frequency 903.3.4 Phase Randomization 913.3.5 Geometry-Based Phase Estimation 933.3.6 Least Squares (LS) 953.3.7 Spectro-Temporal Smoothing of Unwrapped Phase 973.3.7.1 Signal Segmentation 973.3.7.2 Linear Phase Removal 983.3.7.3 Apply Smoothing Filter 983.3.7.4 Reconstruction of the Enhanced-Phase Signal 1013.4 Experiments 1013.4.1 Experiment 3.1: Monte Carlo Simulation Comparing ML and MAP 1013.4.2 Experiment 3.2: Monte Carlo Simulation on Window Impact 1033.4.3 Experiment 3.3: Phase Recovery Using the Griffin–Lim Algorithm 1053.4.4 Experiment 3.4: Phase Estimation for Speech Enhancement: A Comparative Study 1053.5 Summary 107References 108Part II Applications 1134 Phase Processing for Single-Channel Speech Enhancement 115Johannes Stahl and Pejman Mowlaee4.1 Introduction and Chapter Organization 1154.2 Speech Enhancement in the STFT Domain: General Concepts 1164.2.1 A priori SNR Estimation 1164.2.1.1 Decision-Directed a priori SNR Estimation 1174.2.1.2 Cepstro-Temporal Smoothing 1184.2.2 Noise PSD Estimation 1184.2.2.1 Minimum Statistics 1194.3 Conventional Speech Enhancement 1194.3.1 Statistical Model 1194.3.2 Short-Time Spectral Amplitude Estimation 1214.4 Phase-Sensitive Speech Enhancement 1234.4.1 Phase Estimation for Signal Reconstruction 1234.4.2 Spectral Amplitude Estimation Given the STFT Phase 1244.4.3 Iterative Closed-Loop Phase-Aware Single-Channel Speech Enhancement 1264.4.4 Incorporating Voiced/Unvoiced Uncertainty 1284.4.5 Uncertainty in Prior Phase Information 1304.4.6 Stochastic–Deterministic MMSE-STFT Speech Enhancement 1314.4.6.1 Obtaining the Speech Parameters 1344.5 Experiments 1354.5.1 Experiment 4.1: Proof of Concept 1354.5.2 Experiment 4.2: Consistency 1364.5.3 Experiment 4.3: Sensitivity Analysis 1374.6 Summary 139References 1395 Phase Processing for Single-Channel Source Separation 143Pejman Mowlaee and Florian Mayer5.1 Chapter Organization 1435.2 Why Single-Channel Source Separation? 1435.2.1 Background 1435.2.2 Problem Formulation 1445.3 Conventional Single-Channel Source Separation 1455.3.1 Source-Driven SCSS 1465.3.1.1 Ideal Binary Mask 1475.3.1.2 Ideal Ratio Mask 1475.3.2 Model-Based SCSS 1475.3.2.1 Deep Learning 1495.3.2.2 Non-NegativeMatrix Factorization 1505.4 Phase Processing for Single-Channel Source Separation 1525.4.1 Complex Matrix Factorization Methods 1525.4.1.1 Complex Matrix Factorization 1525.4.1.2 Complex Matrix Factorization with Intra-Source Additivity 1545.4.2 Phase Importance for Signal Reconstruction 1555.4.2.1 Multiple Input Spectrogram Inversion 1555.4.2.2 Partial Phase Reconstruction 1565.4.2.3 Informed Source Separation Using Iterative Reconstruction (ISSIR) 1575.4.2.4 Sinusoidal-Based PPR 1585.4.2.5 Spectrogram Consistency 1595.4.2.6 Geometry-Based Phase Estimation 1605.4.2.7 Phase Decomposition and Temporal Smoothing 1625.4.2.8 Phase Reconstruction of Spectrograms with Linear Unwrapping 1635.4.3 Phase-Aware Time–Frequency Masks 1645.4.3.1 Phase-Insensitive Masks 1645.4.3.2 Phase-Sensitive Mask 1655.4.3.3 Complex Ratio Mask 1655.4.3.4 Complex Mask 1665.4.4 Phase Importance in Signal Interaction Models 1665.5 Experiments 1685.5.1 Experiment 5.1: Phase Estimation for Proof-of-Concept Signal Reconstruction 1685.5.2 Experiment 5.2: Comparative Study of GLA-Based Phase Reconstruction Methods 1685.5.2.1 Convergence Analysis 1695.5.2.2 Quantized Scenario 1695.5.3 Experiment 5.3: Phase-Aware Time–Frequency Mask 1705.5.4 Experiment 5.4: Phase-Sensitive Interaction Functions 1725.5.5 Experiment 5.5: Complex Matrix Factorization 1725.6 Summary 174References 1746 Phase-Aware Speech Quality Estimation 179Pejman Mowlaee6.1 Chapter Organization 1796.2 Introduction: Speech Quality Estimation 1796.2.1 General Definition of Speech Quality 1806.2.2 Speech Quality Estimators: Amplitude, Phase, or Both? 1816.3 Conventional Instrumental Metrics for Speech Quality Estimation 1826.3.1 Perceived Quality 1826.3.2 Speech Intelligibility 1846.4 Why Phase-Aware Metrics? 1886.4.1 Phase and Speech Intelligibility 1886.4.2 Phase and Perceived Quality 1886.5 New Phase-Aware Metrics 1896.5.1 Group Delay Deviation 1896.5.2 Instantaneous Frequency Deviation 1906.5.3 Unwrapped MSE 1906.5.4 Phase Deviation 1906.5.5 UnHPSNR and UnRMSE 1916.6 Subjective Tests 1916.6.1 CCR Test 1926.6.2 MUSHRA Test 1926.6.3 Statistical Analysis 1936.6.4 Speech Intelligibility Test 1946.6.5 Evaluation of Speech Quality Measures 1966.7 Experiments 1986.7.1 Experiment 6.1: Impact of Phase Modifications on Speech Quality 1996.7.2 Experiment 6.2: Phase and Perceived Quality Estimation 2016.7.3 Experiment 6.3: Phase and Speech Intelligibility Estimation 2026.7.4 Experiment 6.4: Evaluating the Phase Estimation Accuracy 2036.8 Summary 205References 2057 Conclusion and Future Outlook 210Pejman Mowlaee7.1 Chapter Organization 2107.2 Renaissance of Phase-Aware Signal Processing: Decline and Rise 2107.3 Directions for Future Research 2117.3.1 Related Research Disciplines 2127.3.1.1 Phase-Aware Processing for Speech and Speaker Recognition 2127.3.1.2 Speech Synthesis and Speech Coding 2127.3.1.3 Phase-Aware Speech Enhancement for De-Reverberation 2137.3.1.4 Iterative Signal Estimation 2137.3.1.5 More Robust Phase Estimators 2147.3.1.6 Instrumental Measures in Complex Signal Domain 2147.3.1.7 Multi-Channel Speech Processing 2147.3.2 Other Research Disciplines 2157.3.2.1 Processing Non-Speech Signals 2157.3.2.2 Processing Signals of Higher Dimensionality Than One 2157.4 Summary 215References 216A MATLAB Toolbox 220A.1 Chapter Organization 220A.2 Phase Lab Toolbox 220A.2.1 MATLAB® Code 220A.2.2 Additional Material 221References 221Index 223
Du kanske också är intresserad av
When Things Become Property
Thomas Sikor, Stefan Dorondel, Johannes Stahl, Phuc Xuan To
1 956 kr