Image‑Based Flow Prediction of Vocal Folds Using 3D Convolutional 
Neural Networks

doi:10.1007/s42235-023-00466-3

Journal of Bionic Engineering ›› 2024, Vol. 21 ›› Issue (2): 991-1002.doi: 10.1007/s42235-023-00466-3

Image‑Based Flow Prediction of Vocal Folds Using 3D Convolutional Neural Networks

Yang Zhang1; Tianmei Pu2; Jiasen Xu1; Chunhua Zhou3

1 College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China 2 College of Electrical, Energy and Power Engineering, Yangzhou University, Yangzhou 225127, China 3 College of Aeronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

出版日期:2024-01-30 发布日期:2024-04-09
通讯作者: Yang Zhang E-mail:zhy@nuaa.edu.cn
作者简介:Yang Zhang1; Tianmei Pu2; Jiasen Xu1; Chunhua Zhou3

Image‑Based Flow Prediction of Vocal Folds Using 3D Convolutional Neural Networks

Yang Zhang1; Tianmei Pu2; Jiasen Xu1; Chunhua Zhou3

1 College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China 2 College of Electrical, Energy and Power Engineering, Yangzhou University, Yangzhou 225127, China 3 College of Aeronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

Online:2024-01-30 Published:2024-04-09
Contact: Yang Zhang E-mail:zhy@nuaa.edu.cn
About author:Yang Zhang1; Tianmei Pu2; Jiasen Xu1; Chunhua Zhou3

摘要/Abstract

摘要： In this work, a three dimensional (3D) convolutional neural network (CNN) model based on image slices of various normal and pathological vocal folds is proposed for accurate and efcient prediction of glottal fows. The 3D CNN model is composed of the feature extraction block and regression block. The feature extraction block is capable of learning low dimensional features from the high dimensional image data of the glottal shape, and the regression block is employed to fatten the output from the feature extraction block and obtain the desired glottal fow data. The input image data is the condensed set of 2D image slices captured in the axial plane of the 3D vocal folds, where these glottal shapes are synthesized based on the equations of normal vibration modes. The output fow data is the corresponding fow rate, averaged glottal pressure and nodal pressure distributions over the glottal surface. The 3D CNN model is built to establish the mapping between the input image data and output fow data. The ground-truth fow variables of each glottal shape in the training and test datasets are obtained by a high-fdelity sharp-interface immersed-boundary solver. The proposed model is trained to predict the concerned fow variables for glottal shapes in the test set. The present 3D CNN model is more efcient than traditional Computational Fluid Dynamics (CFD) models while the accuracy can still be retained, and more powerful than previous data-driven prediction models because more details of the glottal fow can be provided. The prediction performance of the trained 3D CNN model in accuracy and efciency indicates that this model could be promising for future clinical applications.

关键词: Vocal folds , · Computational fuid dynamics , · Machine learning , · 3D convolutional neural network

Abstract: In this work, a three dimensional (3D) convolutional neural network (CNN) model based on image slices of various normal and pathological vocal folds is proposed for accurate and efcient prediction of glottal fows. The 3D CNN model is composed of the feature extraction block and regression block. The feature extraction block is capable of learning low dimensional features from the high dimensional image data of the glottal shape, and the regression block is employed to fatten the output from the feature extraction block and obtain the desired glottal fow data. The input image data is the condensed set of 2D image slices captured in the axial plane of the 3D vocal folds, where these glottal shapes are synthesized based on the equations of normal vibration modes. The output fow data is the corresponding fow rate, averaged glottal pressure and nodal pressure distributions over the glottal surface. The 3D CNN model is built to establish the mapping between the input image data and output fow data. The ground-truth fow variables of each glottal shape in the training and test datasets are obtained by a high-fdelity sharp-interface immersed-boundary solver. The proposed model is trained to predict the concerned fow variables for glottal shapes in the test set. The present 3D CNN model is more efcient than traditional Computational Fluid Dynamics (CFD) models while the accuracy can still be retained, and more powerful than previous data-driven prediction models because more details of the glottal fow can be provided. The prediction performance of the trained 3D CNN model in accuracy and efciency indicates that this model could be promising for future clinical applications.

Key words: Vocal folds , · Computational fuid dynamics , · Machine learning , · 3D convolutional neural network

Yang Zhang, Tianmei Pu, Jiasen Xu & Chunhua Zhou . Image‑Based Flow Prediction of Vocal Folds Using 3D Convolutional Neural Networks[J]. Journal of Bionic Engineering, 2024, 21(2): 991-1002.

[1]	Ali Nasr, Sydney Bell, Rachel L. Whittaker, Clark R. Dickerson & John McPhee . Robust Machine Learning Mapping of sEMG Signals to Future Actuator Commands in Biomechatronic Devices[J]. Journal of Bionic Engineering, 2024, 21(1): 270-287.
[2]	Ali Fatahi, Mohammad H. Nadimi-Shahraki, Hoda Zamani . An Improved Binary Quantum-based Avian Navigation Optimizer Algorithm to Seleect Effective Features from Medical Data: A COVID-19 Case Study[J]. Journal of Bionic Engineering, 2024, 21(1): 426-446.
[3]	Shuyu Wang & Zhaojia Sun . Hydrogel and Machine Learning for Soft Robots’ Sensing and Signal Processing: A Review [J]. Journal of Bionic Engineering, 2023, 20(3): 845-857.

Image‑Based Flow Prediction of Vocal Folds Using 3D Convolutional Neural Networks

Image‑Based Flow Prediction of Vocal Folds Using 3D Convolutional Neural Networks

赞

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 3

Hydrogel and Machine Learning for Soft Robots’ Sensing and Signal Processing: A Review

Metrics

本文评价

推荐阅读 10