广告区

广告区


本站消息

站长简介/公众号

关注本站官方公众号:程序员总部,领取三大福利!
福利一:python和前端辅导
福利二:进程序员交流微信群,专属于程序员的圈子
福利三:领取全套零基础视频教程(python,java,前端,php)

  价值13000svip视频教程,python大神匠心打造,零基础python开发工程师视频教程全套,基础+进阶+项目实战,包含课件和源码

  出租广告位,需要合作请联系站长

+关注
已关注

分类  

暂无分类

标签  

暂无标签

日期归档  

2021-08(6)

2021-09(11)

阵列麦克风声音定位-代码python实现-二维与三维声音定位

发布于2022-08-01 18:09     阅读(158)     评论(0)     点赞(23)     收藏(2)



0 声音处理基础专业名词

FT - 傅立叶变换FT(Fourier Transform) 时域频域转换,链接讲的很清晰。

FFT - 快速傅里叶变换 (fast Fourier transform):计算机计算DFT

DTFT - 离散时间傅立叶变换:时域离散,频域连续

DFT-离散傅立叶变换:时域离散,频域也离散时域离散,频域连续
相当于对时域的连续信号进行抽样(等间隔采样)后,再进行傅立叶变换。FFT DFT DTDF 的关系 可视化原理  此链接讲的很清晰)            IFT IFFT IDFT ...为逆变换  


 

STFT - 短时傅里叶变换short-time Fourier transform:在信号做傅里叶变换之前乘一个时间有限的窗函数 h(t),并假定非平稳信号在分析窗的短时间隔内是平稳的,通过窗函数 h(t)在时间轴上的移动,对信号进行逐段分析得到信号的一组局部“频谱”。STFT对声音处理很重要,可以生成频谱图,详细原理此STFT链接讲的很清晰。

MFCC - 梅尔频率倒谱系数。 MFCC链接讲的很清晰。梅尔频率:梅尔频率是一种给予人耳对等距的音高变化的感官判断而定的非线性频率刻度。它与频率赫兹的关系为:

 倒谱:是一种信号的频谱经过对数运算后再进行傅里叶反变换得到的谱。

DCT-离散余弦变换 Discrete Cosine Transform:不同频率振荡的余弦函数之和来表示数据点的有限序列。

 幅度谱、相位谱、能量谱等语音信号处理中的基础知识_IMU_YY的博客-CSDN博客_幅度谱

1 简介

1.1 什么是阵列麦克风

麦克风阵列是由一定数目的麦克风组成,对声场的空间特性进行采样并滤波的系统。目前常用的麦克风阵列可以按布局形状分为:线性阵列,平面阵列,以及立体阵列。其几何构型是按设计已知,所有麦克风的频率响应一致,麦克风的采样时钟也是同步的。

1.2 麦克风阵列的作用

麦克风阵列一般用于:声源定位,包括角度和距离的测量,抑制背景噪声、干扰、混响、回声,信号提取,信号分离。其中声源定位技术利用麦克风阵列计算声源距离阵列的角度和距离,实现对目标声源的跟踪。

更多原理

2 acoular库

基于acoular库来实现,官网手册有详细教程。

2.1 安装

安装可以通过pip安装。

pip install acoular

也可以源码安装,github下载,进入文件夹。

python setup.py install

进入python环境检查安装。

  1. import acoular
  2. acoular.demo.acoular_demo.run()

出现64阵列麦克风与三个模拟声援范例,安装成功。 

 3 二维定位

首先准备阵列麦克风的xml配置文件。就改麦格风个数与空间坐标。

  1. <?xml version="1.0" encoding="utf-8"?>
  2. <MicArray name="array_64">
  3. <pos Name="Point 1 " x=" 0.4 " y=" -0.1 " z=" 0 "/>
  4. <pos Name="Point 2 " x=" 0.2 " y=" 0 " z=" 0 "/>
  5. <pos Name="Point 3 " x=" 0.1 " y=" 0.1 " z=" 0 "/>
  6. <pos Name="Point 4 " x=" -0.4 " y=" 0.4 " z=" 0 "/>
  7. <pos Name="Point 5 " x=" -0.2 " y=" 0 " z=" 0 "/>
  8. <pos Name="Point 6 " x=" -0.1 " y=" -0.2 " z=" 0 "/>
  9. </MicArray>

准备这个麦克风的录音文件,如果有的是USB阵列麦克风,首先连接上再查对应的麦克风ID

  1. import numbers
  2. import pyaudio
  3. #//cat /proc/asound/devices
  4. p=pyaudio.PyAudio()
  5. info=p.get_host_api_info_by_index(0)
  6. numberdevices=info.get('deviceCount')
  7. for i in range(0,numberdevices):
  8. if(p.get_device_info_by_host_api_device_index(0,i).get('maxInputChannels'))>0:
  9. print('INPUT DVEICES ID:',i,"-",p.get_device_info_by_host_api_device_index(0,i).get('name'))

录音,保存格式为wav(wav为一般音频文件处理格式) 需要调节采样率等参数

  1. from chunk import Chunk
  2. from ctypes import sizeof
  3. import numbers
  4. import pyaudio
  5. import argparse
  6. import numpy as np
  7. #//cat /proc/asound/devices
  8. import wave
  9. import cv2
  10. p=pyaudio.PyAudio()
  11. def recode_voice(micid):#打开麦 设置数据流格式
  12. #调节rate channels stream=p.open(format=pyaudio.paInt16,channels=6,rate=16000,input=True,frames_per_buffer=8000,input_device_index=micid)
  13. return stream
  14. if __name__ == '__main__':
  15. paser=argparse.ArgumentParser(description="This bin is for recode the voice by wav,you need input the micid!")
  16. paser.add_argument('micid',type=int,help="the ID of mic device!")
  17. args=paser.parse_args()
  18. stream=recode_voice(args.micid)
  19. stream:pyaudio.Stream
  20. frames=[]
  21. i=0
  22. while(i<20):
  23. i+=1
  24. print('开始录音!')
  25. data=stream.read(8000,exception_on_overflow=False)
  26. audio_data=np.fromstring(data,dtype=np.short) #转numpy获取最大值
  27. # print(len(audio_data)) #8000一记录 chunk块
  28. temp=np.max(np.abs(audio_data)) #显示每8000个的最大数值
  29. print("当前最大数值:",temp)
  30. frames.append(data)
  31. print('停止录音!')
  32. wf=wave.open("./recordV.wav",'wb')
  33. wf.setnchannels(1)
  34. wf.setsampwidth(p.get_sample_size(pyaudio.paInt16))
  35. wf.setframerate(16000)
  36. wf.writeframes(b''.join(frames))
  37. wf.close

有了wav格式文件,转H5文件,acoular库需要h5音频格式。需要改文件名,对应采样率。

  1. from sys import byteorder
  2. import wave
  3. import tables
  4. import scipy.io.wavfile as wavfile
  5. name_="你的音频文件名"
  6. samplerate,data=wavfile.read(name_+'.wav')
  7. # fs=wave.open(name_+'.wav')
  8. meh5=tables.open_file(name_+".h5",mode="w")
  9. meh5.create_earray('/','time_data',obj=data)
  10. meh5.set_node_attr('/time_data','sample_freq',16000)

到这里H5格式的音频文件,xml配置文件都准备好,利用acoular库定位音源

  1. import acoular
  2. import pylab as plt
  3. from os import path
  4. micgeofile = path.join('/home/......./array_6.xml')#############输入麦格风文件
  5. rg = acoular.RectGrid( x_min=-1, x_max=1, y_min=-1, y_max=1, z=0.3, increment=0.01 )#画麦克风的网格大小
  6. mg = acoular.MicGeom( from_file=micgeofile ) #读麦位置
  7. ts = acoular.TimeSamples( name='memory.h5' ) ###########输入h5
  8. print(ts.numsamples)
  9. print(ts.numchannels)
  10. print(ts.sample_freq)
  11. print(ts.data)
  12. ps = acoular.PowerSpectra( time_data=ts, block_size=128, window='Hanning' )#分帧加窗
  13. plt.ion() # switch on interactive plotting mode
  14. print(mg.mpos[0],type(mg.mpos))
  15. plt.plot(mg.mpos[0],mg.mpos[1],'o')
  16. plt.show()
  17. plt.waitforbuttonpress()
  18. env=acoular.Environment(c=346.04)
  19. st = acoular.SteeringVector( grid=rg, mics=mg ,env=env)#用单源传输模型实现转向矢量的基本类
  20. bb = acoular.BeamformerBase( freq_data=ps, steer=st )#波束形成在频域采用基本的延迟和和算法。
  21. pm = bb.synthetic( 2000, 2 )
  22. Lm = acoular.L_p( pm )
  23. plt.figure() # open new figure
  24. plt.imshow( Lm.T, origin='lower', vmin=Lm.max()-0.1,extent=rg.extend())
  25. plt.colorbar()
  26. plt.waitforbuttonpress()

 运行效果

 三维定位(三维定位要慢一些)改xml,h5路径。还有分辨率,分贝范围可以调节

  1. # -*- coding: utf-8 -*-
  2. """
  3. Example "3D beamforming" for Acoular library.
  4. Demonstrates a 3D beamforming setup with point sources.
  5. Simulates data on 64 channel array,
  6. subsequent beamforming with CLEAN-SC on 3D grid.
  7. Copyright (c) 2019 Acoular Development Team.
  8. All rights reserved.
  9. """
  10. from os import path
  11. # imports from acoular
  12. import acoular
  13. from acoular import __file__ as bpath, L_p, MicGeom, PowerSpectra,\
  14. RectGrid3D, BeamformerBase, BeamformerCleansc, \
  15. SteeringVector, WNoiseGenerator, PointSource, SourceMixer
  16. # other imports
  17. from numpy import mgrid, arange, array, arccos, pi, cos, sin, sum
  18. import mpl_toolkits.mplot3d
  19. from pylab import figure, show, scatter, subplot, imshow, title, colorbar,\
  20. xlabel, ylabel
  21. #===============================================================================
  22. # First, we define the microphone geometry.
  23. #===============================================================================
  24. micgeofile = path.join('/home/sunshine/桌面/code_C_PY_2022/py/7.acoular库mvdr实现音源定位/array_6.xml')
  25. # generate test data, in real life this would come from an array measurement
  26. m = MicGeom( from_file=micgeofile )
  27. #===============================================================================
  28. # Now, the sources (signals and types/positions) are defined.
  29. #===============================================================================
  30. # sfreq = 51200
  31. # duration = 1
  32. # nsamples = duration*sfreq
  33. # n1 = WNoiseGenerator( sample_freq=sfreq, numsamples=nsamples, seed=1 )
  34. # n2 = WNoiseGenerator( sample_freq=sfreq, numsamples=nsamples, seed=2, rms=0.5 )
  35. # n3 = WNoiseGenerator( sample_freq=sfreq, numsamples=nsamples, seed=3, rms=0.25 )
  36. # p1 = PointSource( signal=n1, mics=m, loc=(-0.1,-0.1,0.3) )
  37. # p2 = PointSource( signal=n2, mics=m, loc=(0.15,0,0.17) )
  38. # p3 = PointSource( signal=n3, mics=m, loc=(0,0.1,0.25) )
  39. # pa = SourceMixer( sources=[p1,p2,p3])
  40. #===============================================================================
  41. # the 3D grid (very coarse to enable fast computation for this example)
  42. #===============================================================================
  43. g = RectGrid3D(x_min=-0.2, x_max=0.2,
  44. y_min=-0.2, y_max=0.2,
  45. z_min=0.1, z_max=0.36,
  46. increment=0.02)
  47. #===============================================================================
  48. # The following provides the cross spectral matrix and defines the CLEAN-SC beamformer.
  49. # To be really fast, we restrict ourselves to only 10 frequencies
  50. # in the range 2000 - 6000 Hz (5*400 - 15*400)
  51. #===============================================================================
  52. pa = acoular.TimeSamples( name='memory.h5' ) #读h5
  53. f = PowerSpectra(time_data=pa,
  54. window='Hanning',
  55. overlap='50%',
  56. block_size=128,
  57. ind_low=5, ind_high=16)
  58. st = SteeringVector(grid=g, mics=m, steer_type='true location')
  59. b = BeamformerCleansc(freq_data=f, steer=st)
  60. #===============================================================================
  61. # Calculate the result for 4 kHz octave band
  62. #===============================================================================
  63. map = b.synthetic(2000,2)#
  64. #===============================================================================
  65. # Display views of setup and result.
  66. # For each view, the values along the repsective axis are summed.
  67. # Note that, while Acoular uses a left-oriented coordinate system,
  68. # for display purposes, the z-axis is inverted, plotting the data in
  69. # a right-oriented coordinate system.
  70. #===============================================================================
  71. fig=figure(1,(8,8))
  72. # plot the results
  73. subplot(221)
  74. map_z = sum(map,2)
  75. mx = L_p(map_z.max())
  76. imshow(L_p(map_z.T), vmax=mx, vmin=mx-1, origin='lower', interpolation='nearest',
  77. extent=(g.x_min, g.x_max, g.y_min, g.y_max))
  78. xlabel('x')
  79. ylabel('y')
  80. title('Top view (xy)' )
  81. subplot(223)
  82. map_y = sum(map,1)
  83. imshow(L_p(map_y.T), vmax=mx, vmin=mx-1, origin='upper', interpolation='nearest',
  84. extent=(g.x_min, g.x_max, -g.z_max, -g.z_min))
  85. xlabel('x')
  86. ylabel('z')
  87. title('Side view (xz)' )
  88. subplot(222)
  89. map_x = sum(map,0)
  90. imshow(L_p(map_x), vmax=mx, vmin=mx-1, origin='lower', interpolation='nearest',
  91. extent=(-g.z_min, -g.z_max,g.y_min, g.y_max))
  92. xlabel('z')
  93. ylabel('y')
  94. title('Side view (zy)' )
  95. colorbar()
  96. # plot the setup
  97. # ax0 = fig.add_subplot((224), projection='3d')
  98. # ax0.scatter(m.mpos[0],m.mpos[1],-m.mpos[2])
  99. # source_locs=array([p1.loc,p2.loc,p3.loc]).T
  100. # ax0.scatter(source_locs[0],source_locs[1],-source_locs[2])
  101. # ax0.set_xlabel('x')
  102. # ax0.set_ylabel('y')
  103. # ax0.set_zlabel('z')
  104. # ax0.set_title('Setup (mic and source positions)')
  105. # only display result on screen if this script is run directly
  106. if __name__ == '__main__': show()







所属网站分类: 技术文章 > 博客

作者:我是小白兔

链接:https://www.pythonheidong.com/blog/article/1630390/be68bbfb41d87c1aa5c0/

来源:python黑洞网

任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任

23 0
收藏该文
已收藏

评论内容:(最多支持255个字符)