function y = istft(spec,parameter) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Name: istft % Date: 03-2014 % Programmer: Jonathan Driedger % http://www.audiolabs-erlangen.de/resources/MIR/TSMtoolbox/ % % Computing the 'inverse' of the stft according to the paper "Signal % Estimation from Modified Short-Time Fourier Transform" by Griffin and % Lim. % % Input: spec a complex spectrogram generated by stft. % parameter. % synHop hop size of the synthesis window. % win the synthesis window. % zeroPad number of zeros that were padded to the % window to increase the fft size and therefore % the frequency resolution. % numOfIter number of iterations the algorithm should % perform to adapt the phase. % origSigLen original length of the audio signal such that % the output can be trimmed accordingly. % restoreEnergy when windowing the synthesis frames, there is a % potential for some energy loss. This option % will rescale every windowed synthesis frame to % compensate for this energy leakage. % fftShift in case the stft was computed with an fftShift, % setting this parameter to 1 will compensate for % that by applying the same shifting operation % again to each frame after the ifft. % % Output: y the time-domain signal. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Reference: % If you use the 'TSM toolbox' please refer to: % [DM14] Jonathan Driedger, Meinard Mueller % TSM Toolbox: MATLAB Implementations of Time-Scale Modification % Algorithms % Proceedings of the 17th International Conference on Digital Audio % Effects, Erlangen, Germany, 2014. % % License: % This file is part of 'TSM toolbox'. % % MIT License % % Copyright (c) 2021 Jonathan Driedger, Meinard Mueller, International Audio % Laboratories Erlangen, Germany. % % We thank the German Research Foundation (DFG) for various research grants % that allow us for conducting fundamental research in music processing. % The International Audio Laboratories Erlangen are a joint institution of % the Friedrich-Alexander-Universitaet Erlangen-Nuernberg (FAU) and % Fraunhofer Institute for Integrated Circuits IIS. % % Permission is hereby granted, free of charge, to any person obtaining a % copy of this software and associated documentation files (the % "Software"), to deal in the Software without restriction, including % without limitation the rights to use, copy, modify, merge, publish, % distribute, sublicense, and/or sell copies of the Software, and to permit % persons to whom the Software is furnished to do so, subject to the % following conditions: % % The above copyright notice and this permission notice shall be included % in all copies or substantial portions of the Software. % % THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS % OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF % MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. % IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY % CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, % TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE % SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % check parameters %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% if nargin<2 parameter=[]; end if ~isfield(parameter,'synHop') parameter.synHop = 2048; end if ~isfield(parameter,'win') parameter.win = win(4096,2); % hann window end if ~isfield(parameter,'zeroPad') parameter.zeroPad = 0; end if ~isfield(parameter,'numOfIter') parameter.numOfIter = 1; end if ~isfield(parameter,'origSigLen') parameter.origSigLen = -1; % no trimming end if ~isfield(parameter,'restoreEnergy') parameter.restoreEnergy = 0; end if ~isfield(parameter,'fftShift') parameter.fftShift = 0; end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % some pre calculations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% numOfFrames = size(spec,2); numOfIter = parameter.numOfIter; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % audio calculation %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % first iteration Yi = spec; yi = LSEE_MSTFT(Yi,parameter); % remaining iterations parStft.win = parameter.win; parStft.numOfFrames = numOfFrames; parStft.zeroPad = parameter.zeroPad; parStft.anaHop = parameter.synHop; for j = 2 : numOfIter Yi = abs(spec) .* exp(1j*angle(stft(yi,parStft))); yi = LSEE_MSTFT(Yi,parameter); end y = yi; % if the original Length of the signal is known, also remove the zero % padding at the end if parameter.origSigLen > 0 y = y(1:parameter.origSigLen); end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % the Griffin Lim procedure %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function x = LSEE_MSTFT(X,parameter) % some pre calculations w = parameter.win; w = w(:); zp = parameter.zeroPad; w = [zeros(floor(zp/2),1);w;zeros(floor(zp/2),1)]; winLen = length(w); winLenHalf = round(winLen/2); synHop = parameter.synHop; numOfFrames = size(X,2); winPos = (0:numOfFrames-1) * synHop + 1; signalLength = winPos(end) + winLen - 1; x = zeros(signalLength,1); % resynthesized signal ow = zeros(signalLength,1); % sum of the overlapping windows for i = 1 : numOfFrames currSpec = X(:,i); % add the conjugate complex symmetric upper half of the spectrum Xi = [currSpec;conj(currSpec(end-1:-1:2))]; xi = real(ifft(Xi)); if parameter.fftShift == 1 xi = fftshift(xi); end xiw = xi .* w; if parameter.restoreEnergy == 1 xiEnergy = sum(abs(xi)); xiwEnergy = sum(abs(xiw)); xiw = xiw * (xiEnergy/(xiwEnergy+eps)); end x(winPos(i):winPos(i)+winLen-1) = ... x(winPos(i):winPos(i)+winLen-1) + xiw; ow(winPos(i):winPos(i)+winLen-1) = ... ow(winPos(i):winPos(i)+winLen-1) + w.^2; end ow(ow<10^-3) = 1; % avoid potential division by zero x = x ./ ow; % knowing the zeropads that were added in the stft computation, we can % remove them again now. But since we do not know exactly how many % zeros were padded at the end of the signal, it is only safe to remove % winLenHalf zeros. x = x(winLenHalf+1:end-winLenHalf); end