Skip to content

Commit

Permalink
Merge pull request #91 from skim0119/feature/audio_encoding
Browse files Browse the repository at this point in the history
Audio encoding (Lyon's Ear Model + BSA)
  • Loading branch information
skim0119 authored Aug 9, 2022
2 parents 2c2dbd9 + 24938c6 commit ad936a5
Show file tree
Hide file tree
Showing 15 changed files with 564 additions and 183 deletions.
15 changes: 8 additions & 7 deletions .github/workflows/ci-sphinx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,20 @@ jobs:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v3
- name: Setup Poetry
uses: Gr1N/setup-poetry@v7
with:
poetry-preview: false
uses: snok/install-poetry@v1
- name: Poetry install docs dependencies
run: |
poetry --version
poetry config virtualenvs.in-project false
poetry config virtualenvs.create false
poetry config virtualenvs.in-project true
#poetry config virtualenvs.create false
poetry install -E docs
- name: Sphinx Build Check
run: |
source .venv/bin/activate
cd docs
python --version
which python
sphinx-build --version
make clean
if ! (make html SPHINXOPTS="-W --keep-going") ; then
echo Please resolve the warnings/errors
Expand All @@ -44,9 +45,9 @@ jobs:
echo Doc build success
exit 0
fi
- name: Sphinx Link Check
run: |
source .venv/bin/activate
cd docs
make clean
make linkcheck
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ __pycache__/

# C extensions
*.so
*.o

# Distribution / packaging
.Python
Expand Down
13 changes: 13 additions & 0 deletions docs/api/coding.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
*************
Neural Coding
*************

Temporal Encoding/Decoding
==========================

.. automodule:: miv.coding.temporal

Spatial Encoding/Decoding
=========================

.. automodule:: miv.coding.spatial
176 changes: 176 additions & 0 deletions docs/guide/lyon_ear_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
format_version: 0.13
jupytext_version: 1.13.8
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
---

# Temporal Encoding: Lyon Ear Model

- Schroeder, M. R. (1973). An integrable model for the basilar membrane. The Journal of the Acoustical Society of America, 53(2), 429–434. doi:10.1121/1.1913339
- Zweig, G., Lipes, R., & Pierce, J. R. (1976). The cochlear compromise. Journal of the Acoustical Society of America, 59(4), 975–982. doi:10.1121/1.380956
- Lyon, R. F. (1982). A Computational Model of Filtering, Detection, and Compreion in the Cochlea. IEEE Artificial IntelligenceArtificial Intelligence, 3–6.
- Lyon, R. F., & Lauritzen, N. (1985). Processing Speech With the Multi-Serial Signal Processor. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 981–984. doi:10.1109/icassp.1985.1168158
- Slaney, M. (1988). Lyon’s Cochlear Model. 1–79.
- Van, L. M. (1992). Pitch and voiced/unvoiced determination with an auditory model. Journal of the Acoustical Society of America, 91(6), 3511–3526. doi:10.1121/1.402840
- Gabbiani, Fabrizio. (1996). Coding of time-varying signals in spike trains of linear and half-wave rectifying neurons. Network: Computation in Neural Systems, 7(1), 61–85. doi:10.1080/0954898x.1996.11978655
- Reich, D. S., Victor, J. D., & Knight, B. W. (1998). The power ratio and the interval map: Spiking models and extracellular recordings. Journal of Neuroscience, 18(23), 10090–10104. doi:10.1523/jneurosci.18-23-10090.1998
- Gabbiani, F., & Metzner, W. (1999). Encoding and processing of sensory information in neuronal spike trains. Journal of Experimental Biology, 202(10), 1267–1279. doi:10.1242/jeb.202.10.1267
- Schrauwen, B., & Van Campenhout, J. (2003). BSA, a Fast and Accurate Spike Train Encoding Scheme. Proceedings of the International Joint Conference on Neural Networks, 4, 2825–2830. doi:10.1109/ijcnn.2003.1224019
- Cosi, P., & Zovato, E. (2012). Lyon ’ s Auditory Model Inversion : a Tool for Sound Separation and Speech Enhancement. 3–6.
- Swanson, B. A. (2015). Pitch Perception with Cochlear Implants. (August 2008).
- Zai, A. T., Bhargava, S., Mesgarani, N., & Liu, S. C. (2015). Reconstruction of audio waveforms from spike trains of artificial cochlea models. Frontiers in Neuroscience, 9(OCT), 1–13. doi:10.3389/fnins.2015.00347
- Jin, Y., & Li, P. (2017). Performance and robustness of bio-inspired digital liquid state machines: A case study of speech recognition. Neurocomputing, 226(September 2015), 145–160. doi:10.1016/j.neucom.2016.11.045
- Huang, N., Slaney, M., & Elhilali, M. (2018). Connecting deep neural networks to physical, perceptual, and electrophysiological auditory signals. Frontiers in Neuroscience, 12(AUG), 1–14. doi:10.3389/fnins.2018.00532
- Petro, B., Kasabov, N., & Kiss, R. M. (2020). Selection and Optimization of Temporal Spike Encoding Methods for Spiking Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 31(2), 358–370. doi:10.1109/TNNLS.2019.2906158

- Xu, Y., Thakur, C. S., Singh, R. K., Hamilton, T. J., Wang, R. M., & van Schaik, A. (2018). A FPGA implementation of the CAR-FAC cochlear model. Frontiers in Neuroscience, 12(APR), 1–14. doi:10.3389/fnins.2018.00198

+++

## Quick Demo

```{code-cell} ipython3
:tags: [hide-cell]
import IPython
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
```

```{code-cell} ipython3
from miv.converter.temporal import LyonEarModel, BensSpikerAlgorithm
```

```{code-cell} ipython3
filepath = "Cochlear-Implant-Processor/Samples/Birds.wav"
IPython.display.Audio(filepath)
```

```{code-cell} ipython3
sampling_rate, waveform = wavfile.read(filepath)
waveform = waveform.astype(np.float_)
print(f"{sampling_rate=}, {waveform.shape=}")
left_wave = np.ascontiguousarray(waveform[:,0])
right_wave = np.ascontiguousarray(waveform[:,1])
```

### Lyon Ear Model

```{code-cell} ipython3
decimation_factor = 64
ear_model = LyonEarModel(sampling_rate=sampling_rate, decimation_factor=decimation_factor)
```

```{code-cell} ipython3
decimation_factor = 64
cochlear_left = ear_model(left_wave)
cochlear_right = ear_model(right_wave)
print(cochlear_right.shape)
```

```{code-cell} ipython3
fig = plt.figure(figsize=(16,8))
ax1 = fig.add_subplot(211)
img1 = ax1.imshow(cochlear_right.T)
ax1.set_aspect('auto')
plt.colorbar(img1)
ax2 = fig.add_subplot(212)
img2 = ax2.imshow(cochlear_right.T)
ax2.set_aspect('auto')
plt.colorbar(img2)
```

## BSA

```{code-cell} ipython3
out = np.stack([cochlear_left, cochlear_right])
```

```{code-cell} ipython3
bsa = BensSpikerAlgorithm(out)
spiketrain = bsa.get_spikes()[0]
```

```{code-cell} ipython3
# Binning
spiketime = [[] for _ in range(spiketrain.shape[1])]
spikestamp = np.where(spiketrain)
for i, j in zip(spikestamp[0], spikestamp[1]):
spiketime[j].append(i)
```

### Replay

```{code-cell} ipython3
from matplotlib import animation
plt.rcParams["animation.html"] = "jshtml"
plt.rcParams["figure.dpi"] = 150
plt.ioff()
```

```{code-cell} ipython3
fig, ax = plt.subplots()
window = 0.15
tfinal = 10.0
dps = int(spiketrain.shape[0] / tfinal)
fps = 60.0
def animate(frame):
t = frame / fps
plt.cla()
low = t * dps
high = (t + window) * dps
plt.eventplot([np.array(list(filter(lambda x: x >= low and x <= high, sp)))/dps for sp in spiketime])
plt.xlim(t,t+window)
plt.xlabel('time (sec)')
plt.ylabel('channel')
plt.title(f'{t=:.02f} {frame=}')
ax.set_aspect('auto')
animation.FuncAnimation(fig, animate, frames=int(tfinal * fps))
```

## Testing (Sinusoidal)

```{code-cell} ipython3
raise Exception # Do not proceed beyond
```

```{code-cell} ipython3
out = calc.lyon_passive_ear(data[:,0], samplerate)
print('Out:')
print(out.shape)
```

```{code-cell} ipython3
data2 = np.sin(np.arange(2041)/20000*2*np.pi*1000)
out = calc.lyon_passive_ear(data2, 20000, 20)
```

```{code-cell} ipython3
plt.imshow(out, aspect='auto')
plt.colorbar()
```

```{code-cell} ipython3
data3 = np.zeros(256); data3[0] = 1
out = calc.lyon_passive_ear(data3, 16000, 1)
out = np.fmin(out, 0.0004)
```

```{code-cell} ipython3
plt.imshow(out, aspect='auto')
plt.colorbar()
```

```{code-cell} ipython3
```
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ If you are interested in contributing to this project, we prepared contribution
guide/burst_analysis
guide/info_theory
guide/connectivity_network
guide/lyon_ear_model

.. toctree::
:maxdepth: 2
Expand All @@ -54,6 +55,7 @@ If you are interested in contributing to this project, we prepared contribution
api/causality
api/visualization
api/signal_generator
api/coding

.. toctree::
:maxdepth: 2
Expand Down
Empty file added miv/coding/__init__.py
Empty file.
Empty file added miv/coding/spatial/__init__.py
Empty file.
13 changes: 13 additions & 0 deletions miv/coding/temporal/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
__doc__ = """
Auditory Model
==============
References
----------
- https://engineering.purdue.edu/~malcolm/interval/1998-010/
"""

from miv.coding.temporal.lyon import *
from miv.coding.temporal.protocol import *
from miv.coding.temporal.spiker import *
80 changes: 80 additions & 0 deletions miv/coding/temporal/lyon.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
__doc__ = """
Lyon Ear Model
.. automodule:: LyonEarModel
"""
__all__ = ["LyonEarModel"]

from typing import Optional

import lyon.calc
import numpy as np


class LyonEarModel: # TemporalEncoderProtocol
"""
Lyon's Ear Model. Wrapper for lyon.LyonCalc module.
Implementation details can be found `here <https://engineering.purdue.edu/~malcolm/interval/1998-010/>`_
The above MATLAB/C implementation is ported to python `here <https://github.com/sciforce/lyon>`_
.. automodule:: lyon.LyonCalc.lyon_passive_ear
"""

def __init__(
self,
sampling_rate: float = 16000,
decimation_factor: int = 1,
ear_quality: int = 8,
step_factor: Optional[float] = None,
differ: bool = True,
agc: bool = True,
tau_factor: int = 3,
):
"""
Auditory nerve response, based on Lyon's model.
Parameters
----------
sampling_rate : float
sample_rate Waveform sample rate. (default=16000)
decimation_factor : int
decimation_factor How much to decimate model output. (default=1)
ear_quality : int
Smaller values mean broader filters. (default=8)
step_factor : float
Filter stepping factor. If not given, default value set to 25% overlap.
(default=ear_quality / 32)
differ : bool
Differ Channel difference: improves model's freq response. (default=True)
agc : bool
Whether to use AGC for neural model adaptation. (default=True)
tau_factor : int
Reduces antialiasing in filter decimation. (default=3)
"""
self.lyon = lyon.calc.LyonCalc()
self.parameters = {
"sample_rate": sampling_rate,
"decimation_factor": decimation_factor,
"ear_q": ear_quality,
"step_factor": step_factor,
"differ": differ,
"agc": agc,
"tau_factor": tau_factor,
}

def __call__(self, signal: np.ndarray) -> np.ndarray:
"""
Parameters
----------
signal : np.ndarray
signal waveform: mono,
Returns
-------
Output : np.ndarray
array shape (N / decimation_factor, channels)
"""
return self.lyon.lyon_passive_ear(signal, **self.parameters)
25 changes: 25 additions & 0 deletions miv/coding/temporal/protocol.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
__all__ = ["TemporalEncoderProtocol", "SpikerAlgorithmProtocol"]

from typing import Optional, Protocol, Tuple

import numpy as np

from miv.typing import SignalType, SpikestampsType, TimestampsType


class TemporalEncoderProtocol(Protocol):
def __call__(self, signal: np.ndarray) -> SignalType:
...


class SpikerAlgorithmProtocol(Protocol):
def __init__(self):
...

def __call__(
self, signal: np.ndarray, sample_rate
) -> Tuple[SpikestampsType, TimestampsType]:
...

def save(self) -> None:
...
Loading

0 comments on commit ad936a5

Please sign in to comment.