Piano-SSM: Diagonal State Space Models for Efficient MIDI-to-Raw Audio Synthesis

Authors: Dominik Dallinger, Matthias Bittner, Daniel Schnöll, Matthias Wess and Axel Jantsch

Christian Doppler Laboratory for Embedded Machine Learning

  • Original MAESTRO Audios are taken from the MAESTRO Dataset V3.0.0. The MAESTRO dataset is made available by Google LLC under a Creative Commons Attribution Non-Commercial Share-Alike 4.0 (CC BY-NC-SA 4.0) license. Please cite the paper if you use the MAESTRO dataset:

    Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, and Douglas Eck. "Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset." In International Conference on Learning Representations, 2019.

  • Original MAPS Audios are taken from the MAPS Dataset. The MAPS dataset is made available by ParisTech under a Creative Commons Attribution Non-Commercial Share-Alike 2.0 (CC BY-NC-SA 2.0) license. Please cite the paper if you use the MAPS dataset:

    Valentin Emiya, Nancy Bertin, Bertrand David and Roland Badeau "MAPS - A piano database for multipitch estimation and automatic transcription of music."

  • Evaluation Set

    MAESTRO 2008 Domenico Scarlatti Sonata in D Minor, K. 9 L. 413

    Model / Train Sampling Rate Synthesis Sampling Rate 44.1kHz Synthesis Sampling Rate 24kHz Synthesis Sampling Rate 16kHz
    Original Audio
    Piano-SSM XL 44.1kHz
    Piano-SSM XL 24kHz
    Piano-SSM XL 16kHz
    Piano-SSM L 44.1kHz
    Piano-SSM L 24kHz
    Piano-SSM L 16kHz
    Piano-SSM S 44.1kHz
    Piano-SSM S 24kHz
    Piano-SSM S 16kHz
    DDSP-Piano v1 16kHz
    DDSP-Piano v2 24kHz

    MAPS Dataset - MUS-Ambient - MAPS_MUS-schub_d760_3_ENSTDkAm 24 kHz

    Original Audio Piano-SSM XL trained on MAPS Ambient Piano-SSM XL trained on MAPS Close Piano-SSM XL trained on MAPS Close & Ambient Piano-SSM XL trained on MAESTRO

    MAESTRO 2013 Franz Schubert Moment Musical Op. 94 No. 4 in C-sharp Minor, D780

    Model / Train Sampling Rate Synthesis Sampling Rate 44.1kHz Synthesis Sampling Rate 24kHz Synthesis Sampling Rate 16kHz
    Original Audio
    Piano-SSM XL 44.1kHz
    Piano-SSM XL 24kHz
    Piano-SSM XL 16kHz
    Piano-SSM L 44.1kHz
    Piano-SSM L 24kHz
    Piano-SSM L 16kHz
    Piano-SSM S 44.1kHz
    Piano-SSM S 24kHz
    Piano-SSM S 16kHz
    DDSP-Piano v1 16kHz
    DDSP-Piano v2 24kHz

    Training Set

    MAESTRO 2009 Wolfgang Amadeus Mozart Sonata in D Major, K.311

    Model / Train Sampling Rate Synthesis Sampling Rate 44.1kHz Synthesis Sampling Rate 24kHz Synthesis Sampling Rate 16kHz
    Original Audio
    Piano-SSM XL 44.1kHz
    Piano-SSM XL 24kHz
    Piano-SSM XL 16kHz
    Piano-SSM L 44.1kHz
    Piano-SSM L 24kHz
    Piano-SSM L 16kHz
    Piano-SSM S 44.1kHz
    Piano-SSM S 24kHz
    Piano-SSM S 16kHz
    DDSP-Piano v1 16kHz
    DDSP-Piano v2 24kHz

    MAPS ISOL, RAND and UCHO Set

    Just few examples for the training set of the MAPS dataset
    ISOL RAND UCHO

    Acknowledgments