Meet ‘EDGE’: A diffusion-based AI model that produces realistic, long-form dance sequences conditioned by music

Many cultures place a high value on dance as a means of expression, communication, and socialization. However, producing new dances or dance animations is challenging because the dance moves are expressive and free while being carefully orchestrated by the music. In fact, this calls for either time-consuming manual animation or useless motion capture techniques. However, the overhead of the creation process can be reduced by using computational methods to automatically generate jigs. This has a wide range of applications, including helping animators create new dances and rendering interactive characters in video games or virtual reality with realistic and varied movements based on user-supplied music. In addition, the creation of dance can shed light on how music and movement interact, which is a sought-after area of ​​study in neuroscience.

Previous research has made huge strides in applying machine learning-based technologies. However, it still has to achieve great success in producing dances from music that adheres to the user’s requirements. Furthermore, earlier works often use quantitative criteria that prove unreliable, and evaluating created dances is a difficult and subjective process. This research presents Editable Dance Generation (EDGE), an advanced dance generation technology that generates physiologically plausible and realistic dance moves from music input. In their approach, a powerful music feature extractor called Jukebox is used in conjunction with a converter-based spread model.

EDGE creates physically plausible diverse dance choreographies based on musical compositions

With its diffusion-based methodology, dance may benefit from powerful editing features such as judicious conditioning. A new measure is proposed that captures the physical validity of ground-contact behaviors without explicit physical modeling, as well as the benefits immediately conveyed by modeling decisions. In conclusion, the following is what they contributed:

1. It provides a spread-based dance generation method that can generate random length dance sequences while combining advanced performance with powerful editing tools.

2. They examine measures in previous studies and show that they are inaccurate representations of humanly assessed quality, as evidenced by significant user research.

3. They introduce the physical touchpoints of the foot, a new direct accelerometer-based quantitative measure for recording the physical plausibility of generated kinematic motions that does not require explicit physical modeling. Using the novel contact symmetry loss, they propose a novel method to remove the physical possibilities of foot slippage in induced marks.

4. Using audio representations of music from Jukebox, a pre-built model for music that has previously shown high performance in prediction challenges for music, they improve on hand-crafted audio feature extraction methodologies.

One can check out their website, which also has great video demos. It’s something you won’t see every day.


scan the paper And the project. All credit for this research goes to the researchers on this project. Also, don’t forget to join Our Reddit page And the discord channelwhere we share the latest AI research news, cool AI projects, and more.


Anish Teeku is a Consultant Trainee at MarktechPost. He is currently pursuing his undergraduate studies in Data Science and Artificial Intelligence from the Indian Institute of Technology (IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is in image processing and he is passionate about building solutions around it. Likes to communicate with people and collaborate on interesting projects.



Leave a Comment