Tencent Mix Yuan released and open-sourced the image-to-video model, while also launching capabilities such as audio and motion driving

Tencent Hunyuan released and open-sourced the image-to-video model, allowing enterprises and developers to apply for API access on Tencent Cloud. Users can experience the model through the Hunyuan AI Video official website, which supports generating background sound effects and 2K high-quality videos. Users only need to upload an image and describe the motion requirements, and Hunyuan can generate a 5-second short video. The open-source content includes weights, inference code, and LoRA training code, suitable for various roles and scenarios. The model has flexible scalability and supports multi-dimensional control of generated videos

Tencent's Hunyuan image-to-video model has now been launched, and enterprises and developers can apply for API access on Tencent Cloud; users can experience it through the Hunyuan AI video official website. The open-source image-to-video generation model is currently available for download and experience in mainstream developer communities such as Github and HuggingFace.

Tencent Hunyuan has released the image-to-video model and made it open-source, while also launching features such as lip-syncing and motion driving, and supports the generation of background sound effects and 2K high-quality videos.

Based on the capabilities of image-to-video, users only need to upload an image and briefly describe how they want the scene to move and how the camera should be adjusted. Hunyuan can animate the image as requested, turning it into a 5-second short video, and automatically add background sound effects. Additionally, by uploading a picture of a person and inputting the text or audio they wish to "lip-sync," the person in the image can "speak" or "sing"; using the "motion driving" capability, users can also generate a dancing video with one click.

Currently, users can experience it through the Hunyuan AI video official website, and enterprises and developers can apply for API access on Tencent Cloud.

The open-source image-to-video model is a continuation of the open-source work of the Hunyuan text-to-video model, with a total parameter count of 13 billion. The model is suitable for various types of characters and scenes, including realistic video production, anime characters, and even CGI character generation.

The open-source content includes weights, inference code, and LoRA training code, supporting developers to train exclusive LoRA and other derivative models based on Hunyuan. It is currently available for download and experience in mainstream developer communities such as Github and HuggingFace.

The Hunyuan open-source technical report reveals that the Hunyuan video generation model has flexible scalability, with image-to-video and text-to-video pre-training conducted on the same dataset. While maintaining ultra-realistic image quality, smooth performance of large-scale actions, and native camera switching, the model can capture rich visual and semantic information, and achieve multi-dimensional control of the generated video by combining various input conditions such as images, text, audio, and poses. Since the open-source release of the Hunyuan video generation model, it has maintained a high level of popularity, topping the Hugging Face trending list in December last year, and currently has over 8.9K stars on GitHub. Several developers have voluntarily created plugins and derivative models based on the community Hunyuanvideo, accumulating over 900 derivative versions. The earlier open-sourced Hunyuan DiT text-to-image model has over 1,600 derivative models both domestically and internationally.

Currently, the Hunyuan open-source series of models has fully covered multiple modalities including text, image, video, and 3D generation, receiving over 23,000 developer follows and stars on GitHub.

Risk Warning and Disclaimer

The market has risks, and investment should be cautious. This article does not constitute personal investment advice and does not take into account the specific investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article align with their specific circumstances. Investment based on this is at their own risk