Integrate Hugging Face and FFmpeg to automate workflows with scalable backend
Connect Hugging Face and FFmpeg nodes to in your workflow. Integrate with any tool or database and ship powerful backend logic and APIs instantly - No code required!
Getting Started
How To Connect Hugging Face and FFmpeg
Popular Templates With Hugging Face and FFmpeg
Explore our popular Google Sheets & templates below. Click. Remix. Ship!
🎞️
FFmpeg Compress Video
Compress any mp4 video of your choice using FFmpeg.
🎞️
FFmpeg Convert Video
Convert any mp4 video of your choice to mov using FFmpeg.
🗣️
Audio Translator
Given speech audio, transcribe it, translate it into the target language, and finally convert it back to speech using Google's translation and text-to-speech APIs.
Google Translate
👁️
Google Vision - Text Detection
Given an image, find the text in it using Google Vision.
Google Vision
Node stack
Supported Triggers & Actions
Hugging Face NODES
Caption Image
Generate caption for the image using Hugging Face's [Salesforce/blip-image-captioning-large](https://huggingface.co/Salesforce/blip-image-captioning-large) model for image captioning pretrained on COCO dataset - base architecture (with ViT large backbone).
Image Classification
Get classification labels for your image using Hugging Face's [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224) model which is a transformer encoder model (BERT-like) pretrained on a large collection of images in a supervised fashion, namely ImageNet-21k, at a resolution of 224x224 pixels. Next, the model was fine-tuned on ImageNet (also referred to as ILSVRC2012), a dataset comprising 1 million images and 1,000 classes, also at resolution 224x224.
Text Summarization
Summarize long text using Hugging Face's [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) model which is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.
Text-To-Image
Generate image from text, using Hugging Face's [openskyml/dalle-3-xl](https://huggingface.co/openskyml/dalle-3-xl) test model very similar to Dall•E 3.
Text-To-Music
Generate music from text using Hugging Face's [facebook/musicgen-small](https://huggingface.co/facebook/musicgen-small) model capable of generating high-quality music samples conditioned on text descriptions or audio prompts.
FFmpeg NODES
Audio Overlay
Overlay audio on top of an existing video using FFmpeg.
script
Combine Audio
Combine multiple audio files into one.
script
Combine Videos
Combine multiple videos into one, ensuring they all share a common resolution.
script
Compress Audio
Compresses an audio file (.mp3, .m4a) using FFmpeg
script
Compress Video
Compresses a video file using FFmpeg
script
Convert Video Format
Converts the specified video from one format to another. NOTE: This process can be compute intensive depending on the size of the video.
script
Blog posts & Tutorials
Recommended
Reads
Below are recommneded blogs that will help in your journey
Support
Need Help?
Here are some helpful resources to get you "unstuck"
💬
Join BuildShip Community ->
An active and large community of no-code / low-code builders. Ask questions, share feedback, showcase your project and connect with other BuildShip enthusiasts.
🙋
Hire a BuildShip Expert ->
Need personalized help to build your product fast? Browse and hire from a range of independent freelancers, agencies and builders - all well versed with BuildShip.
🛟
Send a Support Request ->
Got a specific question on your workflows / project or want to report a bug? Send a us a request using the "Support" button directly from your BuildShip Dashboard.
⭐️
Feature Request ->
Something missing in BuildShip for you? Share on the #FeatureRequest channel on Discord. Also browse and cast your votes on other feature requests.