Gfycat wants to fix your low-fidelity GIFs with machine learning
We all love to share GIFs — and there are plenty of ways to do that, through online portals or keyboards — but often times because there is so much content, you’ll end up surfacing up a lower-fidelity GIF.
There can be plenty of copies of the same video clips as a GIF, or maybe it’s just difficult to capture and upload, but Gfycat hopes that it can be solved at a technical level. Gfycat is now making a big push on the technical front to make those GIFs look better and more discoverable as creators look to continue to upload content, regardless of what kind of quality or fidelity they are. And it’s more of a video problem than an image recognition problem, CEO Richard Rabbat said.
“We have scaled [through] creators through word of mouth, and they are just getting excited about Gfycat and [creating] content,” Rabbat said. “In many cases, what we’re building from an AI and machine learning perspective are additional tools to support their excitement. We want to enable them to drive more virality for their content, and in this case, make their content even more easily discoverable. That’s something that’s very important to us as we keep focusing on the creators.”
Rabbat said Gfycat will scour the web for the original version of a video where the GIF is coming from — in some cases it comes from YouTube — and analyze that video to figure out what part of it the GIF came from. The company then produces a higher-quality GIF and swaps it out, making the broader spread of the GIF a higher-quality version. The company creates a kind of model for each frame in the GIF and then tries to match that up with the higher-quality videos, he said.
“What we noticed was a number of users that were uploading GIFs were incredibly popular, but when they uploaded most of the time they were really low quality,” Rabbat said. “We’ve been looking at AI and machine learning for a while now, as it relates [to] our initiative to beautify the web when it comes to GIFs.”
After that, if a creator uploads a GIF that includes a celebrity, they might not tag that as having that celebrity. So the company has done some internal analysis to identify which celebrity is in that GIF and automatically tag them. The hope is that while the company has a library of existing popular celebrities, it’ll be able to identify up-and-coming celebrities with these tools and automatically start tagging them as they come in.
Rabbat said Gfycat built both of these tools internally because the off-the-shelf products that were available didn’t work well with GIFs. Though GIFs are, of course, a series of images, he said often times a lot of different elements (like multiple celebrities) will appear in sequence while standard image recognition technology might only identify one or two of them. The technology is instead based on a video, he said.
“One of the big challenges is the raw amount of information a GIF includes,” Rabbat said. “It’s hundreds of frames, sometimes more. We need to identify at a very high rate these different celebrities that are being created. We wanted to do it in real time. We were able to do it within a minute of people creating content, we were able to identify the celebrity.”
Finally, with all these tools, Gfycat wants to identify text within various captions in GIFs as they come in. Again, part of the challenge here was that a GIF might come in with a caption, but the text is grainy and not easily read or identifiable. Gfycat sought to build some internal tools that help understand what the captions say and then make the GIFs more discoverable based on those captions.
While Gfycat is definitely not alone in attempts to make short-form video content like GIFs more easily discoverable — there are companies like Tenor and Giphy looking to create robust platforms as well — it’s attempting to treat the problem with technical tools. And with more than 130 million monthly active users (Giphy, in comparison, has 300 million daily active users), it’s going to become a technical problem as this kind of content can’t be curated at scale.