

Also, fuzzing is becoming quite popular. It’s a technique that automatically detects vulnerabilities on a binary. Though, it is computationally intensive, so I would love to the emergence of a peer-to-peer project that allows anyone to contribute by testing open-source software.
So you are basically building a classifier that tries to assert if a user will like a video. While many are against any kind of “algorithm” within the fediverse, I believe that it’s a necessity. But, I think allowing users to tag content and then building classifiers that allow you to filter based on that would be a more aligned with the fediverse.
Anyway, cosine similarity has worked for a lot of things, so I think it’s a solid foundation to get you started. Another thing you can try is using an embedding model, specifically a model that receives a segment of a video and yields a matrix with the property that similar input will result in outputs relatively close to each other (cosine or euclidean distance).
Another thing to consider is building a platform that will permanently store data. If you can come up with a set of endpoints, I can implement something in python to get ypu started. I don’t have experience with video processing so I cannot help you with that, but the crud aspect is no biggie.