{ "cells": [ { "cell_type": "markdown", "id": "3725096f", "metadata": {}, "source": [ "# Compare Perceptual Similarity between two Videos with Feluda\n", "\n", "This notebook demonstrates how to use the `VideoHash` operator to generate\n", "perceptual hashes for two videos and compare their similarity. It processes\n", "two sample videos and shows their hash values for similarity analysis." ] }, { "cell_type": "markdown", "id": "c32d1dd4", "metadata": {}, "source": [ "[![GitHub](https://img.shields.io/badge/GitHub-View%20Source-blue?logo=github)](https://github.com/tattle-made/feluda/blob/main/docs/examples/compare_video_hashes.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/tattle-made/feluda/blob/main/docs/examples/compare_video_hashes.ipynb)" ] }, { "cell_type": "markdown", "id": "b63d79e1", "metadata": {}, "source": [ "Install dependencies conditionally based on whether the notebook is running in Colab or locally." ] }, { "cell_type": "code", "execution_count": null, "id": "97d892e0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running Notebook locally\n", "\u001b[2mUsing Python 3.10.12 environment at: /home/aatman/Aatman/Tattle/feluda/.venv\u001b[0m\n", "\u001b[2mAudited \u001b[1m6 packages\u001b[0m \u001b[2min 11ms\u001b[0m\u001b[0m\n", "CPU times: user 6.38 ms, sys: 4.13 ms, total: 10.5 ms\n", "Wall time: 138 ms\n" ] } ], "source": [ "%%time\n", "import sys\n", "\n", "IN_COLAB = \"google.colab\" in sys.modules\n", "print(\"Running Notebook in Google Colab\" if IN_COLAB else \"Running Notebook locally\")\n", "\n", "if IN_COLAB:\n", " # Since Google Colab has preinstalled libraries like tensorflow and numba, we create a folder called feluda_custom_venv and isolate the environment there.\n", " # This is done to avoid any conflicts with the preinstalled libraries.\n", " %pip install uv\n", " !mkdir -p /content/feluda_custom_venv\n", " !uv pip install --target=/content/feluda_custom_venv --prerelease allow feluda \"feluda-video-hash-tmk\" > /dev/null 2>&1\n", "\n", " sys.path.insert(0, \"/content/feluda_custom_venv\")\n", "else:\n", " !uv pip install feluda \"feluda-video-hash-tmk\" > /dev/null 2>&1" ] }, { "cell_type": "markdown", "id": "35a99415-ca33-4395-8bf0-84b109bcd382", "metadata": {}, "source": [ "We'll use one operator for this example." ] }, { "cell_type": "code", "execution_count": null, "id": "2143fc54", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading TMK binary to /var/folders/4p/bw6h5x8x1nb_17vsgfc12dz00000gn/T/tmk-hash-video\n" ] } ], "source": [ "from feluda.factory import VideoFactory\n", "from feluda.operators import VideoHash\n", "\n", "hasher = VideoHash()" ] }, { "cell_type": "code", "execution_count": 2, "id": "85d10961-4a9f-4027-acfc-a280de6a0888", "metadata": {}, "outputs": [], "source": [ "VIDEO_URLS = [\n", " \"https://github.com/tattle-made/feluda_datasets/blob/main/feluda-sample-media/en-speech.mp4\",\n", " \"https://github.com/tattle-made/feluda_datasets/blob/main/feluda-sample-media/hi-speech.mp4\",\n", "]" ] }, { "cell_type": "markdown", "id": "5a92ecda-1b9b-40f2-b159-42e8b68520d9", "metadata": {}, "source": [ "In the below codeblock, we are computing the perceptual hash for two videos using the `feluda-video-hash-tmk` operator. The operator uses the compiled TMK Binary from Facebook Research." ] }, { "cell_type": "code", "execution_count": null, "id": "ad07d1a7-5e74-4604-aab4-6f53bca59dcd", "metadata": {}, "outputs": [], "source": [ "hashes = []\n", "for i, video_url in enumerate(VIDEO_URLS, 1):\n", " # Convert GitHub blob URL to CDN raw URL for direct download\n", " raw_url = video_url.replace(\"/blob/\", \"/raw/\")\n", "\n", " # Download video using VideoFactory\n", " video_obj = VideoFactory.make_from_url(raw_url)\n", " video_path = video_obj[\"path\"]\n", "\n", " # Generate TMK hash for the video\n", " # Returns Base64-encoded pure average feature vector\n", " hash_value = hasher.run(video_path)\n", " hashes.append(hash_value)\n", "\n", " # Display hash information\n", " print(f\"Video {i} URL: {video_url}\")\n", " print(f\"TMK Hash: {hash_value[:50]}...\") # Show first 50 characters\n", " print(f\"Hash Length: {len(hash_value)} characters\")\n", " print()\n", "\n", "# Compare the two hashes for similarity\n", "hash1, hash2 = hashes" ] }, { "cell_type": "code", "execution_count": null, "id": "e698bd28", "metadata": {}, "outputs": [], "source": [ "import base64\n", "\n", "# Decode Base64 hashes to compare raw feature vectors\n", "try:\n", " decoded1 = base64.b64decode(hash1)\n", " decoded2 = base64.b64decode(hash2)\n", "\n", " # Calculate similarity metrics\n", " hash_length = len(decoded1)\n", " print(f\"Feature vector length: {hash_length} bytes\")\n", "\n", " # Simple byte-by-byte comparison\n", " identical_bytes = sum(1 for a, b in zip(decoded1, decoded2, strict=False) if a == b)\n", " similarity_percentage = (identical_bytes / hash_length) * 100\n", "\n", " print(f\"Identical bytes: {identical_bytes}/{hash_length}\")\n", " print(f\"Similarity: {similarity_percentage:.2f}%\")\n", "\n", " # Interpret similarity\n", " if similarity_percentage > 80:\n", " print(\"Result: HIGH SIMILARITY - Videos are likely very similar\")\n", " elif similarity_percentage > 50:\n", " print(\"Result: MODERATE SIMILARITY - Videos share some characteristics\")\n", " else:\n", " print(\"Result: LOW SIMILARITY - Videos are likely different\")\n", "\n", "except Exception as e:\n", " print(f\"Error comparing hashes: {e}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "29a50a4a-a1f9-415f-9bfe-780c9427cab5", "metadata": {}, "outputs": [], "source": [ "# Clean up resources when you're done\n", "\n", "hasher.cleanup()" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" } }, "nbformat": 4, "nbformat_minor": 5 }