Detects the file extension (HTML, CSV, JSON, TXT) automatically and converts it to JSON. Links validation and detection of title, tags and description.
How the python script works:
- File type recognition: Supports HTML, CSV, JSON and TXT.
- Extract links: Depending on the file type, URLs and additional data such as title and description are processed.
- Validation: URLs are checked to filter out invalid or inaccessible links.
- Preview: The first 10 entries are displayed so that you can check the data.
- Export: Saves the data as a JSON file that can be imported into Hoarder.
How the Bash script works:
- File type detection: Uses the file extension to select the appropriate method (CSV, JSON, HTML, TXT).
- Extract links: Uses awk, grep, or jq, depending on the file type.
- URL validation: Checks accessibility of URLs with curl.
- Preview: Shows a JSON preview of the first 20 entries with jq.
- Export: Saves the data in a JSON file.
Prerequisites:
- Tools such as curl, jq, and awk must be installed.
##License
This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.