Heuristic based boilerplate removal tool
-
Updated
Feb 25, 2025 - Python
Heuristic based boilerplate removal tool
Undetected Web-Scraping & Seamless HTML Parsing in Python!
procyclingstats scraper
CAP (Common Alerting Protocol) XML alert format parsing, HTML parsing, inserting new alerts into database, OneSignal (possible Android and iOS push notifications), Twitter, Facebook, MailChimp (e-mail notifications) for project of open source solution for natural disasters early-warning.
BeautifulSoup4 packaged into a command line tool
django-janitor allows you to use bleach to clean HTML stored in a Model's field.
web spider to scan UR avialbe room and output as csv
This Python script scrapes internal links on a webpage. It prompts for a URL, sends a GET request to retrieve HTML, uses BeautifulSoup to parse and filter links. Then it prompts the user for output mode (terminal or file) to either print or write the links. Installs required modules (requests and beautifulsoup4) if not found.
this script can analyze number of telegram messages by time
Get insights into your Facebook Messenger activity with Splunk
The first public repository that provides free BUBT website scraping API script on Github.
CLI tool for sitemap generation
A simple HTML form password bruteforcing tool written in python.
Examples on how to process html files in Python
Simple example of a web scrapper using python. In this case, we ask the user using the console for the name of a band/artist and using selenium webdriver and beautifulsoup we print information about the discography of that artist/band
This Python script scrapes Salatomatic for US masjid data, including names, locations, and phone numbers. It uses requests, BeautifulSoup, and csv modules for web scraping and CSV handling.
Script for extracting data from site "dop.edu.ru"
🤖 This bot is needed to parse the list of web pages and send messages with the parsing results
A powerful desktop application to download, archive, and manage web pages locally with full resource support, built with Python and PyQt6.
Add a description, image, and links to the html-parsing topic page so that developers can more easily learn about it.
To associate your repository with the html-parsing topic, visit your repo's landing page and select "manage topics."