Web Scraping with Python: How To Crawl, Scrape, and Analyze URLs

Web Scraping 101: How to crawl, scrape, and analyze websites in Python

How do you convert an entire website into JSON when an API isn’t available? For many, they’d write a web crawler to first discover every URL on a domain. Then, write a web scraper for each type of page to transform it into structured data. After that, they’d have to de-dupe, strip HTML, and more just to get their data in a structured state. That sounds like a lot of work.
Use LDA to Classify Text Documents

LDA Topic TagsThe LDA microservice is a quick and useful implementation of MALLET, a machine learning language toolkit for Java. This topic modeling package automatically finds the relevant topics in unstructured text data.

The Algorithmia implementation makes LDA available as a REST API, and removes the need to install multiple packages, manage servers, or deal with dependencies. This microservice accepts strings, files, and URLs, as well as the ability to include a stop word list as an argument. Read More…