fixed: self-referential typehinting for location
Signed-off-by: savoy <git@liberation.red>
fixed: type hinting cleanup
Type issues due to missing stubs are left as diagnostics for now, others
have been cleaned up.
Signed-off-by: savoy <git@liberation.red>
fixed: change in album HTML did not get # of img
The HTML structure of an album had a slight change from where the
information regarding how many files existed in the album and now many
pages the album was comprised of. The album scrape now correctly gets
that information in order to properly retrieve each image link.
Signed-off-by: savoy <git@liberation.red>
fixed: proper album page iteration
Album pages were not being iterated. A new IterUrl class has been
created to facilitate this.
Page iteration has also moved from Album.__init__() to
Image.get_images() (which Album calls). This flow is similar to Category
calling Album.get_albums(), which follows the logic that passing the
base site to get_categories() (or any midway category) will get all
categories from that point in the hierarchy, a
Category URL to get_albums() will get the albums at that category, and an
Album URL to get_images() to get the images in that album.
Signed-off-by: savoy <git@liberation.red>
changed: Site attributes are optional
As Site can be run to either scrape the site hierarchy or only the most
recently updated albums, either `recent` or `categories` will be None.
Signed-off-by: savoy <git@liberation.red>
added: timeout catches and prep to scrape images
Signed-off-by: savoy <git@liberation.red>
added: gets stats directly from album page
Although album stats are generally pulled from their most closely
related category in the hierarchy, the main album page also contains
info on how many files exist in the album. Although date cannot be
scraped from here, file numbers can.
Signed-off-by: savoy <git@liberation.red>
changed: refactored code
Main has been properly put into __main__.py, and recursion for site ->
category now works.
Signed-off-by: savoy <git@liberation.red>