aim.hathifiles.poll module
- class aim.hathifiles.poll.NewFileHandler(new_files: list, store: list)[source]
Bases:
object
- notify_webhook()[source]
Sends a list of update files that haven’t been seen to the argo events webhook for hathifiles.
- replace_store(store_path: str = 'tmp/hathi_file_list_store.json')[source]
Replaces the store file with a list of hathifile update files
- Parameters:
store_path (str, optional) – path to hathifiles store file. Defaults to S.hathifiles_store_path.
- property slim_store
Removes files from the store that are over one year old
- Returns:
list of update files that are newer than one year
- Return type:
list
- aim.hathifiles.poll.check_for_new_update_files(latest_update_files: list | None = None, store: list | None = None, new_file_handler_klass: ~typing.Type[~aim.hathifiles.poll.NewFileHandler] = <class 'aim.hathifiles.poll.NewFileHandler'>)[source]
Gets the latest list of hathifiles from hathitrust.org, loads up the store file and compares them. If there are new files triggers the argo events webhook and updates the store. If there are no new files, it exits.
- Parameters:
latest_update_files (list | None, optional) – list of latest update files. This will call get_latest_update_files() when None is given.
store (list | None, optional) – list of hathifiles update files that have been seen before. This will call get_store() if None is given.
new_file_handler_klass (Type[NewFileHandler], optional) – Class that handles new update files. Defaults to NewFileHandler.
- aim.hathifiles.poll.create_store_file(store_path: str = 'tmp/hathi_file_list_store.json') None [source]
Creates a store file of the current list of update files from hathitrust.org if there does not already exist a store file.
- Parameters:
store_path (str, optional) – path to store file. Defaults to S.hathifiles_store_path.
- aim.hathifiles.poll.filter_for_update_files(hathi_file_list: list) list [source]
Takes a plain hathifile_file_list list and filters to get only the file names for update files
- Parameters:
hathi_file_list (list) – full list of current hathifiles from hathitrust.org
- Returns:
flat list of update file names
- Return type:
list
- aim.hathifiles.poll.get_hathi_file_list() list [source]
Gets the latest current list of hathifiles from hathitrust.org.
- Returns:
list of dictionairies that describe hathifiles
- Return type:
list
- aim.hathifiles.poll.get_latest_update_files()[source]
Gets the latest list of current hathifiles from hathitrust.org and filters for just a list of update files.
- Returns:
flat list of update file names
- Return type:
list
- aim.hathifiles.poll.get_store(store_path: str = 'tmp/hathi_file_list_store.json') list [source]
Loads the store file that contains the list of all hathifile update files that have been seen before.
- Parameters:
store_path (str, optional) – path to the store file. Defaults to S.hathifiles_store_path.
- Returns:
list of hathifile update files that have been seen before
- Return type:
list