minsci.xmu package¶
Subpackages¶
- minsci.xmu.containers package
- minsci.xmu.tools package
- Subpackages
- Submodules
- minsci.xmu.tools.audits module
- minsci.xmu.tools.describer module
- minsci.xmu.tools.groups module
- minsci.xmu.tools.legacy module
- minsci.xmu.tools.mapper module
- minsci.xmu.tools.matcher module
- minsci.xmu.tools.operations module
- Module contents
Submodules¶
minsci.xmu.fields module¶
Reads and returns information about EMu’s schema
-
class
minsci.xmu.fields.XMuFields(schema_path=None, whitelist=None, blacklist=None, cache=True, verbose=False)[source]¶ Bases:
objectReads and stores metadata about fields in EMu
Parameters: - schema_path (str) – path to EMu schema file. If None, looks for a copy of the schema stored in files.
- whitelist (list) – list of EMu modules to include. If None, anything not on the blacklist is included.
- blacklist (list) – list of EMu modules to exclude. If None, no modules are excluded.
- cache (str) – path to cache file. If specified, script will check there for a cache file and create one if it isn’t found.
- verbose (bool) – triggers verbose output
-
schema¶ dict – path-keyed dicts of field data
-
tables¶ dict – module-keyed lists of paths to tables
-
map_tables¶ dict – path-keyed lists of paths to tables
-
verbose¶ bool – triggers verbose output
-
add_table(columns)[source]¶ Update table containers with new table
Parameters: columns (list) – columns in the table being added
-
get(*args)[source]¶ Return data for an EMu export path
Modified from DeepDict.pull() to jump to a different module when a reference is encountered.
Parameters: *args – the path to a value in the dictionary, with one component of that path per arg Returns: Dictionary with information about the given path
-
static
get_xpath(*args)[source]¶ Reformat plain-text path to xpath
Parameters: path (str) – an XMuFields path Returns: Path string reformatted as in an EMu export
minsci.xmu.xmu module¶
Reads and writes XML formatted for Axiell EMu
-
class
minsci.xmu.xmu.ABCEncoder(*args, **kwargs)[source]¶ Bases:
json.encoder.JSONEncoder-
default(abc)[source]¶ Implement this method in a subclass such that it returns a serializable object for
o, or calls the base implementation (to raise aTypeError).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
-
-
class
minsci.xmu.xmu.Grid(fields, operator)¶ Bases:
tuple-
fields¶ Alias for field number 0
-
operator¶ Alias for field number 1
-
-
class
minsci.xmu.xmu.XMu(path, fields=None, container=None, module=None)[source]¶ Bases:
objectRead and search XML export files from EMu
-
fields¶ XMuFields – based on fields kwarg
-
module¶ str – name of base module
-
record¶ dict – the currently active record
-
schema¶ dict – XMuFields.schema
-
tables¶ dict – XMuFields.tables
-
verbose¶ bool – triggers verbose output
-
xpaths¶ list – paths from source file
Parameters: - path (str) – path to EMu XML report or directory containing multiple reports. If multiple reports are found, they are handled from newest to oldest.
- fields (XMuFields) – contains data about field
- container (DeepDict) – class to use to store EMu data
-
autoiterate(keep=None, **kwargs)[source]¶ Automatically iterates over the source file and caches the result
-
fast_iter(func=None, report=0, skip=0, limit=0, callback=None, callback_kwargs=None, **kwargs)[source]¶ Use callback to iterate through an EMu export file
Parameters: - func (function) – name of iteration function
- report (int) – number of records at which to report progress. If 0, no progress report is made.
- skip (int) – number of records to skip before processing
- limit (int) – number of record at which to stop processing the file
- callback (function) – name of function to run upon completion
Returns: Boolean indicating whether the entire file was processed successfully.
-
find(rec, *args)[source]¶ Return value(s) for a given path in the EMu XML export
Parameters: - rec (lxml.etree.ElementTree) – XML formatted for EMu
- *args (str) – strings comprising the path to a field
Returns: String (for atomic field) or list (for table) containing value(s) along the path given by *args. Blank rows that follow the last populated row in a table are not populated!
-
harmonize(new_val, old_val, path, action='fill')[source]¶ Harmonize new values with existing values on the same path
Parameters: - new_val (str) – new or replacement value
- old_val (str) – existing value
- path (str) – path to field in XMuSchema
- action – can be one of ‘fill’ (add new value if blank), ‘append’ (append new value using either a new row or delimiter), or ‘replace’. The default is fill.
Returns: Tuple containing (revised value, update boolean)
-
read(root, keys=None, result=None, counter=None)[source]¶ Read an EMu XML record to a dictionary
This is much faster than iterating through the XMu.xpaths list.
Parameters: - root (lxml.etree) – an EMu XML record
- keys (list) – parents of the current key
- result (XMuRecord) – path-keyed representation of root updated as the record is read
- counter (dict) – tracks row counts by path
Returns: Path-keyed dictionary representing root
-
read1(root, keys=None, result=None, counter=None)[source]¶ Read an EMu XML record to a dictionary
This is much faster than iterating through the XMu.xpaths list.
Parameters: - root (lxml.etree) – an EMu XML record
- keys (list) – parents of the current key
- result (XMuRecord) – path-keyed representation of root updated as the record is read
- counter (dict) – tracks row counts by path
Returns: Path-keyed dictionary representing root
-
-
minsci.xmu.xmu.check_columns(*args)[source]¶ Check if columns in the same table are the same length
Parameters: *args – Lists of value for each column
-
minsci.xmu.xmu.check_table(rec, *args)[source]¶ Check that the columns in a table are all the same length
minsci.xmu.xmungo module¶
Reads data from NMNH MongoDB collections database
-
class
minsci.xmu.xmungo.MongoBot(username, password, instance=None, container=None)[source]¶ Bases:
objectContains methods to connect and interact with NMNH MongoDB
-
class
minsci.xmu.xmungo.MongoDoc(*args, **kwargs)[source]¶ Bases:
dictDict sublass with methods supporting Mongo-style paths
-
class
minsci.xmu.xmungo.XMungo(*args, **kwargs)[source]¶ Bases:
minsci.xmu.xmungo.MongoBotContains methods to interact with Mongo data using XMu tools
-
fast_iter(query=None, func=None, report=0, skip=0, limit=0, callback=None, **kwargs)[source]¶ Use function to iterate through a MongoDB record set
This method reproduces most (but not all) of the functionality of the XMu.fast_iter() method.
Parameters: - func (function) – name of iteration function
- report (int) – number of records at which to report progress. If 0, no progress report is made.
- limit (int) – number of record at which to stop
- callback (function) – name of function to run upon completion
Returns: Boolean indicating whether the entire record set was processed successfully.
-
Module contents¶
Provides tools to read, write, and otherwise process EMu XML files