Moldable Data Wrapper
You are working in a domain with existing data that you want to turn into an explainable system.
How do you develop custom tools for existing data?
— When we explore data, we represent them using suitable low-level data structures.
— Data structures (lists, dictionaries) reflect the representation of data, not their interpretation.
— To analyze and explore data, we need higher-level views that reflect our understanding of the data.
Wrap each kind of data using a dedicated class reflecting the problem domain entity.
As you explore the data, introduce custom tools (views etc. ) to the domain class that reflect answers to questions you ask about the data.
First extract the data of interest. This might be data sitting in your file system (for example, a CSV file), or data retrieved from a website. For example, here we retrieve a Dictionary representation of JSON data about the feenk GitHub organization:
url := 'https://api.github.com/orgs/feenkcom'. json := ZnClient new get: url. dictionary := STON fromString: json.
The dictionary representation, however, is not well-suited for exploring the GitHub organization domain. If we explore the resulting object (see below) we just see the keys and values of the raw downloaded data. From this view, of course we can explore the data by navigating the Dictionary views, or by programatically exploring other paths, but we cannot add or tailor views to specifically support the GitHub Organization domain.
Now we can add custom views specific to this domain, for example, listing the repositories of an organization, or the most recent GitHub events. For each new domain concept, we introduce a dedicated wrapper object, so we can navigate the entire model via the domain concepts. For example:
GhOrganization new rawData: dictionary
We see the result after some moldable development steps below:
By wrapping the raw domain data, you obtain a moldable object that can be customized to form part of an explainable system. Each time you navigate to other data representing further domain entities, wrap them as well, to build an explorable domain model. In case custom tools of the underlying data objects are useful, you can always recycle them and make them available to the wrapped objects as well.
Lifeware uses Moldable Data Wrappers extensively to wrap SQL data in the insurance domain to be able to create custom views for these data. Within GT (August, 2024) there are over 40 classes that wrap a rawData slot and provide custom views. These classes wrap diverse data ranging from Jupyter notebooks to social media records. EGAD is a research prototype to explore GitHub actions by wrapping the raw data obtained from GitHub as moldable objects.
A Moldable Collection Wrapper serves a similar purpose, but solves a different problem, which is to allow the results of queries to be moldable.