The Constellation File Format

    Constellation saves graphs in a ".star" file. This is simply a container in zip format that holds the graph file "graph.txt" and optionally other files, such as node icons.

    This file format is also used when a graph is exported to JSON. There are two differences between saving and exporting:

    Other than these two differences, the description below applies to both variations.

    graph.txt (or name.json)

    The "graph.txt" (or "name.json") file contains data in UTF-8 encoded JSON format that represents a graph. To extract the data manually, an unzip utility can be used. The simplest way of doing this in Windows is to append ".zip" to the filename (so the file type is recognised) and double-click on the file in Windows Explorer.

    Because the storage format is JSON, any language with a JSON parser can read the graph. In particular, Python can be used to read and write a JSON file in a zip container.

    The output shown below from a Python script demonstrates the top-level structure of a JSON document that describes a Constellation graph:

                [ { "attribute_mod_count": 65,
                          "global_mod_count": 591,
                          "schema": "au.gov.asd.tac.constellation.schema.InteractiveSchemaFactory",
                          "structure_mod_count": 65,
                          "version": 1},
                        { "graph": [{ "attrs": [...]}, { "data": [...]}]},
                        { "vertex": [{ "attrs": [...]}, { "data": [...]}]},
                        { "transaction": [{ "attrs": [...]}, { "data": [...]}]},
                        { "meta": [{ "attrs": [...]}, { "data": [...]}]}]
          

    Furthermore, Constellation encodes the JSON document in "pretty-printed" style, so it is possible to use simple tools such as grep(1) to search for text.

    graph.txt (or name.json) Sections

    The outermost structure of the graph is an ordered list containing five elements. Each element (apart from "version") contains a dictionary with a single key defining the graph section contained in that element. The sections must appear in the following order:

    1. version - An integer version number that defines the remaining structure. This section may also contain other unspecified data.
    2. graph - Contains data relevant to the graph (e.g. background color).
    3. vertex - Contains data relevant to the nodes (e.g. the name of the node).
    4. transaction - Contains data relevant to the transactions (e.g. line style).
    5. meta - Contains data about the graph environment (e.g. the attributes used describe View states).

    Why do it like this? Why not just a use a top-level dictionary containing the section keys? Unlike XML, JSON objects contain unordered name/value pairs. This means that (for example) when a Python dictionary is serialised, the "transaction" key might appear before the "vertex" key in the resulting JSON. Since transactions can't be added to the graph before their corresponding nodes, this would require that the entire structure be loaded into memory to ensure that "vertex" could be accessed first, which would not be a good idea for large graphs.

    By making the top-level structure a JSON array with a specified order, the data can be streamed from the file in the required order, making graph reading more efficient.

    Given that the order is defined, putting a JSON object with a single name in each element is superfluous. However, the name provides a built-in level of documentation, making the file (slightly) more readable, and adds little overhead when reading and writing the file.

    Each graph section "graph", "vertex", "transaction", "meta" contains data with the same format: a list of two single-named objects.

    (The structure here is a list of single-named dictionaries, rather than a single dictionary with two names, for the same reason that the top-most structure is a list of single-named dictionaries: the attributes must be defined before they can be used):

    If an object does not define a name, the value of that name is assumed to be null. e.g. if the vertex "attrs" section defines a "Country" attribute, and an object in the vertex "data" section has no "Country" name, then the resulting graph will have a "Country" value of null.

    Attributes

    Attributes define the data values that are attached to elements of the graph. Each attribute has four components:

    Attributes are defined separately in each graph section; an attribute defined in the "vertex" section cannot be used in the "transaction" section.

    Data Types

    Constellation defines some built-in data types. These are listed below.

    All data type values have string representations so they can be round-tripped from their internal representation, to a JSON document when saved, and back to their internal representation when loaded (although some floating point numbers may not be retrieved exactly due to the inexactness inherent in the string representation). (Obviously this round-tripping would work for other string formats such as CSV):

    "graph" Section

    An example JSON document section is shown below:

                "graph" : [ {
                        "attrs" : [ {
                          "label" : "color",
                          "type" : "color",
                          "descr" : "The background color of the graph",
                          "default" : "Black",
                          "mod_count" : 0
                        }, {
                          "label" : "time_zone",
                          "type" : "time_zone",
                          "descr" : "time_zone",
                          "default" : "UTC",
                          "mod_count" : 0
                        } ]
                      }, {
                        "data" : [ { } ]
                      } ]
          

    Attributes are defined in the "attrs" object as an array of objects, in Python terms, a list of dictionaries. (Although arrays have order, no ordering is imposed by Constellation.) Each object has four defined name/value pairs in no particular order.

    Data are stored in the "data" object. In this case, no data have been stored, so the defaults defined in the attributes will be used (e.g. the graph will use the default black background).

    "vertex" Section

    A part of an example "vertex" section is shown below:

                "vertex" : [ {
                        "attrs" : [ {
                          "label" : "x",
                          "type" : "float",
                          "descr" : "The x coordinate of the vertex",
                          "default" : 0.0
                        }, {
                          "label" : "icon",
                          "type" : "icon",
                          "descr" : "The icon of the vertex",
                          "default" : ""
                        }, {
                          "label" : "Name",
                          "type" : "string"
                        } ]
                      }, {
                        "data" : [ {
                          "vx_id_" : 0,
                          "x" : 9.760799,
                          "icon" : "Flag.Australia",
                          "Name" : "Node 0"
                        }, {
                          "vx_id_" : 1,
                          "x" : 0.22238255,
                          "icon" : "Misc.Constellation",
                          "Name" : "Node 1"
                        } ]
                      } ]
          

    In this example, three vertex attributes are defined in the "attrs" section: "x" (type float), "icon" (type icon), and "Name" (type string).
    NOTE: "Name" has no defined "descr" or "default" values, so these will be null. Two vertices are defined in the "data" section with specific values assigned to their attributes.

    The vertices define values for a special attribute:

    "transaction" Section

    This section is optional. If there is no "transaction" section, there will be no transactions in the resulting graph.

    A part of an example "transaction" section is shown below.

                "transaction" : [ {
                        "attrs" : [ {
                          "label" : "color",
                          "type" : "color",
                          "descr" : "The color of the transaction"
                        }, {
                          "label" : "line_style",
                          "type" : "line_style",
                          "descr" : "The line style of the transaction",
                          "default" : "SOLID"
                        } ]
                      }, {
                        "data" : [ {
                          "vx_src_" : 8,
                          "vx_dst_" : 9,
                          "tx_dir_" : true,
                          "color" : "0.27553296,0.79927653,0.2556097,1.0",
                          "line_style" : "SOLID",
                          "Datetime" : "2014-03-21 06:15:48.471",
                          "Id" : "6"
                        }, {
                          "vx_src_" : 5,
                          "vx_dst_" : 9,
                          "tx_dir_" : true,
                          "color" : "0.7740898,0.7625852,0.9571049,1.0",
                          "visibility" : 0.11111111,
                          "line_style" : "SOLID",
                          "Datetime" : "2014-03-22 04:42:31.216",
                          "Id" : "918"
                        } ]
                      } ]
          

    Two transaction attributes are defined in the "attrs" section: "color" (type color), and "line_style" (type line_style).
    NOTE: "color" has no defined 'default' value, so transactions without an explicitly defined color will be null. Two transactions are defined in the 'data' section with specific values assigned to their attributes.

    The transactions define values for three special attributes:

    "meta" Section

    This section is optional. If there is no "meta" section, defaults will be used.

    The "meta" section contains data about the graph's environment. The data is defined by various Constellation modules which write their state to the JSON document on save, and read their state from the JSON document on open.

    Although the "meta" section has the same "attrs"+"data" format as the other sections, the attributes are defined by modules, rather than being built-in types. For instance, the module that defines what filters and configurations to display in the Conversation View saves its state in the attribute "conversation_view_state" of type "conversation_view_state". When the document is opened, the Constellation graph opener will find a "conversation_view_state" attribute and advertise it to the current modules. The module responsible for conversation_view_state (Core Conversation View in this case) will recognise the attribute, claim it, and read its state.

    The data section array contains a single object, with each key having a name corresponding to an attribute name. For instance, the module responsible for Conversation View state will have a "conversation_view_state" key in which its state is saved.

    Generally, the format of the data used by individual modules is documented by the modules, rather than Constellation itself. Some modules may consider their data to be for internal use only, and not document their format.