GraphBLAS / LAGraph-Working-Group

Public document and planning repository for the LAGraph Working Group

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use GrB_Matrix/Vector or define LAGr_Matrix/Vector

mcmillan03 opened this issue · comments

One consideration, is carrying along property flags that can effect how algorithms are carried out:

GrB_Matrix + property flags --> LAGr_Matrix

Example properties include:

  • is the graph directed or undirected?
  • are there self-loops?
  • are weights strictly positive?
  • is the graph unweighted?
    • this becomes less important with GrB_PAIR which can treat a weighted graph as unweighted
  • is the graph directed and acyclic (a DAG)?
  • is the graph a tree?

Properties should be discoverable, although doing so might require an expensive computation.

We should allow users to pass in the known value of properties to avoid computation.

I see the crux of this issue being interoperability with GraphBLAS. If we go with LAGr_Matrix, we can no longer send that object directly to GraphBLAS without unpacking, even if it's just syntax (e.g. LAGr_Matrix_instance->GrB_Matrix_instance).

One solution is to provide user-visible wrappers for all GraphBLAS functions via LAGraph, something we already have to some degree to simplify error handling. This isn't true interoperability, but it would allow us to simplify the syntax to LAGr_GrBFn(LAGr_Matrix).

The added benefit to this approach would be that if there are any hints we can send along to GraphBLAS regarding this structure, we can handle it inside these wrappers.

I was against LAGr_Matrix/Vector at first, but it's growing on me.

Decision: For now we are going to move forward and assume that we will have LAGraph objects that are opaque. We will move on in subsequent meetings and define function signatures for the library. That will let us write application-level code using LAGraph. We can then see if the opaqueness creates any problems and if needed revisit the issue.

Discussion:

We call it "GraphBLAS" but really its all about sparse linear algebra over algebraic semi-rings. There is surprisingly little Graph specific information we carry along with the objects. As we move to LAGraph, that is no longer the case. Now the GraphBLAS objects are specialized to graphs.

Hence, we need to know ... is this particular graphBLAS matrix, for example, an adjacency matrix or an incidence matrix. Is the graph undirected in which case the adjacency matrix is symmetric. Or do you have a directed graph but for a particular algorithm you want to treat it as undirected (i.e. a non symmetric matrix buy the algorithm will only use the upper or lower triangle). I could continue in this vein, but hopefully the point is clear; there is additional information we must carry with the GraphBLAS objects when they are used in LAGraph functions.

Hence, we need LAGraph objects. The next major decision is: are the LAGraph objects opaque or non opaque?

Pros for opaque objects:

  • they give us flexibility to add low level properties of LAGraph objects algorithm-implementors need but users of the library may not care about.

  • They give us the flexibility to later support deferred execution at the level of the calls to LAGraph functions (note: deferred execution of graphBLAS inside LAGraph functions is orthogonal to this discussion).

  • The sizeof operator doesn't change as properties are added to the type which could let users use a new version of LAGraph with changes to the opaque types without the need to recompile their application code.

Cons for opaque objects:

  • Opaque objects force us to add a whole suite of accessor functions to the API.

  • When mixing GraphBLAS and LAGraph calls in a single application, I would need to explicitly transform my GraphBLAS object into an LAGraph object through calls to accessor functions. This could lead to verbose code ... i.e. many function calls instead of a few low level references to fields of a structure.

There are other pros and cons, but those are a few that stood out in our discussion. The point is we need to move forward in the design of LAGraph and then revisit questions of usability of the API in application-level code to make a final decision.

Another question that came up in the discussion was "what properties do we want to define in the LAGraph object"? It was suggested that we could analyze NetworkX (with 200 or so algorithms) and get some sense of the full set of properties we might need. The list might be short enough that we could realistically "get it done" once and for all in the definition of our objects (as opposed to picking a minimal set now and expanding it from one release of the spec to another)

Sorry. I closed the issue by mistake. I don't know if we want to mark this issues as closed or leave it open and close later after we revisit it.