Development Guide

Building

To build the platofrm and all the in-tree collectors:


# Clone the repo
$ git clone git@github.com:soluble-ai/soluble.git

# enter the source tree
$ cd soluble

# To build with tests:
$ ./gradlew clean check

# To build without running tests:
$ ./gradlew clean build

TIP

You will need JDK 11+ installed.

Build from Source

We use gradle and build all code from source. We don't publish intermediate libraries, though we do use maven for upstream OSS dependencies.

If you want to create a collector that is not in-tree, you need to check out the main repo and build the dependencies you need. This is exactly what we do for our enterprise collectors.

There is a script install-jars that will do this for you.

Architecture

Configuration

Soluble components should all handle config as if env vars are the primary means of configuration.

There is a special class EnvConfig that makes this available. This should be used in preference to direct reads of files, system properties, env vars, etc.

It exposes layered configuration in a coherent way and will allow for dynamic/remote reconfig.

Try to avoid Spring value injection where possible.

Unit and Integration Testing

Be diligent about writing good unit and integration tests, but let's not be dogmatic about it.

Largely speaking we don't care about absolute test coverage numbers. We care that there is enough test coverage to prevent defects from making their way downstream.

Unit Tests

Unit tests should:

  1. Have no external depdendencies (database, etc.)
  2. Not depend on application runtime state (Spring Initialization, etc.)

Integration Tests

Integration tests do depend on specific runtime state and may have external dependencies.

However, we like integration tests to gracefully degrade. This means that if neo4j is available, we will use it. If it is not available, the tests will be skipped. In the CI environment we fail the build if those tests are skipped.

For soluble-collector-core integration tests should extend CoreIntegrationTest. This has all the necessary logic to detect the running environment.

When building a specific collector, integration tests should extend CollectorInegrationTest. This has a significant amount of boilerplate that makes it easier to write tests.

Additionally there is a method checkAvailability(Collector collector) that can be used to determine if a given external dependency is available (AWS, Kubernetes, Slack, etc.). If it is not available, tests will be skipped.

Dev Config

When the integration tests are first run, a file ~/.soluble/config.yml will be written if it does not exist. By default it will write configuration for connecting to neo4j.

cat ~/.soluble/config.yml
---
GRAPH_URL: "bolt://localhost:7687"
GRAPH_USERNAME: "neo4j"
GRAPH_PASSWORD: "graph"

This can be used to keep any local configuration out of the source-tree. It is useful for tokens, api keys, passwwords, or anything else that you need to keep out of source control.

Mocks

We try to avoid too much mocking. It usually ends up being a waste of time. If mocks are needed, please use Mockito.

MockWebServer

Square's MockWebServer is very nice for simulating remote servers and testing client code. When working with 3rd party integrations, it is very effective for capturing and testing the REST request/response flows.

TIP

It is not practical to use MockWebServer to mock complex REST APIs (AWS, Kubernetes, etc.) It is largely a waste of time. These need to be tested against the real thing.

CI

Soluble uses Circle CI for all builds.

https://circleci.com/gh/soluble-ai

Docker Registry

Docker images from all successful builds are pushed to Docker Hub.

https://cloud.docker.com/u/soluble/

All builds are pushed with the tag src-<git-commit>.

Builds from master are pushed with the latest tag.

Notes

Language & Platform

We use Java because it is easy to write robust code in a large code-base.

There are a lot of people that know how to write robust Java code.

We might consider Kotlin or Clojure in the future if we get bored. Although it would be possible to port the collector architecture to Go, it's not clear that there would be a lot of value in doing so. It would improve cold start times and memory consumption, but these aren't high priority items for us.

But in the end, we don't expect anyone to care about language. It might as well be written in Haskell for all it matters.

The dashboard will likely migrate toward a single-page model written in Vue or React with the UI implemented in JavaScript or TypeScript. Right now it is largely rendered server-side, with Bootstrap used for layout.

Jackson Usage

We try to use Jackson's tree model data structures wherever it makes sense to do so. What this means is that we try to avoid gratuitious data mapping to java classes. We use the JsonNode tree model structures directly in the code instead.

Sometimes the code is not the prettiest, but it tends to be more reliable, less error-prone, and considerably less brittle in forward and backward compatibility situations.

Jackson's API is quite good at type conversion and null-safe processing of the underlying data.

Spring

We tried using Micronaut, which is a promising new framework that is optimized for cold starts in Serverless environments. It can target the Graal VM to build "native" executables.

However, we ended up not using it for the following reasons:

  1. We do a lot of runtime composition of featuresa and code, so compile-time dependency-injection made this very difficult.
  2. Cold start times of the components are not a factor for us. (See below)
  3. Spring is mature and just works.

Serverless

When we started on this project, we thought that it might make sense to run it as a constellation of serverless (Lamba) functions. Although this sounds delightful, it turns out that the runtime needs (caching, in-memory processing, etc.) are quite complex and this really wasn't viable.