Store resource s3 information in resource.s3Data

Question

Store resource s3 information in resource.s3Data

rmgraham opened this issue 3 years ago · comments

Ryan Graham commented 3 years ago

Currently in the resources collection, the resource.data field can have different values:

if s3 is disabled, resource.data stores a JSON.stringify'd version of the object
(i.e. "data": '{"kind":"Deployment", ...}')
if s3 is enabled, the data gets uploaded to s3, and resource.data stores the s3 object url
(i.e. "data" : "https://s3.us-east.cloud-object-storage.appdomain.cloud/razee-stage-configs/c79ec168fdf229a5d57703a45804c4072789ef863d1e2b6de8254cf3b2b2ec7a",)

That can get annoying, since now we have to constantly parse it to determine whether or not its in s3.

So instead, lets make resources contain a s3Data attribute like so:

"data": null,
"s3Data": {
    "endpoint":"https://s3.us-east.cloud-object-storage.appdomain.cloud",
    "region": "us-east",
    "bucketName":"razee-stage-configs",
    "path": `$orgId/resources/$cluster/$resourceHash`
}

When the resource is saving to s3, the data attr will now be null, and s3Data will contain the relevant data.

We also need to migrate the existing db data to swap to this format. So we'll need to do multiple stages, where we support both possible ways of storing s3 data for a little while. While also forcing new data to follow the s3Data format.

We will also need to figure out how to handle multiple S3 regions/connections. Currently we have a bunch of S3_* env vars, but we'll need to be able to handle multiple copies, and swap out which we connect to based on which region we want to save it to.

Ryan Graham commented 3 years ago

👍

Michael T · Answer 1 · Tue May 18 2021 04:43:04 GMT+0800 (China Standard Time)

Resource storage logic is used in the following places

app/apollo/resolvers/channel.js: stores data in 'deployableVersions.content'
app/apollo/resolvers/resource.js: stores data in 'resources.data' and 'resourceYamlHist.yamlStr'
app/routes/v1/channels.js: stores resources in 'deployableVersions.content'
app/routes/v2/clusters.js: stores resources in 'resources.data'

Design

Here is a slightly improved design proposal that adds a little more flexibility by abstracting resource storage mechanism.

The goals are:

Eliminate resource storage code duplication
Simplify API code by removing resource storage implementation details from it
Enable resource storage mechanism to be changed or extended without changing API code
Allow resource storage implementation to be as complex and flexible as needed
Make seamless migration from the legacy format

Essentially this makes resource storage implementation pluggable, similar to how current auth is designed.

The way resource is stored will be defined by the "type" field value in the metadata while legacy types will also be recognized and handled appropriately by inferring their type from the "data" field contents as it is done today. Examples of "data" field values:

"embedded-legacy" type:

"data": "{"kind":"Deployment", ...}"

"embedded" type

"data": {  
          "meta-data": {"type": "embedded"},  
          "data": "{"kind":"Deployment", ...}"  
        }

"s3legacy" type, uses default COS configuration

"data": "https://s3.us-east.cloud-object-storage.appdomain.cloud/cos-razee/d9ad5f7b-39f2-4794-aad2-1673d6770813-12e3f63d-2c6a-4fea-adb4-b57b1d65cc59-service-version-1"

"s3" type

"data": {  
    "meta-data": {"type": "s3"},  
    "data": {  
              "endpoint":"s3.us-east.cloud-object-storage.appdomain.cloud",  
              "location": "us-east",  
              "bucketName":"razee-stage-configs",  
              "path": "$orgId/resources/$cluster/$resourceHash"  
            }  
        }

Resource handling logic will be encapsulated into a separate set of objects behind a common StorageStrategy interface:

Embedded (current mongo)
S3legacy (current implementation)
S3 (S3 with multiple locations)
...

The following examples show the interface in action

// Reading existing resource  
const encodedResource = readMongoField();  
// Select appropriate resource handler implementation dynamically based on the content metadata  
const handler = storageFactory.deSerialize(encodedResource); // non-embedded implementations download resources  
let resource = handler.getData();

// Creating new resource  
const resource = getYamlResource();  
const resourceName = "${org_id.toLowerCase()}-${channel.uuid}-${name}";  
// Choose configured default strategy with default location if it is not specified  
const handler = storageFactory.newResourceHandler(resourceKey, bucketName [,'us-standard']);  
hadler.setData(resource);  
const encodedResource = handler.serialize(); // non-embedded implementations upload resources

// Delete existing resource  
const encodedResource = await models.DeployableVersion.findOne({org_id, channel_id: channel_uuid, uuid: version_uuid });  
// Select appropriate resource handler implementation dynamically based on the content metadata  
const handler = storageFactory.deSerialize(encodedResource);  
handler.deleteData(); // non-embedded implementations delete remote resources

Program artifacts to be created / changed

Refactor the following modules to use new resource handling interface:

app/apollo/resolvers/channel.js
app/apollo/resolvers/resource.js
app/routes/v1/channels.js
app/routes/v2/clusters.js

Add storageConfig.json file with the list of all available resource handler implementations using "type: impl-class" notation:

{  
   's3': '/app/storage/s3Resource',  
   'embedded': '/app/storage/embeddedResource',  
}

Add three resource handler implementations:

s3Resource.js
s3legacyResource.js
embeddedResource.js (handles embedded legacy as well)

Add resource handler factory storageFactory.js

uses storageConfig.json to get all available storage strategies
determines default handler implementation based on an env variable
handler types for existing resources are selected based on metadata
new resource handlers are created using the default implementation
each resource handler implementation configures itself appropriately

Add location-to-S3Config mapping in addition to the default S3 configuration that exists today.

Michael T · Answer 2 · Thu Jun 17 2021 02:17:24 GMT+0800 (China Standard Time)

Launched into PROD with the PRs:
#902
#907
#909