MedAnd / AspNetCore.DataProtection.ServiceFabric

ASP.NET Core DataProtection for Service Fabric

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Question] Linux support

dzoech opened this issue · comments

Hello, thanks for your work! While skimming over the source code, I saw it contains functionality for ETW. Without looking too much into it, I'd like to ask if this service can also be run on a Linux cluster.
Thank you!

Hi, thanks for the feedback and question! You may be interested in this branch which is still WIP but have tested core functionality is working on Ubuntu 16.04: PersistKeysToServiceFabricServiceUriUpgrade.

Note I have paused work as currently waiting for .Net Core 3.0 to GA, will then switch focus to upgrade to latest SF SDK, .Net Core 3.0 and cross platform support etc. ETW logic was not tested on Linux.

PS. Glad to accept PRs in this direction though 😊

Thank you, I'll look into it :)

ETW logic was not tested on Linux.

Since ETW is a Windows-only mechanism, it won't work for sure but my question was if the service is dependent on ETW. But if I assume correctly, it is not.

Both the core service and the REST API sample site compile & run. You'll notice the following line of code though:

ServiceEventSource.Current.ServiceTypeRegistered(Process.GetCurrentProcess().Id, typeof(DataProtectionService).Name);

From memory & it's been months, the above code did not throw an exception... but best to double check, comment out if required etc.

Out of interest, why Linux?

Thank you for your effort. I will try it out in the next couple of days when I get Redis running on the cluster.

Out of interest, why Linux?

I'm doing a comparison of AKS, SF-Linux and SF-Windows in regards to containerized applications for my master thesis. And as I have seen so far, SF isn't really easy to work with when you want to use containers...

If that is the case, may I suggest also considering Azure Service Fabric Mesh... currently in preview and promises to combine the best parts of k8s and Azure Service Fabric:

containerized applications without having to manage VMs, storage or networking configuration, while keeping the enterprise-grade reliability, scalability, and mission-critical performance of Service Fabric. Service Fabric Mesh supports both Windows and Linux containers

I have considered it initially but decided against it because it is still in preview and also makes the comparison to AKS (where I have access the underlying infrastructure) more complicated. But I appreciate your input.

What really gives me a hard time with Service Fabric is the mapping of a container volume to a local VM disk. To me, it looks like it should be possible but I can't find any useful documentation. But well... 🤷

From what I know the recommendation is to map container volumes to something like Azure Files, state should not be stored on Service Fabric discs, consider them to be ephemeral.

Service Fabric Azure Files Volume Driver (Preview)

I have a Redis container and want to map its volume to disk. Azure Files is not suitable due to obvious performance limitations. :) ´
The reason why I want to use a volume is to still have a valid cache (and valid tokens for DataProtection) when the Redis container is restarted (movement of the container to another node is another problem).

What I figured out to work is the following:

<ContainerHostPolicies CodePackageRef="Code"... >
  ...
  <Volume Source="C:\" Destination="C:\data" />
</ContainerHostPolicies>

But I don't really want to use C:\ for the source but rather a sub directory. But since the cluster is running on a Azure hosted cluster (using a VM scale set), I don't know about the Node's directory structure.

Above is actually not recommended, as in doing heavy writes to the OS drive. If you still wanted to do this though, you can RDP to each of the Service Fabric nodes to review the folder structure. See Remote connect to a virtual machine scale set instance or a cluster node.

As a side note, Azure premium files is actually used to run various Azure services such as Azure Database for PostgreSQL and Azure Database for MySQL, so should be fast enough for above scenario. Azure premium files support up to 20,000 IOPS. Most likely the Service Fabric node IOPS will be slower but this will depend on how you provisioned your cluster and what you selected in the ARM templates:

In addition to above, the way the .Net Core DataProtection library from MS is designed, it will cache keys on process start-up, so reads from then on will be from memory. Moreover the way my micro-service works (AspNetCore.DataProtection.ServiceFabric) is that it reads / writes to Service Fabric Reliable Collections, which means you cannot host this service within a container but your container process maybe able to call "out" to this service, but have not tested this. Lastly the AspNetCore.DataProtection.ServiceFabric microservice may in turn cache keys in memory as this is what Reliable Collections do by design (act as a write through cache).

Funny thing; everytime I tried to RDP into the VMSS I had some problems which eventually made me stop trying. Now I gave it another try and it worked immediatelly... 😄

My thought was to attach a data disk to the VMSS and then use this to mount the docker volume. Do you see any problems with that approach?

Above is actually not recommended, as in doing heavy writes to the the OS drive.

Ah, I didn't think about this. Thanks for clarification. :)

In regards to Azure premium files, I'd like to use as few Azure resources as possible in order to achieve a minimal vendor lock-in (since SF can also be hosted on-premises). Therefore the cluster and its running microservice application should be as self-contained as possible.

In addition to above, the way the .Net Core DataProtection library from MS is designed [...]

So if I understand this correctly, there are two write through caches running - the DataProtection library itself as the first one and then SF Reliable Collections as a subsequent one. Is this correct?

Regarding your comment above:

From what I know the recommendation is to map container volumes to something like Azure Files, state should not be stored on Service Fabric discs, consider them to be ephemeral.

Do you have any sources for this? I'd be genuinely intrested in and would probably help me for my thesis.

Thank you again for your help! :)

So if I understand this correctly, there are two write through caches running - the DataProtection library itself as the first one and then SF Reliable Collections as a subsequent one. Is this correct?

That is correct, having looked into the .Net Core code I think you'll find a good dose of caching is going on for DataProtection. In addition SF Reliable Collections often act as a write through cache.

Do you have any sources for this? I'd be genuinely interested in and would probably help me for my thesis.

[+]

My thought was to attach a data disk to the VMSS and then use this to mount the docker volume. Do you see any problems with that approach?

I assume this would be via ARM, and would mean each node in the cluster has an additional data disk... given Service Fabric can and does move services around the cluster, I am not sure how the data will likewise move around? Moreover a cluster can be scaled in and out, what happens to the data in this scenario?

Whilst I know you do not wish to use Azure Files I would argue it's the same as attaching a data disk to the VMSS; with the important difference that the Azure Files volume moves around the cluster along with the service which is using it.

See Service Fabric Azure Files Volume Driver (Preview) for more details.

PS. When your thesis is complete I look forward to having a read 😊

Correct, every node has its own data disk (which are not synchronized to my knowledge). Moving the data around is all up to Redis. I'm not aware how they do it, but a setup with one master and multiple slave nodes (but not for each node a Redis service, like -1 instances) should do the trick I guess: when a Redis instance is moved from one node to another, this node's data disk is empty. But since Redis has its own replication mechanism, the moved instance will eventually have the correct data.

Whilst I know you do not wish to use Azure Files I would argue it's the same as attaching a data disk to the VMSS; with the important difference that the Azure Files volume moves around the cluster along with the service which is using it.

I'd say the difference is if I want to migrate the SF application from the Azure cluster to an on-premises cluster, I have to find a solution to replace Azure Files. But when using an attached disk, this should not be a problem.

PS. When your thesis is complete I look forward to having a read 😊

If you tell me, how I can contact you, I'd be happy to send it to you :) Moreover, we wouldn't have to abuse this issue for our discussion hehe

If you tell me, how I can contact you, I'd be happy to send it to you :) Moreover, we wouldn't have to abuse this issue for our discussion hehe

Any of the means listed on my website are good for comms 🙂

Re: Redis... still not sure above will work the way you are expecting but interested to know how things progress...

In theory Service Fabric Reliable Disk based volume should also work on vanilla Service Fabric clusters... I think this is what you really want, for your data to be replicated and available to any node in the cluster...

Fine! I'll just give it a try with the attached data disks because now I'm curious. I'll let you know how it works out.

Ad Reliable Disk based volume: This looks interesting - a possible issue I see is that this might interfere with Redis' replication. But as I said, I'll tinker around and let you know. And a huge thanks for your insights :)

It seems I forgot to mention that I'd also like to containerize the client (which is consuming the DataProtection service). Am I correct in the assumption that this is not possible because of it using the ServiceProxy class that requires the Service Fabric SDK?

var proxy = ServiceProxy.Create<IDataProtectionService>(new Uri(_serviceUri), new ServicePartitionKey());

Limited access to GitHub currently... pls tweet 😎

The way AspNetCore.DataProtection.ServiceFabric works is that it reads / writes to Service Fabric Reliable Collections (via SDK APIs), which means you cannot host this service within a container... as for the client my understanding is that it will also not work within a container however have not tested this scenario (as in there might be workarounds).

Thinking this can now be closed but feel free to comment and we'll re-open...