Caddy+Smallstep integration (trusted certs for localhost and internal sites)

Question

Caddy+Smallstep integration (trusted certs for localhost and internal sites)

mholt opened this issue 4 years ago · comments

Thanks to a recent agreement, I'll soon be working on integrating Smallstep into Caddy. 👍

This should allow us to serve HTTPS for localhost and internal sites automatically, much like we already do for public sites via Let's Encrypt. Essentially, we want https://localhost or https://example.local to be able to be the default using trusted certificates and properly managed local/internal PKI.

In Caddy 1, tls self_signed simply generated a private key in memory and used it to self-sign a certificate that lasted for a week, and that's all: it wasn't trusted, the key wasn't reused, and there was no proper PKI. With this integration in v2, Caddy will be able to serve your local dev and internal sites over HTTPS using trusted certificates with a proper, long-lived CA. This brings the ACME protocol into local and internal environments, rather than hacking private keys together in memory.

In the future, we can probably extend this to properly-managed mTLS that is (near-)fully automatic in a cluster.

This feature is not a release blocker for Caddy 2.0. It may get into 2.1.

This issue is for tracking discussion of the proposal and getting feedback from interested users and companies. Please get involved!

Here are some questions to help bootstrap development discussion:

What do you want/need most from this feature?
Should HTTPS be the default for all sites, including local and internal ones?
What kind of configuration should be exposed?
What should the cert and CA default lifetimes be?
(Anything else that comes to your mind)

/cc Smallstep: @mmalone @maraino @mikemaxey

Michael Malone · Answer 1 · Thu Feb 06 2020 01:42:09 GMT+0800 (China Standard Time)

Hi everyone. I'm Mike, from smallstep.

We're super interested in uncovering use cases for this integration. To help get a conversation going I wanted to drop a couple ideas to gauge interest, and share a few links for anyone who wants to learn more about this stuff.

Use cases

Integrating Caddy with smallstep's CA (called step-ca) will make it easy for users to run their own internal PKI -- like your own internal Let's Encrypt. In general, a certificate is a sort of credential that you can use to authenticate a server or a client (e.g., for clients it's a replacement for a username/password or API access token). Running your own CA means you can issue these credentials yourself, however you want, whenever you want.

Some use cases that we think may be interesting to the Caddy community are:

Issuing certificates so you can use HTTPS during local development / staging (e.g., for foo.localhost)
Serving internal websites (e.g., on an intranet) using HTTPS
Securing (encrypting & mutually authenticating) communication between microservices using mutual TLS
Securing connections between internal systems that need to communicate across the public internet or other untrusted networks

More info

The smallstep CA is open source at https://github.com/smallstep/certificates
I wrote a (long) blog post about PKI at https://smallstep.com/blog/everything-pki/
I made the case for using TLS everywhere at https://smallstep.com/blog/use-tls/
For more details about ACME support in step-ca see https://smallstep.com/blog/private-acme-server/

Ryan Hurst · Answer 2 · Thu Feb 06 2020 02:41:50 GMT+0800 (China Standard Time)

@mholt I am not totally familiar with all of Caddys capabilities so this may not be totally relevant but in general, I see private PKI as being super important for API Key like use cases (mTLS) and service to service authentication (again mTLS).

As such, the idea of using SmallStep, over ACME, with external account binding, to acquire mTLS certificates makes a lot of sense to me.

You can see Caddy (as an FE) as the API authentication endpoint where mTLS is in use instead of the typical pre-shared key style API authentication, and having clients keeping their mTLS certificates fresh via an exposed smallstep ACME instance.

You can also see the upstream server being authenticated to via mTLS, where a different smallstep instance, again via external account binding or maybe via the traditional ACME challenges.

There is also the case where the upstream is another caddy instance and it is doing the certificate validation, and of course, getting its own certificate via ACME.

In models like this, you can have point-to-point TLS used for all communications via Private or Public TLS.

Jared Folkins · Answer 3 · Fri Feb 07 2020 02:00:19 GMT+0800 (China Standard Time)

In the EDU space, we've talked about a binary/service that looks like our provider. A proxy to do DNS challenges to CloudFlare or something so that we can get back valid LE certs. And then on our internal servers, we'd setup certbot to call this proxy and vioala. Magic.

Smallstep looks cool but many SMBs do not have the resources to ship root CAs every where. Remove that friction and we will buy.

Ryan Hurst · Answer 4 · Fri Feb 07 2020 02:12:37 GMT+0800 (China Standard Time)

I also like @jahands idea; the ability to use Caddy in the DMZ to do ACME challenge-response on behalf of hosts on the inside of a network is interesting (split horizon).

Though I think its a bad practice to use public SSL certs within an enterprise where machines are managed as it exposes the enterprise to associated operational risk and puts a weight on the public PKI (as we have seen with card readers doing the same thing) it's not clear to me UNIs have much of an option. I am sure there are other similar orgs as well as such something like this would be valuable and probably not a big extension to what you already have in Caddy.

Jared Folkins · Answer 5 · Fri Feb 07 2020 02:40:24 GMT+0800 (China Standard Time)

Though I think its a bad practice to use public SSL certs within an enterprise where machines are managed as it exposes the enterprise to associated operational risk and puts a weight on the public PKI (as we have seen with card readers doing the same thing) it's not clear to me UNIs have much of an option. I am sure there are other similar orgs as well as such something like this would be valuable and probably not a big extension to what you already have in Caddy.

If we threat model this, we generally have a flow chart numbered below.

no certs
self signed certs
external LE certs
external certs + managed enterprise PKI

#1 is where most orgs are at #2 is a nice step but open to MiTM #3 Does put the burden on public PKI infra but I think that could be solved too #4 Would be ideal but many can't afford a FTE to support that infra.

If we are trying to make it harder for attackers, we want solutions that scale without a huge FTE and dollar investment. #3 is what I believe is best based on my experience and those who are underfunded who share their perspective at opsecedu.com. I'm open to other ideas that meet the criteria of super simple to install and scale without FTE or large dollar investments.

Nathan McNulty · Answer 6 · Fri Feb 07 2020 04:35:04 GMT+0800 (China Standard Time)

Going to dump some relevant and some non-relevant info here that @mholt had asked about in a discussion at Opsecedu - sorry if it's a bit scattered. To preface, the discussion was around internal CA's at school districts and possible gaps that smallstep and/or caddy may have that we are currently handling in other ways.

We utilize ADCS internally to issue certificates for things like EAP-TLS, ConfigMgr which requires special purposes and EKU's on the certificates, and other things that are non-HTTPS focused. I don't believe you can request some of these things from Let's Encrypt, and I'm not familiar enough with smallstep to now if this is supported. It would be interesting if an ACME client were able to request these parameters, but that may help move off ADCS.

Another point was that a lot of MDM solutions and other equipment support SCEP but not ACME or maybe the ability to run the ACME client. I hear the frustration from the development side of things is the tooling operations uses is different than theirs, and I don't believe either of these solutions aim to solve that problem. For example, dev may spin up a box and grab a LE cert, but in production, ops may issue an internal cert via ADCS or SCEP where the dev does not have access/control.

To describe the proxy concept @jaredfolkins mentioned, I can explain what we are currently doing. We built what is basically a centralized certbot that grabs LE certificates for most of our servers with a dns-challenge. It then runs install scripts on remote hosts (think scp cert, relaunch daemon, then validate new cert) and has health check components built into the service as well. This maintains all of our LE stuff, but it would be arguably much easier if smallstep were able to act as a certbot server here.

Some mechanism, whether native smallstep or 3rd part, would get LE certificates onto smallstep, and then local ACME clients such as certbot would be able to request those publicly trusted LE certificates from smallstep. This achieves Jared's wishes that everything use the same protocols for requesting certificates, a unified CA for handling all of your certificates, and dev/ops are using the same tooling.

The part I'm unclear of is how we currently use ADCS to issue specific certificates to users and computers based on templates, disallow certain devices from requesting certificates, etc. I'm assuming that would be part of smallstep, but again, I'm just not familiar with it. Ultimately, we have to keep in mind that much of this is more than just HTTPS (we put certs on AD communications, RDP, ConfigMgr agents, SIEM agents, etc), so I'm not sure how much is really applicable to Caddy vs Smallstep vs anything else.

Ryan Hurst · Answer 7 · Fri Feb 07 2020 11:33:38 GMT+0800 (China Standard Time)

Though I think its a bad practice to use public SSL certs within an enterprise where machines are managed as it exposes the enterprise to associated operational risk and puts a weight on the public PKI (as we have seen with card readers doing the same thing) it's not clear to me UNIs have much of an option. I am sure there are other similar orgs as well as such something like this would be valuable and probably not a big extension to what you already have in Caddy.

If we threat model this, we generally have a flow chart numbered below.

no certs

self signed certs

external LE certs

external certs + managed enterprise PKI

#1 is where most orgs are at #2 is a nice step but open to MiTM #3 Does put the burden on public PKI infra but I think that could be solved too #4 Would be ideal but many can't afford a FTE to support that infra.

If we are trying to make it harder for attackers, we want solutions that scale without a huge FTE and dollar investment. #3 is what I believe is best based on my experience and those who are underfunded who share their perspective at opsecedu.com. I'm open to other ideas that meet the criteria of super simple to install and scale without FTE or large dollar investments.

SmallStep is about lowering those costs and complexities so no full time FTE or other cost is required :)

Michael Malone · Answer 8 · Sat Feb 08 2020 08:08:58 GMT+0800 (China Standard Time)

Gonna hop in here and try to summarize & synthesize a little bit, based on what I'm reading above.

Mutual TLS use cases

@rmhrisk it sounds like you're onboard with the two big mTLS use cases, which would be:

The API Gateway use case, where Caddy would be validating client certificates sent by third party clients. This would be a replacement for ad-hoc API keys or something like two-legged OAuth. We'd need a way to issue certificates to third parties for this to work. I think OAuth actually has some spec extensions that do this. ACME probably does too. Either way, this would be a big project.
The service-to-service use case, where Caddy would be deployed as a forward and/or reverse proxy in a "service mesh" style. For this, the existing certificate enrollment mechanisms are probably sufficient.

For either of these use cases, I think Caddy would have to grow better client certificate / mTLS support. I'm not sure what the current capabilities are in Caddy v2 but, minimally, it'd need to be able to:

Configure an ingress to require client authentication, validating client certs using your internal CA's root certificate. If proxying, we'd probably want to send some/all of the cert details upstream so the app being proxied to can implement authorization.
Configure an egress to send a client certificate when one is requested by an upstream (i.e. Caddy needs to be able to send a client certificate when it's proxying to an HTTPS server)

"Web PKI" inside the DMZ

I think this is really going to the bread-and-butter / low-hanging-fruit for an initial integration between Caddy & smallstep. Basically, the idea is to bring the standard PKI used by web browsers for HTTPS inside the DMZ. I think the big use cases for this are:

Development & staging environments, where you want an environment that's as similar as possible to production. Having ACME in step-ca means we can simulate LE and you can delete non-HTTPS code paths from your code bases.
Internal / intranet websites, where you want certificates that work in web browsers issued for internal domain names. Again, this will be easy to support with step-ca + Caddy.

I'm really interested in learning how common / how important these two use cases are, and whether there are additional use cases in this category.

@jaredfolkins your perspective here re: the cost of running an internal PKI is super valuable. Like @rmhrisk said, our mission is to make PKI more accessible. We want to bring that cost down. If you haven't checked out our CA it is pretty easy to get started, and I'd be very interested to hear feedback on how we can make things even easier.

You mentioned that many SMBs don't have the resources to ship root CAs. Is MDM too expensive (I honestly don't know what MDM solutions cost)? If so, you can actually use step to do root distribution if your audience is technical enough to run a couple commands:

# Download the root certificate
$ step ca root root.crt --ca-url <ca-url> --fingerprint <fingerprint>

# Install the root certificate in your system trust store(s)
$ step certificate install root.crt

Am I understanding the problem correctly? Any thoughts on how we can improve?

Other PKI use cases

@nathanmcnulty I think you're right that these non-HTTPS use cases aren't really relevant for the Caddy+smallstep integration. But we're absolutely interested in addressing these use cases with step-ca.

Certificate templating to support other purposes & EKUs is on our short-term open source roadmap. So is SCEP support. While it is possible for step-ca to work together with AD CS we've heard from a few folks that they'd rather not run more than one CA. We'd love to get to a place where we can replace AD CS! Is there anything else we need?

I don't think these use cases are really relevant to the Caddy integration, but they are important insofar as they make adopting step-ca more palatable. So I suppose it's worth cataloging use cases that might block adoption, too. Besides SCEP, the big ones that are on our radar already are:

OpenVPN certificates
Code signing certificates
EAP-TLS certificates

We're also working to support a generic templating mechanism so you can totally customize certificate attributes. But I think that'll be for pretty advanced use cases. See smallstep/cli#110 for a discussion. The big blocker on that, at the moment, is figuring out what format to use for the cert template (not sure if there's a standard for that).

Links to commonly used tools, specs / standards, blog posts, etc. that are relevant to any of these use cases would also be super helpful.

Nathan McNulty · Answer 9 · Sat Feb 08 2020 13:31:19 GMT+0800 (China Standard Time)

@mmalone That all looks great! I do think Microsoft has (unintentionally) made it difficult to fully replace AD CS because of how closely some features are integrated, such as Windows Hello for Business. These are pretty niche to a heavy Windows shop though, so I feel like those folks (like me) would tend to stick with is most documented and are most familiar with anyway.

Having said that, I don't foresee Microsoft adding ACME to AD CS, so there's still a lot to like about smallstep here, especially from a developer standpoint. Microsoft is also doing a better job of supporting 3rd party CA's in Azure/InTune (SCEP currently), so smallstep may be able to serve up those EAP-TLS certificates to InTune devices once you get further down the road :)

Ryan Hurst · Answer 10 · Sat Feb 08 2020 13:34:49 GMT+0800 (China Standard Time)

Replacing AD/CS is straight forward. I built this previously and that’s exactly what it does: https://www.globalsign.com/en/auto-enrollment-gateway

Microsoft Intune SCEP really is a Microsoft proprietary protocol; they’ve added other stuff to it that’s not standard, the “standard” parts are not necessarily conformant with the SCEP draft either (there is no final RFC) and the CA must ask InTune to validate some proprietary blobs (a mix of XML and JSON as I recall) in the request prior to issuance.

JalonSolov · Answer 11 · Mon Feb 10 2020 00:47:40 GMT+0800 (China Standard Time)

Just as a bit of extra information... I have successfully used certstrap to set up a CA, and generate site certificates for pods in a kubernetes cluster to talk to each other via TLS.

Once you have the CA, you can generate whatever certs you want, for localhost or wherever else, whenever you need them. And since you're generating them yourself, you control the expiration, etc.

Michael Grosser · Answer 12 · Wed Feb 12 2020 03:46:51 GMT+0800 (China Standard Time)

There is also https://github.com/bradfitz/autocertdelegate as an example on how people can and do work around ACME for internal use.

Matt Holt · Answer 13 · Sun Mar 08 2020 04:21:42 GMT+0800 (China Standard Time)

After thousands of lines of refactoring and weeks of work on foundational things, I've finally pushed my WIP implementation of certificates for localhost to #3125. Please try it out!

This PR begins the pki app which manages CA certificates. It also implements the internal issuer, which is a Caddy module that can use one of those CAs to issue certificates. Storage and renewal is managed by Caddy; signing and keys and other cryptographic things are managed primarily by Smallstep. It also can add its root cert to your trust store.

A future PR will add support for running an actual ACME server (also powered by Smallstep).

Still lots of polish and some TODOs to take care of, but your early feedback is welcomed!

Matt Holt · Answer 14 · Sat Mar 14 2020 09:44:17 GMT+0800 (China Standard Time)

Beta 17 is released with the first part of the Smallstep integration: locally-managed PKI for all sites that don't qualify for public certs.

Next up: an embedded Smallstep ACME server.

mannp · Answer 15 · Wed Mar 25 2020 22:58:30 GMT+0800 (China Standard Time)

I currently use step certs on my internal network with acme and traefik2 it works great for https but not so for mTLS certificates.

I very much hope mTLS certificates will get supported, as this would be a killer feature, for me at least :)

Edit: Next up: an embedded Smallstep ACME server.

I wondered how this would work for people with existing step-ca cert server, as well how that might work when the server is also used for ssh certs?

Mariano Cano · Answer 16 · Thu Mar 26 2020 01:37:01 GMT+0800 (China Standard Time)

@mannp what do you mean that mTLS certificates are not supported?
You can use step certificates to create client certificates, but it's true, that there are no acme challenges supported for clients, you will need to use a different provisioner, i.e. OIDC or JWK are your best options.

mannp · Answer 17 · Thu Mar 26 2020 02:20:36 GMT+0800 (China Standard Time)

@maraino I was referring to automatic generation of certs by traefik2 via my step acme server does not work so great for mTLS.

https certs are created automatically for https, so all my internal servers automatically get https certs and work great.

That same mechanism for mTLS is more problematic and I haven't yet manged to get it to allocate both sides of the mTLS automatically via T2 and my step acme server.

I can configure T2 to get the cert, but it doesn't work, but that is likely my misconfigured T2, than anything to do with smallstep.

I was rather hoping this caddy support would give me that auto cert generation for mTLS using acme :)

Mariano Cano · Answer 18 · Thu Mar 26 2020 03:43:42 GMT+0800 (China Standard Time)

@mannp I've just created a certificate using an acme provisioner and it should be able to support mTLS, it supports also client authentication.

        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication

The problem I think is that traeffik might not be able to use ACME to get client certificates, in the same way they do for server certificates.

On Kubernetes there might be an automatic way to solve this would be to use cert-manager + step-issuer (https://github.com/smallstep/step-issuer/tree/cert-manager.io using that branch to support recent versions of cert-manager).

If you're not using kubernetes you should be able to use a script to create and renew certificates, using step ca certificate to create once a certificate, and renew it with step ca renew --daemon.

What I'm not sure if traeffik is able to reload renewed certificates, if it's not, a possibility you would need to use provisioner specific claims with longer certificate expiration times. Or if you're using step ca renew --daemon you might be able to use its options to send signals or run scripts to force the 'reload' of traeffik, see step ca renew --help.

@mholt If caddy supports client authentication, it would be great if you can use the ACME protocol for retrieving them.

mannp · Answer 19 · Thu Mar 26 2020 04:13:06 GMT+0800 (China Standard Time)

@maraino is there a specific option for step to create the TLS Web Server Authentication compatible cert? I have not tried it directly.

I tried using the following
( https://containo.us/blog/traefik-2-tls-101-23b4fbee81f1/ ) - Option 2 using step as the certificate resolver and TCP & TLS configured.

The cert is created by my step ca and received by T2, the connection appears terminated, but I did not inspect the cert to see if it had the X509v3 Extended Key Usage:TLS Web Server Authentication usage?

I guess it might have created a https type cert, as i am not aware of any specific options to change the type of cert from T2.

I believe T2 will renew the https certs when needed, via acme, so for me it is excellent for the https part.

I understood it might be a client SNI issue, but running a small business myself i didn't have time to take things further.

Happy to join your step gitter if you need any more info on what i have tried.

T2 might have been updated for support now, but not currently using caddy I was keeping a keen eye on this thread to understand more on possible mTLS support.

Big thanks from me for step and caddy :)

Mariano Cano · Answer 20 · Thu Mar 26 2020 06:59:36 GMT+0800 (China Standard Time)

@mannp By default, all certs will have both Client and Server Authentication. So they should be valid for HTTPS and client authentication. The problem I think is T2 does not support ACME for client certificates, it's not what is intended for.

mannp · Answer 21 · Fri Mar 27 2020 17:24:10 GMT+0800 (China Standard Time)

The problem I think is T2 does not support ACME for client certificates, it's not what is intended for.

I mentioned above that T2 creates a TLS cert as specified via my step ACME, is this lack of support something other than that?

Matt Holt · Answer 22 · Sat Mar 28 2020 02:02:29 GMT+0800 (China Standard Time)

@maraino

If caddy supports client authentication, it would be great if you can use the ACME protocol for retrieving them.

It does support client auth, but what do you mean by "retrieving them" -- what is "them"? And, as the client or server validating the client?

In practice, what does this mean? Is it like, the HTTP reverse proxy needs to present a client certificate to the upstream, so the user simply configures the proxy's server name, which the proxy gets and manages a certificate for?

^ Did I get it right, or did you mean something else?

Mariano Cano · Answer 23 · Sat Mar 28 2020 02:58:02 GMT+0800 (China Standard Time)

@mholt I mean to be able to use ACME to get client certificates too.
Client certificates are validated by the server, if it's enabled. To enable it the server needs a couple of things in their tls.Config:

ClientAuth set to RequireAndVerifyClientCert (enforce mTLS), or VerifyClientCertIfGiven
ClientCAs pool with the root certificate that signed the client certificate (unless that is globally accepted by the system)

Matt Holt · Answer 24 · Sat Mar 28 2020 03:02:41 GMT+0800 (China Standard Time)

Got it. So based on that, here's my proposal/plan:

A new option in the HTTP reverse proxy's client auth config where you specify the server name for the client certificate. Like ClientCertificateAutomateName string
The reverse proxy then gets that certificate using the automation policy matching that server name. (The default automation policy is to use ACME with Let's Encrypt, but automation policies can be customized! See https://caddyserver.com/docs/json/apps/tls/automation/policies/ - so using an internal ACME server is very easy.)
The reverse proxy then presents that fully-managed client certificate to the upstreams.

Sound good?

Mariano Cano · Answer 25 · Sat Mar 28 2020 03:08:49 GMT+0800 (China Standard Time)

Sounds good, but to use ACME the ClientCertificateAutomateName must be a host that has to resolve to the caddy server, you cannot use an email, an urn or something like that.

I've verified a Let's Encrypt certificate and they also allow client authentication with their certs.

Matt Holt · Answer 26 · Sat Mar 28 2020 03:10:40 GMT+0800 (China Standard Time)

Sounds good, but to use ACME the ClientCertificateAutomateName must be a host that has to resolve to the caddy server, you cannot use an email, an urn or something like that.

Makes sense, since at this point, ACME only supports server names and not email/URL subjects.

Great! I'll get on this today.

Matt Holt · Answer 27 · Sat Mar 28 2020 04:32:42 GMT+0800 (China Standard Time)

@maraino @mannp Alrighty, with the latest push to the dev branch in commit d8eb39c, Caddy's reverse proxy can now use fully-automated client certificates:

...
{
	"handler": "reverse_proxy",
	"transport": {
		"protocol": "http",
		"tls": {
			"client_certificate_automate": "clientcert.test"
		}
	},
	"upstreams": [
		{"dial": "127.0.0.1:5000"}
	]
}
...

I tested this locally and it works great. 😃 Simply give the reverse proxy its own hostname, and tell Caddy to manage certs for that hostname using the internal ACME server; and then configure the upstream to trust the internal ACME server's root. That's it! The client keeps its client certificate renewed just like a server certificate.

mike1237 · Answer 28 · Sat Apr 25 2020 06:00:04 GMT+0800 (China Standard Time)

I want to first thank everyone for their amazing work on both of these products.

I'm working on re-architecting all of my environments to Zero Trust, including my dev & staging environments.

For the Dev environment, specifically, I'm looking to use an internal non-IANA domain e.g. mike-lab.private (https://tools.ietf.org/html/rfc6762#appendix-G)

I'm testing Win10 with WSL2 and Docker for Windows, which means that Caddy 2's automatic HTTPS functionality breaks because Caddy 2 is running in Docker/WSL2 and can't add the root cert to Windows trusted certificate store.

So I am working on using step-ca with an ACME provisioner and I will add the root certs manually on the client PCs (long term goal is to interface something like Pomerium to proxy Devnet access and utilize an SSO provider so that I don't have to deal with client device root certs).

I am looking into provisioning step-ca and Caddy with an IaaS solution to configure Caddy to use step-ca's ACME provisioner via JSON and to stand up a step-ca instance + ACME.

Matt Holt · Answer 29 · Tue May 12 2020 07:19:03 GMT+0800 (China Standard Time)

All the immediately-planned features have now been implemented and merged into master -- so I will close this issue. Later on, we can figure out ways to better automate the configuration of these features in a cluster.