WIP: Add Application Load Balancer Controller Manager#879
WIP: Add Application Load Balancer Controller Manager#879kamilprzybyl wants to merge 76 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
7ecc416 to
86bdf1d
Compare
| // generate self-signed certificates for the metrics server. While convenient for development and testing, | ||
| // this setup is not recommended for production. | ||
| // | ||
| // TODO(user): If you enable certManager, uncomment the following lines: |
There was a problem hiding this comment.
might be good to create separate examples for that
| sort.SliceStable(ruleMetadataList, func(i, j int) bool { | ||
| a, b := ruleMetadataList[i], ruleMetadataList[j] | ||
| // 1. Host name (lexicographically) | ||
| if a.host != b.host { |
| } | ||
|
|
||
| // cleanupCerts deletes the certificates from the Certificates API that are no longer associated with any Ingress in the IngressClass | ||
| func (r *IngressClassReconciler) cleanupCerts(ctx context.Context, ingressClass *networkingv1.IngressClass, ingresses []*networkingv1.Ingress) error { |
There was a problem hiding this comment.
do we really want to delete certificates? That could be very dangerous as users might just not associate the certificate anymore and want to keep it.
There was a problem hiding this comment.
Each Ingress TLS secret reference is mapped to a unique ALB Certificate resource. Even if multiple Ingresses reference the same k8s Secret, they will each trigger the creation of a separate, independent certificate on the ALB. That means, deleting one Ingress (and its associated ALB Certificate) does not impact the certificates still being used by other Ingresses.
|
|
||
| // isCertValid checks if the certificate chain is complete. It is used for checking if | ||
| // the cert-manager's ACME challenge is completed, or if it's sill ongoing. | ||
| func isCertValid(secret *corev1.Secret) (bool, error) { |
There was a problem hiding this comment.
isn't that already handled by the certificate api?
There was a problem hiding this comment.
The Cert API only checks if the public and private keys are matching + if the certificate is a valid X.509 format but it does not reject incomplete chains.
Renamed the function from isCertValid to isCertReady and clarified the function in the comment above.
d8de28a to
f903b25
Compare
|
We currently only delete certificates when the IngressClass is deleted or no Ingress references the IngressClass. |
|
3ad18ef to
9d6f767
Compare
| // It uses the trusted CAs from the operating system for validation. | ||
| tlsBridgingTrustedCaAnnotation = "alb.stackit.cloud/tls-bridging-trusted-ca" | ||
| // If set, the application load balancer enables TLS bridging with a custom CA provided as value. | ||
| tlsBridgingCustomCaAnnotation = "alb.stackit.cloud/tls-bridging-custom-ca" |
There was a problem hiding this comment.
This should probably use as value a reference to a TLS secret.
| // It merges and sorts all routing rules across the ingresses based on host, priority, path specificity, path type, and ingress origin. | ||
| // The resulting ALB payload includes targets derived from cluster nodes, target pools per backend service, HTTP(S) listeners, | ||
| // and optional TLS certificate bindings. This spec is later used to create or update the actual ALB instance. | ||
| func (r *IngressClassReconciler) albSpecFromIngress( //nolint:funlen,gocyclo // We go through a lot of fields. Not much complexity. |
There was a problem hiding this comment.
this is indeed complex, should be split up into multiple functions.
| return 0, fmt.Errorf("service not found: %s", path.Backend.Service.Name) | ||
| } | ||
|
|
||
| if path.Backend.Service.Port.Name != "" { |
There was a problem hiding this comment.
should be refactored to have less branching, 4 indents is too much imo.
| ## Overview | ||
|
|
||
| The STACKIT Cloud Provider includes both the Cloud Controller Manager (CCM) for managing cloud resources and the CSI driver for persistent storage. This deployment provides a unified solution for cloud integration and storage provisioning. | ||
| The STACKIT Cloud Provider includes the Cloud Controller Manager (CCM) for managing cloud resources, the CSI driver for persistent storage and the Application Load Balancer Controller Manager (ALBCM) for managing STACKIT Application Load Balancer (ALB) via Ingress Resources. |
There was a problem hiding this comment.
we should have different deployment docs for each binary. CCM and ALBCM should be different md files. It's easier to comprehend.
# Conflicts: # go.mod # go.sum
ccab903 to
cde054d
Compare
| } | ||
|
|
||
| targets := []albsdk.Target{} | ||
| for _, node := range nodes.Items { |
There was a problem hiding this comment.
We should also filter out nodes that are to be removed, similar to the CCM. This also requires extending the watcher predicate.
hown3d
left a comment
There was a problem hiding this comment.
some more general code concerns.
| var evtErr *errorEvent | ||
|
|
||
| if errors.As(errItem, &evtErr) { | ||
| log.Info(evtErr.description, "typ", evtErr.typ, "ingressRef", evtErr.ingressRef) |
There was a problem hiding this comment.
Instead of filtering here for the error type, include the "typ" and "ingressRef" in the Error method of the errorEvent type. This way you can just always use errItem.Error().
| ctx context.Context, | ||
| class *networkingv1.IngressClass, | ||
| ingresses []networkingv1.Ingress, | ||
| ) (payload *albsdk.CreateLoadBalancerPayload, activeCertIDs map[string]string, validationErrors []error, err error) { |
There was a problem hiding this comment.
It is a bad practice to have separate error variables return and also having slices of errors. Please use one error type and use errors.Join to join your errors.
In the calling function you can use unwrap to get a list of all the errors.
| return e.description | ||
| } | ||
|
|
||
| func (r *IngressClassReconciler) SendEvents(class *networkingv1.IngressClass, validationErrors []error) { |
There was a problem hiding this comment.
This could be a method of the errorEvent type.
func (e *errorEvent) RecordEvent(class *networkingv1.IngressClass, recorder record.EventRecorder) {
if evtErr.ingressRef.Name == "" {
return
}
recorder.Eventf(class, corev1.EventTypeWarning, "ALB", "Error in %s %s in Namespace %s: %s", evtErr.ingressRef.Kind, evtErr.ingressRef.Name, evtErr.ingressRef.Namespace, evtErr.description)
recorder.Event(&evtErr.ingressRef, corev1.EventTypeWarning, "ALB", evtErr.description)
}You currently parse the errors into the errorEvent to later call this SendEvents function to parse the errors as types again. This is not neccessary.
There was a problem hiding this comment.
replaced SendEvents with RecordEvent and made it an errorEvent method. The new method records a single event instead of a list of events.
| type errorEvent struct { | ||
| ingressRef corev1.ObjectReference | ||
| description string | ||
| typ string |
There was a problem hiding this comment.
This is always set to "error". What purpose does it bring?
Maybe a field.Path could be useful here for the user to directly see from what field in his ingress manifest the error stems.
There was a problem hiding this comment.
replaced with field.NewPath
| continue | ||
| } | ||
| if dstTargetPool.skipCertificateValidation != srcTargetPool.skipCertificateValidation { | ||
| mergeErrors = append(mergeErrors, &errorEvent{ |
There was a problem hiding this comment.
Please use errors.Join and don't return an error slice
| | `alb.stackit.cloud/cookie-persistence-name` | IngressClass, Ingress | Optional | Sets the name for session cookie persistence. | | ||
| | `alb.stackit.cloud/cookie-persistence-ttl-seconds`| IngressClass, Ingress | Optional | Sets the TTL (in seconds) for cookie persistence. | | ||
| | `alb.stackit.cloud/priority` | Ingress | Optional | Defines the evaluation priority of the Ingress. | | ||
| | `alb.stackit.cloud/traget-pool-tls-enabled` | IngressClass, Ingress, Service | Optional | Enables TLS bridging using OS trusted CAs. | |
| Currently, the Ingress API is supported. | ||
| Support for Gateway API is planned. | ||
|
|
||
| ##### Environment Variables |
There was a problem hiding this comment.
Maybe a section for "Get Started" that also links to the example deployment.yaml that tells a user how to actually run this alb controller in their cluster (if not managed by SKE).
Having a headline and then 5th headline with environment variable is confusing.
| For(&networkingv1.IngressClass{}, builder.WithPredicates(ingressClassPredicate())). | ||
| Watches(&corev1.Node{}, nodeEventHandler(r.Client), builder.WithPredicates(nodePredicate())). | ||
| Watches(&networkingv1.Ingress{}, ingressEventHandler(r.Client)). | ||
| Watches(&corev1.Secret{}, secretEventHandler(r.Client)). |
There was a problem hiding this comment.
Please make sure to disable the cache for secrets, as this would otherwise get quite huge pretty fast if we watch all secrets.
There was a problem hiding this comment.
In addition, the secret contains a server side field selector for "type". Therefore we could just watch for secrets of type TLS to reduce API server churn
| ) | ||
|
|
||
| func init() { | ||
| utilruntime.Must(clientgoscheme.AddToScheme(scheme)) |
There was a problem hiding this comment.
no need to use a custom scheme if we just use kubernetes core resources
| return len(c.CertificateConfig.CertificateIds) != len(d.CertificateConfig.CertificateIds) | ||
| } | ||
|
|
||
| func targetPoolsChanged(current, desired []albsdk.TargetPool) bool { |
There was a problem hiding this comment.
please implement these changed functions with slices.EqualFunc
Utilized getSortedIngressesForIngressClass for getAlbSpecForIngressClass
… alb specs via getAlbSpecForResources at last step
fixed unit tests and added more for ingress controller
unit tests for alb spec are added ensured that ephemeral address option defaults to false when internal-alb annotation is defined.
…errorEvent method; change logic to record a single event instead of a list of events
|
@kamilprzybyl: The following test failed, say
Full PR test history. Your PR dashboard. Command help for this repository. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
How to categorize this PR?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Breaking changes: