Containers Trend Report. Explore the current state of containers, containerization strategies, and modernizing architecture.
Securing Your Software Supply Chain with JFrog and Azure. Leave with a roadmap for keeping your company and customers safe.
Performance refers to how well an application conducts itself compared to an expected level of service. Today's environments are increasingly complex and typically involve loosely coupled architectures, making it difficult to pinpoint bottlenecks in your system. Whatever your performance troubles, this Zone has you covered with everything from root cause analysis, application monitoring, and log management to anomaly detection, observability, and performance testing.
Cloud computing has opened a Pandora's Box of many original issues compared to sound old on-premise systems. I believe that chief among them is data residency or data location: Data localization or data residency law requires data about a nation's citizens or residents to be collected, processed, and/or stored inside the country, often before being transferred internationally. Such data is usually transferred only after meeting local privacy or data protection laws, such as giving the user notice of how the information will be used and obtaining their consent. Data localization builds upon the concept of data sovereignty that regulates certain data types by the laws applicable to the data subjects or processors. While data sovereignty may require that records about a nation's citizens or residents follow its personal or financial data processing laws, data localization goes a step further in requiring that initial collection, processing, and storage first occur within the national boundaries. In some cases, data about a nation's citizens or residents must also be deleted from foreign systems before being removed from systems in the data subject's nation. - Data localization Data residency is an essential consideration for solution architects. Two critical scenarios can impact data residency requirements: Legal requirements: Certain countries have laws that require data to be stored within their territory. For example, China has strict data residency laws, which can impact the storage of data for businesses operating in the country. It is essential to be aware of such regulations and ensure compliance. Jurisdictional challenges: Laws and regulations governing data privacy and security vary between countries. This gap can create challenges for cloud providers in managing data storage and access across multiple countries, as they need to comply with the laws and regulations of each jurisdiction. For instance, the FISA and the Patriot Act in the US allow US actors to access the data of EU citizens, even though they are protected by the GDPR. As a cloud service provider responsible for upholding EU regulations, you could be liable under EU law if such access happens. Most cloud providers offer this capability; e.g., Google. However, it assumes the provider has Data Centers in the desired location. For example, Google has none in China. I want to offer a couple of options to handle this requirement in this post. Where To Compute the Location? In this section, I'll list a couple of approaches to compute the location. In the Code We can manage data residency at the code level. It's the most flexible option, but it also requires writing code and thus is the most error-prone. Specifics depend on your tech stack, but it goes like this: Get the request. Optionally query additional data. Establish where the data should go to. Write the data in the computed location. Here's what it looks like: X and Y correspond to different countries. In a Library/Framework From an architectural point-of-view, the driver approach is similar to the one above. However, the code doesn't compute the final location. The library/framework offers sharding: A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load. Some data within a database remains present in all shards,[a] but some appear only in a single shard. Each shard (or server) acts as the single source for this subset of data. - Shard (database architecture) The application knows about all countries' database URLs in this approach and needs to keep track of them. For example, the Apache ShardingSphere project provides a JVM database driver with such sharding capabilities. One can configure the driver to write data to a shard depending on a key; i.e, the location. Apache ShardingSphere is an ecosystem to transform any database into a distributed database system, and enhance it with sharding, elastic scaling, encryption features & more. The project is committed to providing a multi-source heterogeneous, enhanced database platform and further building an ecosystem around the upper layer of the platform. Database Plus, the design philosophy of Apache ShardingSphere, aims at building the standard and ecosystem on the upper layer of the heterogeneous database. It focuses on how to make full and reasonable use of the computing and storage capabilities of existing databases rather than creating a brand new database. It attaches greater importance to the collaboration between multiple databases instead of the database itself. - Apache ShardingSphere In a Proxy The proxy approach is similar to the library/framework approach above; the difference comes from the former running inside the application, while the latter is a dedicated component. The responsibility of keeping track of the databases falls now on the proxy's shoulders. Apache ShardingSphere provides both alternatives, a JDBC driver and a proxy. In the API Gateway Another approach is to compute the location in the API Gateway. Interestingly enough, part of a Gateway's regular responsibilities is to keep part of the upstreams. How To Compute the Location This section will consider what we need to compute the location. Let's examine the case of an HTTP request from a client to a system. The system as a whole, regardless of the specific component, needs to compute the location to know where to save the data. Two cases can arise: the HTTP request may carry enough information to compute where to send the data to or not. In the former case, it may be because of a client cookie, the previous request set relevant data in hidden fields on a web page, or any other approach. In the latter case, the system needs additional data. If the data is self-sufficient, it's better to forward the request to the correct country as early in the request processing chain as possible; i.e., from the API Gateway. Conversely, if the data requires enrichment, it should be done as close to the data enrichment component as possible. Imagine a situation where data location is based on a user's living place. We could set a cookie to store this information or data that allows us to compute it client-side. Every request would carry the data, and we wouldn't need additional data to compute the location. However, storing sensitive data on the client is a risk, as malicious actors could tamper with storage. To improve security, we could keep everything server-side. The system would need to request additional data to compute the location on every request. A Tentative Proposal We can reconcile the best of both worlds at the cost of additional complexity. Remember that everything is a trade-off: your mileage may vary depending on your context. The following is a good first draft that you can refine later. The first request can hit any of the two (or more) endpoints. The response adds additional metadata: which Gateway to query on the subsequent request, plus enough data so it can compute the location. On the second request, the client queries the correct Gateway. Note that for resiliency purposes, there can be two layers. Also, the request adds data necessary to compute the location. The Gateway receives the request, calculates the location, and forwards it to the app in the same location. The app receives the request and does what it's supposed to do using a sharding-friendly library. The library computes the location again. In most cases, it should yield the current location; if the data has been tampered with or the topology has changed since the initial computation, it switches to the correct location. In the latter case, we incur a performance penalty. However, our design makes it unlikely to happen. The main principle is that we should select the correct location as early as possible but leave the option to move to the right location if needed. Here's the sequence diagram of the second call, with location metadata already added: Conclusion In this post, we looked at data residency and designed a draft architecture to implement it. In the next post, we will delve into the technical details. To Go Further Foreign Intelligence Surveillance Act Patriot Act GDPR Apache ShardingSphere
In this blog, I would like to share a few best practices for creating High Available (HA) Applications in Mule 4 from an infrastructure perspective ONLY (CloudHub in this article refers to CloudHub 1.0 ONLY). Most of the configuration details (only relevant to HA) shared here are taken from MuleSoft Documentation/Articles/Blogs. 1. Horizontal Scaling Scaling horizontally means adding more servers so that the load is distributed across multiple nodes. Scaling horizontally usually requires more effort than vertical scaling, but it is easier to scale indefinitely once set up. CloudHub (CH) Add multiple CloudHub worker nodes. CloudHub (CH) provides high availability (HA) and disaster recovery against application and hardware failures. CH uses Amazon AWS for its cloud infrastructure, so availability is dependent on Amazon. The availability and deployments in CH are separated into different regions, which in turn point to the corresponding Amazon regions. If an Amazon region goes down, the applications within the region are unavailable and not automatically replicated in other regions. Deploy on Multiple Workers (Single Region) In order to achieve HA, add multiple CH workers to your Mule application to make it horizontally scale. CH automatically distributes multiple workers for the same application for maximum reliability. When deploying your application to two or more workers, the HTTP load balancing service distributes requests across these workers, allowing you to scale your services horizontally. Requests are distributed on a round-robin basis. Note: HTTP load balancing can be implemented by an internal reverse proxy server. Requests to the application (domain) URL http://appname.cloudhub.io are automatically load-balanced between all the application’s worker URLs. In case of AZ failure, all requests will be served by the Mule application deployed in other AZ in the same region. Deploy on Multiple Regions (for Disaster Recovery (DR) also) Application deployed to multiple regions provides better disaster recovery (DR) strategy along with HA. Generally, the region never goes down, but in the case of any natural calamities, the region may fail. If you have a requirement for DR, you can deploy the application to multiple regions and implement the CloudHub load balancer or any external load balancer (see below). DR Implementation by deploying our Mule application into Multiple regions and setting up a Load Balancer. Enable CloudHub HA features CloudHub's High Availability (HA) features provide scalability, workload distribution, & added reliability to your applications on CloudHub. This functionality is powered by CloudHub’s scalable load-balancing service, worker scale-out, and persistent queues features. Note: You can enable HA features on a per-application basis using the Anypoint Runtime Manager console when either deploying a new application or redeploying an existing application. Worker Scale-Out Refer to ‘Deploy on Multiple Workers (Single Region)’ above. Persistent Queues Persistent queues ensure zero message loss and let you distribute workloads across a set of workers. If your application is deployed to more than one worker, persistent queues allow communication between workers and workload distribution. How To Enable Cloudhub HA Features You can enable and disable either or both of the above features of CloudHub HA in one of two ways: When you deploy an application to CloudHub for the first time using the Runtime Manager console. By accessing the Deployment tab in the Runtime Manager console for a previously deployed application. Steps To Follow Next to Workers, select options from the drop-down menus to define the number and type of workers assigned to your application. Click an application to see the overview and click Manage Application. Click Settings and click the Persistent Queues checkbox to enable queue persistence. If your application is already deployed, you must redeploy it for your new settings to take effect. On-Premises Create multiple Standalone Mule runtime instances (nodes) & then create a Cluster out of them. When you deploy applications on-premises, you are responsible for the installation and configuration of the Mule runtime instances that run your Mule applications. Mule Enterprise Edition supports scalable clustering to provide high availability (HA) for deployed applications. A cluster is a set of Mule runtime engines that acts as a unit. In other words, a cluster is a virtual server composed of multiple nodes (Mule runtime engines). The nodes in a cluster communicate and share information through a distributed shared memory grid. This means that the data is replicated across memory in different machines. Cluster Setup in On-Premises Setup Single Data Center Multi-Node Cluster By default, clustering Mule runtime engines ensures high availability (HA). If a Mule runtime engine node becomes unavailable due to failure or planned downtime, another node in the cluster can assume the workload and continue to process existing events and messages. The following figure illustrates the processing of incoming messages by a cluster of two nodes. As shown in the following figure, the load is balanced across nodes: Node 1 processes message 1 while Node 2 simultaneously processed message 2. Processing of incoming messages by a cluster (When NO Failure occurs) If one node fails, the other available nodes pick up the work of the failing node. As shown in the following figure, if Node 2 fails, Node 1 processes both message 1 and message 2. Processing of incoming messages by a cluster (When Failure occurs) How To Add Servers/Nodes to Runtime Manager (Applicable to Deployment Options Where Control Plane Is either Anypoint Platform or On-Prem) Prerequisites: Your enterprise license is current. You are running Mule 3.6.0 or later, and API gateway 2.1 or later. If you want to download the RM Agent, you must have an Enterprise support account. If the server is already registered with another Runtime Manager instance, remove that registration first. Notes: For Mule 3.6.x, install the Runtime Manager agent. For Mule 3.7.x and later Mule 3.x versions, you can optionally update the Runtime Manager agent to the latest version to take advantage of all bug fixes and new features. For info on RM Agent installation, refer here. Steps To Follow In order to add a Mule server to the Runtime Manager so that you can manage it, you must first register it with the Runtime Manager agent. Note: Use the amc_setup script to configure the Runtime Manager agent to communicate with Runtime Manager. From Anypoint Platform, select Runtime Manager. Click Servers in the left menu. Click the Add Server button. Enter a name for your server. Notes: Server names can contain up to 60 alphanumeric characters (a-z, A-Z, 0–9), periods (.), hyphens (-), and underscores (_), but not spaces or other special characters. Runtime Manager supports Unicode characters in server names. The server name must be unique in the environment, but it can be the same for the same organization in different environments. Runtime Manager generates the amc_setup command. This command includes the server name you specified (server-name) and the registration token (token) required to register Mule in your environment. The registration token includes your organization ID and the current environment. The arrow shows the amc_setup command in the Add Server window Click Copy command to copy the amc_setup command. This button appears only if the server name you specify is valid. In a terminal window, change to the $MULE_HOME/bin directory for the Mule instance that you’re registering. Paste the command on the command line. Include any other parameters on the amc_setup command line. For more info on parameters, refer here. If your environment requires all outbound calls to go through a proxy, specify proxy settings in either the $MULE_HOME/conf/mule-agent.yml file or the $MULE_HOME/conf/wrapper.conf file. When the amc_setup the command completes successfully, you see the below messages: Mule Agent configured successfully Connecting to Access Management to extract client_id and client_secret Credentials extracted correctly, updating wrapper conf file After the script completes successfully, the name of your server appears on the Servers tab of Runtime Manager with a status of Created. Note: If the server was running when you ran the amc_setup script, restart the server to reconnect with Runtime Manager. Setup Two Data Centers (Primary-Secondary) Multi-Node Clusters (Applicable for Disaster Recovery (DR) Also) To achieve high availability (HA), you can select from two deployment topology options — a two-data centers architecture (see below) or a three-data centers architecture. You would set up your Mule application the same way in a highly available cluster configuration in both the primary and the secondary data centers. Use Active-Active Configuration ONLY This configuration provides higher availability with minimal human involvement. Requests are served from both data centers. You should configure the load balancer with appropriate timeout and retry logic to automatically route the request to the second data center if a failure occurs in the first data center environment. Benefits of Active-Active configuration are reduced recovery time objective (RTO) and recovery point objective (RPO). Two Data Centers Multi-node Clusters in Active-Active Configuration For the RPO requirement, data synchronization between the two active data centers must be extremely timely to allow seamless request flow. Use External Load Balancer When Mule clusters are used to serve TCP requests (where TCP includes SSL/TLS, UDP, Multicast, HTTP, and HTTPS), some load balancing is needed to distribute the requests among the clustered instances. There are various software load balancers available. Many hardware load balancers can also route both TCP and HTTP or HTTPS traffic. RTF (Runtime Fabric) Runtime Fabric Clusters These are clusters at the node level (worker level) using Kubernetes (K8s). They are different from Mule runtime clusters (on-prem) as they lack below clustering features: Distributed shared memory Shared state Shared VM queues In RTF clusters, Mule runtimes are not aware of each other. They are comparable to a CloudHub environment with two or more workers. Add More Workers or Controller Nodes to the Cluster This management of nodes has to be carried out in the Kubernetes side and not in MuleSoft Control Plane — Anypoint Platform Runtime Manager. There are two main ways to have nodes added to the cluster: The kubelet on a node self-registers to the control plane. Manually add a Node object using kubectl. After you create a Node object, or the kubelet on a node self-registers, the control plane checks whether the new Node object is valid. Note: For more info, refer to the documentation available on Kubernetes. Add More Pod Replicas to the Same Worker Node This management of pod replicas can be carried out in Anypoint Platform Runtime Manager. Steps To Follow Log in to Anypoint Platform using your Enterprise Mule credentials, and go to Runtime Manager. Select your application. Click tab Deployment Target. Locate Replicas dropdown. Change the number of replicas to desired number. Runtime Manager screen where no. of Pod Replicas can be changed. Select the option Run in Runtime Cluster Mode. Click Apply changes. Click View Status Status screen post-Pod Replicas creation. 2. Load Balancing Load Balancers are one or more specialized servers that intercept requests intended for a group of (backend) systems, distributing traffic among them for optimum performance. If one backend system fails, these load balancers automatically redirect incoming requests to the other systems. By distributing incoming requests across multiple systems, you enable the still-operational systems to take over when one system fails, hence achieving High Availability (HA). CloudHub (CH) Shared Load Balancer (SLB) is available within a Region & serves all the CH customers within that Region, hence you have no control over it. Instead create an Anypoint VPC first & then a Dedicated Load Balancer (DLB) within the VPC. Create a Dedicated Load Balancer (DLB) Within an Anypoint Virtual Private Cloud (VPC) Prerequisites: Ensure that your profile is authorized to perform this action by adding the CloudHub Network Administrator permission to the profile of the organization where you are creating the load balancer. Create an Anypoint Virtual Private Cloud (Anypoint VPC) in the organization where you want to create a load balancer. Create at least one certificate and private key for your certificate. There are 3 ways to create and configure a DLB for your Anypoint VPC: Using Runtime Manager Using CLI Using CloudHub REST API Steps To Follow (Using Runtime Manager) From Anypoint Platform, click Runtime Manager. Click Load Balancers > Create Load Balancer. Enter a name for your load balancer. Note: The CloudHub DLB name must be unique across all DLBs defined in Anypoint Platform (by all MuleSoft customers) Select a target Anypoint VPC from the drop-down list. Specify the amount of time the DLB waits for a response from the Mule application in the Timeout in Seconds field. The default value is 300 seconds. Add any allow-listed classless inter-domain routing (CIDR) as required. The IP addresses you specify here are the only IP addresses that can access the load balancer. The default value is 0.0.0.0/0. Select the inbound HTTP mode for the load balancer. This property specifies the behavior of the load balancer when receiving an HTTP request. Values to use for HA: On: Accepts the inbound request on the default SSL endpoint using the HTTP protocol. Redirect: Redirects the request to the same URL using the HTTPS protocol. Specify options: Enable Static IPs specifies to use static IPs, which persist when the DLB restarts. Keep URL encoding specifies that the DLB passes only the %20 and %23 characters as is. If you deselect this option, the DLB decodes the encoded part of the request URI before passing it to the CloudHub worker. Support TLS 1.0 specifies to support TLS 1.0 between the client and the DLB. Upstream TLS 1.2 specifies to force TLS 1.2 between the DLB and the upstream CloudHub worker. Forward Client Certificate specifies that the DLB forwards the client certificate to the CloudHub worker. Add a certificate: Click Add Certificate The arrow shows the Add certificate option in the Create Load Balancer page. On the Create Load Balancer | Add certificate page, select Choose File to upload both public key and private key files. If you want to add a client certificate, click Choose File to upload the file. If you want to add URL mapping rules, click the > icon to display the options. The arrow shows the expand icon in the Create Load Balancer | Add certificate page. If you add more than one URL mapping rule, order the rules in the list according to the priority in which they should be applied. Click Add New Rule, and then specify the input path, target app, output path, and protocol. Click Save Certificate. Click Create Load Balancer. On-Premises When On-Prem Mule clusters are used to serve TCP requests (where TCP includes SSL/TLS, UDP, Multicast, HTTP, and HTTPS), some external load balancing is needed to distribute the requests among the clustered instances. There are various third-party software load balancers available; two of them, which MuleSoft recommends, are: NGINX, an open-source HTTP server and reverse proxy. You can use NGINX’s HttpUpstreamModule for HTTP(S) load balancing. The Apache web server, which can also be used as an HTTP(S) load balancer. Many hardware load balancers can also route both TCP and HTTP or HTTPS traffic. I will not get into more details as it is up to your/customer's discretion to decide. Refer to the documentation of whichever load balancer you decide for configuration details. RTF (Runtime Fabric) Deploy Ingress Controllers (For Both Internal and External Calls) Runtime Fabric supports any ingress controller that is compatible with your Kubernetes environment and supports a deployment model where a separate ingress resource is created per application deployment. In general, most off-the-shelf ingress controllers support this model. The ingress controller is not part of the product offering. Customers need to choose and configure the ingress controller on RTF on self-managed (BYOK). Runtime Fabric creates a separate Ingress resource for each application deployed. The default ingress controller in any cloud (AWS, Azure or GCP) creates a separate HTTP(s) load balancer for each Ingress resource in the K8s cluster. Prerequisite(s) RTF Cluster is up and running. Steps To Follow Create the internal and external Ingress Controllers. kubectl apply -f nginx-ingress.yaml kubectl apply -f nginx-ingress-internal-v2.yaml Validate that Ingress Controllers are created with external & internal IP. kubectl get services --all-namespaces Validate the IPs created Create the Ingress Templates. kubectl apply -f ingress-template.yaml kubectl apply -f ingress-template-internal.yaml Note: For info on how to create templates, refer here. Validate that Ingress Controllers are created. kubectl get ing --all-namespaces Deploy your Mule application. In the Ingress section of the deployed application, you should be able to add endpoints for both internal and external. Ingress section of RM screen where we can add endpoints. External endpoint testing: Execute a cURL command from your own machine. Internal endpoint testing: A good way of testing is spawning a container inside the K8s cluster, accessing it and then executing a cURL command from there. Deploy External Load Balancers If you want your Mule applications to be externally accessible, you must add a external load balancer to your K8s cluster. Load balancers create a gateway for external connections to access your cluster, provided that the user knows the load balancer’s IP address and the application’s port number. When running multiple ingress controllers, you must have an external load balancer outside Runtime Fabric to front each of the ingress controllers. The external load balancer must support TCP load balancing and must be configured with a server pool containing the IP addresses of each controller. A health check must be configured on the external load balancer, listening on port 443. This configuration of the external load balancer provides the following: High availability. Protection against failure. Automatic failover (if a replica of the internal load balancer restarts or is evicted and rescheduled on another controller). Note: For info on which external load balancer (as per MuleSoft recommendations) to use, refer above ‘Load Balancing — On-Premises.’ 3. Managing Server Groups (Applicable to Hybrid Deployment Option Only — Runtime Plane Is On-Prem and Control Plane Is Anypoint Platform) A server group is a set of servers that act as a single deployment target for applications so that you don’t have to deploy applications to each server individually. Unlike clusters, application instances in a server group run in isolation from the application instances running on the other servers in the group. Deploying applications to servers in server groups provides redundancy so you can restore applications more seamlessly and quickly, with less downtime and hence High Availability (HA). Create Server Groups Prerequisites All servers in a server group must be running the same version of the Mule runtime engine and the same version of the Runtime Manager agent. You can create a server group with servers with the Running, Created, or Disconnected status, but you cannot have a server group with mixed servers. In order to add a server on which existing applications are currently running to a server group, you must first stop and delete the applications from the server. Add Servers to Runtime Manager (RM) Refer to ‘How to add servers/nodes to Runtime Manager’ above. Create Server Groups To deploy applications to a server group, you can either add the servers to Runtime Manager first and then create the server group, or you can create the group and add servers to it later. Steps To Follow From Anypoint Platform, select Runtime Manager. Select Servers in the left menu. Click the Create Group button. In the Create Server Group page, enter the name of the server group. Group names can contain between 3–40 alphanumeric characters (a-z, A-Z, 0–9) and hyphens (-). They cannot start or end with a hyphen and cannot contain spaces or other characters. Select the servers to include in your new server group. Click the Create Group button. Note: The new server group appears in the Servers list. The servers no longer appear in the Servers list. To see the list of servers in the group, click the server group name Add Servers to a Server Group Prerequisites At least one server is configured. The server group is created. No applications are deployed on the server that you are adding to the server group. Steps To Follow From Anypoint Platform, select Runtime Manager. Select Servers in the left menu. Click Group in the Type column to display the details pane. Click the Add Server button: The arrow shows the Add Server button in the details pane A list of available servers appears. Select the servers to add to the group, and click the Add Servers button. Note: The servers no longer appear in the Servers list. To see the list of servers in the server group, click the group name. 4. Managing Clusters (Applicable to On-Premises Deployment Model Only) A cluster is a set of up to eight servers that act as a single deployment target and high-availability processing unit. Unlike in server groups, application instances in a cluster are aware of each other, share common information, and synchronize statuses. If one server fails, another server takes over processing applications. A cluster can run multiple applications. Note: Before creating a cluster, you must create the Mule runtime engine instances and add the Mule servers to Anypoint Runtime Manager. Features That Help in Achieving HA Clusters Management To deploy applications to a cluster, you can either add the servers to Runtime Manager first and then create the cluster, or you can create the cluster and add servers to it later. There are two ways to create and manage clusters: Using Runtime Manager Manually, using a configuration file Prerequisites/Restrictions Do not mix cluster management tools All nodes in a cluster must have the same Mule runtime engine and Runtime Manager agent version Manual cluster configuration is not synced to Anypoint Runtime Manager, so any change you make in the platform overrides the cluster configuration files. To avoid this scenario, use only one method for creating and managing your clusters: either manual configuration or configuration using Anypoint Runtime Manager. If you are using a cumulative patch release, such as 4.3.0–20210322, all instances of Mule must be the same cumulative patch version How To Create Clusters Manually Ensure that the node is not running; that is, the Mule Runtime Server is stopped. Create a file named mule-cluster.properties inside the node’s $MULE_HOME/.mule directory. Edit the file with parameter = value pairs, one per line (See below). Note: mule.clusterId and mule.clusterNodeId must be in the properties file. Properties files ... mule.cluster.nodes=192.168.10.21,192.168.10.22,192.168.10.23 mule.cluster.multicastenabled=false mule.clusterId=<Cluster_ID> mule.clusterNodeId=<Cluster_Node_ID> ... Repeat this procedure for all Mule servers that you want to be in the cluster. Start the Mule servers in the nodes. How To Manage Clusters Manually Manual management of a cluster is only possible for clusters that have been manually created. Stop the node’s Mule server. Edit the node’s mule-cluster.properties as desired, then save the file. Restart the node’s Mule server. How To Create Clusters in Runtime Manager To deploy applications to a cluster, you can either add the servers to Runtime Manager first and then create the cluster, or you can create the cluster and add servers to it later. Prerequisites Servers cannot contain any previously deployed applications. Servers cannot belong to another cluster or server group. Multicast servers can be in the Running or Disconnected state. Unicast servers must be in the Running state. All servers in a cluster must be running the same Mule runtime engine version (including monthly update) and Runtime Manager agent version. Steps To Follow From Anypoint Platform, select Runtime Manager. Select Servers in the left menu. Click the Create Cluster button: The arrow shows the Create Cluster button on the Servers page In the Create Cluster page, enter the name for the cluster. Note: Cluster names can contain between 3–40 alphanumeric characters (a-z, A-Z, 0–9) and hyphens (-). They cannot start or end with a hyphen and cannot contain spaces or other characters. Select Unicast or Multicast. Select the servers to include in your new cluster. Click the Create Cluster button. Note: The new cluster appears in the Servers list. The servers no longer appear in the Servers list. To see the list of servers in the cluster, click the cluster name. Add Servers to Runtime Manager Refer to ‘How to add servers/nodes to Runtime Manager’ above. Multicast Clusters Enable Multicast for your clusters to achieve better HA. A multicast cluster comprises servers that automatically detect each other. Servers that are part of a multicast cluster must be on the same network segment. Advantages The server status doesn’t need to be Running to configure it as a node in a cluster. You can add nodes to the cluster dynamically without restarting the cluster. Ports/IP address config prerequisites If you configure your cluster via Runtime Manager & you use the default ports, then keep TCP ports 5701, 5702, and 5703 open. If you configure custom ports instead, then keep the custom ports open. Ensure communication between nodes is open through port 5701 . Keep UDP port 54327 open. Enable the multicast IP address: 224.2.2.3 . Data Grid (In-Memory) An in-memory data grid (IMDG) is a set of networked/clustered nodes that pool together their random access memory (RAM) to let applications share data with other applications running in the cluster. Mule clusters have a shared memory based in Hazelcast IMDG. There are different ways in which Mule share information between nodes, thereby achieving HA: Notes: Some transports share information to ensure a single node is polling for resources. For example, in a cluster there is a single FTP inbound endpoint polling for files. VM queues are shared between the nodes, so, if you put a message in a VM queue, potentially any node can grab it and continue processing the message. Object Store is shared between the cluster nodes. For more info, refer here. Data Sharing/Data Replication Data Sharing is the ability to distribute the same sets of data resources with multiple applications while maintaining data fidelity across all entities consuming the data. Data Replication is a method of copying data to ensure that all information stays identical in real-time between all data resources. The nodes in a cluster communicate and share information through a distributed shared memory grid IMDG (see below). This means that the data is replicated across memory in different machines. Data Sharing/Data Replication via IMDG Fault Tolerance (FT) Fault Tolerance means ensuring recovery from failure of an underlying component. By default, clustering Mule runtime engines ensures high availability (HA). If a Mule runtime engine node becomes unavailable due to failure or planned downtime, another node in the cluster can assume the workload and continue to process existing events and messages. Note: For more info, refer to ‘Setup Single Data Center Multi-node Cluster’ above under ‘Horizontal Scaling#On-Premises’ Also, JMS can be used to achieve HA & FT by routing messages through JMS Queues. In this case, each message is routed through a JMS queue whenever it moves from component to component. Active-Active Configuration Refer to ‘Use Active-Active Configuration ONLY’ above under ‘Horizontal Scaling#On-Premises’ Quorum Management Quorum feature is only valid for components that use Object Store. When managing a manually configured cluster, you can now set a minimum quorum of machines required for the cluster to be operational. When partitioning a network, clusters are available by default. However, by setting a minimum quorum size, you can configure your cluster to reject updates that do not pass a minimum threshold. This helps you achieve better consistency and protects your cluster in case of an unexpected loss of one of your nodes, hence HA. Under normal circumstances, if a node were to die in the cluster, you may still have enough memory available to store your data, but the number of threads available to process requests would be reduced as you now would have fewer nodes available, and the partition threads in the cluster could quickly become overwhelmed. This could lead to: Clients left without threads to process their requests. The remaining members of the cluster become so overwhelmed with requests that they’re unable to respond and are forced out of the cluster on the assumption that they are dead. To protect the rest of the cluster in the event of member loss, you can set a minimum quorum size to stop concurrent updates to your nodes and throw a QuorumException whenever the number of active nodes in the cluster is below your configured value. Note: To enable quorum, place in the cluster configurations file {MULE_HOME}/.mule/mule-cluster.properties the mule.cluster.quorumsize property, and then define the minimum number of nodes of the cluster to remain in an operational state. Ensure you catch QuorumExceptions when configuring a Quorum Size for your cluster, and then make any decision as to send an email, stop a process, perform some logging, activate retry strategies, etc. Primary Node In an active-active model, there is no primary node. In the Active-Active model, one of the nodes acts as the primary polling node. This means that sources can be configured to only be used by the primary polling node so that no other node reads messages from that source. This feature works differently depending on the source type: Scheduler source: only runs in the primary polling node. Any other source: defined by the primaryNodeOnly attribute. Check each connector’s documentation to know which is the default value for primaryNodeOnly in that connector. XML <flow name="jmsListener"> <jms:listener config-ref="config" destination="listen-queue" primaryNodeOnly="true"/> <logger message="#[payload]"/> </flow> Object Store Persistence You can persistently store JDBC data in a central system that is accessible by all cluster nodes when using Mule runtime engine on-premises. The following relational database systems are supported: MySQL 5.5+ PostgreSQL 9 Microsoft SQL Server 2014 To enable object store persistence, create a database and define its configuration values in the {MULE_HOME}/.mule/mule-cluster.properties file: mule.cluster.jdbcstoreurl: JDBC URL for connection to the database mule.cluster.jdbcstoreusername: Database username mule.cluster.jdbcstorepassword: Database user password mule.cluster.jdbcstoredriver: JDBC Driver class name mule.cluster.jdbcstorequerystrategy: SQL dialect Two tables are created per object store: One table stores data Another table stores partitions Recommendations You create a dedicated database/schema that will only be used for JDBC store. The database username needs to have permission to: Create objects in the database (DDL), CREATE and DROP for tables. Access and manage the objects it creates (DML), INSERT, UPDATE, DELETE, and SELECT. Always keep in mind that the data storage needs to be hosted in a centralized DB reachable from all nodes. Don’t use more than one database per cluster. Some relational databases have certain constraints regarding the name length of tables. Use the mule.cluster.jdbcstoretableNametransformerstrategy property to transform long table names into shorter values. The persistent object store uses a database connection pool based on the ComboPooledDataSource Java class. The Mule runtime engine does not set any explicit values for the connection pool behavior. The standard configuration uses the default value for each property. For example, the default value for maxIdleTimeis 0, which means that idle connections never expire and are not removed from the pool. Idle connections remain connected to the database in an idle state. You can configure the connection pool behavior by passing your desired parameter values to the runtime, using either of the following options: Pass multiple parameters in the command line when starting Mule: $MULE_HOME/bin/mule start \ -M-Dc3p0.maxIdleTime=<value> \ -M Dc3p0.maxIdleTimeExcessConnections=<value> Replace <value> with your desired value in milliseconds. Add multiple lines to the $MULE_HOME/conf/wrapper.conf file: Properties files wrapper.java.additional.<n>=-Dc3p0.maxIdleTime=<value> wrapper.java.additional.<n>=-Dc3p0.maxIdleTimeExcessConnections=<value> Replace <n> with the next highest sequential value from the wrapper.conf file. Conclusion This is an effort to collate good practices to achieve High Availability (HA) in one place. Mule developers keen on building highly available applications can refer to the exhaustive list above. Thank you for reading!! Hope you find this article helpful in whatever possible. Please don’t forget to like, share & feel free to share your thoughts in the comments section. If interested, please go through my previous blogs on Best Practices To build high-performant Mule applications here. To build highly reliable Mule applications here.
In today’s digital age, the reliability and availability of software systems are critical to the success of businesses. Downtime or performance issues can have serious consequences, including financial loss and reputational damage. Therefore, it is essential for organizations to ensure that their systems are resilient and can withstand unexpected failures or disruptions. One approach to achieving this is through chaos engineering. What Is Chaos Engineering? Chaos engineering is a practice that involves intentionally introducing failures or disruptions to a system to test its resilience and identify weaknesses. By simulating real-world scenarios, chaos engineering helps organizations proactively identify and address potential issues before they occur in production. This approach can help organizations build more resilient systems, reduce downtime, and improve overall performance. Steps Involved In Chaos Engineering The chaos engineering process involves several steps. First, teams must identify the critical components of the system and the potential failure modes that could impact these components. Next, they must design and execute experiments to simulate these failure modes and measure the impact on the system. Finally, teams must analyze the results of the experiments and use the insights gained to improve the system’s resilience. Benefits of Chaos Engineering One of the key benefits of chaos engineering is that it helps organizations identify and address potential issues before they occur in production. By intentionally introducing failures to a system, teams can identify weaknesses and areas for improvement. For example, if an experiment reveals that the system is not resilient to a particular type of failure, the team can take steps to address this weakness and improve overall system resilience. Another benefit of chaos engineering is that it can help organizations reduce downtime and improve system availability. By identifying and addressing potential issues proactively, teams can prevent unexpected failures and disruptions that could impact system availability. This can help organizations maintain business continuity and avoid financial loss or reputational damage. How Chaos Engineering Helps Organizations Chaos engineering can also help organizations improve their overall system performance. By testing the system’s resilience under different conditions, teams can identify bottlenecks or performance issues that may impact system performance. This can help organizations optimize their systems and improve overall performance. To implement chaos engineering, organizations must adopt a culture of experimentation and embrace failure as a learning opportunity. This requires a shift in mindset from one that views failure as a negative outcome to one that recognizes failure as a natural part of the learning process. By embracing failure and learning from it, teams can continuously improve their systems and build more resilient and reliable software. Conclusion In conclusion, chaos engineering is a powerful practice that can help organizations build more resilient systems, reduce downtime, and improve overall performance. By intentionally introducing failures and disruptions to a system, teams can identify weaknesses and areas for improvement, and proactively address potential issues before they occur in production. To implement chaos engineering, organizations must adopt a culture of experimentation and embrace failure as a learning opportunity. With a commitment to chaos engineering, organizations can build more resilient and reliable software systems that can withstand unexpected failures and disruptions.
Network virtualization has been one of the most significant advancements in the field of networking in recent years. It is a technique that allows the creation of multiple virtual networks, each with its own set of policies, services, and security mechanisms, on top of a single physical network infrastructure. Network virtualization helps to optimize network resources, reduce operational costs, and increase flexibility and agility in network deployment and management. In this article, we will delve deeper into the concept of network virtualization, its benefits, and the various technologies and protocols used in its implementation. What Is Network Virtualization? Network virtualization is the process of decoupling the network’s logical functions from its physical infrastructure to create multiple virtual networks on a shared physical network. The idea is to allow multiple tenants or applications to share the same physical infrastructure while maintaining their own isolated logical networks with dedicated resources and policies. This enables the creation of a highly efficient and flexible network that can meet the needs of different users and applications. Virtualization has been widely used in the IT industry for many years, primarily in the server and storage domains. Network virtualization extends the same concept to the networking domain, allowing multiple logical networks to be created on top of a single physical network infrastructure. It provides a layer of abstraction that separates the logical network from the physical infrastructure, enabling the logical network to be configured and managed independently of the physical network. Benefits of Network Virtualization Network virtualization has several benefits, including: Resource Optimization Network virtualization enables the efficient use of network resources by allowing multiple logical networks to share the same physical infrastructure. This reduces the need for dedicated physical networks for each application or user, leading to lower costs and better utilization of resources. Improved Agility and Flexibility Network virtualization makes it easier to create, manage, and modify logical networks as per the changing needs of the users and applications. This enables network administrators to respond quickly to changing business requirements and deploy new applications and services more rapidly. Better Security Network virtualization provides better security by creating isolated logical networks that can be secured independently. This reduces the risk of security breaches spreading across the entire network and enhances overall network security. Simplified Network Management Network virtualization simplifies network management by enabling the central management of multiple logical networks. This reduces the need for complex manual configuration and ensures consistency across the network. Improved Network Scalability and Flexibility Network virtualization allows organizations to easily scale their network resources up or down to meet changing demands. Virtual networks can be created and configured quickly and easily without the need for additional physical network devices. Reduced Network Complexity Network virtualization simplifies network design and management, reducing the complexity of the underlying physical network infrastructure. This makes it easier to manage and troubleshoot network issues. Cost Savings Network virtualization reduces the need for additional physical network devices, leading to lower capital and operational costs. Approaches to Network Virtualization There are different approaches to implementing network virtualization, including: Overlay Networks Overlay networks create virtual networks on top of the existing physical network infrastructure, using tunneling protocols such as VXLAN or GRE to encapsulate virtual network traffic within the physical network. Overlay networks provide a simple and scalable approach to network virtualization, enabling the creation of virtual networks without the need for additional physical network devices. However, they can introduce additional network latency and may require additional network bandwidth. VLANs VLANs provide a simple approach to network segmentation, allowing different virtual networks to be created on a single physical network infrastructure. VLANs use tagging to identify virtual network traffic, enabling network administrators to isolate and secure virtual network traffic. However, VLANs have limitations in terms of scalability and flexibility, as they are limited to a maximum of 4096 VLANs per network. Software-Defined Networking (SDN) SDN separates the network control plane from the data plane, enabling network administrators to manage network resources centrally using software-defined controllers. SDN provides a flexible and scalable approach to network virtualization, enabling the creation of virtual networks that can be configured and managed centrally. SDN also enables automated network configuration and management, reducing the complexity of network management and improving network scalability and flexibility. Network Functions Virtualization (NFV) NFV enables the creation of network services as virtualized network functions (VNFs), running on generic hardware rather than proprietary network devices. NFV provides a flexible and scalable approach to network services, enabling organizations to Applications of Network Virtualization Network virtualization has several applications in different areas, including: Data Centers Network virtualization is widely used in data centers to create virtual networks for different applications, departments, and tenants. This enables the efficient utilization of the network resources, as well as improved security and isolation between different applications and tenants. Cloud Computing Network virtualization is an essential component of cloud computing, as it enables the creation of virtual networks that can be used to connect different cloud services, applications, and users. This allows for the efficient sharing of resources, as well as improved security and isolation between different cloud tenants. Internet Service Providers Network virtualization is also used by internet service providers to create virtual networks for different customers, departments, and applications. This enables the efficient utilization of network resources, as well as improved security and isolation between different customers and applications. Telecommunications Network virtualization is also used in the telecommunications industry to create virtual networks for different services and applications, such as voice, data, and video. This enables the efficient sharing of network resources, as well as improved security and isolation between different services and applications. Challenges of Network Virtualization While network virtualization offers several benefits, it also presents some challenges, including: Complexity Network virtualization can be complex to set up and manage, requiring specialized skills and knowledge. This can increase the cost and complexity of network operations. Performance Network virtualization can impact network performance, especially in terms of latency and throughput. This can affect the user experience and application performance. Compatibility Network virtualization may not be compatible with all network hardware and software, requiring specialized hardware and software that support network virtualization. Conclusion Network virtualization is a powerful technology that enables the creation of multiple virtual networks on top of a shared physical infrastructure. In the world of computer networks, virtualization has emerged as a revolutionary concept that has transformed the way we manage and utilize our network resources. Network virtualization allows us to create multiple virtual networks on a single physical network infrastructure, thereby increasing efficiency, flexibility, and scalability.
In today's fast-paced digital world, application performance has become critical in delivering a seamless user experience. Users expect applications to be lightning-fast and responsive, no matter the complexity of the task at hand. To meet these expectations, developers constantly look for ways to improve their application's performance. One solution that has gained popularity in recent years is the integration of MicroStream and Redis. By combining these two cutting-edge technologies, developers can create ultrafast applications that deliver better results. In this post, we will explore the benefits of this integration and how developers can get started with this powerful combination. MicroStream is a high-performance, in-memory persistence engine designed to improve application performance. MicroStream can store data in memory without needing a mapper or conversion process. It means that developers can work with objects directly without worrying about the mapping process, saving around 90% of the computer power that would have been consumed in the mapping process. One of the critical advantages of MicroStream is its speed. By storing data in memory, MicroStream allows faster read and write operations, resulting in improved application performance. MicroStream's data structure is optimized for in-memory storage, enhancing its speed and efficiency. It makes it an ideal solution for applications that require fast response times and high throughput. Another advantage of MicroStream is its simplicity. With MicroStream, developers can work with objects directly without dealing with the complexities of SQL databases or other traditional persistence solutions. It makes development faster and more efficient, allowing developers to focus on creating great applications instead of struggling with complex data management. MicroStream's speed, simplicity, and efficiency make it an ideal solution for modern application development. By eliminating the need for a mapper or conversion process, MicroStream saves valuable computer power and resources, resulting in significant cost savings for developers. And with its optimized data structure and in-memory storage capabilities, MicroStream delivers fast and reliable performance, making it a powerful tool for building high-performance applications. We have enough of the theory: let's move to the next session, where we can finally see both databases working together to impact my application. Database Integratrion In an upcoming article, we will explore the integration of MicroStream and Redis by creating a simple project using Jakarta EE. With the new Jakarta persistence specifications for data and NoSQL, it is now possible to combine the strengths of MicroStream and Redis in a single project. Our project will demonstrate how to use MicroStream as the persistence engine for our data and Redis as a cache for frequently accessed data. Combining these two technologies can create an ultrafast and scalable application that delivers better results. We will walk through setting up our project, configuring MicroStream and Redis, and integrating them with Jakarta EE. We will also provide tips and best practices for working with these technologies and demonstrate how they can be used to create powerful and efficient applications. Overall, this project will serve as a practical example of using MicroStream and Redis together and combining them with Jakarta EE to create high-performance applications. Whether you are a seasoned developer or just starting, this project will provide valuable insights and knowledge for working with these cutting-edge technologies. The project is a Maven project where the first step is to put the dependencies besides the CDI and MicroStream: XML <code> <dependency> <groupId>expert.os.integration</groupId> <artifactId>microstream-jakarta-data</artifactId> <version>${microstream.data.version}</version> </dependency> <dependency> <groupId>one.microstream</groupId> <artifactId>microstream-afs-redis</artifactId> <version>${microstream.version}</version> </dependency> </code> The next step is creating both entity and repository; in our scenario, we'll create a Book entity with Library as a repository collection. Java @Entity public class Book { @Id private String isbn; @Column("title") private String title; @Column("year") private int year; } @Repository public interface Library extends CrudRepository<Book, String> { List<Book> findByTitle(String title); } The final step before running is to create the Redis configuration where we'll overwrite the default StorageManager to use the Redis integration, highlighting MicroStream can integrate with several databases such as MongoDB, Hazelcast, SQL, etc. Java @Alternative @Priority(Interceptor.Priority.APPLICATION) @ApplicationScoped class RedisSupplier implements Supplier<StorageManager> { private static final String REDIS_PARAMS = "microstream.redis"; @Override @Produces @ApplicationScoped public StorageManager get() { Config config = ConfigProvider.getConfig(); String redis = config.getValue(REDIS_PARAMS, String.class); BlobStoreFileSystem fileSystem = BlobStoreFileSystem.New( RedisConnector.Caching(redis) ); return EmbeddedStorage.start(fileSystem.ensureDirectoryPath("microstream_storage")); } public void close(@Disposes StorageManager manager) { manager.close(); } } Done, we're ready to go! For this sample, we'll use a simple Java SE; however, you can do it with MicroProfile and Jakarta EE with microservices. Java try (SeContainer container = SeContainerInitializer.newInstance().initialize()) { Book book = new Book("123", "Effective Java", 2002); Book book2 = new Book("1234", "Effective Java", 2019); Book book3 = new Book("1235", "Effective Java", 2022); Library library = container.select(Library.class).get(); library.saveAll(List.of(book, book2, book3)); List<Book> books = library.findByTitle(book.getTitle()); System.out.println("The books: " + books); System.out.println("The size: " + books.size()); } Conclusion In conclusion, MicroStream integration with multiple databases is a promising approach to designing high-performance data management systems. This project explores various integration techniques to connect microstream with databases such as MySQL, MongoDB, Oracle, and PostgreSQL. The system will be designed and implemented using a combination of programming languages such as Java, Python, and JavaScript. The project will also provide documentation, training materials, and benchmark tests to ensure the system meets the specified requirements and delivers user value. By leveraging the power of MicroStream technology and integrating it with different databases, organizations can build robust, scalable, and efficient data management systems that can handle large amounts of data and complex data structures. This approach can give organizations a competitive edge by enabling them to process data faster, make better-informed decisions, and enhance operational efficiency. Overall, MicroStream integration with multiple databases is a promising approach that can benefit organizations in various industries. With the right design, implementation, and testing, organizations can leverage this approach to build data management systems that meet their unique business needs and drive success. Reference: Source code
Caches are very useful software components that all engineers must know. It is a transversal component that applies to all the tech areas and architecture layers such as operating systems, data platforms, backend, frontend, and other components. In this article, we are going to describe what is a cache and explain specific use cases focusing on the frontend and client side. What Is a Cache? A cache can be defined in a basic way as an intermediate memory between the data consumer and the data producer that stores and provides the data that will be accessed many times by the same/different consumers. It is a transparent layer for the data consumer in terms of user usability except to improve performance. Usually, the reusability of data provided by the data producer is the key to taking advantage of the benefits of a cache. Performance is the other reason to use a cache system such as in-memory databases to provide a high-performance solution with low latency, high throughput, and concurrency. For example, how many people query the weather on a daily basis and how many times do they repeat the same query? Let's suppose that there are 1,000 people in New York consulting the weather and 50% repeat the same query twice per day. In this scenario, if we can store the first query as close as possible to the user's device, we achieve two benefits increase the user experience because the data is provided faster and reduce the number of queries to the data producer/server side. The output is a better user experience and a solution that will support more concurrent users using the platform. At a high level, there are two caching strategies that we can apply in a complementary way: Client/Consumer Side: The data cached is stored on the consumer or user side, usually in the browser's memory when we are talking about web solutions (also called private cache). Server/Producer Side: The data cached is stored in the components of the data producer architecture. Caches like any other solution have a series of advantages that we are going to summarize: Application performance: Provide faster response times because can serve data more quickly. Reduce load on the server side: When we apply caches to the previous system and reuse a piece of data, we are avoiding queries/requests to the following layer. Scalability and cost improvement: As data caching gets closer to the consumer, we increase the scalability and performance of the solution at a lower cost. Components closer to the client side are more scalable and cheaper because three main reasons: These components are focused on performance and availability but have poor consistency. They have only part of the information: the data used more by the users. In the case of the browser's local cache, there is no cost for the data producer. The big challenges of cache are data consistency and data freshness, which means how the data is synchronized and up-to-date across the organization. Depending on the use case, we will have more or fewer requirements restrictions because it is so different from caching images than the inventory stock or sales behavior. Client-Side Caches Speaking about the client-side cache, we can have different types of cache that we are going to analyze a little bit in this article: HTTP Caching: This caching type is an intermediate cache system, as it depends partially on the server. Cache API: This is a browser API(s) that allows us to cache requests in the browser. Custom Local Cache: The front-end app controls the cache storage, expiration, invalidation, and update. HTTP Caching It caches the HTTP requests for any resource (CSS, HTML, images, video, etc.) in the browsers, and it manages all related to storage, expiration, validation, fetch, etc., from the front end. The application’s point of view is almost transparent as it makes a request in a regular way and the browser does all the “magic." The way of controlling the caching is by using HTTP Headers, in the server side, it adds cache-specific headers to the HTTP response, for example: "Expires: Tue, 30 Jul 2023 05:30:22 GMT," then the browser knows this resource can be cached, and the next time the client (application) requests the same resource if the request time is before the expiration date the request will not be done, the browser will return the local copy of the resource. It allows you to set the way the responses are disguised, as the same URL can generate different responses (and their cache should be handled in a different way). For example, in an API endpoint that returns some data (i.e., http://example.com/my-data) we could use the request header Content-type to specify if we want the response in JSON or CSV, etc. Therefore, the cache should be stored with the response depending on the request header(s). For that, the server should set the response header Vary: Accept-Language to let the browser know the cache depends on that value. There are a lot of different headers to control the cache flow and behavior, but it is not the goal of this article to go deep into it. It will probably be addressed in another article. As we mentioned before, this caching type needs the server to set the resources expiration, validation, etc. So this is not a pure frontend caching method or type, but it’s one of the simplest ways to cache the resources the front-end application uses, and it is complementary to the other way we will mention down below. Related to this cache type, as it is an intermediate cache, we can even delegate it in a “piece” between the client and the server; for example, a CDN, a reverse proxy (for example Varnish), etc. Cache API It is quite similar to the HTTP caching method, but in this case, we control which requests are stored or extracted from the cache. We have to manage the cache expiration (and it’s not easy, because those caches were thought to live “forever”). Even if these APIs are available in the windowed contexts are very oriented to their usage in a worker context. This cache is very oriented to use for offline applications. On the first request, we can get and cache all the resources need it (images, CSS, JS, etc.), allowing the application to work offline. It is very useful in mobile applications, for example with the use of maps for our GPS systems in addition to weather data. This allows us to have all the information for our hiking route even if we have no connection to the server. One example of how it works in a windowed context: const url = ‘https://catfact.ninja/breeds’ caches.open('v1').then((cache) => { cache.match((url).then((res) => { if (res) { console.log('it is in cache') console.log(res.json()) } else { console.log('it is NOT in cache') fetch(url) .then(res => { cache.put('test', res.clone()) }) } }) }) Custom Local Cache In some cases, we will need more control over the cached data and the invalidation (not just expiration). Cache invalidation is more than just checking the max-age of a cache entry. Imagine the weather app we mentioned above. This app allows the users to update the weather to reflect the real weather in a place. The app needs to do a request per city and transform the temperature values from F to ºC (this is a simple example: calculations can be more expensive in other use cases). To avoid doing requests to the server (even if it’s cached), we can do all the requests the first time, put all the data together in a data structure convenient for us, and store it in, for example in the browser’s IndexedDB, in the LocalStorage, SessionStorage or even in memory (not recommended). The next time we want to show the data, we can get it from the cache, not just the resource data (even the computation we did), saving network and computation time. We can control the expiration of the caches by adding the issue time next to the API, and we can also control the cache invalidation. Imagine now that the user adds a new cat in its browser. We can just invalidate the cache and do the requests and calculations next time, or go further, updating our local cache with the new data. Or, another user can change the value, and the server will send an event to notify the change to all clients. For example, using WebSockets, our front-end application can hear these events and invalidate the cache or just update the cache. This kind of cache requires work on our side to check the caches and handle events that can invalidate or update it, etc., but fits very well in a hexagonal architecture where the data is consumed from the API using a port adaptor (repository) that can hear domain events to react to the changes and invalidate or update some caches. This is not a cache generic solution. We need to think if it fits our use case as it requires work on the front-end application side to invalidate the caches or to emit and handle data change events. In most cases, the HTTP caching is enough. Conclusion Having a cache solution and good strategy should be a must in any software architecture, but our solution will be incomplete and probably not optimized. Caches are our best friends mostly in high-performance scenarios. It seems that the technical invalidation cache process is the challenge, but the biggest challenge is to understand the business scenarios and uses cases to identify what are the requirements in terms of data freshness and consistency that allow us to design and choose the best strategy. We will talk about other cache approaches for databases, backend, and in-memory databases in the next articles.
For reference: Checkout my previous article where I discuss connection pool high availability, "Connection Pool High Availability With CockroachDB and PgCat." Motivation The load balancer is a core piece of architecture for CockroachDB. Given its importance, I'd like to discuss the methods to overcome the SPOF scenarios. High-Level Steps Start CockroachDB and HAProxy in Docker Run a workload Demonstrate fault tolerance Conclusion Step-By-Step Instructions Start CockroachDB and HAProxy in Docker I have a Docker Compose environment with all of the necessary services here. Primarily, we are adding a second instance of HAProxy and overriding the ports not to overlap with the existing load balancer in the base Docker Compose file. I am in the middle of refactoring my repo to remove redundancy and decided to split up my Compose files into a base docker-compose.yml and any additional services into their own YAML files. lb2: container_name: lb2 hostname: lb2 build: haproxy ports: - "26001:26000" - "8082:8080" - "8083:8081" depends_on: - roach-0 - roach-1 - roach-2 To follow along, you must start the Compose environment with the command: docker compose -f docker-compose.yml -f docker-compose-lb-high-availability.yml up -d --build You will see the following list of services: ✔ Network cockroach-docker_default Created 0.0s ✔ Container client2 Started 0.4s ✔ Container roach-1 Started 0.7s ✔ Container roach-0 Started 0.6s ✔ Container roach-2 Started 0.5s ✔ Container client Started 0.6s ✔ Container init Started 0.9s ✔ Container lb2 Started 1.1s ✔ Container lb Started The diagram below depicts the entire cluster topology: Run a Workload At this point, we can connect to one of the clients and initialize the workload. I am using tpcc as it's a good workload to demonstrate write and read traffic. cockroach workload fixtures import tpcc --warehouses=10 'postgresql://root@lb:26000/tpcc?sslmode=disable' Then we can start the workload from both client containers. Load Balancer 1: cockroach workload run tpcc --duration=120m --concurrency=3 --max-rate=1000 --tolerate-errors --warehouses=10 --conns 30 --ramp=1m --workers=100 'postgresql://root@lb:26000/tpcc?sslmode=disable' Load Balancer 2: cockroach workload run tpcc --duration=120m --concurrency=3 --max-rate=1000 --tolerate-errors --warehouses=10 --conns 30 --ramp=1m --workers=100 'postgresql://root@lb2:26000/tpcc?sslmode=disable' You will see output similar to this. 488.0s 0 1.0 2.1 44.0 44.0 44.0 44.0 newOrder 488.0s 0 0.0 0.2 0.0 0.0 0.0 0.0 orderStatus 488.0s 0 2.0 2.1 11.0 16.8 16.8 16.8 payment 488.0s 0 0.0 0.2 0.0 0.0 0.0 0.0 stockLevel 489.0s 0 0.0 0.2 0.0 0.0 0.0 0.0 delivery 489.0s 0 2.0 2.1 15.2 17.8 17.8 17.8 newOrder 489.0s 0 1.0 0.2 5.8 5.8 5.8 5.8 orderStatus The logs for each instance of HAProxy will show something like this: 192.168.160.1:60584 [27/Apr/2023:14:51:39.927] stats stats/<STATS> 0/0/0 28724 LR 2/2/0/0/0 0/0 192.168.160.1:60584 [27/Apr/2023:14:51:39.927] stats stats/<STATS> 0/0/816 28846 LR 2/2/0/0/0 0/0 192.168.160.1:60584 [27/Apr/2023:14:51:40.744] stats stats/<STATS> 0/0/553 28900 LR 2/2/0/0/0 0/0 192.168.160.1:60584 [27/Apr/2023:14:51:41.297] stats stats/<STATS> 0/0/1545 28898 LR 2/2/0/0/0 0/0 192.168.160.1:60582 [27/Apr/2023:14:51:39.927] stats stats/<NOSRV> -1/-1/61858 0 CR 2/2/0/0/0 0/0 HAProxy exposes a web UI on port 8081. Since we have two instances of HAProxy, I exposed the second instance at port 8083. Demonstrate Fault Tolerance We can now start terminating the HAProxy instances to demonstrate failure tolerance. Let's start with instance 1. docker kill lb lb The workload will start producing error messages. 7 17:41:18.758669 357 workload/pgx_helpers.go:79 [-] 60 + RETURNING d_tax, d_next_o_id] W230427 17:41:18.758737 357 workload/pgx_helpers.go:123 [-] 61 error preparing statement. name=new-order-1 sql= W230427 17:41:18.758737 357 workload/pgx_helpers.go:123 [-] 61 + UPDATE district W230427 17:41:18.758737 357 workload/pgx_helpers.go:123 [-] 61 + SET d_next_o_id = d_next_o_id + 1 W230427 17:41:18.758737 357 workload/pgx_helpers.go:123 [-] 61 + WHERE d_w_id = $1 AND d_id = $2 W230427 17:41:18.758737 357 workload/pgx_helpers.go:123 [-] 61 + RETURNING d_tax, d_next_o_id unexpected EOF 142.0s 3 0.0 0.2 0.0 0.0 0.0 0.0 delivery 142.0s 3 0.0 2.2 0.0 0.0 0.0 0.0 newOrder 142.0s 3 0.0 0.2 0.0 0.0 0.0 0.0 orderStatus 142.0s 3 0.0 2.2 0.0 0.0 0.0 0.0 payment Our workload is still running using the HAProxy 2 connection. Let's bring it back up: docker start lb Notice the client reconnects and continues with the workload. 335.0s 1780 0.0 0.1 0.0 0.0 0.0 0.0 stockLevel _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms) 336.0s 1780 0.0 0.1 0.0 0.0 0.0 0.0 delivery 336.0s 1780 7.0 1.1 19.9 27.3 27.3 27.3 newOrder 336.0s 1780 0.0 0.1 0.0 0.0 0.0 0.0 orderStatus 336.0s 1780 2.0 1.0 10.5 11.0 11.0 11.0 payment 336.0s 1780 0.0 0.1 0.0 0.0 0.0 0.0 stockLevel 337.0s 1780 0.0 0.1 0.0 0.0 0.0 0.0 delivery 337.0s 1780 7.0 1.1 21.0 32.5 32.5 32.5 ne The number of executed statements goes up upon the second client successfully connecting. We can now do the same with the second instance. Similarly, the workload reports errors that it can't find the lb2 host. 0.0 0.2 0.0 0.0 0.0 0.0 stockLevel I230427 17:48:28.239032 403 workload/pgx_helpers.go:79 [-] 188 pgx logger [error]: connect failed logParams=map[err:lookup lb2 on 127.0.0.11:53: no such host] I230427 17:48:28.267355 357 workload/pgx_helpers.go:79 [-] 189 pgx logger [error]: connect failed logParams=map[err:lookup lb2 on 127.0.0.11:53: no such host] And we can observe the dip in the statement count. We can bring it back up: docker start lb2 One thing we can improve on is starting the workload with both connection strings. It will allow each client to fail back to the other instance of pgurl even when one of the HAProxy instances is down. What we have to do is stop both clients and restart with both connection strings. cockroach workload run tpcc --duration=120m --concurrency=3 --max-rate=1000 --tolerate-errors --warehouses=10 --conns 30 --ramp=1m --workers=100 'postgresql://root@lb:26000/tpcc?sslmode=disable' 'postgresql://root@lb2:26000/tpcc?sslmode=disable' I am going to do that one client at a time so that the workload does not exit completely. Not at any point in this experiment have we lost the ability to read/write to and from the cluster. Let's shut down one of the HAProxy instances again and see the impact. docker kill lb lb I'm now seeing errors across both clients, but both clients are still executing. .817268 1 workload/cli/run.go:548 [-] 85 error in stockLevel: lookup lb on 127.0.0.11:53: no such host _elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms) 156.0s 49 0.0 0.2 0.0 0.0 0.0 0.0 delivery 156.0s 49 1.0 2.1 31.5 31.5 31.5 31.5 newOrder 156.0s 49 0.0 0.2 0.0 0.0 0.0 0.0 orderStatus 156.0s 49 1.0 2.0 12.1 12.1 12.1 12.1 payment 156.0s 49 0.0 0.2 0.0 0.0 0.0 0.0 stockLevel I230427 17:55:58.558209 354 workload/pgx_helpers.go:79 [-] 86 pgx logger [error]: connect failed logParams=map[err:lookup lb on 127.0.0.11:53: no such host] I230427 17:55:58.698731 346 workload/pgx_helpers.go:79 [-] 87 pgx logger [error]: connect failed logParams=map[err:lookup lb on 127.0.0.11:53: no such host] I230427 17:55:58.723643 386 workload/pgx_helpers.go:79 [-] 88 pgx logger [error]: connect failed logParams=map[err:lookup lb on 127.0.0.11:53: no such host] I230427 17:55:58.726639 370 workload/pgx_helpers.go:79 [-] 89 pgx logger [error]: connect failed logParams=map[err:lookup lb on 127.0.0.11:53: no such host] I230427 17:55:58.789717 364 workload/pgx_helpers.go:79 [-] 90 pgx logger [error]: connect failed logParams=map[err:lookup lb on 127.0.0.11:53: no such host] I230427 17:55:58.841283 418 workload/pgx_helpers.go:79 [-] 91 pgx logger [error]: connect failed logParams=map[err:lookup lb on 127.0.0.11:53: no such host] We can bring it back up and notice the workload recovering. Conclusion Throughout the experiment, we've not lost the ability to read and write to the database. There were dips in traffic, but that's expected. The lesson here is providing a highly available configuration where clients can see multiple connections.
In this series of simulating and troubleshooting performance problems in Scala, let’s discuss how to make threads go into a blocked state. A thread will enter into a blocked state when it cannot acquire a lock on an object because another thread already holds the lock on the same object and doesn’t release it. Scala Blocked Thread Program Here is a sample program, which would make threads go into a blocked state. package com.yc class BlockedApp { } object BlockedApp { def main(args: Array[String]): Unit = { for (counter <- 1 to 10) { new AppThread().start() } } def action(): Unit = { this.synchronized { while (true) Thread.sleep(600000) } } class AppThread extends Thread { override def run(): Unit = BlockedApp.action() } } The sample program contains the BlockedApp class. This class has a start() method. In this method, 10 new threads are created. In the AppThread class there is a run() method that invokes BlockedApp.action(). In this BlockedApp.action() method, the thread is put to continuous sleep (i.e., the thread is repeatedly sleeping for 10 minutes again and again). But if you notice, the action() method is a synchronized method. Synchronized methods can be executed by only one thread at a time. If any other thread tries to execute the action() method while the previous thread is still working on it, then the new thread will be put in the blocked state. In this case, 10 threads are launched to execute the action() method. However, only one thread will acquire the lock and execute this method, and the remaining 9 threads will be put in a blocked state. NOTE: If threads are in a BLOCKED state for a prolonged period, then the application may become unresponsive. How To Diagnose Blocked Threads You can diagnose blocked threads either through a manual or automated approach. Manual Approach In the manual approach, you need to capture thread dumps as the first step. A thread dump shows all the threads that are in memory and their code execution path. You can capture a thread dump using one of the 8 options mentioned here. But an important criterion is as follows: you need to capture the thread dump right when the problem is happening (which might be tricky to do). Once the thread dump is captured, you need to manually import the thread dump from your production servers to your local machine and analyze it using thread dump analysis tools like fastThread or samurai. Automated Approach On the other hand, you can also use the yCrash open-source script, which would capture 360-degree data (GC log, 3 snapshots of thread dump, heap dump, netstat, iostat, vmstat, top, top -H, etc.) right when the problem surfaces in the application stack and analyze them instantly to generate root cause analysis report. We used the automated approach. Below is the root cause analysis report generated by the yCrash tool highlighting the source of the problem. yCrash reporting transitive dependency graph of 9 BLOCKED threads yCrash prints a transitive dependency graph that shows which threads are getting blocked and who is blocking them. In this transitive graph, you can see ‘Thread-0’ blocking 9 other threads. If you click on the thread names in the graph, you can see the stack trace of that particular thread. yCrash reporting the stack trace of 9 threads that are in BLOCKED state Here is the screenshot that shows the stack trace of the 9 threads which are in a blocked state. From the stack trace, you can observe that the thread is stuck on the ‘com.yc.BlockedApp$.action(BlockedApp.scala:17)’ method. Equipped with this information, one can easily identify the root cause of the blocked state threads. Video To see the visual walk-through of this post, click below:
CPU isolation and efficient system management are critical for any application which requires low-latency and high-performance computing. These measures are especially important for high-frequency trading systems, where split-second decisions on buying and selling stocks must be made. To achieve this level of performance, such systems require dedicated CPU cores that are free from interruptions by other processes, together with wider system tuning. In modern production environments, there are numerous hardware and software hooks that can be adjusted to improve latency and throughput. However, finding the optimal settings for a system can be challenging as it requires navigating a multidimensional search space. To accomplish this efficiently, it is necessary to understand the tuning landscape and to use tools and strategies that facilitate effective changes. Moreover, managing Java processes can be more difficult due to the number of auxiliary threads which are spawned by the JVM, even for logically single-threaded applications. The scheduling of these threads is critical to minimizing jitter and achieving optimal performance. In this article, we will explore the strengths and weaknesses of the standard solutions for controlling CPU isolation for low-latency applications under Linux and how we at Chronicle Software developed Chronicle Tune to address the inherent trade-offs of these solutions. Using the isolcpus Linux Configuration isolcpus is a Linux boot command-line option that allows an explicit list of CPUs to be excluded from consideration by the Linux scheduler. This option provides very effective isolation; however, the problem is that it does not respect CPU ranges. For example, when you use task set or sched_setaffinity to specify a range of allowed CPUs for a pinned process, only the first CPU in the allowed range is utilized, regardless of the number of threads in the process. Thus when controlling thread placement under isolcpus, every thread requires explicit management and particular care must be taken to avoid scheduling conflicts from auxiliary and/or child threads. Another major disadvantage of isolcpus is that the configuration is fixed once a host has started, so changes to the configuration require a reboot of the system. Using Cgroups or Csets An alternative that provides a similar level of isolation and allows for dynamically changing configuration is Linux cgroups. cgroups is a feature in the Linux kernel that enables administrators to limit, allocate, and prioritize system resources such as CPU, memory, disk I/O, and network bandwidth among processes or groups of processes. This can help prevent one application from monopolizing system resources, resulting in poor performance or instability. csets is a utility that is used specifically to manage CPU affinity and placement for groups of tasks. By defining csets, administrators can assign specific CPUs or CPU cores to particular tasks or groups of tasks, ensuring that those tasks have dedicated CPU resources and minimizing interference from other tasks. This can be especially useful in high-performance computing environments, where minimizing contention and maximizing performance is critical. Both cgroups and csets enable specific cpuset groupings to be defined, with processes confined to run within one particular group. Figure 1. A comparison of how threads can be managed with isolcpus and cgroups. isolcpus allows the management of individual threads but prevents the use of flexible CPU groups. One drawback is that cgroups are primarily designed to work at the process level and are a less natural tool to use when the targeted control of individual threads is important. As touched on earlier, this can be a particular problem for Java applications, given the relatively large number of auxiliary threads started by the JVM. Even though many of the threads only run occasionally, they can still generate enough jitter to impact the high percentiles of any latency-sensitive application in the same group. A further complication when using cgroups is the absence of support from standard calls like task set and sched_setaffinity, making it more challenging to combine cgroups with low-level libraries: moving processes between groups requires the use of specialized calls. Making the Procedure Better and Automatic Since there are drawbacks with both isolcpus and cgroups/csets, plus they can be time-consuming to configure, we developed software to make tuning and managing a system simpler and more transparent. Chronicle Tune blends features from isolcpus and cgroups/csets together with bespoke functionality to simplify CPU and system tuning, allowing changes to be applied dynamically without the need for reboots. Chronicle Tune can be especially useful for Java applications where careful separation and control of application and background threads is essential for achieving the best performance. Chronicle Tune facilitates optimal process placement and control, helping to ensure fewer and shorter interrupts and allowing threads to be dynamically migrated. Figure 2. Comparison of Chronicle Tune, isolcpus, and groups How Much Could Tuning Improve Performance? For businesses seeking to improve their performance down to the nanosecond, it’s crucial to understand how much of a difference tuning can make. To this end, we conducted a practical test to evaluate the impact of the Chronicle Tune on the performance of the Chronicle Queue. Specifically, we measured the write-to-read latency for 256-byte messages at a rate of 100,000 messages per second. Our testing shows that while Chronicle Tune had an effect even at lower percentiles, from around the 99.9th percentile onwards, the benefits of significantly reduced jitter became increasingly apparent, showing the machine running in a much cleaner, more optimal configuration. Figure 3 Write-to-read latency of Chronicle Queue exchanging 256-byte messages @ 100k msgs/s. Can Shrink Wrapped Software Be as Efficient as Manual Tuning? Using ready-made software is certainly more convenient than tuning manually. So how effective is Chronicle Tune in comparison? To investigate this, we have a tool that measures the jitter experienced by a spinning, pinned thread (closely representing a typical latency-sensitive application thread), and Figure 4 below shows the results of a comparison between isolcpus and Chronicle Tune. This plot shows that while the total number of jitter events is slightly lower for isolcpus (as might be expected given isolcpus integrates directly with the scheduler), the worst outliers are, in fact, slightly lower with Chronicle Tune (6us vs. 14us) on account of the additional system tuning with Chronicle Tune beyond just CPU isolation. Chronicle Tune achieves this with a simple, transparent configuration, which can be adjusted without the need for reboots. Figure 4. Average number of delays per hour, grouped by length of the delay. The statistics were gathered during a 91-second jitter test run. Conclusion The standard solutions for controlling CPU isolation for low-latency applications under Linux are isolcpus and cgroups/csets. However, they each have their downsides and can be awkward to use. Chronicle Tune simplifies the process of system tuning and manages low-latency, low-jitter tasks, scheduling of threads separately from processes, dynamic adjustment of allocations during runtime, efficient management of Interrupt Requests, and whole-system optimization, including SSD, disk, memory, and network. All of this is achieved using a simple, transparent configuration that can be adjusted dynamically without the need for a reboot to take effect.
Welcome back to this series all about file uploads for the web. In the previous posts, we covered things we had to do to upload files on the front end, things we had to do on the back end, and optimizing costs by moving file uploads to object storage. Upload files with HTML Upload files with JavaScript Receive uploads in Node.js (Nuxt.js) Optimize storage costs with Object Storage Optimize performance with a CDN Secure uploads with malware scans Today, we’ll do more architectural work, but this time it’ll be focused on optimizing performance. Recap of Object Storage Solution By now, we should have an application that stores uploaded files somewhere in the world. In my case, it’s an Object Storage bucket from Akamai cloud computing services, and it lives in the us-southeast-1 region. So when I upload a cute photo of Nugget making a big ol’ yawn, I can access it at austins-bucket.us-southeast-1.linodeobjects.com/files/nugget.jpg. Nugget is a super cute dog. Naturally, a lot of people are going to want to see this. Unfortunately, this photo is hosted in the us-southeast-1 region, so anyone living far away from that region has to wait longer before their eyes can feast on this beast. Latency sucks. And that’s why CDNs exist. What Is a CDN? CDN stands for “content delivery network“, and it’s a connected network of computers that are globally distributed and can store copies of the same files so that when a user makes a request for a specific file, it can be served from the nearest computer to the user. By using a CDN, the distance a request must travel is reduced, thereby resolving requests faster, regardless of a user’s location. Here’s a WebPageTest result for that photo of Nugget. The request was made from their servers in Japan, and it took 1.1 seconds for the request to complete. Instead of serving the file directly from my Object Storage bucket, I can set up a CDN in front of my application to cache the photo all over the world. So users in Tokyo will get the same photo but served from their nearest CDN location (which is probably in Tokyo), and users in Toronto are going to get that same file but served from their nearest CDN location (which is probably in Toronto). This can have significant performance implications. Let’s look at that same request, but served behind a CDN. The new WebPageTest results still show the same photo of Nugget, and the request still originated from Tokyo, but this time it only took 0.2 seconds; a fraction of the time! When the request is made for this image, the CDN can check if it already has a cached version. If it does, it can respond immediately. If it doesn’t, it can go fetch the original file from Object Storage, then save a cached version for any future requests. Note: the numbers reported above are from a single test. They may vary depending on network conditions. The Compounding Returns of CDNs The example above focused on improving the delivery speeds of uploaded files. In that context, I was only dealing with a single image that is uploaded to an Object Storage bucket. It shows almost a full-second improvement in response times, which is great, but things get even better when you consider other types of assets. CDNs are great for any static asset (CSS, JavaScript, fonts, images, icons, etc.), and by putting it in front of my application, all the other static files can automatically get cached as well. This includes the files that Nuxt.js generates in the build process, and which are hosted on the application server. This is especially relevant when you consider the “Critical rendering path” and render-blocking resources like CSS, JavaScript, or fonts. When a webpage loads, as the browser comes across a render-blocking resource, it will pause parsing and go download the resource before it continues (hence “render-blocking”). So any latency that affects a single asset may also impact the performance of other assets further down the network cascade. This means the performance improvements from a CDN are compounding. Nice! So is this about showing cute photos of my dog to more people even faster, or is it about helping you make your applications run faster? YES! Whatever motivates you to build faster websites, including a CDN as part of your application infrastructure is a crucial step if you plan on serving customers from more than one region. Connect Akamai CDN to Object Storage I want to share how I set up Akamai with Object Storage because I didn’t find much information on the subject, and I’d like to help anyone that’s looking for a solution. If it doesn’t apply to your use case, feel free to skip this section. Akamai is the largest CDN provider in the world, with something like 300,000 servers across 4,000 locations. It’s used by some of the largest companies in the world, but most enterprise clients don’t like sharing which tools they use, so it’s hard to find Akamai-related content. (Note: You will need an Akamai account and access to your DNS editor.) In the Akamai Control Center, I created a new property using the Ion Standard product, which is great for general-purpose CDN delivery. After clicking Create Property, you’ll be prompted to choose whether to use the setup wizard to guide you through creating the property, or you can go straight to the Property Manager settings for the new property. I chose the latter. In the Property Manager, I had to add a new hostname in the Property Hostnames section. I added the hostname for my application. This is the URL where users will find your application. In my case, it was "uploader.austingil.com". Part of this process also requires setting up an SSL certificate for the hostname. I left the default value selected for Enhanced TLS. With all that set up, Akamai will show me the following Property Hostname and Edge Hostname. We’ll come back to these later when it’s time to make DNS changes. Property Hostname: uploader.austingil.com Edge Hostname: uploader.austingil.com-v2.edgekey.net Next, I had to set up the actual property’s behavior, which meant editing the Default Rule under the Property Configuration Settings. Specifically, I had to point the Origin Server Hostname to the domain where my origin server will live. In my DNS, I created a new A record pointing origin-uploader.austingil.com to my origin server’s IP address, then added a CNAME record that points uploader.austingil.com to the Edge Hostname provided by Akamai. A: origin-uploader.austingil.com -> origin server IP CNAME: uploader.austingil.com -> uploader.austingil.com-v2.edgekey.net This lets me build out my CDN configuration and test it as needed, only sending traffic through the CDN when I’m ready. Finally, to serve files in my Object Storage instance through Akamai, I created a new rule based on the blank rule template. I set the rule criteria to apply to all requests going to the /files/* sub-route. The rule behavior is set up to rewrite the request’s Origin Server Hostname and change it to my Object Storage location: npm.us-southeast-1.linodeobjects.com. This way, any request that goes to uploader.austingil.com/files/nugget.jpeg is served through the CDN, but the file originates from the Object Storage location. And when you load the application, all the static assets generated by Nuxt are served from the CDN as well. All other requests are passed through Akamai and forwarded to origin-uploader.austingil.com, which points to the origin server. So that’s how I’ve configured Akamai CDN to sit in front of my application. Hopefully, it all made sense, but if you have questions, feel free to ask me. To Sum Up Today we looked at what a CDN is, the role it plays in reducing network latency, and how to set up Akamai CDN with Object Storage. But this is just the tip of the iceberg. There’s a whole world of tweaking CDN configuration to get even more performance. There are also a lot of other performance and security features a CDN can offer beyond just static file caching: web application firewalls, faster network path resolution, DDoS protection, bot mitigation, edge compute, automated image and video optimization, malware scanning, request security headers, and more. My colleague, Mike Elissen, also covers some great security topics on his blog. The most important thing that I wanted to convey today is that using a CDN improves file delivery performance by caching content close to the user. I hope you’re enjoying the series so far and plan on sticking around until the end. We’ll continue next time by looking at ways to protect our servers from malicious file uploads. Thank you so much for reading. If you liked this article, and want to support me, the best ways to do so are to share it and follow me on Twitter.
Joana Carvalho
Performance Engineer,
Postman
Greg Leffler
Observability Practitioner, Director,
Splunk
Ted Young
Director of Open Source Development,
LightStep
Eric D. Schabell
Director Technical Marketing & Evangelism,
Chronosphere