| |JULY 20239CIOReviewof our cloud spend. Not wanting to break our existing infrastructure, monitoring was the next critical step. We installed monitoring and log ingestion tools on these servers to pull all key metrics and to track what was occurring on these servers. One tip, be sure to check for cron jobs and open ports. That will help scope out the server's usage. Based on our findings, we detailed out what was running on each machine, how it was being utilized and what resources it was consuming. We had instances running at under 0.1 percent CPU utilization and targeted them first. We were able to quickly pause those instances, resize them and bring them back online. We moved CPU utilization closer to the 5 percent utilization, which cut our spend significantly. Our 5 percent CPU utilization target was arbitrary and really depends on peaks in usage. Next, it is important to have a production like test environment to try different instance types to find the best utilization of resources that doesn't impact performance. If you have a job running only once a day or once a week, spiking CPU significantly, you may consider moving it to a lambda function to save further. Instead of paying for the idle CPU throughout the day, you will only incur compute costs when the job runs. Reducing RiskIt is always concerning shutting off a server, especially when you are not entirely sure of everything it is doing. However, in most cases the savings always out ways the risk. There are a few additional things you can do to further reduce the risk of an outage. Instead of deleting an instance, you can shut it down instead. Then, if you determine it is needed, you can simply start it back up. If that seems too risky, you can just limit the instance's access via its firewall so the instance is never shut down, only temporarily unreachable. For our instances, after I felt we had a good handle that it was not in use, we would spin down the server for a week before terminating it. Cloud CleanupBased on most of the implementations I have seen over the years, organizations have plenty of room to cut spend in their cloud environment without reducing quality. In fact, these types of exercises typically uncover some poor design patterns that ultimately allow for a better cloud architecture. In Lolli & Pops' case, we had four separate servers running on one virtual machine: Payara, Tomcat, Nifi and MySQL. We were able to both reduce costs and isolate these four servers for better uptime and redundancy. I would like to note, you should never run a database on a standalone VM. Use a redundant database cloud service like RDS or GCS instead.While you may not see an 85 percent cloud cost reduction, like Lolli & Pops, you will be surprised what you can save without moving away from the cloud. If the tasks seems too daunting, hiring an outside cloud performance agency is always an option. Good luck on the next leg in your cloud journey. You will be surprised at what you can save in cloud cost without moving away from the cloud
< Page 8 | Page 10 >