Case Study: How Flatiron Health Gained Visibility and Control Over Total Platform Costs

The Zen of Total Platform Engineering Cost Management

Jeff Harris

In the fast-paced world of platform engineering, managing costs effectively can feel like a constant battle. You can tame the battle and achieve a well managed budget and forecasting process for your total engineering platform by applying some of the concepts of mindfulness and flow to your team and cross departmental processes. This webinar explores how the power of mindfulness and flow can be applied to your platform engineering processes to empower team productivity and foster a culture of cost optimization.

Key Takeaways:

Understand the fundamental principles of total platform engineering cost management and its significance in today’s technological landscape.

  • Mindfulness for Cost Visibility: Discover how cultivating mindfulness can enhance cost awareness within your team. Learn practical techniques for visibility and engagement to identify areas for cost optimization without feeling overwhelmed.
  • Flow State for Effective Cost Management: Explore the concept of flow and how creating a state of focused attention and engagement can lead to more efficient and strategic cost management decisions.
  • Actionable Strategies: Gain practical tips and actionable strategies to implement mindfulness and flow practices within your team, fostering a collaborative and productive environment for managing platform engineering costs effectively.

Who Should Watch:

  • Platform engineers and architects
  • Cost optimization leads
  • Cloud infrastructure specialists
  • Anyone seeking to improve team productivity and cost management in the realm of platform engineering.

Watch today and unlock the secrets of zen-like cost management! By harnessing the power of mindfulness and flow for total platform cost management, you can empower your team to achieve greater efficiency, productivity, and cost savings.

Length: 25 minutes

Webinar Transcript

Introduction

It looks like everybody’s getting into the room now. Awesome. Thanks, y’all. Apologize for the short delay there. Give everybody just a few minutes here before we get started.

Today we’ll be talking about the Zen of total platform engineering cost management. So, having a little fun with it today, by taking some of the ideas of mindfulness and flow and seeing how we can integrate those into our daily work around financial management of the cloud. Really even looking at how we can enable our teams to stay in their flow state while still being able to get access to the cost data. So, I’m excited for the conversation today.

While we’re waiting here for others to join, just take a moment and get yourself into the mindset of the conversation. We’re going to have maybe a few deep breaths to get yourselves zenned out, mindful, finish up your emails, put your phones away. We’ll go ahead and get kicked off.

State of Platform Engineering

So, before we do that, I’ll give you a little background on myself. My name is Jeff Harris. I’m currently Director of Strategy and Operations at Yotascale. I help a lot on the product side, on the customer side, really focusing on problems that companies in the market are trying to solve today and helping make sure we’re building a product that can assist in solving those problems in a well-thought-out and user-friendly way.

We’ll have some time for Q&A at the end, but feel free to submit questions into the chat and I’ll be keeping an eye on those as well as we go through here.

Great, let’s get started. So, where we’re going to start our conversation today is sort of taking an assessment of where we’re at. For those of you amateur mindfulness practitioners out there like me, usually that starts by trying to look at or taking the senses and your surroundings and take stock of what the problem is.

Here we’re starting with one of the issues in the current state of platform engineering. We’re seeing just an intense proliferation of SaaS tools, self-service SaaS tools. It may seem like a great idea for those of us who are building products, right? We want these to be self-service platforms. We don’t want to have to invest in a lot of hand-holding and management and making sure and helping people get what they need out of the product. But that does tend to put a lot of the onus on you, the user, the team that’s administering the product.

So, we’re seeing a steady increase and even from here to 2023 to 2024, we’re seeing the results of this survey showing an incredible increase in the number of self-service platforms that are used within an organization. This could be Snowflake, your developers are going in and using that, AWS, it could be DataDog. It could be any of these modern SaaS usage-based tools that exist out there. We’re now expecting our users and yourselves to implement these tools and then to use them and adopt them to get value out of them.

That becomes a challenge, right? You can’t be mindful of 14 different tools. You can’t be an expert on 14 different tools. So, we’ll start here with just this problem that we see in the industry today and we’re going to talk through a little bit about how the complexity scales as a company scales their platform engineering teams. Especially when you’re talking about cloud cost, really starting with all of these different platforms that exist out there.

Platform Engineering Cost Complexity Scaling

Imagine that you’re trying to track the cost of an application or how much a team is spending, and you see that there are multiple different cloud products that these teams are using, in some cases even multiple providers. The first step in this sort of mindfulness exercise is getting all of that information together. How do you sync those platforms with your organization as it changes? If you have new cost centers, teams, services, research projects spinning up, how do you track the cost of those business-oriented objects?

You have to be able to allocate that cost ownership, and this should be one of the first things that you try to do. Even if you don’t have large teams, you may want to start by just saying, “OK, what services are we running? What products are we running? What customers are we serving? How do I look at costs through these different lenses?”

Moving on up, as we scale, we start to think about, “OK, now we have a sense of where we’re spending our money. What is it being spent on?” Now it’s a better time to start thinking about how do we optimize it. You’ve probably had new services come in at this point, services leave. How do you manage this ever-changing platform environment that your teams are operating and managing?

As you think about optimizing those service costs, each of these vendors has different methods and ways that you can optimize cost. So, you need to now become an expert in each of those different products and services. You need to be able to take the recommendations that you’re finding, optimization opportunities that you’re finding, and then share them back to the team so that they can take action on them.

For somebody in this process, it’s a lot of work, and oftentimes it’s a lot of work for everybody in the process: the team that’s trying to identify the optimization opportunities as well as the teams that you’re trying to give them to so they can take action on them. Really organizing that information and bringing it all into a single platform can help you be mindful of the opportunities that are out there.

As you deliver that out to your end users, we’ll talk about how we can help keep them in flow as well. We move up the scale problem. You’re starting to optimize services. You’ve got an idea where you’re spending your money. Now you can start thinking about setting budgets, looking at forecasting. What do the users need from your FinOps platform or your FinOps practice? How can you help support them and give them information that allows them to track whether they’re meeting the targets that you’re setting at a company level or meeting the targets that they’re setting at their own team levels? Can you enable them to track both of those things?

Forecasts at a provider level or at a service level, right? Putting all of these things into context so that your forecasts aren’t about a cloud service like EC2. It’s not, “I’m going to spend X amount on EC2 over the next six months.” It’s, “I’m going to spend Y amount on this service and Z amount on X service.” Getting to that level of granularity helps you be mindful of where you’re spending your cost.

As we get further up this chain and we really start to self-actualize, you’re in a large organization. You’ve got finance asking questions that are separated from all of the technology decisions and the technology that’s being used. You may have started to spin up a FinOps practice internally where you’re trying to dedicate resources, human people to help manage this problem. Manage this project of getting information into an organized state and delivering it out to the rest of the organization in an actionable way.

Again, as you grow and scale, you want to be mindful of how you are implementing these practices. There are multiple stakeholders. This is a cross-functional problem and you need to get everybody on the same page and to understand the information that exists so everybody’s looking at the same data set and able to make decisions based off of that.

Creating Order in Platform Engineering

Coming into how can you create that order? When we look at mindfulness and flow and these two things sort of balance, what is mindfulness? It’s a sort of state of active open attention to the present, right. That’s our definition here. But it’s being aware or being able to focus on something. When you practice mindfulness in the sort of breathing sense, right, you’re focusing on your breath. You’re trying to get rid of all the other distractions. How can we do that when it comes to cloud cost?

One example, right? If you look at maybe you’re just taking the cost of usage report straight as it is, or you’re looking at Cost Explorer. One of the challenges we hear from our customers when they come to Yotascale is, “I don’t know how to make sense of this information.” There’s a lot of information here. The CUR file is huge for some companies, hundreds of millions of rows of data. The values that exist in all of the columns or the additional information about it is extensive and confusing and not worded so that a human can read it in a lot of times.

So, it’s really hard to stay mindful when you’re going through that type of information. What a product can do in this area can help you understand that data in words and terminology that makes sense to not just you as a user, but as a business, right? The business has its own sort of way of talking about things, its own language, its own terminology, its own acronyms. All of that can be used to talk about cost as well. So, you can talk about costs in terms of your cost centers, your business units, your customers, the products that you deliver.

Being able to get that visibility and organize that data allows you to have that level of mindfulness. When we think about flow, for a lot of the people that are customers, as a FinOps organization, right, there’s engineers. Engineers are the people that we want to enable with this information so they can make intelligent decisions.

One of the challenges we hear from engineers is it is a distraction. It’s not part of my core responsibilities to manage cost. I am out there to produce things that deliver value for our customers. Somebody else is going to figure out the cost problem. That’s not always the case. Not everybody feels that way, but it’s often that feeling exists because it is a challenge to get the information they need to make good decisions. If they go in and are looking for a single tag in Cost Explorer, if they go in and they’re looking for information from Snowflake and Datadog, and how do I really calculate the cost of my product that is not their core function and their core value to the organization.

So how do you keep them in that flow state? Allow them to continue to make good decisions, while serving them with data that explains cost changes over time for their application, for their service, for the work that they are responsible for. Being able to get this information organized and delivered out to our users allows us to help them stay focused on the tasks that they are there to accomplish while helping the business stay mindful of its costs.

The Zen of Platform Engineering Cost Management

Getting into a little bit more details around that, how do we go about the evolution of building out this cost management visibility program? In this case, we’re starting here right now with visibility and putting this on the mindfulness side here because, in my view, having that visibility is awareness of your situation, not just looking at it in terms of AWS terminology or Azure terminology or GCP terminology. But what is that common language across these providers? And what is the common language for your business that you can then start to see costs through, get that information in real time, and have notifications pushed out at a granular level where users are able to understand what that notification is for.

So you’ve established the visibility, you’ve allocated the cost out to the teams, out to the services, the projects that are using it. You’ve made it possible for them to manage their own costs, get visibility into their own cost, and have the shared definition of cost across the organization.

When we think of coming into flow, that is engaging our teams, that is getting this information to our teams so that they can use it within the context that they operate in, they don’t have to go searching across the organization or on these different platforms to get the right sizing opportunities that exist. We can deliver all of that in a single view so that a team, a manager, a product owner can see what are the opportunities that I could take based on the infrastructure that I use. Based on our spending rates, based on our discount, how can I know how much I can save from these cloud providers? 

And through Yotascale, you can see these views at a team, at a product level. Keeping those teams in flow so they don’t have to go searching for this information. Later, on top of that, the needs of the finance organization to budget to forecast. We also want to be able to budget and forecast around these entities that make sense to us. It’s not just forecasting my AWS cost, it’s forecasting the cost of the product, the cost of the service, the cost of the customer. That is the language that the business is able to use to operate in.

If you go tell finance, you know, EC2 cost is growing, they’re not going to be able to reconcile that with the business. What does that mean for the business other than I’m spending more money? Am I spending money to get more customers? Am I spending money because we’ve changed the products that we’re delivering? Why is that? I need the context around this. The ability to deliver anomaly alerts down to the detail of what resource it was allows teams to again stay in flow.

So that when you get alerted that a NAT gateway cost has spiked because all of a sudden it’s using an external IP, your production operations are all running smoothly. You’re now able to take that information and act on it. You get the resource ID level and this is sharing an example from one of our customers, Zoom, who did experience a cost spike and was able to take the alert that was delivered to the team that managed the NAT Gateway and quickly shut down or made a change to the way that the NAT Gateway was set up in order to reduce the cost.

We see these types of events with customers all over the place. In this case, it was about a 40K a day anomaly that was happening. So you know that can be extreme savings and the way that this team was able to do this was staying in flow. They got the information in an alert and had the NAT Gateway and resource ID and they were able to just remediate that problem right away.

Hope that you guys enjoyed this little talk on the mindfulness and flow aspects of cost management. 

Yotascale Demo

I did want to take a minute here to show you guys how we do this in the product as well. Just hang around. There’s going to be a good demo of one of the things that I didn’t call out quite directly. But as I pull up the console here, we’ll get into it.

This is our Copilot as well. This is something that we are seeing customers ask for because they want to just go from asking a question or intention that they have to getting an answer. So we’re going to see how the Copilot can help you stay in flow within a product, like Yotascale as well.

Briefly, we’re going to talk through this instance here, our tenant, our customer SaaS Tech, their traditional SaaS company. They have started by looking at their costs through a finance-specific lens. So finance has defined cost centers and business units. We’ll dive into that a little bit more. What I want to highlight first is that within this view, we’re looking at all of our costs across multiple cloud providers that we use at SaaS Tech. And I see we’ve spent about 875K.

I talked about looking at this initial finance lens. Now, there’s also a team-oriented view of cost. We take that same cost data and that same 875K from all of our cloud providers and we’re going to break it down through a different lens for different personas. What this allows organizations that use Yotascale to do is define a view for cost that meets the needs of finance as well as a view of cost that meets the needs of engineering. We’re always allocating all of the cost, but this allows us to have those different views for different personas within the organization who may use different terminology.

When we come back to that finance-specific view, because this is where we started with SaaS Tech and they came in, they had their budgets defined. They not just at the company level, but also at the business unit levels. So I could drill in. In fact, I can give access to somebody on the product development business unit and they’re going to be able to see their costs as it splits across front end and back end. You see what services they’re using as well. Again, this visibility and the allocation is what allows us to get to this level of granularity and allows us to talk about our cost with this terminology that makes sense to us.

If I jump back over to finance, back to our cost analytics and I’m going to show an example of, let’s say I’m a VP of engineering. I’ve logged into Yotascale. I don’t come to the product very often, so I came here with a question but I’m not really sure how to get to the answer. Let’s go with one of these predefined ones. What were the major cost drivers for the past three months based on service usage? What this is going to do is similar to your ChatGPT. We are leveraging an LLM on the backend here and it takes the user question, interprets it, and understands what they’re asking. It can translate it into an API call to Yotascale. It makes an API call to Yotascale, retrieves the information, and then it’s able to provide an answer. We get sort of a high level. These are the major cost drivers and I can continue to ask follow-ups on this and it may present me with some graphs to help me understand those costs, and I can expand this here.

What we’re trying to do is eliminate the need for you to learn how to use yet another SaaS tool. We are at the very beginning stages of that, but already we’re seeing great feedback from customers and we’re seeing them ask questions they just want to get an answer to. They don’t know how to go build the dashboard. They could go learn it. They could take that time. But how do we keep you in flow? How can we allow you to ask questions and I can create a new chat again here where we can ask a question about our business or objects as well, right? So you see, how much have we spent on development?

And again, this will take the question that I’ve asked. It will understand what I mean by product development. What is that for us at SaaS Tech? It understands that’s a business unit and it’s able to query Yotascale using the API to get back information about your cost. So immediately I’m able to get that answer to that question. I can say, can you break that down or can you show me month over month? And I can have a conversation here and continue to get deeper and deeper, change the question, redirect the assistant. Right? We’re working on getting it to understand your context. But there are times where it doesn’t. And so here we can see, you know, it’s broken it down over the last six months and there will be a lot more coming, right? This is the first entry for us in this GenAI area within the product.

We don’t just see it here, though. We do have a few other areas where generative AI is being integrated into the product. So when you’re thinking about, hey, I have maybe a different way of breaking cost down that I want to try. Yotascale allows you to do that by defining lenses, defining a way to break cost down. Now it might be that I have an idea for how I want to break my cost down. We’ve got this prod, staging, dev environments. So I want to come over here and I want to be able to engage with an assistant in order to build out that logic. What I can do is I’m able to tell this assistant, hello, I’d like to see costs. This assistant understands how to build these allocations for you so you don’t have to go in and tediously manage and create these lenses on your own. This assistant is able to define the lens, build it out in real time with you as a human in the loop so I can see the different tags that it’s discovered that it wants to map to this.

It understood what I asked by environment, it found the different variations of the environment tag that exist out there and it’s grouped them together into my prod environment. Now, admittedly, there’s some things here that I might want to clean up as I look at some of these, but we’re getting to a point where it’s going to be able to do all of this on its own, and you’re going to continue to be able to engage with it or edit yourself on this side as you’re ready to, right. So I can come in and make changes to this live. I can communicate back to my assistant and then we can save it and we can process our data through this lens and be able to see all of our cost data through this new view, breaking it down by both environment and accounts that are part of that environment.

Conclusion

One other area I think, yeah, no, that is it. I covered all the things that I wanted to cover today. We do appreciate your time and I’m happy to stay on and answer any questions about Yotascale or about how customers are leveraging the product or any sort of individual questions you may have about your own situation. Again, I want to say thank you to everybody for taking the time to join us today. I appreciated it and I enjoyed the presentation. I hope you enjoyed it as well.

Pause now to see if there are any questions. Looks like we have one here. Just asking about what we use for our LLMs. Today we are leveraging OpenAI, but we’re also experimenting with others and even sort of fine-tuning and using our own.

And then another question about other platforms that we support. So today we support AWS, GCP and Azure, and direct integrations, and then we support many, many others, Datadog, Databricks, Snowflake as well. So you can load those costs and we have customers doing that today, loading those costs into Yotascale and able to display them alongside other costs from other cloud providers.

Excellent. We’ll wrap up a little bit early today. Again, if you do have questions, you can feel free to reach out to me directly at jeff@yotascale.com or reach out to our support or sales team if you have any questions about the product. Appreciate everyone’s time today and hope you enjoy the rest of your day. Thank you.