8 Things That Need To Scale Better in 20258 Things That Need To Scale Better in 2025
Scalability issues have always plagued computing and networking efforts, but now environments are more complex and more challenging.
As businesses grow and tech stacks become more complex, scalability remains a top issue.
“Companies face significant challenges scaling across both physical and virtual spaces. While a holistic approach to operations across regions provides advantages, it also introduces complexity,” says Dustin Johnson, CTO of advanced analytics software provider Seeq. “The cloud can assist, but it’s not always a one-size-fits-all solution, especially regarding compute needs. Specialized resources like GPUs for AI workloads versus CPUs for standard processes are essential, and technologies like Kubernetes allow for effective clustering and scaling. However, applications must be designed to fully leverage these features, or they won’t realize the benefits.”
The variety of technologies involved creates significant complexity.
“Today, a vertically integrated tech stack isn’t practical, as companies rely on diverse applications, infrastructure, AI/ML tools and third-party systems,” says Johnson. “Integrating all these components -- ensuring compatibility, security, and scalability -- requires careful coordination across the entire tech landscape.
A common mistake is treating scalability as a narrow technology issue rather than a foundational aspect of system design. Approaching it with a short-term, patchwork mentality limits long-term flexibility and can make it difficult to respond to growing demands.
Following are some more things that need to scale better in 2025.
1. Processes
A lot of organizations still have manual processes that prevent velocity and scale. For example, if a user needs to submit a ticket for a new server to implement a new project, someone must write the ticket, someone receives the ticket, someone must activate it, and then something must be done with it. It’s an entire sequence of steps.
“That’s not a scalable way to run your environment so I think scaling processes by leveraging automation is a really important topic,” says Hillery Hunter, CTO and GM of innovation at IBM and an IBM Fellow. “There are a bunch of different answers to that [ranging] from automation to what people talk about, such as is IT ops or orchestration technologies. If you have a CIO who is trying to scale something and need to get permission separately from the chief information security officers, the chief risk officer or the chief data officer team, that serialization of approvals blocks speed and scalability.”
Organizations that want to achieve higher velocities should make it a joint responsibility among members of the C-suite.
“You don’t just want to automate inefficient things in your organization. You really want to transform the business process,” says Hunter. “When you bring together the owners of IT, information, and security at the same table, you remove that serialization of the decision process, and you remove the impulse to say no and create a collective impetus to say yes because everyone understands the transformation is mutual and a team goal.”
2. IT operations
IT is always under pressure to deliver faster without sacrificing quality, but the pressure to do more with less leaves IT leaders and their staff overwhelmed.
“Scalability needs to be done though greater efficiency and automation and use things like AIOps to oversee the environment and make sure that as you scale, you maintain your security and resiliency standards,” says Hunter. “I think re-envisioning the extent of automation within IT and application management is not done until those processes break. It’s maybe not investing soon enough so they can scale soon enough.”
3. Architectures
In the interest of getting to market quickly, startups might be tempted to build a new service from existing pre-made components that can be coupled together in ways that “mostly fit” but will demonstrate the business idea. This can lead to unintentionally complicated systems that are impossible to scale because of their sheer complexity. While this approach may work well in the beginning, getting business approval later to completely re-architect a working service that is showing signs of success may be very difficult.
“First of all, be very careful in the architectural phase of a solution [because] complexity kills. This is not just a reliability or security argument, it is very much a scalability argument,” says Jakob Østergaard, CTO at cloud backup and recovery platform Keepit. “A complex structure easily leads to situations where one cannot simply ‘throw hardware at the problem’ this can lead to frustrations on both the business side and the engineering side.”
He advises: “Start with a critical mindset, knowing that upfront investment in good architecture will pay for itself many times over.”
4. Data visibility
Organizations are on a constant mission to monetize data. To do that they need to actively manage that data throughout the entire lifecycle at scale.
“While cloud computing has gained popularity over the past few decades, there is still a lot of confusion, resulting in challenges including understanding where your cloud data lives, what it contains, and how to ensure it is properly protected,” says Arvind Nithrakashyap, co-founder and CTO at data security company Rubrik. “When it comes to scalability one blind spot is unstructured and semi-structured data.”
Unstructured data poses a security risk, as it can contain sensitive business data or personally identifiable information. And since all unstructured data is shared with end-user applications using standard protocols over TCP/IP networks, it’s a prime target for threat actors. Since most companies have hybrid and multi-cloud implementations IT needs to understand where sensitive data is, where it is going and how it is being secured.
“One of the toughest hurdles for organizations whose unstructured data portfolio includes billions of files, and/or petabytes of data, is maintaining an accurate, up-to-date count of those datasets and their usage patterns,” says Nithrakashyap. “[You need to understand] things [such as] how many files [exist], where they are, how old they are, and whether they’re still in active use. Without reliable, up-to-date visibility into the full spectrum of critical business files, your organization can easily be overwhelmed by the magnitude of your data footprint, not knowing where critical datasets are located, which datasets are still growing, [and] which datasets have aged out of use.”
5. SaaS service APIs
APIs are the glue that holds our modern software-driven world together. Keepit’s Østergaard says his company sees bottlenecks on software-as-a-service APIs that vendors offer up for general use, from explicit throttling to slow responses, that are outright intermittent failures. For better and tighter integrations between systems, APIs need to scale to higher volume use.
“Fundamentally, an API that does not scale is pointless,” says Østergaard. “For APIs to be useful we want them to be usable. Not a little bit, not just sometimes, but all the time and as much as we need. Otherwise, what's the point?”
Although it can be difficult to pinpoint a limiting factor, if user experience is any indication, it appears that some services are built on architectures that are difficult for the vendor to scale to higher volume use.
“This is a classical problem in computer science -- if a service is built, for example, around a central database, then adding more API front-end nodes may not do anything to improve the scalability of the APIs because the bottleneck may be in the central database,” says Østergaard. “If the system is built with a central database being core to its functionality, then replacing that central component with something that is better distributed over many systems could require a complete re-write of the service from the ground up. In practical terms for real world services, making a service scale to higher volume use is often very different from just clicking the ‘elastic scaling’ button on the cloud platform on which it runs.”
To scale a solution, it must be built on the “simplest possible” architecture, since architectural complexity is typically the main obstacle to scaling a solution. A complex architecture can make throwing hardware at a solution completely ineffective.
6. Artificial intelligence
As AI usage accelerates, cloud and cybersecurity scalability become even more critical.
“[M]ost companies are still in a discovery phase [with AI], and therefore what it takes to scale [in terms of] capabilities, cost, etc. is still not fully understood. It requires an approach of continuous learning and experimentation, with a strong focus on outcomes, to prioritize the right activities,” says Orla Daly, CIO at digital workforce transformation company Skillsoft.
IT leaders must ensure alignment with business leaders on the desired outcomes and critical success factors. They also need to understand the skills and resources in the organization, define KPIs and fill key gaps.
“Teams who are not proactively managing the need for scale will find suboptimal decisions or runaway costs on one side, or [a] lack of progress because the enablers and path to scale are not defined,” says Daly. “Scaling technology is ultimately about enabling business outcomes, therefore continuing to tie activities to the company priorities is important. It’s easy to get carried away by new and exciting capabilities, and innovation remains important, but when it comes to scaling, it’s more important to take a thoughtful and measured approach.”
7. Generative AI
Organizations are struggling with scaling GenAI cost-effectively. Most providers bill for their models based on tokens that are numerical representations of words or characters. The costs for input and output tokens differ. For example, Anthropic’s Claude 3.5 Sonnet charges $3.00 per million input tokens and $15 per million output tokens while OpenAI’s gpt-4o model costs $2.50 per million input tokens and $10 per million output tokens. The two models are not equal and support different features, so the choice isn’t as clear cut as “which model is cheaper”.
“GenAI model consumers must pick a balance between price, capability and performance. Everyone wants the highest quality tokens at the lowest possible price as quickly as possible,” says Randall Hunt, CTO at leading cloud services company and AWS Premier Tier Services partner, Caylent.
An additional charge exists around “vectorization” of data, such as converting images, text, or other information into a numerical format, called an embedding, that represents the semantic meaning of the underlying data rather than the specific content.
“Embedding models are typically cheaper than LLMs. [For instance,] Cohere’s Embed English embedding model is $0.10 per million tokens. Embeddings can be searched somewhat efficiently using techniques like [hierarchical navigable small world] (HNSW) and cosine similarity, which isn’t important, but it requires the use of database extensions or specialized datastores that are optimized for those kinds of searches -- further increasing cost. [A]ll of this cost is additive, and it can affect the unit economics of various AI projects.”
8. Operational technology data
Companies are being flooded with data. This goes for most organizations, but it’s especially true for industrial companies that are constantly collecting operational technology (OT) data from equipment, sensors, machinery and more. Industrial companies are eager to integrate insights from OT and IT data to enable data-driven decision making based on a holistic view of the business.
“In 2025 and beyond, companies that can successfully give data context and make efficient and secure connections between diverse OT and IT data sources, will be best equipped to scale data throughout the organization for the best possible outcomes,” says Heiko Claussen, chief technology officer at industrial software company AspenTech. “Point-to-point data connections can be chaotic and complex, resulting in siloes and bottlenecks that could make data less effective for agile decision making, enterprise-scale digital transformation initiatives and AI applications.”
Without OT data fabric, an organization that has 100 data sources and 100 programs utilizing those sources would need to write and maintain 10,000 point-to-point connections. With an OT data fabric, that drops to 200 connections. In addition, many of these connections will be based on the same driver and thus much easier to maintain and secure.
About the Author
You May Also Like